Reliability Centered Maintenance Simplified

Reliability Centered Maintenance Simplified

 Reliability Centered Maintenance Simplified

 
An RCM (Reliability Centered Maintenance)  process systematically identifies all of the asset’s functions and functional failures, and identifies all of its reasonably likely failure/causes. It then proceeds to identify the effects of these likely failure modes and to identify in what way those effects matter. Once it has gathered this information, the RCM process then selects the most appropriate asset management policy.  
RCM considers all asset management options: on-condition task, scheduled restoration task, scheduled discard task, failure-finding task, and one-time change (to hardware design, operating procedures, personnel training, or other aspects of the asset outside the strict world of maintenance). This consideration is unlike other maintenance development processes.

Seven Questions Address by RCM

Fundamentally, the RCM process seeks to answer the following seven questions in sequential order:
1-Functions 
What are the functions and associated desired standards of performance of the asset in its present operating context (functions)? The specific criteria that the process must satisfy are: 
  • The operating context of the asset shall be defined.     
  • All the functions of the asset/system shall be identified (all primary and secondary functions, including the functions of all protective devices).     
  • All function statements shall contain a verb, an object, and a performance standard (quantified in every case where this can be done).     
  • Performance standards incorporated in function statements shall be the level of performance desired by the owner or user of the asset/system in its operating context. 
The operating context is the circumstance in which the asset is operated. The same hardware does not always require the same failure management policy in all installations. For example, a single pump in a system will usually need a different failure management policy from a pump that is one of several redundant units in a system. A pump moving corrosive fluids will usually need a different policy from a pump moving benign fluids. Protective devices are often overlooked; an RCM process shall ensure that their functions are identified. Finally, the owner/user shall dictate the level of performance that the maintenance program shall be designed to sustain.
 
2-Functional Failures  
In what ways can it fail to fulfill its functions (functional failures)? This question has only one specific criterion: All the failed states associated with each function shall be identified. If functions are well defined, listing functional failures is relatively easy. For example, if a function is to “keep system temperature between 50 C and 70 C,” then functional failures might be: 
  • Unable to raise system temperature above ambient,     
  • Unable to keep system temperature above 50 C     
  • Unable to keep system temperature below 70 C.
3-Failure Modes  
What causes each functional failure (failure modes)? In Failure Modes, Effects and Criticality Analysis (FMECA), the term “failure mode” is used in the way that RCM uses the term “functional failure.” However, the RCM community uses the term “failure mode” to refer to the event that causes functional failure. The standard’s criteria for a process that identifies failure modes are:  
  • All failure modes reasonably probable to cause each functional failure shall be identified.     
  • The method used to decide what constitutes a “reasonably probable” failure mode shall be acceptable to the owner/user of the asset.     
  • Failure modes shall be identified at a level of causation that makes it possible to identify an appropriate failure management policy.     
  • Lists of failure modes shall include failure modes that have happened before, failure modes that are currently being prevented by existing maintenance programs, and failure modes that have not yet happened, however they are thought to be reasonably likely (credible) in the operating context.     
  • Lists of failure modes should include any event or process that is likely to cause a functional failure, including deterioration, human error whether caused by operators or maintainers, and design defects. 
RCM is the most thorough of the analytic processes that develop maintenance programs and manage physical assets. It is therefore appropriate for RCM to identify every reasonably likely failure mode.
 
4. Failure Effects  
What happens when each of the failures occur (failure effects)? The criteria for identifying failure effects are: 
  • Failure effects shall describe what would happen if no specific task were done to anticipate, prevent, or detect the failure.  
  • Failure effects include all the information needed to support the evaluation of the consequences of the failure, such as:
  1. What evidence (if any) that the failure has occurred (in the case of hidden functions, what would happen if a multiple failure occurred)?     
  2. What it does (if anything) to kill or injure someone, or to have an adverse effect on the environment?     What it does (if anything) to have an adverse effect on production or operations?     
  3. What physical damage (if any) is caused by the failure?     
  4. What (if anything) must be done to restore the function of the system after the failure?
FMECA or FMEA usually describes failure effects in terms of the effects at the local level, at the subsystem level, and at the system level.
 
5-Failure Consequences  
In what way does each failure matter (failure consequences)? The standard’s criteria for a process that identifies failure consequences are:  
  • The assessment of failure consequences shall be carried out as if no specific task is currently being done to anticipate, prevent, or detect the failure.      
  • The consequences of every failure mode shall be formally categorized as follows:      
  • The consequence categorization process shall separate hidden failure modes from evident failure modes.     
  • The consequence categorization process shall clearly distinguish events (failure modes and multiple failures) that have safety and/or environmental consequences from those that only have economic consequences (operational and non-operational consequences). 
RCM assesses failure consequences as if nothing is being done about it. Some people are tempted to say, “Oh, that failure doesn’t matter because we always do (something), which protects us from it.” However, RCM is thorough it checks the assumption that this action that “we always do” actually does protect them from it, and it checks the assumption that this action is worth the effort. 

RCM assesses failure consequences by formally assigning each failure mode into one of four categories: hidden, evident safety/environmental, evident operational, and evident non-operational. The explicit distinction between hidden and evident failures, performed at the outset of consequence assessment, is one of the characteristics that most clearly distinguishes RCM, as defined by Stan Nowlan and Howard Heap, from MSG-2 and earlier U.S. civil aviation processes.

6-Proactive Tasks  
What should be done to predict or prevent each failure (proactive tasks and task intervals)? This is a complex topic, and so its criteria are presented in two groups. The first group pertains to the overall topic of selecting failure management policies. The second group of criteria pertains to scheduled tasks and intervals, which comprise proactive tasks as well as one default action (failure-finding tasks). 
The criteria for selecting failure management policies are:
  1. The selection of failure management policies shall be carried out as if no specific task is currently being done to anticipate, prevent, or detect the failure. 
  2. The failure management selection process shall take account of the fact that the conditional probability of some failure modes will increase with age (or exposure to stress), that the conditional probability of others will not change with age, and the conditional probability that others will decrease with age. 
  3. All scheduled tasks shall be technically feasible and worth doing (applicable and effective), and the means by which this requirement will be satisfied are set out under scheduled tasks in the failure management section. 
  4. If two or more proposed failure management policies are technically feasible and worth doing (applicable and effective), the policy that is most cost-effective shall be selected. 
  5. Scheduled tasks are tasks that are “performed at fixed, predetermined intervals, including ‘continuous monitoring’ (where the interval is effectively zero).” 
  6. Scheduled tasks should be identified that fit the following criteria: In the case of an evident failure mode that has safety or environmental consequences, the task shall reduce the probability of the failure mode to a level that is tolerable to the owner/user of the asset.
  7. In the case of a hidden failure mode where the associated multiple failure has safety or environmental consequences, the task shall reduce the probability of the hidden failure mode to an extent which reduces the probability of the associated multiple failure to a level that is tolerable to the owner/user of the asset. In the case of an evident failure mode that does not have safety or environmental consequences, the direct and indirect costs of doing the task shall be less than the direct and indirect costs of the failure mode when measured over comparable periods of time. 
  8. In the case of a hidden failure mode where the associated multiple failure does not have safety or environmental consequences, the direct and indirect costs of doing the task shall be less than the direct and indirect costs of the multiple failure plus the cost of repairing the hidden failure mode when measured over comparable periods of time.

Categories of Tasks

There are three general categories of tasks that are considered to be proactive in nature:  
On-condition Tasks  An on-condition task is “a scheduled task used to detect a potential failure.” Such a task has many other names in the maintenance community such as:  
  • Predictive” tasks (in contrast to “preventive” tasks, a name that these people apply to scheduled discard and scheduled restoration tasks.)  
  • “Condition-based” tasks, referring to “condition-based maintenance” or CBM (again, in contrast to “time-based maintenance” or scheduled discard and scheduled restoration tasks)  
  • “Condition-monitoring” tasks, since the tasks monitor the condition of the asset. 

Scheduled Discard Task   

The next kind of task is a scheduled discard task, defined as “a scheduled task that entails discarding an item at or before a specified age limit regardless of its condition at the time.” A scheduled discard task must be subjected to the following criteria before accepting the task: 
  • There shall be a clearly defined (preferably a demonstrable) age at which there is an increase in the conditional probability of the failure mode under consideration.     
  • A sufficiently large proportion of the occurrences of this failure mode shall occur after this age to reduce the probability of premature failure to a level that is tolerable to the owner or user of the asset.

Scheduled Restoration Tasks  

The next kind of task is a scheduled restoration task, defined as “a scheduled task that restores the capability of an item at or before a specified interval (age limit), regardless of its condition at the time, to a level that provides a tolerable probability of survival to the end of another specified interval.” The following criteria must be applied to a scheduled restoration task before accepting the task: 
  • There shall be a clearly defined (preferably a demonstrable) age at which there is an increase in the conditional probability of the failure mode under consideration.      
  • The task shall restore the resistance to failure (condition) of the component to a level that is acceptable to the owner or user of the asset.      
  • A sufficiently large proportion of the occurrences of this failure mode shall occur after this age to reduce the probability of premature failure to a level that is tolerable to the owner or user of the asset.
7-Default Actions  
What should be done if a suitable proactive task cannot be found (default actions?) This question pertains to unscheduled failure management policies: the decision to let an asset run to failure, and the decision to change something about the asset’s operating context (such as its design or the way it is operated.)

Failure-Finding Tasks  

A failure-finding task is defined as “a scheduled task used to determine whether a specific hidden failure has occurred.” Failure-finding tasks usually apply to protective devices that fail without notice. This task represents a transition from the sixth question (proactive tasks) to the seventh question (default actions, or actions taken in the absence of proactive tasks.) Failure-finding tasks are scheduled tasks like the proactive tasks. However, failure-finding tasks are not proactive. They do not predict or prevent failures. They detect failures that already have happened, in order to reduce the chances of a multiple failure and the failure of a protected function while a protective device is already in a failed state.

Run to Failure  

If a process offers a decision to let an asset run to failure, the following criteria should be applied before accepting the decision:
  • In cases where the failure is hidden and there is no appropriate scheduled task, the associated multiple failure shall not have safety or environmental consequences.      
  • In cases where the failure is evident and there is no appropriate scheduled task, the associated failure mode shall not have safety or environmental consequences. In other words, the process must not allow its users to select “run to failure” if the failure mode, or (in the case of a hidden failure) the associated multiple failure, has safety or environmental consequences. 
NOTE: There are many articles written by Ricky Smith and Keith Mobley on Reliability which may help better explain some of your questions as they relate to RCM or maintenance strategies. Some of these articles can be found in Chapter 10 in their book “Rules of Thumb for Maintenance and Reliability Engineers”. 

About:
To all my friends, The Maintenance Community on Slack is an incredible free space where over 1,500 maintenance and reliability professionals like myself share real life experiences with each other.   
 
To join us, sign up here: https://upkeep.typeform.com/to/icC8EKPT

Post a Comment

Previous Post Next Post