Failure Modes Effects and Criticality Analysis (FMECA) is a quality tool which builds on the results of Functional Analysis to identify risks and their consequences. It grew out of Reliability Engineering efforts in the late 1950's.

FMECA can be applied to systems, products, manufacturing processes, equipment, plant and even less tangible subjects such as logistic or information flows. It is used to identify the possible ways in which failure can occur, the corresponding causes of failure, the corresponding effects of failure, and the impact on Customer Satisfaction.

The objective of FMECA is to identify the components of products and systems most likely to cause failure, so that these potential failures can then be designed out. FMECA allows the identification early in the product development process of potential problems or safety hazards which are inherent in a product design. The safety and/or reliability of the product is assessed and modifications initiated at a relatively low cost before they are built into the product. Product reliability and customer satisfaction will be improved by preventing failures from occurring.

FMECA allows a Concurrent Engineering team to address reliability issues early in the design cycle where modifications are less costly, and critical risks associated with design or process concepts can be identified and the necessary corrective actions taken in time.

When used in conjunction with Root Cause Analysis, Design of Experiments and QFD the technique provides the basis for designing out potential failure modes.

FMECA helps assess the relative importance of the failure of different components. For example, when an engine fails, it is extremely rare that the cause is a breakage in a component such as a connecting rod or crankshaft. The most likely cause is a fault in the ignition or fuel system. Similarly, in most electronic assemblies, faults tend to occur most often in connections and ancillaries. FMECA can be used to show why a particular action is necessary. For example, the replacement of a tiny engine component costing 20 cents with an apparently over-engineered one costing 35 cents might eliminate more than 40 percent of failures. With FMECA, it becomes apparent why the cost of a tiny component should be almost doubled, yet without it, the change can be difficult to justify.

FMECA contributes greatly to achieving internal quality targets. Its results can also be used to demonstrate the quality of engineering to potential customers.

In FMECA terminology, a function is the task a component/system performs. A failure is defined as an inability to perform a function normally. Four classes of failure can be identified :
To illustrate this, consider an engine valve. The function of a valve is to open and close. An example of a failure is that the valve does not close. There can be different reasons for this failure. For example, the valve will fail to close if its spring breaks, but it can also stick in its guide or be held open by the cam should the camshaft belt break.

For another example, consider a spring. The function of the spring could be to return a spool to its rest position. Failure could occur if the spring is broken, or is too hard or too weak. Potential effects of failure can be identified:

Potential failure mode Potential effect of failure
spring broken loss of system efficiency
spring too hard slow system response
spring too weak system noise and vibration

There are two main phases to the use of FMECA:
In the risk assessment phase of FMECA, a multi-functional team analyzes the way the product/system operates, and identifies for each of the possible failure modes of the product/system:
and assigns a value to each of these factors.

Failures modes are then prioritized as a function of their probability, severity, and detectability. Based on this prioritization, action can then be taken to prevent failure or to reduce the likelihood of failure occurring.

The probability (P) of a particular mode of failure can be fixed on a scale of 1 to 10 with a higher number being scored for more probable failures. As an example, a score of 1 could be assigned to 1 failure in 100 000, and a score of 10 to a probability of 1 failure in 20.

The severity (S) factor describes the degree of reduced functionality resulting from the failure. This can also be scored on a scale of 1 to 10 - with a high score implying seriously reduced functionality, and a low score representing little or no effect. As an example, a score of 1 could signify that the customer would not notice that a failure has occurred, 9 could represent a major problem for the customer, and 10 could represent a significant safety hazard or non-compliance with regulations.

The detectability (D) factor describes the likelihood of the failure not being detected by the design process before the product or system is used by the customer. Again it can be scored on a 1 to 10 range, where a score of 1 could indicate that the failure will be found every time, and a score of 10 could indicate that the failure will not be detected before use by the customer.

A criticality index or risk priority number (RPN) can then be calculated:
RPN = P * S * D

In the case of the spring, this analysis could result in:

Potential failure mode Potential effect of failure Possible causes RPN
spring broken loss of system efficiency wrong material 582
spring too hard slow system response wrong material 252
spring too weak system noise and vibration wrong material 290

The higher the resulting index the more urgent the need to find a solution. In the failure prevention phase, having identified the most urgent failure modes to address by examining the RPN in conjunction with its individual elements, the FMECA team applies Root Cause Analysis and other creativity techniques to develop solutions to prevent failures from occurring. A tracking system is put in place to ensure feedback into the FMECA process and avoid wasting time solving problems which have already been solved.

FMECA can be carried out either by starting at the component level and expanding upwards (the 'bottom-up' approach), or starting at the system level and working downwards (the 'top-down' approach). In most cases the bottom-up approach is used - the engineer beginning at the component level and working upward to the top level of the design. During this process all possible failures should be identified and examined. The top-down, 'system-level' approach is often used to reduce the amount of analysis that must be carried out. The system-level FMECA starts off by identifying the functional blocks containing high risk components. These blocks are then analyzed in more detail.

Key factors in the successful application of FMECA include:
FMECA has many uses. Examples include evaluating system reliability, evaluating hardware and software interfaces, identifying potential design defects and safety hazards, simplifying maintenance, and trouble-shooting. It is also useful for rating alternative solutions to a problem.

Computer systems are available to support FMECA, providing templates and databases for storage and retrieval of commonly used information such as failure modes and possible causes. Use of these systems eliminates a lot of the fastidious work, such as input of detailed information, associated with each part/system. As solutions to failure modes are found, the FMECA database is updated, increasing the overall knowledge base.

FMECA produces and uses a lot of product/system data that must be distributed to many users in the development process. To avoid loss of valuable information about the design, this information must be carefully stored and distributed. Engineers must be able to see what changes were made to previous versions. As designs change, new revisions of FMECA analysis results will be created and modified. They too will have to be stored and be made available for future use. An EDM/PDM system can be used to manage the storage, access and distribution of FMECA data.


Copyright 1998 by John Stark