NASA's Not Shining Moments
The space agency's approach, including its "faster, better, cheaper" credo, may be a recipe for disaster.
Spaceflight remains such an expensive, hazardous, edge-of-the-precipice activity that the cost of disasters can be staggering. The presumed loss of the Mars Polar Lander in December 1999 is only the latest setback. The Mars Climate Orbiter spacecraft crashed into the destination planet's atmosphere and was destroyed last September 23 because of navigation judgment errors. The entire space shuttle fleet was grounded for nearly half a year when a short circuit from a mishandled wire bundle nearly led to an emergency landing in July; more than 100 similarly frayed wires were subsequently found in other shuttles.
The recent blizzard of U.S. space accidents traceable to sloppiness applies not only to the National Aeronautics and Space Administration but also to its aerospace contractors, such as Lockheed Martin and Boeing. The total costs far exceed $3 billion, out of an annual national space budget of about $30 billion. Errors can never be totally eliminated--this is rocket science, after all. But many observers have been alarmed at the apparent increase, which could be a symptom of deeper problems that could lead to more failures in the future. Observers and old-time NASA personnel fear that the agency's current philosophies, including its "faster, better, cheaper" credo--the use of more frequent but smaller-scale, less expensive missions--may not be leaving enough room for quality control.
Launches are the most common point of failure and markedly illustrate the kind of mistakes critics say are avoidable. Two potentially serious problems occurred on the STS-93 shuttle flight in July, which launched the Chandra X-ray Observatory. The first, at main engine ignition, saw an improperly fastened pin fall from inside one of the rocket engines, piercing the thin piping that circulates the cryogenic hydrogen fuel through a nozzle to cool the structure. The resulting loss of fuel, though small, caused the shuttle engines to shut down prematurely, just short of the craft's planned altitude. The second problem involved a short circuit several seconds into the flight, which took two computers that control the main engines off line, forcing backup systems to complete the ascent.
Engineers traced the short circuit to worn insulation on cables running the length of the shuttle's payload bay. The source of the wear was not clear, so NASA prudently examined all of the shuttle fleet's wiring. More than 100 additional cases of wear, including some as serious as the one that nearly aborted STS-93, were found and repaired. NASA determined the cause to have been careless handling and bumping by workers.
A string of handling errors continued even as NASA struggled to recover from the frayed-wire near miss. Workers ran a test on a wing elevon (a combined elevator and aileron) without removing a support structure, and as a result several spars harpooned the elevon, requiring its replacement. One main engine had to be replaced when x-rays discovered that a drill bit had been left inside engine plumbing. (Such sloppiness is not limited to NASA systems: the European Space Agency's first launch of the Ariane 5 heavy booster blew up in 1996 because of a software oversight, and its SOHO satellite went out of control in mid-1998, apparently because overworked technicians failed to monitor it properly. Commercial rockets in the U.S. and Russia also suffered a rash of launch explosions in 1998 and 1999.)
In September an independent review of a string of expensive failures by Lockheed Martin's Titan IV rockets concluded that "the company focused too heavily on cutting costs and not enough on supervising the quality of its work," according to press accounts. Henry Spencer, a regular commentator on space events, provided more details in a privately circulated report. In addition to the emphasis on cost cutting, he reported that the study found "lack of accountability and well-defined responsibility, growing problems with skills retention, violations of traditionally rigorous rules about testing flight hardware, procedures overly vulnerable to human error, declining workforce quality, and poor customer communications."
Edward M. Hanna, a management consultant for the aerospace safety group FasterBetterCheaper.com, stated in an article circulated around NASA last summer that "there's been a tendency to replace older, more experienced workers with younger people. And that's related to a loss of quality." After a five-year study into the declining quality of aerospace work, Hanna's group determined that "cost cutting and short-term objectives have taken priority over the retention of an experienced core of talent." As a result, wages in aerospace are 20 percent below those of other engineering professions, when the criticality of quality requirements should demand not parity but 20 to 50 percent higher salaries, according to Hanna.
Besides the retention problem ("erosion of critical skills" is the phrase most commonly used within NASA), there are other roadblocks to quality work. For example, the technology itself is more complex and unforgiving. Norman Augustine, former chief executive of Lockheed Martin and a frequent commentator on aerospace quality techniques, told the Washington Post that "after the fact, it's always obvious what went wrong. But before the fact, the problems are so hard to find."
Another obstacle is the style of some managers. The key to success, Augustine says, is a culture where workers know "they won't lose their heads" if they tell the boss bad news. His rule: "We'll tolerate problems, but we won't tolerate not reporting them." NASA had this kind of leadership in the 1960s, when men such as Robert R. Gilruth led the successful Apollo program. But agency insiders privately describe how such an approach sadly never caught on at some other centers and is alien to the style of current leadership at NASA, which has been run since 1992 by Daniel S. Goldin.
"The organization that I spent most of my professional career in had these same problems," states Charles Harlan, the now retired head of safety at the NASA Johnson Space Center in Houston. "The current top management at NASA is famous for 'kill the messenger'-type management style." Harlan, now an aerospace safety consultant, concludes: "It is somewhat depressing that neither Boeing nor NASA can rise above this kind of behavior."
Early in December a presidential board on space launch accidents released its report. The main causes of the incidents were connected with engineering and fabrication flaws when the boosters were being assembled, resulting from a lack of adequate management attention and also possibly from the loss of the most experienced employees to retirement and layoffs. "Maintaining management, technical and engineering oversight expertise is becoming increasingly difficult in both government and industry," the report stated.
Last year's space setbacks are certain to create a psychological rebound, in which workers try harder to avoid future disasters. NASA has publicly stated that its approach is still fundamentally sound, although it admits that its Mars strategy needs major rethinking in the wake of the Mars Polar Lander and Climate Orbiter disappearances. The agency may postpone the next landing attempt, scheduled for 2001, as it tries to determine whether the Mars program is sufficiently well designed and budgeted. But in the long run, NASA will have to address its systemic weaknesses if it is to avoid a new string of expensive, embarrassing and perhaps in some cases life-threatening foul-ups.