Re: [time-nuts] DC distribution

jimlux Sun, 06 Oct 2019 04:13:53 -0700

On 10/5/19 3:35 PM, Hal Murray wrote:


jim...@earthlink.net said:

There is *great* resistance to changing any assembly and workmanship
standard - nobody wants to be the person who says "we don't need to do
*that* anymore" and then a disaster happens, and one of the potential  causes
is "you didn't do *that*"

It is entirely possible that the original rationale and explanation is  no
longer valid.


There is also a risk of troubles because you are still doing *that*.

Do the people who maintain the rules occasionally look around to see if a
better way has been developed?


TL,DR: Yes, but..

Sore point there - since my job these days is managing what are calledRisk Class D missions, for which some of the (perceived) risk is thatyou don't have to follow all the process that is typical for Class A, B,and C missions. And I've had in flight failures where the spacecraft waslost (Ouch!) There's the question of "should we have followed someprocess that we didn't follow" - The idea is that process is expensive,and knowing acceptance of risk allows you to do things you could nototherwise do.

NASA divides missions into risk classes (NPR 8705.4), in terms of the"consequences of failure" or "national significance" or "difficulty ofreflight" or cost ranging from A down to D. Class A is human ormultibillion flagship; Class B is things like Mars rovers; Class C is"less than two year missions that cost less than $100M" kind of thing;Class D is "ok if it fails".

There is an enormous amount of "standard practice" for NASA missions -often derived from long experience, or, perhaps, from some "bad day" anda process/rule gets created that says "we're not going to do that again".

It's important to know that NASA, in general, does not do "reliabilitycalculations" in a MIL-HDBK-217 way - there's no stacking up ofindividual part reliabilities to get an estimated system MTBF. This ishistorical - NASA typically builds "just one unit" (maybe 2 or 3) - sothere's no chance to do life testing and build up statistics. I think(Jim's opinion) that when they started coming up with process, thepart/assembly reliability data had huge variances, so the resulting MTBFpredictions spanned a wide range, or worse yet, said "failure iscertain". There's also a problem that parts reliability probably isn'tthe dominant factor for reliability - it's design (is that wire undertension causing it to break with thermal cycles) or workmanship (not ina good/bad sense, but a variability sense).

So there are tons of process to try and drive the variability ofworkmanship down - You don't just tighten a fastener, you torque it to aspecified level, determined (in theory) by the design loads, etc.; andsomeone witnesses the torquing to make sure someone didn't forget toinstall the bolts. Mistakes happen - the system tends to get paperworkheavy - and disasters have occurred because someone ignored the evidenceof their hands/eyes and trusted the paper - NOAA N-prime is the case inpoint. Interestingly, these are called "process escapes" - and there'sa huge amount of work (multiple work years even for things where nothingbad happened) that goes into determining why someone did somethingthat's outside the process - was it just a bad day? is the processitself inconvenient or incomplete, etc. There is an intense amount ofcontemplation on changing the process - typically it was created becauseof a single bad event (NASA just doesn't build that many things), itaddressed the causes of that event and appears to be a "good idea" forthe future. It then becomes part of the "received wisdom of the ages"and everyone does it - until some event triggers a reevaluations.

In general, the system is set up so that it's easier to just "do thestandard thing" than to get a waiver to not do it. Getting the waivertypically requires that you *prove* in some sense that it won't increaserisk, or that you've somehow backed yourself into a corner and there'sno way to get the job done without it. The latter is the "willingacceptance of risk" and there's a lot of people who have to sign off onit - The NASA administrator does NOT want to sit in front of Congressexplaining why a $500M mission was lost because a waiver was issued tonot do something. "You mean, sir, that we saved a few hours labor and itcost us $500M?" You don't get to say "There were 10,000 things, eachthat are individually a good idea, but if we did them all, the missionwould have cost $1B, and you only gave us $500M"

For a Class D mission, there is a formal process (at JPL, anyway) whereyou go through the roughly 700 "Design Principles" and "Flight ProjectPractices" and identify which ones you will comply with, which youwon't, and which are "comply with intent, but adjusted for thismission". The DP and FPP are high level documents that describe "stuffyou should do" - things like "you should have no more than 30% CPUloading at PDR", "You shouldn't discharge the batteries more than X%".

The result of this process (which takes a few months) is a list ofblanket waivers - for instance, maybe you don't need to have independentpeople do a worst case analysis or parts stress analysis of all yourcircuits - you trust in the experience of the engineer doing the design,and they do some informal analysis (a spreadsheet of voltage rating vsvoltage it sees in the circuit). A big one is getting waivers to nothave inspection and test at ALL levels of integration - you can assemblethe whole thing, test it as a whole, at the risk of discovering aproblem late in the project. For instance, you assume all thetransistors are good from the mfr, and that the board is correctlyassembled by the automated fab, so you don't need electrical test. Youplug the board in, and if the system doesn't work, you have a spare youcan swap in. On the other hand, if it takes 6 months to dismantle thespacecraft and extract the failed board, you probably won't get thewaiver. My spacecraft were easy - you could assemble or disassemblethem into their component assemblies in less than a day - so it wasn'tschedule risk, it was "will we break something by handling it" - littleteeny connectors are fragile.

Ultimately, whether your mission succeeds or fails, but especially if itfails, we go back and look at all those waivers and over a period ofyears, we decide, hmm, maybe we should change that because technologyhas changed. Each time someone goes through the Class D process, the FPPand DP get looked at, and if everyone is getting exempted from somerequirement, and has good reasons, then there's a rule change. But it'sslow.

And where there is a problem, maybe a new rule will be created - withthe large number of SmallSats (cubesats and slightly larger) being donethese days, you wind up with physical properties that are outside the"traditional" experience. A 10 foot long flexible antenna sticking outof a 1000kg spacecraft is mechanically very different from that sameantenna sticking out of a 5kg spacecraft.

And there will need to be new processes to deal with swarms and massiveconstellations - NASA is used to flying one spacecraft, maybe 2 (MER) -if there's a failure, it's a big deal. You convene a Failure ReviewBoard (FRB), you identify Corrective Actions, etc. If you fly 100spacecraft to perform a function, and one fails, and the function isstill performed, meeting all requirements, is it a big deal? Maybe it'sjust that the spacecraft have 90% reliability, and you planned for thatby launching 100 when you need 50 to make your measurement. Are yougoing to convene a FRB for each failure? Or are you going to say - ohyeah, that is an expected failure mode, we know it's random and not acommon design flaw among all 100, move on.

With a move to "statistics" instead of "build it perfect" - there willbe process changes - but there will need to be test data to back up thestatistics.


_______________________________________________
time-nuts mailing list -- time-nuts@lists.febo.com
To unsubscribe, go to 
http://lists.febo.com/mailman/listinfo/time-nuts_lists.febo.com
and follow the instructions there.

Re: [time-nuts] DC distribution

Reply via email to