On 2025-07-03 16:12, Wayne S via cctalk wrote:
Yes, stats are kept about issues.
Someone should look at the stats and start to investigate when there’s a lot of 
failures with the same issue. Explicit instructions should be sent to field 
engineers to take extra steps to document what they found and how it was 
resolved, and report their conclusions back to the investigation leader.

That’s how IBM did it.
I know DEC kinda did it for software on VMS though their “Software and 
Solutions” database. I really liked looking at that.

Sent from my iPhone

On Jul 3, 2025, at 12:52, Paul Koning<[email protected]> wrote:



On Jul 3, 2025, at 2:26 PM, Wayne S via cctalk<[email protected]> wrote:

That’s a good business practice anyway. You want your high price system up and 
running as fast as possible, so not having to do more than cursory diagnostics 
is a good thing I think deck realize that with the VAX and it’s remote the 
diagnostic capability as for the board breaks, IBM used to do that for all the 
boards they replaced. They even had a special board breaking tool.
My CE from IBM said that it costs IBM more to diagnose a faulty board than it 
does to make a new one so that’s why they do it.  Breaking the board also 
ensures that the engineers won’t get caught up in a side project trying to 
figure out what went wrong.
That's true for problems seen occasionally.  When people realize a particular issue 
appears "too often" it does become an engineering matter, because then it 
indicates an issue with design or manufacturing or part selection.

For example, I remember a product that had a memory backup battery issue, which turned 
out to be a change in plating for the battery holder.  For engineering it turned into an 
exercise in learning what "electrovoltaic series" means -- not something 
familiar to most digital logic EEs.

    paul

Stats are very important if the manufacturer has a lot of their own product under comprehensive maintenance agreements, especially fixed disk drives that would require major time to recover from backups when they fail.  Control Data had a problem once, I believe it was with the MMDs, where they noticed premature failures in the field.  Because they kept accurate records, they were able to trace it back to serials after a particular date when a water filter had been changed at the factory and the new one caused some sort of problem with the epoxy holding the magnetic material to the substrate.

cheers

Nigel


--
Nigel Johnson, MSc., MIEEE, MCSE VE3ID/G4AJQ/VA3MCU
Amateur Radio, the origin of the open-source concept!
Skype:  TILBURY2591

Reply via email to