Linux-Advocacy Digest #842, Volume #27           Fri, 21 Jul 00 06:13:04 EDT

Contents:
  The Failure of the USS Yorktown ("Adam Warner")
  Re: The Failure of the USS Yorktown (Nico Coetzee)
  Re: MS advert says Win98 13 times less reliable than W2k (Arthur Frain)
  Re: The Failure of the USS Yorktown ("Stuart Fox")
  Re: Am I the only one that finds this just a little scary? (Donal K. Fellows)
  Re: Am I the only one that finds this just a little scary? (Donal K. Fellows)
  Re: Some Windows weirdnesses... (Russell Wallace)
  Re: Some Windows weirdnesses... ("Ferdinand V. Mendoza")
  Re: Linux is blamed for users trolling-wish. ("David Brown")
  Re: Am I the only one that finds this just a little scary? ("Christopher Smith")

----------------------------------------------------------------------------

From: "Adam Warner" <[EMAIL PROTECTED]>
Crossposted-To: comp.os.ms-windows.nt.advocacy
Subject: The Failure of the USS Yorktown
Date: Fri, 21 Jul 2000 20:00:15 +1200

Hi all,

I see the USS Yorktown is a very popular topic. Here's some quick online
research:
http://www.cs.clemson.edu/~steve/Spiro/stories/node75.html

The most important quotation in this link is from the official report:

"The Yorktown lost control of its propulsion system because its computers
were unable to divide by the number zero ... The Yorktown's Standard
Monitoring Control System administrator entered zero into the data field for
the Remote Data Base Manager program. That caused the database to overflow
and crash all LAN consoles and miniature remote terminal units. The program
administrators are trained to bypass a bad data field and change the value
if such a problem occurs again."

"Sunk by Windows NT"
http://www.wired.com/news/technology/0,1282,13987,00.html
Contains advocacy paragraph:
"Why Windows NT Server 4.0 continues to exist in the enterprise would be a
topic appropriate for an investigative report in the field of psychology or
marketing, not an article on information technology," said John Kirch, a
networking consultant and Microsoft certified professional, in his white
paper, Microsoft Windows NT Server 4.0 versus Unix. "Technically, Windows NT
Server 4.0 is no match for any Unix operating system."

(The paper referred to is here: http://unix-vs-nt.org/kirch/)

The Scientific America Article:
http://www.sciam.com/1998/1198issue/1198techbus2.html

"The controversy began when the USS Yorktown, a guided-missile cruiser that
was the first to be outfitted with Smart Ship technology, suffered a
widespread system failure off the coast of Virginia in September last year.
After a crew member mistakenly entered a zero into the data field of an
application, the computer system proceeded to divide another quantity by
that zero. The operation caused a buffer overflow, in which data leak from a
temporary storage space in memory, and the error eventually brought down the
ship's propulsion system. The result: the Yorktown was dead in the water for
 more than two hours."

Application viewpoint:
'Others insist that NT was not the culprit. According to Lieutenant
Commander Roderick Fraser, who was the chief engineer on board the ship at
the time of the incident, the fault was with certain applications that were
developed by CAE Electronics in Leesburg, Va. As Harvey McKelvey, former
director of navy programs for CAE, admits, "If you want to put a stick in
anybody's eye, it should be in ours."...

'For now, the navy's official stance remains unchanged. "We are absolutely
committed to COTS ... and to the Windows NT operating system," insists
Captain Charles Hamilton, deputy for Fleet in the Program Executive Office
for Theater Surface Combatants.'

"The Yorktown Affair"
http://jerrypournelle.com/reports/jerryp/Yorktown.html

This article is the clincher. Here's some excerpts:
'For example, I'm sure that many of you have heard of the sad story of the
USS Yorktown. For those that haven't: The Yorktown has been completely
networked using Windows NT as part of the Navy's "SmartShip" program, to try
and reduce the manpower requirements for large warships. However, because of
an operator incorrectly entering a zero in an entry field, the system
crashed and somehow corrupted the central database-as a result, the ship was
dead in the water for two hours, and had to be towed back to port.'

(It appears these facts are not an exaggeration, but you'll have to read the
whole article to decide for yourself).

'I know that a lot of the interest in this specific incident is because an
NT system blew up and there are people who take great joy reading about NT
systems blowing up. I can guess what's behind that, of course, and I
certainly don't need to take up your time with that.'

Government News Extracts
http://bonehead.sedonageo.com/~vool/articles/gcn/980713-1.html

'Atlantic Fleet officials acknowledged that the Yorktown last September
experienced what they termed "an engineering local area network casualty,"
but denied that the ship's systems failure lasted as long as DiGiorgio said.
The Yorktown was dead in the water for about two hours and 45 minutes, fleet
officials said, and did not have to be towed in.'

'"Refining that is an ongoing process," Redman said. "Unix is a better
system for control of equipment and machinery, whereas NT is a better system
for the transfer of information and data. NT has never been fully refined
and there are times when we have had shutdowns that resulted from NT."'

'"Because of politics, some things are being forced on us that without
political pressure we might not do, like Windows NT," Redman said. "If it
were up to me I probably would not have used Windows NT in this particular
application. If we used Unix, we would have a system that has less of a
tendency to go down."

Although Unix is more reliable, Redman said, NT may become more reliable
with time.'

Proceedings of the US Naval Institute
http://www.usni.org/Proceedings/Articles98/PROjohns.htm

'Can a PC handle the load? For as-required strategic and nontactical
applications, a PC-based system may be adequate. For real-time tactical
applications the processing requirement is less likely to be met in the near
future. One issue is the suitability of software to meet the needed data and
security loads. Military weapon systems are enterprise systems that have
large numbers of users and require high levels of security and reliability.
At present, these capabilities are available in UNIX-based systems but not
in Windows 95 or Windows NT. In other words, the pure PC environment is not
ready for the demands of real-time tactical operations.'

GCN Editorial
http://206.144.247.86/archives/gcn/1998/november23/20.htm
'Few stories in GCN over the last year have produced as much reaction as our
coverage of the Smart Ship USS Yorktown. Back in July, GCN reporter Gregory
Slabodkin uncovered the fact that on at least one occasion, the systems
aboard the Navy's model fly-by-wire ship had crashed, leaving the vessel
partially disabled. It took two hours for the crew to reboot.'

---
After all that here are my observations:

1. We really can't tell whether the underlying NT OS did in fact crash, but
on the balance of probabilities I'd guess it eventually did and had to be at
least rebooted given the length of time it took to get the vessel
operational again.

2. Regardless, the application clearly did not contain enough error checking
and was primarily at fault.

3. If it contains substance, the issue raised of political pressure to use
NT is probably the most damning. There's little point in us arguing
technical merits if the decision to use NT on the USS Yorktown wasn't
primarily a technical decision.

Regards,
Adam



------------------------------

Date: Fri, 21 Jul 2000 10:28:53 +0200
From: Nico Coetzee <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Crossposted-To: comp.os.ms-windows.nt.advocacy
Subject: Re: The Failure of the USS Yorktown

Adam Warner wrote:

> Hi all,
>
> I see the USS Yorktown is a very popular topic. Here's some quick online
> research:
> http://www.cs.clemson.edu/~steve/Spiro/stories/node75.html
>
> ---
> After all that here are my observations:
>
> 1. We really can't tell whether the underlying NT OS did in fact crash, but
> on the balance of probabilities I'd guess it eventually did and had to be at
> least rebooted given the length of time it took to get the vessel
> operational again.
>
> 2. Regardless, the application clearly did not contain enough error checking
> and was primarily at fault.
>
> 3. If it contains substance, the issue raised of political pressure to use
> NT is probably the most damning. There's little point in us arguing
> technical merits if the decision to use NT on the USS Yorktown wasn't
> primarily a technical decision.
>
> Regards,
> Adam

The question remains: is the US Government prepared to endanger lives and loose
millions of dollars of equipment in the event of a system failure in the heat
of combat?

I am not American, but I do respect human life and I think it is not fair at
this stage to put human life on the line of systems that are not proven combat
ready. I was in the army a long time. Our golden rule (with regards to
technology) was to always have backup systems, and always train just as hard on
"conventional" systems as on high tech systems.

I would like to add one of my experiences as a after thought. On a military
exercise somewhere in Africa we had a very nasty problem. We were getting very
used to GPS systems. One practice mission was in a area full of Iron Mountains
(literally), so not even a normal compass worked. Naturally the GPS systems
could on one particular day (D day -1) only get at best one satellite. If it
weren't for our good training, we would not have being able to mark targets for
bombardments (artillery) or plan air strikes. We used very old (and maybe
forgotten) methods of navigation and chart plotting to do all calculations.
But, in the end, the job could be done.

Now, with that said, I do not say Technology should not be used, but the
military must always train people to perform their functions as if technology
was not available. In the ships example - what if a guided missile takes out
the Server room? The personnel must still be able to continue with their tasks
in such a situation.

In the article, the Navy spokes person mentioned that engineers was trained to
work around these problems in future. I hope he was telling the truth...

As far as the OS is concerned - My opinion is that military systems should be
developed in house.

Cheers,

Nico

--
==============
The following signature was created automatically under Linux:
. 
People's Action Rules:
        (1) Some people who can, shouldn't.
        (2) Some people who should, won't.
        (3) Some people who shouldn't, will.
        (4) Some people who can't, will try, regardless.
        (5) Some people who shouldn't, but try, will then blame others.




------------------------------

From: Arthur Frain <[EMAIL PROTECTED]>
Subject: Re: MS advert says Win98 13 times less reliable than W2k
Date: Fri, 21 Jul 2000 01:19:38 -0700

Steve Mading wrote:
> 
> Arthur Frain <[EMAIL PROTECTED]> wrote:
> : Steve Mading wrote:

> :> What the hell is "13 times less reliable" supposed to mean?  How do
> :> you attach numbers to a concept like "reliability"?
 
> : I'm not saying either MS or NSTL or the ad is using a
> : reasonable measure of reliability, but it is possible
> : to quantify it. "Reliability" is "performs to specs
> : over time" or some similar definition. You simply
> : measure the time between failures (MTBF - "mean time
> : between failures") or the reciprocal (FITS or 'failures
> : in time'). If you're really interested, see if
> : MIL-HDBK-217 is online.
 
> Yeah, but that is meaningless unless the inputs over that time run the
> gambit from one end of the scale to the other.  Otherwise you can be
> missing the conditions that cause the crashes.  Testing by throwing
> infinite monkeys at the problem doesn't work if the number of monkeys
> isn't really infinite - you end up missing large parts of the testspace.

Sorry - I didn't see your reply until today.
You're absolutely right - I've only done this
with semiconductors and systems, not with 
software, but the principles are about the 
same.

In the semi area, you worry about things like
"fault coverage" and "pattern sensitivities",
which is similar to what you're describing. A
lot of this is very difficult to quantify,
even when parts are designed for testability.
Semis and systems also have very clearly 
defined specs for every detail of operation
and they're all measureable. Semi failures
also tend to be catastrophic - they don't
work at all, not just that they run slower
or put out the wrong voltage or something else
parametric.

I've been out of that area for a long time,
but I'm sure it's advanced tremendously. I
can't imagine how you'd do it for software
without designing for it. To get the same
kind of coverage you can get on semis would
be horrendously complex. Which is one reason
MS's testing claims for W2K don't impress
me much.

OTOH, the MTBF's for semis are into the
100K to 1M+ hour range. I've had seven systems
running for over 2 years 24/7 (a lot of
device hours) and the only component failure
I've had (other than monitors, which is
another story) is a power supply wiped out
by a power surge. Even HD's have MTBF's
stated at 100K+ hrs - I haven't had one
fail since the about 1990 (although
obviously I'm not running the 100KB
drives any more).

Arthur

------------------------------

From: "Stuart Fox" <[EMAIL PROTECTED]>
Crossposted-To: comp.os.ms-windows.nt.advocacy
Subject: Re: The Failure of the USS Yorktown
Date: Fri, 21 Jul 2000 09:30:56 +0100


"Adam Warner" <[EMAIL PROTECTED]> wrote in message
news:8l8vuh$i7g$[EMAIL PROTECTED]...
> Hi all,
>
> I see the USS Yorktown is a very popular topic. Here's some quick online
> research:
> http://www.cs.clemson.edu/~steve/Spiro/stories/node75.html
>
> The most important quotation in this link is from the official report:
>
> "The Yorktown lost control of its propulsion system because its computers
> were unable to divide by the number zero ... The Yorktown's Standard
> Monitoring Control System administrator entered zero into the data field
for
> the Remote Data Base Manager program. That caused the database to overflow
> and crash all LAN consoles and miniature remote terminal units. The
program
> administrators are trained to bypass a bad data field and change the value
> if such a problem occurs again."

This I think is the most telling, and raises a few questions.
1. How does an app failure cause all LAN consoles to crash?
2. Why not have the field set to NOT NULL so the administrators can't make
the mistake?
3. Why not have some error checking in the database - if the app is mission
critical, surely this is the least they can do?
>
>
> 1. We really can't tell whether the underlying NT OS did in fact crash,
but
> on the balance of probabilities I'd guess it eventually did and had to be
at
> least rebooted given the length of time it took to get the vessel
> operational again.

But given that there was bad data in the database, rebooting would make no
difference - yes?  I wouldn't read that into it, especially since the reboot
would only take a few minutes max.  Unless they rebooted it a lot of
times...  :)

>
> 2. Regardless, the application clearly did not contain enough error
checking
> and was primarily at fault.

Absolutely.
>
> 3. If it contains substance, the issue raised of political pressure to use
> NT is probably the most damning. There's little point in us arguing
> technical merits if the decision to use NT on the USS Yorktown wasn't
> primarily a technical decision.

I wouldn't have used NT to start with, and I like NT.  I wouldn't have used
Linux either, but probably one of the commercial unices.

Cheers

Stu



------------------------------

From: [EMAIL PROTECTED] (Donal K. Fellows)
Crossposted-To: comp.os.ms-windows.nt.advocacy
Subject: Re: Am I the only one that finds this just a little scary?
Date: 21 Jul 2000 08:32:37 GMT

In article <8l7teg$mep$[EMAIL PROTECTED]>,
Christopher Smith <[EMAIL PROTECTED]> wrote:
> It's trivial to prove that a user space application doing an x/0
> operation won't crash NT,

Is it?  OK, it is trivial enough to prove that there are some user
space applications that do not crash NT when they perform a divide by
zero, but extending that to all is non-trivial.  Suppose an x/0 caused
stack corruption (it could happen,) this could then lead to a series
of pretty-much random system calls (believable) and demonstrating that
that sort of thing would cause no problems is not easy.  Especially if
the code is running with administrator priviledges.  If, for example,
the bug caused dud packets to be put onto the ship's network, the
routers might have got confused and turned the servers into (the
network equivalent of) a black hole.  Stranger things have happened...

Donal.
-- 
Donal K. Fellows    http://www.cs.man.ac.uk/~fellowsd/    [EMAIL PROTECTED]
-- I may seem more arrogant, but I think that's just because you didn't
   realize how arrogant I was before.  :^)
                           -- Jeffrey Hobbs <[EMAIL PROTECTED]>

------------------------------

From: [EMAIL PROTECTED] (Donal K. Fellows)
Crossposted-To: comp.os.ms-windows.nt.advocacy
Subject: Re: Am I the only one that finds this just a little scary?
Date: 21 Jul 2000 08:34:30 GMT

In article <8l8uo3$lnq$[EMAIL PROTECTED]>,
Stuart Fox <[EMAIL PROTECTED]> wrote:
> I find it hard to believe that a mission critical app is **that**
> poorly written.

I don't.  I'm obviously more cynical than you.

Donal.
-- 
Donal K. Fellows    http://www.cs.man.ac.uk/~fellowsd/    [EMAIL PROTECTED]
-- I may seem more arrogant, but I think that's just because you didn't
   realize how arrogant I was before.  :^)
                           -- Jeffrey Hobbs <[EMAIL PROTECTED]>

------------------------------

From: Russell Wallace <[EMAIL PROTECTED]>
Subject: Re: Some Windows weirdnesses...
Date: Fri, 21 Jul 2000 09:49:57 +0100
Reply-To: [EMAIL PROTECTED]

Tim Kelley wrote:
> The problem is if you start adding features it will be unreliable
> ... that's the point.  There is very little for FAT to keep track
> of so it does not have much data to corrupt.

So it does come down to reliability vs features after all?  Then give me
reliability :)

-- 
"To summarize the summary of the summary: people are a problem."
Russell Wallace
mailto:[EMAIL PROTECTED]
http://www.esatclear.ie/~rwallace

------------------------------

From: "Ferdinand V. Mendoza" <[EMAIL PROTECTED]>
Subject: Re: Some Windows weirdnesses...
Date: Wed, 19 Jul 2000 14:47:24 +0400



Arthur Sowers wrote:

>  You do need to do graceful
> shutdowns, have a UPS to prevent power failures from causeing an
> ungraceful shutdown.

He-he-he... At least Mandrake 7.1 can tolerate this with reiserfs -if
you can't afford a UPS.


Ferdinand



------------------------------

From: "David Brown" <[EMAIL PROTECTED]>
Crossposted-To: alt.sad-people.microsoft.lovers,alt.destroy.microsoft
Subject: Re: Linux is blamed for users trolling-wish.
Date: Fri, 21 Jul 2000 11:26:54 +0200

This discussion has got way too long to continue as it stands - I suggest a
big snip, and a change of format before continuing.

<Snip>

I have no arguement as to what MS has done - you can quote as many examples
of illegal or immoral behaviour, and I will agree with every one.  MS has
done very little real software development, has made few "innovations", and
produced very little software of high quality (it has done some good stuff,
but in proportion to its size, there is very little).  This will be
considered as facts by all but the most ardent of BG groupies, so there is
no need to discuss more details here.

Consider the following points:

1) When discussing where MS would be, we are really talking about where BG
would be.

2) BG is a meglomaniac.  He wants power and money, and has a much stronger
drive to achieve that ambition than most people.

3) He is very resourceful.  He is an excellent marketer and salesman.  He is
intelligent (I don't want to discuse how intelligent, but he is certainly
not stupid).  He understands the computing market and market forces.

4) He wants to get rich fast, and has no scrouples.


In the real world, where no body bothered about the illegal practices of MS
(although they were convicted at one point, the "sentence" was basically to
promise not to do it again, and even that was never enforced), the easiest
path to success was through crime.  Competition would only have slowed him
down, so it was eliminated in whatever way was most convenient at the time.

In our hypothetical world in which crime is not tolerated (including crime
by other companies, so that no one else could use the old MS's tactics),
this would not work.  But we can be reasonably sure that, given the
characteristics 2) to 4), BG would succeed at making a rich and powerful
company.  He might fail, but he would not give up easily.  Without the help
of crime, this company (MS, or whatever it were called) would not be in
nearly the same position as MS today, but it would still be a solid,
successful company.  It may even be several semi-independant companies.

So how would this new MS achieve its success?  It cannot break the law, and
the law is designed to promote competition and protect the consumers.  It
cannot force its software on people through illegal contracts - it must
compete for its market share.  There are many ways to do this, and you can
be sure that, given characteristic 4), BG will avoid competion where
possible.  But unless the laws are critically lacking in substance, there is
no doubt that the new MS would have to compete fairly in some markets
(whether it be competing directly for consumers, or for OEM support, or for
developers support).


Now, please let me know if there is anything you strongly disagree with in
points 1) to 4).  I think they are general enough to be considered valid
both in the real world and in the hypothetical law-abiding world.  Then
think about my conclusions, which are based almost solely on these facts.
Then re-read them and think about them some more.  Then write a reply.  Read
through that reply, delete all the details of what MS has done wrong and all
the "you're wrong because I'm always right" parts, and explain exactly what
is wrong with my reasoning.




------------------------------

From: "Christopher Smith" <[EMAIL PROTECTED]>
Crossposted-To: comp.os.ms-windows.nt.advocacy
Subject: Re: Am I the only one that finds this just a little scary?
Date: Fri, 21 Jul 2000 19:37:22 +1000


"Donal K. Fellows" <[EMAIL PROTECTED]> wrote in message
news:8l91r5$hmd$[EMAIL PROTECTED]...
> In article <8l7teg$mep$[EMAIL PROTECTED]>,
> Christopher Smith <[EMAIL PROTECTED]> wrote:
> > It's trivial to prove that a user space application doing an x/0
> > operation won't crash NT,
>
> Is it?  OK, it is trivial enough to prove that there are some user
> space applications that do not crash NT when they perform a divide by
> zero, but extending that to all is non-trivial.  Suppose an x/0 caused
> stack corruption (it could happen,) this could then lead to a series
> of pretty-much random system calls (believable) and demonstrating that
> that sort of thing would cause no problems is not easy.

How would a user space app cause stack corruption that would affect the
entire system, just out of interest ?

> Especially if
> the code is running with administrator priviledges.

No.  Administrator is not the same as root.  It would be have to running in
kernel mode (ie a driver) to do the sorts of things your describing (AFAIK,
anyway).

> If, for example,
> the bug caused dud packets to be put onto the ship's network, the
> routers might have got confused and turned the servers into (the
> network equivalent of) a black hole.  Stranger things have happened...

How is a *user space* app going manipulate network packets ?



------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and comp.os.linux.advocacy) via:

    Internet: [EMAIL PROTECTED]

Linux may be obtained via one of these FTP sites:
    ftp.funet.fi                                pub/Linux
    tsx-11.mit.edu                              pub/linux
    sunsite.unc.edu                             pub/Linux

End of Linux-Advocacy Digest
******************************

Reply via email to