Re: [osol-discuss] Re: How would "the ARC process" look at this discussion of KSH 88-vs-93?

Casper . Dik Sat, 30 Jul 2005 05:40:01 -0700

>No it's not absolute rubbish - and describing it as such does a great
>dis-service to those talented engineers who have worked on it.  And to
>those that understand it.



Quite; the particular case of ksh is a difficult one; while I don't
think there are many people writing products based on ksh script
(I'd certainly hope there aren't any), scripts are often written
by system administrators and breakage is bad.

We've also got a number of rather convoluted ksh scripts ourselves
(patchadd/patchrm, etc) which depend on possibly many o fthe intricacies
of ksh.

It's clear that the ksh developers wanted to move forward and make
ksh POSIX compatible; it's also clear that in doing so they broke
many scripts because scoping rules and other things changed.

We also know that the list of incompatibilties posted on the ksh
website is incomplete, but that's another matter.

The disk of "people writing ksh based products taking their business
elsewhere" is minimal compared to the risk of breaking programs
as a result of replacing ksh 88 with ksh 93.  The people writing the
ksh scripts can easily work around the ksh issues; each platform
requires testing and such testing would easily reveal the shortcomings
of Solaris ksh.

Having said that, I think we should move forward swiftly on introducing
ksh93 in Solaris which should certainly help those developing products.

We wanted this to happen for S10 but we were just too busy.

>No it's not "simply a decision not to change some aspect of the interface"!
>It's taking the time and making the extra effort to see if a technical
>solution can be arrived at without breaking backward compatibility.  As
>such, it forces a more disciplined approach to software engineering, rather
>than a cursory: "I don't like this interface - so I'll simply change it and
>then build my software on top of that change".  The latter is called
>hacking - or as I would describe it to most people - thinking with the
>keyboard, instead of thinking with your brain.

Interfaces cannot be changed, they can only be extended.  Interface
design is really hard and projects typically fail miserably at them.

Some times the interface manages to scrape through because it turns out
to be extensible; sometimes it just needs to be tossed.

In the past, interfaces have been changed and we are still regretting
it.  Case in point signal().  Different in SV and BSD; changed in BSD for
no good reason.  Programs written for BSD suddenly weren't portable to
SV and vice versa.  This still causes issues.  And guess which issue caused
most grief when porting stuff from SunOS 4.x to Solaris 2.x?

Other bad design include select(), sigvec().  Now select() just managed
to scrape through because of the use of value/result parameters with
pointers and pointers can point to an array of ints not just one.
(But the interface was designed for 32 bit ints and processes with at
most 20fds).

Sigvec is broken because it limited the number of signals to 32 by setting
defining the mask as an int.

Sigaction and ilk fixed that but they have an equal flaw: they required
picking a new arbitrary limit for each implementation; Solaris picked
128 but that still means that we can't really arbitrarily add signals.

The open() system call has a "flag" argument which allowed it to be
extended from a 2 argument form to a 3 argument form (the 3rd argument is
only valid when O_CREAT is used)

The socket interface is an example of an attempt at a generic interface
for multiple types of protocols and addresses and while the kernel interface
is reasonable, all programs needed to be upgraded for the use of IPv6
and that is a bug.

Had are been a generic "name to address" translation sequence, things like,
telnet /tmp/.X11-unix/X0 would have come for free.  XTI/TLI did a number
of things better but people generally prefer simpler interfaces.

It's hard to avoid making mistakes; I think we can name plenty we
made for Solaris 2.0, some forced on us because there was already a SPARC
reference implementation of SVr4; the most horrible limitation is the
8 bit filedescriptor field, referenced by some programs and binaries because
of the old definition in <stdio.h>:

#define fileno(fp)      ((fp)->_file)

We can't change that very well because, even though programs compiled in
2.6 and later will use the proper function, some programs are known to have
used stuff like:

        fileno(myfile) = fd;

or just
        myfile->_file = fd;

or
        fd = myfile->_file;


the first two just don't work with fds > 255 and the code may break anyway;
the latter is insidious and very common for programs compiled against 2.5.1
and earlier headers.

If we change the fd to be stored elsewhere, such programs would start doing
operations on the wrong fd or "fd % 256".

We did a lot of things better in 64 bit Solaris.

Because binary compatibity is a core value for Solaris we cannot change
this unless we find a way to safely change this.

We hope that "binary compatibility" and "development platform stability"
are the two core values for Solaris which will be seen as core values
for OpenSolaris as well.

But stability and binary compatibility do not stand in the way of progress
at all; all you need to do is define a new interface, preferrably an
extensible one, which covers the newly discovered needs.

>> whether you want to develop a word processor or a web browser.
>
>Absolutely disagree.  And trivializing it like this demonstrates a complete
>lack of understanding.

I think this is not so much a lack of understanding as well as a
different outlook on the world.  What Roy says is true for webbrowsers and
word processors.  But it is not at all true for Operating Systems.
Stability of interfaces is important for the OS; but a wordprocessor
can change it format and interface every release and so they do.  And they
solve the compatibility problem by allowing the importation of old
documentation and usually have some form of slightly broken "old style"
export format.  So the market isn't really forced to upgrade at once but
mostly is.

But the people writing wordprocessors really like to be able to ship a
version compiled on 2.6 for that release and all later releases.

Case in point is possibly the Windows OS: the OS itself is by and large
backward compatible (as is Solaris) so you can run old and new applications
on the newer releases but the GUI and the applications are not.

But the Windows APIs contain many attempts at doing the same thing
in different ways.  And they contain many DoFoo() and DoFooNewFangled()
functions; you cannot change an interface but you can make a new interface
which is a superset.

And rather than having to have developers change their code from:

        DoFoo(X, Y)
to
        DoFoo(X, Y, NULL, NULL)

developers aren't required to change their code because you just keep
DoFoo, possibly reimplemented as:

        DoFoo(X, Y)
        {
                DoFooNewFangled(X, Y, NULL, NULL);
        }



>- interface stability requires some extra effort.  There is no free lunch
>with software or with any other engineering discipline.

But the payback is enormous....

>- the extra effort required is not a significant barrier - it's more of a
>mindset change.

There's another issue which crosses into this territory: interfaces must
be unforgiving.

        fclose(badptr)                          should SIGSEGV
        x = malloc(10); x[10] = 1               should be caught
        "reserved, must be zero fields"         must be verified.

The reason is that once you allow additional bits to contain garbage
rather than 0s, e.g., you can never extend the interface by saying:
        - if 0
          then do it the old way
          else do it the new way
        
We noticed this when we went to UltraSPARC; our previous SPARC implementations
ignored certain reserved bits and some ADA compiler left trash in them.
UltraSPARC verified the reserved bits and give a bad opcode trap.
So we had to work around this in th e kernel by fixing the opcodes
as they trapped.

>> Discovering if an interface would be changed by an integration is
>> a technical problem.  An ARC review should certainly be looking for
>
>No - discovering "if an interface would be changed by an integration"
>implies a reactive, slap yourself on the head _mistake_.  The ARC process
>is designed to predict requirements and issues early in the technical
>software development process and assure that integration is smooth and
>almost a "no brainer".

The ARC process includes two deliverables:
        - interfaces used (to a point, not all the C/POSIX/etc stuff)
        - interfaces delivered (new, modified)

>"100 monkeys typing away at the interface" implies a complete hacker
>mentality to me and it the very antithesis of software engineering.
>Software development is an engineering discipline - and when treated as
>such, provides a stable, predictable and correctly behaving Operating
>System that the developer can deliver his/her applications on with complete
>confidence.  Or they can extend/expand the Operating System itself knowing
>that they are adding value for *every* user/developer in the community and
>that they are *not* a cause for developers to waste their valuable time
>recompiling/fixing software and discovering/fixing new and devious bugs.

Interface building is one of the hardest parts; and it was early in the
design process of the Solaris Privileges that we abandoned the Trusted
Solaris privilege set definition: a fixed bitset which can be defined
as follows:

        - priv_set_t foo;

to one which needs to be allocated:

        priv_set_t *pfoo = priv_allocset();

The other challenge, of course, was make seteuid(0) and seteuid(uid) behave
like before; and that is what makes the Solaris privilege implementation
different.

>Further I'll predict that any software developer who truly cares about
>their craft will benefit greatly by taking the time to understand the
>significant payback they'll accrue by putting in a little extra effort to
>maintain interface compatibility.  It's not an easy sell - but I guarantee
>to any developer that makes the effort, it'll make them a significantly
>better software engineer and provide them with skills that they can/will
>leverage throughout their career (assuming that they have chosen software
>development as their career).

It should be understood that the API is the hardest part of a library;
it's often treated almost as an afterthought.

Casper
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Re: [osol-discuss] Re: How would "the ARC process" look at this discussion of KSH 88-vs-93?

Reply via email to