Re: [osol-discuss] Re: How would "the ARC process" look at this discussion of KSH 88-vs-93?

Joerg Schilling Tue, 02 Aug 2005 03:24:20 -0700

[EMAIL PROTECTED] wrote:

> Quite; the particular case of ksh is a difficult one; while I don't
> think there are many people writing products based on ksh script
> (I'd certainly hope there aren't any), scripts are often written
> by system administrators and breakage is bad.


Similar to what Sun did with the tcp/ip code in the kernel, Sun should have
started to plan the needed migration 2 years ago. If ksh93 was present on 
Solaris 10 and ksh was renamed to ksh88 (with a link to ksh) things would
be a lot easier.


> We've also got a number of rather convoluted ksh scripts ourselves
> (patchadd/patchrm, etc) which depend on possibly many o fthe intricacies
> of ksh.

If the main differences that may cause problems are local variables,
Sun could have added a method that allowes the parser of ksh88 to complian
about using local vars.


> It's clear that the ksh developers wanted to move forward and make
> ksh POSIX compatible; it's also clear that in doing so they broke
> many scripts because scoping rules and other things changed.

Shure that is has been this way? How about POSIX just adopted ksh93 as it
was newer?


> We also know that the list of incompatibilties posted on the ksh
> website is incomplete, but that's another matter.

And I did read the pdksh incompatibility list which seems to be even longer
than the list for ksh93 vs ksh88. If nobody ever tried to find problems
in Sun owned ksh scripts, caused by such incompatilities, it makes
no sense to talk about them seriously.

> We wanted this to happen for S10 but we were just too busy.

See above, I believe it is of similar evidence as the tcp/ip stack changes
because it has been a known to be a cause of problems for OpenSolaris.


> Interfaces cannot be changed, they can only be extended.  Interface
> design is really hard and projects typically fail miserably at them.

This is an important predication and the reason why I lost my faith in Linux
is the fact that the Linux Kernel people constantly change interfaces and
are unable to have a discussion on interface definitions and interface 
stability.

This is where Solaris has a big advantage on Linux and this is why I believe
OpenSolaris has a real change and will win new users in the long term.


> Some times the interface manages to scrape through because it turns out
> to be extensible; sometimes it just needs to be tossed.

It is important to plan interfaces in a way that makes stable as long
as possible and in a way that makes them extensible if the limits of stability
have been reached.

Creating a new interface that has the chance for being stable for a long time
requires to think about all possible aspects and to define it in a way gives
a chance for stability. At this point it seems that I should mention that
while the number of software engineers working on UNIX or UNIX like OS
did increase dramatically during the past 25 years, the number of excellent
people seems to be almost constant. 

With the right methods for collaboration, there is a chance to slightly increase
the number of good enough people in our community. Just compare interface
quality between Linux and FreeBSD. FreeBSD is much better and  my impression
is that this is a result of the way the FreeBSD people deal with people that
are members of the inner circle.

Being able to do this means that we first need to know about the skills
of the important people. We need to know "who did work on which parts"
and who did design which interface. We also need to have a place where people
talk more on technical issues. 

Knowing the skills of people is important for a "circle of trust" that we
would need when discussing important changes on OpenSolaris. I personally
cannot judge on the importance of a meaning on someone who is not fully
explicit in his statements, if I don't know the level of knowledge of this
person. Unfortunately many people on this list are not verbose enough in their
texts when something is being discussed.

I personally know the skills of about 10-30 people from Sun and I in 
the majority don't have this knowledge from this mailing list but from
personal meetings phone calls and from IRC chats.

For this reason, I strongly vote for a OpenSolaris developer meeting that
brings the important people from Sun together with the important people
from the OpenSolaris community. If the meeting is being held in Menlo Park,
we would need to bring 5-20 people from the community to MPK.

The reason for this meeting should be to learn more about the skills of the 
people in order to allow for shorter email conversation later. If I know
the skills of a person who just sends a mail "believe me...", it is easier
to know whether I should believe him or not. There are a lot Sun Employees
on this list that I don't know well enough for this kind of conversation
and I asume that the same may apply the other way round too.

I like to know which people I should put into my personal list os people
that should belong to the "inner circle" of knowledge when having discussions
on important aspects of interfaces or similar things.



> In the past, interfaces have been changed and we are still regretting
> it.  Case in point signal().  Different in SV and BSD; changed in BSD for
> no good reason.  Programs written for BSD suddenly weren't portable to
> SV and vice versa.  This still causes issues.  And guess which issue caused
> most grief when porting stuff from SunOS 4.x to Solaris 2.x?

Signal() has been changed by BSD aroung 1979 for good reason: the old AT&T
interface did not work correctly because it did not treat signals similar to
interrupts. 


> Other bad design include select(), sigvec().  Now select() just managed
> to scrape through because of the use of value/result parameters with
> pointers and pointers can point to an array of ints not just one.
> (But the interface was designed for 32 bit ints and processes with at
> most 20fds).
>
> Sigvec is broken because it limited the number of signals to 32 by setting
> defining the mask as an int.

Correct, BSD needed to introduce a new method but failed with making it
extensible. But please note that at that time, UNIX did have 16 signals and
nobody thought we would ever need more than 32.


> Sigaction and ilk fixed that but they have an equal flaw: they required
> picking a new arbitrary limit for each implementation; Solaris picked
> 128 but that still means that we can't really arbitrarily add signals.

POSIX only requires 

        sigset_t Integer or structure type of an object used to represent
                 sets of signals. 

it is fully compliant to have something like:

        typedef struct {
                uint32_t        __nsigs;
                uint8_t         __sigbits[(NSIG+BITSPERBYTE-1)/BITSPERBYTE];
        } sigset32_t;

It would even be possible to extend the current interface by using something
like this:

        typedef struct {
                uint32_t        __o_sigbits[2];
                uint32_t        __nsigs;
                uint32_t        __filler;
                uint8_t         __sigbits[(NSIG+BITSPERBYTE-1)/BITSPERBYTE];
        } sigset32_t;

Of course, we must not already have more than 64 signals before the change
is applied.

...

> It's hard to avoid making mistakes; I think we can name plenty we
> made for Solaris 2.0, some forced on us because there was already a SPARC
> reference implementation of SVr4; the most horrible limitation is the
> 8 bit filedescriptor field, referenced by some programs and binaries because
> of the old definition in <stdio.h>:
>
> #define       fileno(fp)      ((fp)->_file)
>
> We can't change that very well because, even though programs compiled in
> 2.6 and later will use the proper function, some programs are known to have
> used stuff like:
>
>       fileno(myfile) = fd;
>
> or just
>       myfile->_file = fd;
>
> or
>       fd = myfile->_file;
>
>
> the first two just don't work with fds > 255 and the code may break anyway;
> the latter is insidious and very common for programs compiled against 2.5.1
> and earlier headers.

This problem was introduced by people from AT&T who did believe that they
would need to keep their old and broken stdio for binary compliance with
SVr3. However, they did forget about the fact that SVr4 also introduced a new
loader format (SVr3 has COFF, SVr4 has ELF) and the binary compatibility 
was only needed in the old COFF compatibility librariries but not for newly
compiled code.


> But the people writing wordprocessors really like to be able to ship a
> version compiled on 2.6 for that release and all later releases.
>
> Case in point is possibly the Windows OS: the OS itself is by and large
> backward compatible (as is Solaris) so you can run old and new applications
> on the newer releases but the GUI and the applications are not.
>
> But the Windows APIs contain many attempts at doing the same thing
> in different ways.  And they contain many DoFoo() and DoFooNewFangled()
> functions; you cannot change an interface but you can make a new interface
> which is a superset.
>
> And rather than having to have developers change their code from:
>
>       DoFoo(X, Y)
> to
>       DoFoo(X, Y, NULL, NULL)
>
> developers aren't required to change their code because you just keep
> DoFoo, possibly reimplemented as:
>
>       DoFoo(X, Y)
>       {
>               DoFooNewFangled(X, Y, NULL, NULL);
>       }

This is where the number of people that have the needed background knowledge 
becomes really small if you like to discuss things. For this reason, it would
be important for external people from the community to know which poeple from
Sun have these skills and of course Sun pople would need to know the same
for people from outside Sun.

Jörg

-- 
 EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin
       [EMAIL PROTECTED]                (uni)  
       [EMAIL PROTECTED]        (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Re: [osol-discuss] Re: How would "the ARC process" look at this discussion of KSH 88-vs-93?

Reply via email to