[sage-devel] Re: Random banter about Sage standards

Bill Hart Mon, 30 Aug 2010 18:37:49 -0700


On Aug 30, 8:51 pm, Robert Bradshaw <rober...@math.washington.edu>
wrote:
> On Sun, Aug 29, 2010 at 1:56 AM, Tim Daly <d...@axiom-developer.org> wrote:
> > tl;dr old curmudgeon flaming on about the dead past, not "getting it" about
> > Sage.
>
> > Robert Bradshaw wrote:
>
> >> In terms of the general rant, there are two points I'd like to make.
> >> The first is that there's a distinction between the Sage library
> >> itself and the many other spkgs we ship. By far the majority of your
> >> complaints have been about various arcane spgks. Sage is a
> >> distribution, and we do try to keep quality up, but it's important to
> >> note that much of this software is not as directly under our control,
> >> and just because something isn't as good as it could be from a
> >> software engineering perspective doesn't mean that it won't be
> >> extremely useful to many people.  Even if it has bugs. We try to place
> >> the bar high for getting an spkg in but blaming the Sage community for
> >> poor coding practices in external code is a bit unfair. I hold the
> >> Sage library itself to a much higher standard.
>
> > The point that "software is not as directly under our control" is not really
> > valid.
>
> > This is a *design* decision of Sage, not a necessary evil. Sage specifically
> > chose
> > to build on dozens of other pieces of software.
>
> I think reusing the large body of existing code is a necessary evil to
> achieve our goals. Unfortunately, I think it is also a huge source of
> problems, and we can all agree the code is of varying quality (some of
> it's great, some is not so much...).
>
> > Once you make spkg functionality part of Sage functionality, you own it.
>
> To fully "own" the code we need to either (1) fork (2) get the code
> fixed upstream or (3) re-write it entirely ourselves. We have taken
> all three of these routes in various cases, but all take a huge amount
> of duplicated effort.


I think the thing that is required is:

(4) carefully test the upstream functions that we use in Sage.
(5) review the upstream code, at least each function used in Sage.

I know it won't happen though, because there's too much of it to
review and too many functions to test. It's too late to fix this
problem.

Sage may never have gotten off the ground if this were a requirement.
It would grind to a halt if this became one. So it will never happen.

>
> Sometimes the route we currently take is "this code exists, let's try
> to make use of it" which has compromises. So there is the large number
> of spkgs, some of which cause many headaches, that few people work on,
> and then there is the core library which, though not without its
> issues, I feel is in better shape and where most of the work goes.

I think you mean that most of the *reviewing* work goes into the Sage
library code.

There's plenty of work going into the spkgs. That work just isn't
necessarily under the direct control of the Sage project.

How many lines of Sage python/cython code are there? About 1.5M or so?

GMP/MPIR and FLINT together gets up towards 0.5M lines of code, and
that's just two of nearly 100 spkgs....

>
>
>
>
>
> > The statement that Sage tries "to place the bar high for getting an spkg in"
> > isn't
> > actually much of a claim. I've watched the way spkgs get voted onto the
> > island
> > and it usually involves a +1 by less than half a dozen people. Would you
> > really
> > consider this to be placing "the bar high"? I'd consider developing a test
> > suite,
> > or an API function-by-function code review, or a line-by-line code review to
> > be placing the bar high. At the moment I see Sage writing test cases for
> > python
> > code but I don't see the same test cases being pushed into the spkgs. Even
> > where
> > external test cases are available (e.g. the computer algebra test suites for
> > Schaums
> > and Kamke) I don't see them being run.
>
> You're right, the bar isn't high. My main point is that we are trying
> to raise it. It used to take almost nothing for an spkg to go in.

It's not really just about raising a bar. One part of a spkg may be
very solid and another part very poor. Without some finer control over
what is used by Sage, the problem will always exist.

As it is at the moment, spkgs get in based on needed functionality and
a cursory examination of a few vitals, not because of a systematic
review of all the code in them.

Of course, excluding packages that are not actively maintained might
be helpful. The trouble is, any rule you make, there'll always be
exceptions.

>
>
>
>
>
> > From a software engineering perspective there are some things that *are*
> > directly under Sage control such as the pexpect interfaces. How carefully
> > are these designed? Just yesterday I saw comments about the gensym
> > (question-mark variables) connections to Maxima not being handled. This
> > syntax is not a new Maxima feature so a pexpect interface design could have
> > taken this into account but it did not. Each pexpect interface should be
> > designed
> > to be automatically constructed from the BNF of the underlying spkg. This
> > would eliminate the mismatch by design and be good software engineering.
>
> > The conclusion that "blaming the Sage community for poor coding practices
> > in external code" as being "a bit unfair" is not valid. While it is grossly
> > unfair to
> > assume that spkgs are of poor quality, if your *design* calls for using
> > materials
> > of "unknown quality" it seems that a very large portion of your effort
> > *must*
> > involve quality reviews of spkgs. End users just see Sage.
>
> Personally, I think there's a distinction between "the Sage community
> writes code of questionable quality" and "the Sage community uses code
> of questionable quality." Now I'm not saying that everyone here has
> excellent software development skills (which is far from the truth)
> but what I do see that I think is disingenuous is the comments I see
> of "spkg x.y.z has compiler warnings, the Sage community doesn't know
> how to write good code."

You forget that numerous members of the Sage community write spkgs,
not just python. And just because C gives compiler warnings instead of
leaving everything to runtime or not giving feedback at all, doesn't
make the python/cython code more solid. I am certain the python/cython
code in Sage is just as much to blame for the bugs in Sage.

>
> Would I like to see such issues fixed? Yes, for sure. But sometimes
> treating an spkg as a black box that does what you ask it too gets the
> job done. Hopefully over time the poorly-written or poorly maintained
> packages get fixed/replaced. (I see the spkg model staying with us for
> a long time, hopefully the average quality going up--there are a lot
> of solid ones.)
>

The other option would be to reimplement everything in cython, except
for a very small core which needs to be in assembly.

>
>
>
>
> > Still to come will be the "code rot" issue. Open source packages tend to
> > have a
> > very small number of active contributors. Projects tend to stop when those
> > people
> > drift away. Once a package is no longer maintained it stops working due to a
> > lot of factors such as incompatible standards like python 3.0, operating
> > system changes
> > like include files, architecture changes like parallel builds, loss of
> > primary
> > development platforms like the implosion of open solaris, etc. Recent
> > examples of this
> > in Sage might be the Atlas 64bit issue (architecture), the Sympow issue
> > (author
> > loss), the loss of pointful effort due to the death of open solaris
> > (platform death),
> > the python GIL issue on multicore (software architecture), the rise of
> > python 3.x
> > (software standards change), etc.
>
> > Now that the wave of new spkg adoption has slowed I expect to see a growing
> > need for maintaining "upstream" code. By *design*, their problems are now
> > your
> > problems. Who will debug a problem that exists in 500,000 lines of upstream
> > code?
> > Who will understand the algorithms (e.g. sympow) written by experts, some of
> > whom are unique in the world, and debug them?
>
> > Writing new code is always fun. Maintaining old code you didn't write is
> > painful.
> > But from an end-user perspective "it is all Sage" so all bugs are "Sage
> > bugs".
> > That may seem unfair but the end-user won't know or care.
>
> > The belief that Sage will gradually rewrite the code pile it has (5 million
> > lines?) into
> > higher quality seems odd. For comparison, Axiom is about 1 million
> > things-of-code
> > (lisp doesn't have "lines"). It took over 20 years and over 40 million
> > dollars of funding.
> > Scaling linearly, Sage would take 100 years and 200 million dollars to be
> > rewritten
> > into "all python". Frankly, I think the spkgs are going to be around for a
> > very long time.
>
> >> The second point is that much of the older, crufty code in Sage was
> >> put in at a time when standards were much lower, or even before there
> >> was a referee process at all.
>
> > When Axiom was written we were using Liskov's ideas directly from the
> > primary papers.
> > I believe that we were the first system to dispatch not only on the type of
> > the arguments
> > but also on the type of the return (something that is still not common). But
> > Axiom was
> > developed as research software, not with the intention of being brought to
> > market as a
> > product (free or commercial). Sage is being developed with this intention.
>
> > Our choice of "standards" was to build on abstract algebra. There were a
> > great many
> > debates about the right way to do things and we always went back to the
> > touchstone of
> > what abstract algebra implied. At the time (40 years ago) there were no
> > existing examples
> > of computational mathematics for many of the ideas so we had to invent them.
> > Axiom
> > set the standards (e.g. integration) and they were quite high (Axiom still
> > has the most
> > complete implementation). Sage has existing examples to guide it.
>
> > So at the time Sage was being developed there *were* standards in place. You
> > seem
> > to feel that Sage was started "pre-standard" (2005?) and "pre-referee"
> > (ISSAC?).
>
> >> I think this was necessary for the
> >> time--Sage would have gotten off the ground if it couldn't have been
> >> useful so quickly. This includes in particular many of the spkgs that
> >> have been grandfathered in and wouldn't make the cut now, but it takes
> >> time to remove/replace/clean them up. Of course there's room for
> >> improvement, but do you think the current review process is
> >> insufficient and lots of new bad code is being written and included?
> >> If so, what should we do better?
>
> > I *do* feel that the current review process in Sage is insufficient (see my
> > earlier diatriabe).
>
> > I see reviews of bug fixes but I don't see reviews of spkgs.
>
> Yes, spkgs are a problem.
>
>
>
>
>
> > We are now over 50 years
> > into the development of computational mathematics and Sage has the goal of
> > competing
> > with systems developed in the 1970/1980s, over 30 years ago. This would be a
> > great
> > thing if Sage were to deeply document the algorithms, develop the standards,
> > and/or
> > prove the code correct but I don't see anyone advocating any of these. I
> > don't see anyone
> > advocating alternative ideas that would "raise the bar" in computational
> > mathematics.
>
> > Even in the area of education I don't see anyone hammering on the NSF to
> > fund more
> > efforts in computational mathematics. I don't see pushback to NIST to
> > standardize the
> > algorithms. Obama wants to bring science back to life and encourage
> > research. As the
> > largest group of academics I would wish that you would petition the funding
> > sources.
> > Even if all of the funds went to Sage I'd still feel that this was
> > worthwhile.
>
> > In short, I don't see *change*.
>
> If I understand you correctly, you want to set the goal for Sage much
> higher than just a free, open alternative to the Ma*s.
>
> - Robert

I don't think there is anything wrong with Tim's sentiments. He's
really just directing them at the wrong project. Sage doesn't have
those aims, nor do any of the MA*'s. He needs a project with those
specific aims that he can direct those sentiments towards.

Unfortunately, the number of people who will think it is sexy to work
on such a project is pretty small. So it really is a project that will
take 30 years. Who knows if it can keep pace with the development of
computers over that time.

Bill.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

[sage-devel] Re: Random banter about Sage standards

Reply via email to