On 08/27/10 12:59 PM, Tim Daly wrote:
tl;dr we need to raise the standards and "get it right".

On getting different answers...

Some algorithms in computational mathematics use random
values. Depending on the source of random values you might
get different but correct answers. Random algorithms can
be very much faster than deterministic ones and might even
be the only available algorithms.

In this case the answer I got was just plain wrong.

So the problem is not a result of a valid algorithm giving different, but correct answers each time.

In Axiom the test is marked as using a random value so a
test which does not produce the same result every time is
still considered valid.

But this is not the problem here. A root was not found - for whatever reason we do not know.

I do not know if Sage uses such algorithms and I do not
know if that is the source of your failure. If not, then
as you point out, the result is quite troubling. This might
be hard to establish in Sage as it uses several different
subsystems in some computations and who knows if they use
random algorithms?

It's not very reproducible. I've only seen it once in seventy odd runs.

I suspect it could be an issue with the actual testing framework.

On software engineering....

One needs to understand a problem to write correct software.
What is not understood will be where the problems arise.
Sage has the advantage of being written by domain experts
so the algorithms are likely of high quality. Unfortunately,
they rest on many layers of algorithms with many assumptions
(it is kittens all the way down).

There is however a difference between good mathematical skills and the ability to write good software. It is very clear to me that some software in Sage has been written by experts in their fields of mathematics, but who are quite poor at writing software.

Most current systems, at least in this generation, have the
participation of some of the original authors so problems
can be understood and fixed. That will not be true in the
future (some of the Axiom's many authors have died).

Though people may chose not to fix problems even if they are alive. People lose interest. I could imagine someone went to work for Wolfram Research, they would not appreciate him/her maintaining software used by Sage.

In software engineering terms the question is, what happens
when the world expert in some area, such as Gilbert Baumslag
in Infinite Group Theory, writes a program to solve a problem
which later fails? The failure could be due to his lack of
understanding of underlying implementation details or it could
be because an upstream system has changed an assumption, as
is being discussed on another thread. That is, "code rot".

Without Gilbert, who can debug it? Do you throw the subsystem
away, as has been discussed with SYMPOW? If you do then you
have given up real expertise. If you don't then you can't trust
the results. It could be many years before another infinite
group theory AND computational mathematics expert can find and
fix the bug. Meanwhile, the code will continue to rot as more
things change under it.


Who can debug this in 30 years? ...

There are three potential attacks on this problem, documentation,
standard test suites, and program proofs.

Sage has been making doctests which are useful for bringing
potential failures to attention. Other than simple checks and
possible user examples they are useless for other purposes.

Sage also has the ability to run the test suites provided by the developers of the constituent parts. So for example the Python, mpir, mpfr and other test suites can be run.

Unfortunately, most of the packages making up Sage do not have the required file to execute the tests, which is a real shame, as it often needs to consist of little more than

make test

or

make check

cliquer, docutils and many other programs in Sage have test suites of their own, but which can't be executed.

*Documentation* should consist of a detailed discussion of
the algorithm. Since we are doing computational mathematics the
discussion has to have both a mathematical part and an implementation
part.

True. A lot of code does lack this.

Since "computational mathematics" is not (yet) a widespread
and recognized department of study at many U.S. universities,
the authors of today are likely weak in either the mathematics or
the implementation. Hopefully this will change in the future.

Maybe, though there may be cases where a person with reasonable mathmatics skills, but good programming skills, could work with someone who really understands the maths, but not the computer implementation. It does not necessarily follow that these have to be implemented by the same person.

*Standard test suites*, which involves testing the results against
published reference books (Schaums, Luke, A&S, Kamke, etc) This can
make the testing less ad-hoc. These tests also allow all of the
other systems to publish their compliance results.

Yes, though those tests need a lot of work. From the links you provided about Rubi there were basically 4 results

1) Software produced the correct and simplist result
2) Software produced a correct, but overly complex result.
3) Software failed to compute an answer.
4) Software gave the wrong answer.

I would imagine taking a large number of those tests, and putting them into Sage would be a huge undertaking.

It's a shame all these systems (Mathematica, Maple, MATLAB, Sage, Axiom etc) all have different syntaxes. If we were testing C compilers, they would all take the same code. With the maths sofware, it has to be written for each and every piece of software.

All software
engineers know that you can't write your own tests so this is a
good way to have a test suite and an excellent way to show users
that your system gets reasonable results.

It would be good engineering practice for the person writing to the code to not be the one writing the test suite. In practice, that will be less easy with open-source software.

I honestly don't know how many Sage developers would actually have picked up a book on software engineering, and read things like the person writing the code should not be the person writing the test. My guess is not very many - which is not helped by the fact the books tend to be quite expensive.

*Program proofs* are important but not yet well established in this
area. In prior centuries a "proof" in mathematics involved a lot
of "proof by authority" handwaving. In the 20th century the standard
of a mathematical proof was much more rigorous. In computational
mathematics, if we have proofs at all they are likely just some
pre-rigorous handwaving that ignores implementation details.
Worse yet, almost none of the systems so far have any kind of
theoretical scaffolding on which to hang a proof. If you don't
know what a Ring is and you don't have a strong definition of a
Ring underlying your implementation choices, how can you possibly
proof the code correct?



Developing computational mathematics for the long term....

For *documentation*, I believe that we should require literate programs
which contain both the theory and the implementation. Each algorithm
should be of publication quality and peer-reviewed.

Sounds good for a long term aim.

For *standard test suites*, I believe we should have these in many
different areas. We should collect a set of reference books that are
fairly comprehensive and jointly develop tests that run on all systems.
Rubi is doing this with rule based integration.
http://www.apmaths.uwo.ca/~arich/

I read that - see above comments.

Axiom is doing this with the Computer Algebra Test Suite
http://axiom-developer.org/axiom-website/CATS/index.html

If we publish and maintain the test suite results for all systems
there will be great pressure to conform to these results, great
pressure to develop standard algorithms that everyone uses, and
great pressure to improve areas of weakness.

But if a test suite was considered a "gold standard" then I doubt it would be too hard to add a bit of code to Sage which can do a particular integral in the test suite. Just adding code for the purpose of passing a test, would be tempting. I'm sure if Wolfram Research thought it was in their commercial interests, they could get Mathematica to pass all those Rubi tests.

I noticed that Mathematica appeared to do quite a bit better than Maple in those tests. That did not totally surprise me. You probably are aware of Vladimir Bondarenko, who is a somewhat strange character. But from some discussions I've had with him, he is of the opinion that Wolfram Research took quality control and bug reports more seriously than Maplesoft. But his way of reporting bugs was not exactly conventional.



Dave

--
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to