On Jun 5, 2010, at 1:55 AM, David Kirkby wrote:
Many years ago I developed some software based on finite differences
to determine the impedance of an arbitrary shaped transmission line.
http://atlc.sourceforge.net/
That code is highly portable - it has been run by me on a Cray
supercomputer, and by someone else on a Sony Playstation! In between,
it build on just about every system I could get access too, which was
quite a lot.
I did not test that manually, but had a simple shell script which
copied the source code to multiple machines and built on them in
parallel. I set up systems running multiple Linux distributions, as
well as IRIX, NetBSD, OpenBSD, FreeeBSD, Solaris, AIX, HP-UX & tru64.I
also used some linux distros that others gave me access to. It was
not rocket science to do this - all I did was basically this below.
Consider 'server' the machine where I developed the software, and
client1, clinet2, client3 .. the clients on which I tested the
software. Assume the source code was source-x.y.z.tar.gz
1) Copy via passwordless scp source-x.y.z.tar.gz to client1, client2,
client2, client3 ..
2) Copy a small build script to each client. I've long since lost the
script (or rather it would take me too long to find it on tape), but
basically it did.
decompress the .tar.gz file
extract the tar file
change to the directory of source code
configure
make
make test
scp test.,log u...@server:$version/$OperatingSystem/$client/$date//
test.log
3) Run the build scripts on the multiple clients from the server via
scp. (It is trivial to run a command on a remote system via ssh).
The only exceptions to this, which had to be manually done, was some
machines I used (mainly Itanium based) on what was the HP Testdrive
farm. They insisted on the use of ftp/telnet, which I never bothered
automating.
This was 15 or so years ago, before things like buildbot or other
solutions existed.
For one individual to test their code on multiple machines is pretty
easy.
I believe we have something like this already, though it only pushes
to the build farm (which obviously isn't diverse enough).
You're right, it shouldn't be to hard to put something like this
together, it's just that no-one's sat down and done it yet.
If this simple system was applied to Sage, it would be complicated by
two things.
1) You could only test your own code, not others code, so interactions
between patches would not be easy to test. That's only going to be
found when everyones patches are integrated together.
But it seems to me a lot of what gets broken, and what release
managers have to deal with, is code that builds on platform X but not
on platform Y. It would make the release managers job a lot easier if
people could test their code on multiple machines before submitting it
for review.
2) t2' is so slow. The biggest problem is building ATLAS, but an
updated (beta) release of ATLAS apparently builds in < 1 hour on 't2',
which is still slow, but a lot quicker than the current process. (It
has tuning parameters for the T2+ processors, so avoids the lengthly
tuning process).
Fortunately, most patches only involve rebuilding a small number of
files (potentially just pure Python in the Sage library), so a full
rebuild would not be needed.
I urge William if possible to try to get some money for a decent SPARC
machine. 't2' is very very very slow. The skynet SPARC boxes are just
very slow. They are old machines running at 1.28 GHz. (The Blade 2500
was introduced in 2003, so they are 7 years old).
If Oracle would agree, selling or part-exchanging 't2' would be one
solution. I suspect 't2' would fetch nearly enough money to buy a
eight-core M3000, which would blow 't2' away for what we use it for.
('t2' is designed for a VERY different task to what we use it for).
If we had a pull, rather than push, build farm, then those with
hardware they cared about Sage working on could participate by
donating their cycles more easily. That's a more scaleable solution as
well. Perhaps we could take advantage of http://gcc.gnu.org/wiki/CompileFarm
occasionally as well?
3) It would put a very high load on the machines.
But ultimately, a very simple system which copies a tar file to
multiple remote systems, builds and tests them is not hard to write.
Setting up a buildbot is not something I've done, but I can't believe
that is rocket science, but clearly a lot of work to get it just
right. It uses ssh which was the exact same system I used 15 years
ago.
Part of the problem to me seems to be that there is not a central
repository for Sage. It would be good if as soon as a patch gets
positive review, everyone had immediate access to the source tree with
that patch integrated. That must include both standard packages and
the library.
How do deal with spgks vs. the standard library is a sticky issue--
certainly the latter could be automated much more easily, and is where
most of the development takes place (though also much less likely to
be platform dependent).
There are other things that I believe would be useful.
1) People to read about release management - there is plenty on the
web.
2) There is at least one commercial build system
http://www.viewtier.com/buy.htm for which free licenses are available
for open-source projects. (Personally, whilst I'd prefer open-source
solutions, but we have to accept that sometimes commercial systems are
more appropriate). But ultimately, it takes time to evaluate the
various candidate solutions, and narrow it down.
3) Try to fund a release manager, who comes from a software
engineering background, not a maths background.
4) People read books on software engineering. Ultimately, Sage is an
engineered product, which IMHO should have software engineering skills
applied to it. Currently it appears to lack that to me.
5) Make it clear to students that there is a very different thing to
develop software for their own use, and to get it put into Sage's
wider community. That means documenting it well, testing it on
multiple system.
Personally, I don't believe Sage is tested enough, which often results
in tickets being created to fix bugs. Just have a read about how
sqlite (which is a component of Sage), is tested.
http://www.sqlite.org/testing.html
Perhaps if Sage patches had to got through the same rigorous process,
there would be less need for as many bug-fixes. That would cut down
the amount of time the release manager needs to deal with patches. The
vast majority of patches are to fix defects. Let's pick 20 in order,
starting at 9000.
#9000 enhancement
#9001 defect
#9002 defect
#9003 defect
#9004 defect
#9005 enhancement
#9006 defect
#9007 gcc-bug->wontfix
#9008 defect
#9009 defect
#9010 enhancement
#9011 defect
#9012 defect
#9013 defect
#9014 enhancement
#9015 defect
#9016 defect
#9017 defect
#9018 defect
#9019 defect
#9020 enhancement
So 70% (based on that small sample) are patches to correct defects.
Some of those ticket numbers looked familiar, and indeed three of them
that I worked on should have been labeled enhancement. (Defect is the
default, if one doesn't change it--it's probably be better to have a
blank default then it's clear when something hasn't been filed
correctly.) Still way to many bugs.
Perhaps if there was more emphasis on testing (as there is in sqlite),
and less on getting ones own patches integrated quickly, there would
be less bugs to correct. That would dramatically reduce the workload
for the release managers, as there would be less tickets to merge.
It may be that the present large number of tickets is unsustainable
without a paid release manager. If that is the case, reducing the
quantity of patches, whilst increasing the quality, might be the
answer.
Before commenting, please make sure to read
http://www.sqlite.org/testing.html
it is not very long.
I'm a huge fan of both more testing and more automation.
- Robert
--
To post to this group, send an email to [email protected]
To unsubscribe from this group, send an email to
[email protected]
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org