[sage-devel] Re: NSF white paper draft - call for comments

William Stein Tue, 29 May 2007 08:02:33 -0700

OK, I can't help but respond to this....

On 5/28/07, Richard Fateman <[EMAIL PROTECTED]> wrote:
> The FOI act request is not likely to work.
> Who is going to provide the information?  The NSF does not audit all grants
> down to the level where it could say if money in the budget was actually
> spent for a particular software product. You'd never get a response. If the
> NSF asked me, I wouldn't know, exactly. Some of my money probably went to a
> university or department license for software, but it came out of my grant
> as a monthly access fee.


I am not an accountant or lawyer, but I think
it can't hurt to try, depending on how much it costs us.
We would have to ask for something realistic, and the questions
you put forth above should help clarify what we ask for.
Perhaps we can ask for a rough estimate of how much software
money people actually ask for on awarded proposals, not how
much actually got spent.  My impression is that the NSF has
the absolutely highest standards about tracking how they spend
their money. At least they could give an upper bound about how much
money they award that is allocated to a category that could
be spent on software.

> Much easier in principle would be to ask Maplesoft and WRI how many licenses
> were sold to academic researchers. Not all are funded by NSF or even US
> gov't, but you might get an idea.  Of course these companies would be
> entirely within their rights to refuse to provide such info, and I would
> expect them to do so.

This won't work, as you explain.  Also, Maple, Mathematica,
MATLAB, etc., are all privately held companies so they are not required
to divulge information about their financial records to the general public.

> Now, the white paper problems...
>
> The FOI problem doesn't matter, I think, because the idea that the NSF
> should support open-source free software to compete with commercial software
> that does the same thing, is just a non-starter. Unless Bob Grafton is going
> to change things in a big way, and somehow convince peer reviewers,
> panelists, and other NSF program directors, that this change is good for
> science.

Maybe he is.  Maybe he isn't.   Is there some advantage for us by not
providing our perspective to him?  Even if what he does is an initial
drop in the bucket, it may make a difference.

I watched a documentary recently about open source.  Evidently the
open sourcing of Netscape was started primarily by a white paper
that circulated at Netscape, Inc., which eventually got the attention
of the top officers of the company.   Netwcape's open sourcing of
their software was a major event in the 1990s
that had a ripple effect on the software industry.

> The NSF is supposed to fund research projects that advance the state of the
> art, not duplicate commercial software that might be used by other people.
> Maybe.

According to http://www.nsf.gov/about/, the NSF's mission is to
"to promote the progress of science; to advance the national health,
prosperity, and welfare; to secure the national defense..."

I think open source software has the potential to contribute greatly
to the NSF mission to "promote the progress of science" in that it makes
easily and widely usable many specialized research-oriented  programs,
many of whose development is partially supported by NSF.

The NSF website goes on to say "NSF's goals--discovery, learning,
research infrastructure and stewardship--provide an integrated
strategy to advance the frontiers of knowledge, cultivate a
world-class, broadly inclusive science and engineering workforce and
expand the scientific literacy of all citizens, build the nation's
research capability through investments in advanced instrumentation
and facilities, and support excellence in science and engineering
research and education through a capable and responsive organization."

The NSF mission thus goes beyond just funding research projects that
advanced the state of the art.  They also fund infrastructure, education,
and projects that "expand the scientific literacy of all citizens".  I think
open source math software falls within the proviso of their mission
statement.

> The use of commercial software is rampant. Windows. Powerpoint. Excel.
> Oracle. Symantec, Cisco, Adobe.

The use of open source software is also rampant.  Firefox. Open
Office. Latex.  Postgresql/MySQL, etc.

> To argue that open-source is verifiable, correct, extensible  is bogus.
> It is essentially impossible to verify a CAS even if it is open-source,
> because no one has the time or energy to attempt it. So it doesn't matter.
> Has anyone verified GCC?
> The argument that a software program must be proved correct in order to use
> it to prove a theorem is also bogus.

Agreed, but I think this misses the point somewhat.
The statement you attack is the analogue of the statement "a theorem must
be proved correct in order to use that theorem". Of course that argument
is bogus, since referees make mistakes as do authors of papers.
The basic scientific requirement in mathematics is that the argument
that claims that a theorem is correct must be published in a journal
that anybody can access.    This is the minimum requirement for
publishing a pure mathematics theorem today; correctness is not a
minimum requirement (though we try to get close) since actually formally
verifying the correctness of proofs is often too difficult.

Most working pure mathematicians will agree that the fact proofs are published
with statements of theorems has a major positive impact on how mathematics
research is carried out.  When you read the proof of a theorem instead of
just the statement, even if the proof is incomplete -- or even wrong
(!), the following happens:

   (1) you much more deeply understand what the theorem really asserts,
   (2) you are better equipped to prove a similar theorem,
   (3) you learn *techniques* that can be applied to proving new
theorems in related (and sometimes unrelated) areas.

Likewise, when using an algorithm in a computer algebra system, something
similar happens.  For example, consider computation of Bernoulli numbers.
I once tried computing them in Maple, Mathematica, Magma, GAP, and PARI.
I was surprised to find that PARI's Bernoulli command was 10,000 times
faster than any other system at the benchmark I was doing, and I wondered
why.  A student of mine then looked at the source code of PARI and
(1)-(3) above occurred, and the student and I generalized the
algorithm to an algorithm to compute "generalized Bernoulli numbers".

In Mathematica, there is also a function to compute Bernoulli numbers.
 In version 5.1 it was very slow.  In versions >= 5.2 it is very fast
(it takes about twice as along as PARI). If I used only Mathematica,
instead of (1)-(3), I would have simply wondered what magic they use
to compute Bernoulli numbers -- in fact, once I did wonder, and
somebody looking over my shoulder said "they make huge tables". Whose
to know?

In fact, the Mathematica documentation has something to say about this
(see 
http://reference.wolfram.com/mathematica/tutorial/WhyYouDoNotUsuallyNeedToKnowAboutInternals.html):

"Particularly in more advanced applications of Mathematica, it may
sometimes seem worthwhile to try to analyze internal algorithms in
order to predict which way of doing a given computation will be the
most efficient. And there are indeed occasionally major improvements
that you will be able to make in specific computations as a result of
such analyses.

But most often the analyses will not be worthwhile. For the internals
of Mathematica are quite complicated, and even given a basic
description of the algorithm used for a particular purpose, it is
usually extremely difficult to reach a reliable conclusion about how
the detailed implementation of this algorithm will actually behave in
particular circumstances."

In short, they acknowledge that if you could view the source you could
sometimes vastly speed things up, etc., but it "will not be worthwhile."
This is like saying "you can read the statement of my theorems
but not the proofs, because though you could sometimes greatly improve
my theorems, their proofs are too complicated for you to understand
and it wouldn't be worth your time."  If a pure research mathematician tried to
make such claims, the mathematical community would not taken them
seriously.

> It is also more than a little
> disquieting that it is considered a good argument to fund computer
> programming research in order to prove some esoteric theorem's proof does
> not depend on a computer bug.  It suggests that somehow some (small)
> percentage of the tiny amount of money for pure math should be diverted to
> computer algebra researchers.  Instead we should be looking at diverting
> some percentage of the huge amount of money that is spent on scientific /
> engineering computing to CAS work.

The intention of the white paper is not supposed to be about esoteric
theorem proving research.  It's about diverting money to CAS work.
This should be clear given the list of CAS's we suggest supporting.

> By the way, the meaning of a mathematical proof (even if helped by a
> computer) is a much subtler concept than portrayed in this white paper. See
> the interesting paper by deMillo, Lipton, Perlis  in CACM  (1977). And the
> many retorts, to it [which mostly are ineffective].  One quick point: many
> published theorems are false. You don't need a computer for that.

Agreed. See above.

> The total cost of ownership of free software is not 0.  It is the cost of
> downloading, installing, updating, registering, etc.  For specialty software
> (i.e. not so many users), there may be no expert available, free or even for
> pay, to fix bugs. The situation for Gnu Common Lisp and some
> long-outstanding bugs.
> If I place a cost on the time of a software engineer of $50/hour
> ("wholesale"), and the installation of free software on computers for all
> interested parties takes a few hours, it may not be worth it.

This is a tired arguments
against open source software.

>  Compare that to the cost of buying a supported piece of software.

That software cost me $1000, and I still have to download, install,
update, register, etc., and in my experience the support often sucks
anyways.   Moreover there is the massive hidden cost of becoming
dependent on that commercial software in order to care out my research,
and to share my work with others.   That hidden cost is both huge
and in some cases quite painful.

> Paying a student $1000 does not guarantee that anything useful will emerge.

True, there is no guarantee.

> In fact, it rarely will emerge.

Blatantly false in my experience.   I am repeatedly simply *amazed*
at what students can do given suitable motivation, per review, workshops,
funding, etc.

> The question about ethics/law is entirely backwards.  Should the NSF pay
> some tax-exempt institution to hire a student to write a program to compete
> with a commercial, tax-paying, company?  Should the NSF be running companies
> out of business?

Should the NSF be keeping commercial math software companies in
business instead of supporting students to the exclusion of creating
infrastructure that "expand the scientific literacy of all citizens"?

> The NSF cannot reasonably look over everyone's shoulder and say, hm,
> couldn't you use free software XYZ instead of paying $1000 for commercial
> software.  Or $60 or $6?

Agreed.  We're not proposing that they do that.

> Now it is apparently the case that some
> governments have instituted a rule essentially requiring open-source for all
> government software.  This would be interesting, but I do not expect the US
> to do this.

We're not proposing they do that.

> Comments that SAGE is user friendly, extensible, well documented, are rather
> motherhood-and-apple-pie,  or false.   SAGE include Maxima, which is not, in
> my opinion, well documented, overall.

(1) We do not claim in the white paper that SAGE is any of those things.
In section 3, we list those among the guiding principles for SAGE
development, i.e., as goals that we aim for. In fact, I wrote that list
of goals over 2 years ago, and they have helped clarify our focus as
the developer base has grown.

(2) Your argument that the quality of SAGE's documentation is at best
the quality of Maxima's is incorrect.   SAGE uses Maxima behind the scenes,
just like programs use standard libraries behind the scenes.  All the
documentation that an end user views when working with SAGE is new
documentation written by us for new functions or functions that provide
a clean front end to functions provided by Maxima or other systems. Often
we try to improve on the documentation of the other systems
in writing our documentation.

> If Grafton wants to fund CAS research, I think he means to get a collection
> of research problems that researchers could attack that --along the way--
> would augment the power of a CAS.  Looking at the recently published ISSAC
> 2007 program, there is almost nothing that "researchers" are doing that
> would affect CAS performance or capabilities that the ISSAC program
> committee has accepted. Look at the WRI library or the Mapleprime library
> for things that strike someone's fancy.  More often than not, the
> implementation in these places is amateurish and could be the target for
> research.

That is one model for getting support for CAS research.  It doesn't
make sense to say that one must argue either for open source CAS's
*or* for research problems that just might help open source CAS's.
Why not argue for both?  That's what I do.

> (I'll be traveling for 2 weeks, so I am shooting this off fast, and maybe it
> is ill-considered.  I may not be listening for a while.. )
>
> RJF

Thank you for your counterarguments.

-- 
William Stein
Associate Professor of Mathematics
University of Washington
http://www.williamstein.org

--~--~---------~--~----~------------~-------~--~----~
To post to this group, send email to sage-devel@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://sage.scipy.org/sage/ and http://modular.math.washington.edu/sage/
-~----------~----~----~----~------~----~------~--~---

[sage-devel] Re: NSF white paper draft - call for comments

Reply via email to