Re: [sage-devel] Re: Sage & Pari
> On Jul 27, 11:58 am, John Cremona wrote: > > My innocent posting to sage-devel on this subject two days ago seems > > to have done something to increase the number of postings this month, > > even if a lot of the discussion has been much wider than the orginal > > title (Sage & Pari) suggests! Well, credit is "the coin of the realm" in open source so it is important to make sure that those who work get recognized ("paid"). This is especially sensitive in an academic environment. If you spend a year writing software but don't publish then the tenure committee won't see any of that work. I managed to widen ("hijack") this thread to the larger topic of computational math and literate programming. Apologies for that. Tim Daly -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
I cited Pari in a non-peer reviewed article today. Bill. On Jul 27, 11:58 am, John Cremona wrote: > My innocent posting to sage-devel on this subject two days ago seems > to have done something to increase the number of postings this month, > even if a lot of the discussion has been much wider than the orginal > title (Sage & Pari) suggests! > > John -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Sage & Pari
My innocent posting to sage-devel on this subject two days ago seems to have done something to increase the number of postings this month, even if a lot of the discussion has been much wider than the orginal title (Sage & Pari) suggests! John -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
Of course if you code is so dependent on CPU cycles then it is a good candidate for assembly optimisation. After superoprimisation you might hit the theoretical retirement rate of muops in which case you can actually say something about how fast it will run. Integer arithmetic is one such case. In cases where you are entirely cache dependent you may be able to set a cutoff based entirely on the sizes of the caches on the machine. But yes, in general it is an understatement. You are right. Bill. On Jul 27, 11:10 am, Volker Braun wrote: > On Wednesday, July 27, 2011 10:33:57 AM UTC+1, Bill Hart wrote: > > > It's rarely possible to fully > > explain a cutoff without reference to the architecture, e.g. > > Instruction latency, caching, etc > > Thats an understatement. The current state of CPUs makes it factually > impossible to precisely quantify the execution time of a particular piece of > code without running it. For starters, there are multiple cache levels of > varying size. The compiler will reorder your C/C++ program when generating > x86 (say) assembly, and the CPU will again reorder the x86 operands when it > decodes them into mu-ops. And thanks to speculative execution sometimes > mu-ops are effectively processed in parallel (depending on their order). For > all practical purposes its impossible to say how many clock cycles are used > to evaluate some C/C++ code. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
On Wednesday, July 27, 2011 10:33:57 AM UTC+1, Bill Hart wrote: > > It's rarely possible to fully > explain a cutoff without reference to the architecture, e.g. > Instruction latency, caching, etc Thats an understatement. The current state of CPUs makes it factually impossible to precisely quantify the execution time of a particular piece of code without running it. For starters, there are multiple cache levels of varying size. The compiler will reorder your C/C++ program when generating x86 (say) assembly, and the CPU will again reorder the x86 operands when it decodes them into mu-ops. And thanks to speculative execution sometimes mu-ops are effectively processed in parallel (depending on their order). For all practical purposes its impossible to say how many clock cycles are used to evaluate some C/C++ code. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
Hi Tim, Let me first say that I am a little embarrassed that you downloaded Flint as a possible example of literate code. Reading my original post I did seem to give that impression. Actually I was merely mentioning it as an example of a documented system, not necessarily literate as such. I didn't personally implement the Bell numbers code. The 7000 is almost certainly a tuning cutoff which works well for "Fredrik's laptop" which probably explains why it is not documented. At this point there is no general tuning framework in flint. When we do implement that this number will simply be autogenerated for the architecture on which Flint is built. It's rarely possible to fully explain a cutoff without reference to the architecture, e.g. Instruction latency, caching, etc. But sometimes the cutoffs defy explanation even when this information is available. It is useful when a formula is found which can be used with only a few inputs such as cache size and relative latencies as tuning can then often be done without running timing experiments at build time, which is time consuming. I applaud your efforts with Clojure and look forward to seeing the results. This is an important language which has received a lot of attention. In order to answer your other questions I will need to be at my laptop, which I will not be for about a day. I'll try to reply in more detail then. I think it would be very interesting to have a Flint example of literate programming. I'd be happy to provide some details (or at least ask Fredrik if he might help). Again with regard to the Sage library I wasn't directly claiming it was literate in the sense you mean it. Merely that it is sometimes a reasonable approximation. I'd be happy to look for some good examples of this, though many of the Python files will do. It probably only takes a few minutes to find an example of Sage best practice. Bill. On Jul 27, 7:32 am, daly wrote: > > > Again I have to credit Sebastian Pancratz and Fredrik Johansson here > > > for raising the standard in flint. I thought I had been producing > > > beautiful code until Sebastian started rewriting some of it for > > > me. :-) > > I downloaded Flint and looked at the source code and documentation. > First, I applaud you on the PDF. It looks very nice and it includes > quite a bit of information. I especially liked the hyperlinks to > Wolfram's site and the fact that you included a bibliography. > The inline mathematics is very useful (a TeX advantage). > > As an experiment I tried to understand "Bell Numbers", of which I > know nothing. Since the algorithm for "bell_number_vec_recursive" > is small I can see what is being computed. However, I also see that > "bell_number_vec" contains a magic number 7000 as the defining > value to split the computation. One advantage of a literate form > of the documentation would be the obvious need to explain this > number. The pdf simply says that it "Automatically switches..." > but not when or why. > > The hard part of a fully literate document is the need to explain > the special "tricks" that make an algorithm fast. Mathematically > they might not matter but computational math hinges on the details. > Many an algorithm in Sage has been carefully tuned to get speed > and it is this careful tuning that makes the algorithm worthwhile. > It also makes it obscure. But this tuning includes crucial knowledge > that cannot be reverse-engineered and is only known to the author. > Eventually the author dies, as has happened with several Axiom authors. > > If a future maintainer needed to port Flint to run on the ARM > or the Apple5 or the GPU/CPU/APU setup what are the critical things > to optimize? Is 7000 machine independent? Knowing these details is > the difference between a classroom algorithm and a world class one. > If I write Bell Numbers in Axiom, does 7000 matter? > > The reason to favor a fully-embedded literate program is that you > can't ignore or hide any implementation detail. The magic number > 7000 would cry out for an explanation. Keeping the code in one > file and the documentation in another file makes it trivial to > skip details. A peer review of a literate program would certainly > notice that this was skipped. > > I am tempted to make a literate paper about bell numbers that > is built on your code but I'd obviously make a fool of myself > in front of a crowd of number theorists :-) Never the less, > a small "pamphlet" on the implementation of Bell Numbers would > make a good starting example to critique. I would be willing > to provide the tools and techniques if you'd be willing to > provide the explanations. Feel free to decline. > > Tim Daly -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Sage & Pari
> > Again I have to credit Sebastian Pancratz and Fredrik Johansson here > > for raising the standard in flint. I thought I had been producing > > beautiful code until Sebastian started rewriting some of it for > > me. :-) I downloaded Flint and looked at the source code and documentation. First, I applaud you on the PDF. It looks very nice and it includes quite a bit of information. I especially liked the hyperlinks to Wolfram's site and the fact that you included a bibliography. The inline mathematics is very useful (a TeX advantage). As an experiment I tried to understand "Bell Numbers", of which I know nothing. Since the algorithm for "bell_number_vec_recursive" is small I can see what is being computed. However, I also see that "bell_number_vec" contains a magic number 7000 as the defining value to split the computation. One advantage of a literate form of the documentation would be the obvious need to explain this number. The pdf simply says that it "Automatically switches..." but not when or why. The hard part of a fully literate document is the need to explain the special "tricks" that make an algorithm fast. Mathematically they might not matter but computational math hinges on the details. Many an algorithm in Sage has been carefully tuned to get speed and it is this careful tuning that makes the algorithm worthwhile. It also makes it obscure. But this tuning includes crucial knowledge that cannot be reverse-engineered and is only known to the author. Eventually the author dies, as has happened with several Axiom authors. If a future maintainer needed to port Flint to run on the ARM or the Apple5 or the GPU/CPU/APU setup what are the critical things to optimize? Is 7000 machine independent? Knowing these details is the difference between a classroom algorithm and a world class one. If I write Bell Numbers in Axiom, does 7000 matter? The reason to favor a fully-embedded literate program is that you can't ignore or hide any implementation detail. The magic number 7000 would cry out for an explanation. Keeping the code in one file and the documentation in another file makes it trivial to skip details. A peer review of a literate program would certainly notice that this was skipped. I am tempted to make a literate paper about bell numbers that is built on your code but I'd obviously make a fool of myself in front of a crowd of number theorists :-) Never the less, a small "pamphlet" on the implementation of Bell Numbers would make a good starting example to critique. I would be willing to provide the tools and techniques if you'd be willing to provide the explanations. Feel free to decline. Tim Daly -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Sage & Pari
...[snip]... > > In regular mathematics it is standard practice to fully document, that > > is, show the complete proof of your findings. In computational maths > > we do not do this. There is no particular reason why we don't except > > that it is not the expected behavior. > > Unfortunately this is rarely true these days. In regular mathematics > it is now very common for there to be large leaps between steps. > Usually the more advanced the mathematician the less detail that is > expected. Graduate students spend a lot of time filling it in. > > But your point regarding computational mathematics is spot on. It is > very common to find code completely undocumented in any way. There is > a substantial disparity in attitudes here. What concerns me is the long run. We are at the very beginning of a new science of computational mathematics. The early systems are the "newton's notebooks" of this science. Most of them are locked up in companies (MMA, Maple, Derive, Macsyma, etc.). But companies die and take their software with them, for example, Macsyma. (For company deaths see the last few minutes of: http://www.ted.com/talks/geoffrey_west_the_surprising_math_of_cities_and_corporations.html If we are to have a firm foundation for computational mathematics it is necessary that we begin to build the citation trail of algorithms available in a public, published way. Why are we wasting time and effort reinventing algorithms? Why can't I find executable algorithms along with their theory? Why can't I attend a talk, download the literate paper, and execute the algorithm while the talk is ongoing? > > When I began my massive rewrite of flint from scratch I decided that > each function would have a complete description and justification of > the algorithm (if not well-known) in an associated text file, with > references and proofs if necessary. Sebastian Pancratz came along and > saw these text files and decided to write a parser that automatically > turned them into pdf documentation! Excellent! Although I'd much rather follow Knuth and have a literate document I find it outstanding that you have created these. Can you point to a URL as an example? > > > > > We could raise the bar of expected behavior by having algorithms fully > > explained in the source code along with citations of theory sources. We > > could include a section on limitations, boundary conditions, examples, > > assumptions, and test cases. This is not as challenging as it sounds > > if it is done by the original author. It is extremely hard to recover > > or discover this information after the fact. > > To a good extent this is done in most of the Sage library. I agree > this should be the expected standard. I am unaware of this, which is clearly my fault. Can you point to an example you consider literate? > > Again I have to credit Sebastian Pancratz and Fredrik Johansson here > for raising the standard in flint. I thought I had been producing > beautiful code until Sebastian started rewriting some of it for > me. :-) > > > > > Knuth proposed the idea of "literate programs" which combine the > > actual source code with the human language text as a single document. > > The document re-arranges the source code for ease of explanation and > > includes the usual paper sections on background, theory, along with > > a bibliography citing prior (literate) work. (See Queinnec, Christian > > ``Lisp in Small Pieces'' ISBN 0-521-56247-3 for a literate example) > > My favourite example of this is Jonesforth. I highly recommend anyone > who has not read this program to do so immediately. It is one of the > most beautiful documents you can obtain for free. And you get yourself > a Forth implementation at no extra cost. All page references to Jonesforth seem to time out. I am writing a similar document for the Clojure language. The literate document contains a full implementation of Clojure. Unpacking the document creates a Clojure REPL and a PDF of the documentation (well, as far as it has gotten at this point). source: http://daly.literatesoftware.com/clojure.pamphlet pdf : http://daly.literatesoftware.com/clojure.pdf > > I don't personally do literate programming in the sense you intend it, > with rearranged code so that it reads more naturally. But for very > complex and touchy pieces of critical code I sometimes add formal > justifications for every line of code. A good example of this is the > bitpacking code in flint. > > > > > We could take small steps in this direction, possibly by hosting a > > "literate program" track at conferences or workshops to encourage the > > more literate among us to show examples and build up the technology. > > I could certainly imagine a talk being constructed around an actual > program. I might try this one day and see how it goes. I would like to see Sage have some small step in this direction but I have no idea where to start. I know next to nothing about number theory so I couldn't begin to rev
[sage-devel] Re: Sage & Pari
On Jul 26, 7:02 pm, daly wrote: > On Tue, 2011-07-26 at 08:30 -0700, Bill Hart wrote: > To paraphrase your above comment, I believe that we need to raise the > standards of computational mathematics further toward being "a > respectable sport". > > Consider your observation: > "Of course properly citing Pari and the algorithm used and checking > under what conditions the result is meaningful, etc, assumes that > these things are well documented and accessible. So, a positive step > the Pari developers could take is to make it easy to read the source > code for a given function in Pari, much as Sage has done for Sage > library code, and to document the algorithms used and where they are > published and what their restrictions are." > > In regular mathematics it is standard practice to fully document, that > is, show the complete proof of your findings. In computational maths > we do not do this. There is no particular reason why we don't except > that it is not the expected behavior. Unfortunately this is rarely true these days. In regular mathematics it is now very common for there to be large leaps between steps. Usually the more advanced the mathematician the less detail that is expected. Graduate students spend a lot of time filling it in. But your point regarding computational mathematics is spot on. It is very common to find code completely undocumented in any way. There is a substantial disparity in attitudes here. When I began my massive rewrite of flint from scratch I decided that each function would have a complete description and justification of the algorithm (if not well-known) in an associated text file, with references and proofs if necessary. Sebastian Pancratz came along and saw these text files and decided to write a parser that automatically turned them into pdf documentation! > > We could raise the bar of expected behavior by having algorithms fully > explained in the source code along with citations of theory sources. We > could include a section on limitations, boundary conditions, examples, > assumptions, and test cases. This is not as challenging as it sounds > if it is done by the original author. It is extremely hard to recover > or discover this information after the fact. To a good extent this is done in most of the Sage library. I agree this should be the expected standard. Again I have to credit Sebastian Pancratz and Fredrik Johansson here for raising the standard in flint. I thought I had been producing beautiful code until Sebastian started rewriting some of it for me. :-) > > Knuth proposed the idea of "literate programs" which combine the > actual source code with the human language text as a single document. > The document re-arranges the source code for ease of explanation and > includes the usual paper sections on background, theory, along with > a bibliography citing prior (literate) work. (See Queinnec, Christian > ``Lisp in Small Pieces'' ISBN 0-521-56247-3 for a literate example) My favourite example of this is Jonesforth. I highly recommend anyone who has not read this program to do so immediately. It is one of the most beautiful documents you can obtain for free. And you get yourself a Forth implementation at no extra cost. I don't personally do literate programming in the sense you intend it, with rearranged code so that it reads more naturally. But for very complex and touchy pieces of critical code I sometimes add formal justifications for every line of code. A good example of this is the bitpacking code in flint. > > We could take small steps in this direction, possibly by hosting a > "literate program" track at conferences or workshops to encourage the > more literate among us to show examples and build up the technology. I could certainly imagine a talk being constructed around an actual program. I might try this one day and see how it goes. > > Raising the standards for published mathematical software will help > us all in the long run. We can look forward to a time when we can > read, understand, and cite particular Pari and Sage algorithms which > are their actual implementations in literate form. > It's not the only important step that needs to be taken though. My epiphany regarding automated program checking came when I serendipitously encountered a fantastic book on Prolog on the exact same day that I sat down to implement Damas-Hindley-Milner type inference for an interpreter I have been writing. I marvel that a 20 line implementation of the unification algorithm saves me so much hassle. After this little epiphany I have not had any problems seeing the future. Type inference and unification have been around for ages, but I feel like someone who has been in a time machine and returned and who has to sit around and watch the inevitable development of the compiler technology that I have already seen. Progress feels like a slow motion replay. Bill. -- To post to this group, send an email to sage-devel@goog
[sage-devel] Re: Sage & Pari
On 7/26/11 11:03 AM, William Stein wrote: (Sorry if anybody's favorite Sage component is missed in the attached diagram -- these were just the notes for my talk, and the diagram I actually drew was by hand with chalk and had ... below some spots to emphasize incompleteness.) Nice; I think it's funny how Jmol is pictured in the center, but not mentioned :). Matplotlib should also probably have been in the python list, as it's a well-used component of Sage. But all in all, this is a great diagram! Jason -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Sage & Pari
On Tue, 2011-07-26 at 08:30 -0700, Bill Hart wrote: > Hi Simon, > ...[snip]... > > Of course properly citing Pari and the algorithm used and checking > under what conditions the result is meaningful, etc, assumes that > these things are well documented and accessible. So, a positive step > the Pari developers could take is to make it easy to read the source > code for a given function in Pari, much as Sage has done for Sage > library code, and to document the algorithms used and where they are > published and what their restrictions are. > > A lot of Pari code is especially difficult because it uses messy stack > allocation all over the place. This is not thread safe and it is > difficult to read and contribute to. If I wanted to make Pari really > easy to contribute to and understand, I'd probably use garbage > collection for all but the core arithmetic routines. I'd still make > the core threadsafe though. I believe flint2 proves that this is > possible in an efficient manner without messy stack allocation > routines. > > The Pari developers might also like to consider people who wrote the > Sage wrapper for Pari as contributing to Pari and credit them as > such. > > It is in matters like these that Sage has taken a massive first step > towards making Computational Number Theory a respectable sport. Whilst > it doesn't address all the issues I raised in my earlier post, it has > taken the field from being a mass of poorly documented, broken, non- > portable, incompatible code lying around on people's hard drives or > locked up in inscrutable closed implementations to being a unified > distribution which meets some minimum standards. The code in the Sage > library is open and easily accessible. Every function has a docstring > and doctests (alright, that's a lie, but it's almost true). Library > code is subjected to some kind of peer review. References to papers > are provided. And moreover, Sage builds on multiple platforms and is > tested regularly so it is more useful for some people. These are all > very important first steps in bringing the field into good repute. > None of these things seems to have hurt the popularity of Sage. > > Having said that, I personally find it very difficult to trace through > which algorithm is used in what regime in Sage. I once made some > comments about which algorithms were used in Sage in a certain regime > and someone lambasted me for getting it wrong. I subsequently > carefully checked it through, following the rabbit warren of function > calls across multiple interface boundaries through many files > separated across vast reaches of the Sage directory tree and I am > convinced I had the details substantially right. Of course if we just > relied on Sage documentation to tell us what algorithm is being used > in which package, we'd get it wrong much of the time, because that > relies on the documentation having been maintained correctly. But at > least Sage makes a step in the right direction. I've had similar > issues trying to trace through the linbox package to figure out which > algorithm is used. ...[snip]... To paraphrase your above comment, I believe that we need to raise the standards of computational mathematics further toward being "a respectable sport". Consider your observation: "Of course properly citing Pari and the algorithm used and checking under what conditions the result is meaningful, etc, assumes that these things are well documented and accessible. So, a positive step the Pari developers could take is to make it easy to read the source code for a given function in Pari, much as Sage has done for Sage library code, and to document the algorithms used and where they are published and what their restrictions are." In regular mathematics it is standard practice to fully document, that is, show the complete proof of your findings. In computational maths we do not do this. There is no particular reason why we don't except that it is not the expected behavior. We could raise the bar of expected behavior by having algorithms fully explained in the source code along with citations of theory sources. We could include a section on limitations, boundary conditions, examples, assumptions, and test cases. This is not as challenging as it sounds if it is done by the original author. It is extremely hard to recover or discover this information after the fact. Knuth proposed the idea of "literate programs" which combine the actual source code with the human language text as a single document. The document re-arranges the source code for ease of explanation and includes the usual paper sections on background, theory, along with a bibliography citing prior (literate) work. (See Queinnec, Christian ``Lisp in Small Pieces'' ISBN 0-521-56247-3 for a literate example) We could take small steps in this direction, possibly by hosting a "literate program" track at conferences or workshops to encourage the more literate among us
[sage-devel] Re: Sage & Pari
On Jul 26, 5:59 pm, William Stein wrote: > When doing such investigations, we care greatly that the data we get is > correct. We just don't care so much that the program producing it be > *proven* correct in a formal and technically rigorous sense. I understand that you qualified that with "such investigations" in reference to my example of investigating conjectures and gathering numerical evidence etc. And I agree with you in that context. However, I wish to emphasise the point I am making that the issues of correct data and rigorous programming methods are very much related. We are currently doing this ad hoc. It will in future be largely automated. And where it is not, it will be part of producing code to publication standard. I only wish to emphasise the point so that in a few decades time you will see how absolutely certain I was of that fact at this moment. That is to take nothing away from the important work of FoCM and similar ventures to improve attitudes. That is important too and I believe very relevant to Pari. Bill. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
Re: [sage-devel] Re: Sage & Pari
On Tue, Jul 26, 2011 at 9:00 AM, Bill Hart wrote: > > Regarding rigor, mathematics itself went through a phase where > informal arguments were displaced with formal ones. Likewise, informal > computer programs will eventually give way to formally verified ones, > and this will naturally be embraced by mathematicians. You are right > that this will not directly have an effect on Number Theory > computations as they are currently done because people will continue > to use computers as a tool to investigate conjectures and collect > numerical evidence and so on. They care little about whether the > program is correct. > When doing such investigations, we care greatly that the data we get is correct. We just don't care so much that the program producing it be *proven* correct in a formal and technically rigorous sense. When people begin justifying their programs in their papers, (which > will of course also be electronic in the future), then the field will > become more academically sound and eventually the problems it > currently has will become a thing of the past. At the moment the field > (if you can even call it that) has a serious image problem. People > speak about working on computation in whispers as though it is sinful. > There are other things happening in the world of mathematics that are helping with this image problem. For example, I was just at FoCM 2011 [1] and did not get this impression about computation, and there was a very wide range of mathematics represented (everything from numerical PDE's to Computational Topology to Number Theory). And there were certainly some well respected mathematicians among the participants (e.g., Steven Smale, Gunar Carlson, etc.). The people who build communities (major conference series / proceedings volumes) like the FoCM organizers are helping. [1] http://www.damtp.cam.ac.uk/user/na/FoCM11/workshops.html > This is possibly something you don't encounter as a computer > scientist. > > Bill. > > -- > To post to this group, send an email to sage-devel@googlegroups.com > To unsubscribe from this group, send an email to > sage-devel+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/sage-devel > URL: http://www.sagemath.org > -- William Stein Professor of Mathematics University of Washington http://wstein.org -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
On Jul 26, 1:51 am, rjf wrote: > I hope that Bill was hoping to stir up some dissenting opinions. Yes, naturally. However, it looks to me that you have simply recorded the way things currently are. People will do whatever they feel like. That's true by definition. Regarding rigor, mathematics itself went through a phase where informal arguments were displaced with formal ones. Likewise, informal computer programs will eventually give way to formally verified ones, and this will naturally be embraced by mathematicians. You are right that this will not directly have an effect on Number Theory computations as they are currently done because people will continue to use computers as a tool to investigate conjectures and collect numerical evidence and so on. They care little about whether the program is correct. I'm also not talking about program proving in the sense that people usually mean it. I simply mean that compiler technology will give more confidence in implementations by virtue of the fact that they will check things currently left unchecked. This will obviously be picked up by serious mathematicians, namely the ones who are currently growing up with computers and who come to care about rigor, and will be made a formal part of doing mathematics with computers. I don't envision some centralised authority imposing conditions on published work. It will simply become part of mathematics de rigueur. Reearchers will not risk submitting papers that may be rejected on the grounds of inappropriate formal justification of their code for the standard of journal they are submitting to. When people begin justifying their programs in their papers, (which will of course also be electronic in the future), then the field will become more academically sound and eventually the problems it currently has will become a thing of the past. At the moment the field (if you can even call it that) has a serious image problem. People speak about working on computation in whispers as though it is sinful. This is possibly something you don't encounter as a computer scientist. Bill. -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
Hi Simon, After reading what you wrote, I fully agree with you. Your distinction between citing a citation and citing the original work is a useful one. This would happen naturally if you focused on identifying the algorithm and understanding its limitations, as I recommended. But they way you have put it is more nuanced and sensible. Actually, regarding thin wrappers, nothing irritates me more than a programmer saying "we have just implemented feature X in system Y" when what they mean is they "wrapped function W from external library Z" and called it an "implementation" of feature X. [This issue actually grates on me a lot. I recently saw the most absurd example of this. Someone claimed they had "implemented" a C REPL. It turned out to be a Python program which took the C code you entered and passed it to GCC via the command line. It was completely and totally broken and didn't even preserve state. Now, it turned out that a group at CERN has actually been working on a real C++ REPL called Cling, for analysing data from the LHC (something like 25 petabytes of serialised C++ objects a year after filtering). I could imagine some miscreant wrapping the whole Cling project in Python and claiming they have implemented a C++ REPL. At least the Sage library has many new and interesting implementations of original algorithms which build on the functionality it "wraps".] As for citation, unfortunately professional mathematicians rarely have time to learn anything about programming and so often you find less than adequate details when they cite computer software. I've been at a Pure Maths conference for a week and a half and I've seen computers mentioned twice. The first time no credit was given at all (though it was almost certainly Pari). The second time the entire credit was "SAGE", although the speaker took care to acknowledge the help of John Cremona in helping them perform the computation. In that case I believe the computation was possibly done entirely in Pari via Sage. It looked to be computation of class numbers and unit groups of some number fields, although I am not 100% sure of this. The speaker was delighted that "computers can do these things". Of course computers can't do anything. Better to cite the person who did the implementation. And did that speaker realise that the computation was subject to the GRH or worse? They may be excused for not going into great detail if the computation was as mundane as I thought it was. It seems to fall into the category of "by a well known result...". The situation we really need to be worried about is where some special algorithm is used which makes a given computation possible where it was not otherwise. If my computation relied critically on multiply two large polynomials of degree ten billion, I might want to mention that I used flint via Sage, in case the reader haplessly tries to do this in Pari or some system that only supports polynomials of length up to 2^30. Of course properly citing Pari and the algorithm used and checking under what conditions the result is meaningful, etc, assumes that these things are well documented and accessible. So, a positive step the Pari developers could take is to make it easy to read the source code for a given function in Pari, much as Sage has done for Sage library code, and to document the algorithms used and where they are published and what their restrictions are. A lot of Pari code is especially difficult because it uses messy stack allocation all over the place. This is not thread safe and it is difficult to read and contribute to. If I wanted to make Pari really easy to contribute to and understand, I'd probably use garbage collection for all but the core arithmetic routines. I'd still make the core threadsafe though. I believe flint2 proves that this is possible in an efficient manner without messy stack allocation routines. The Pari developers might also like to consider people who wrote the Sage wrapper for Pari as contributing to Pari and credit them as such. It is in matters like these that Sage has taken a massive first step towards making Computational Number Theory a respectable sport. Whilst it doesn't address all the issues I raised in my earlier post, it has taken the field from being a mass of poorly documented, broken, non- portable, incompatible code lying around on people's hard drives or locked up in inscrutable closed implementations to being a unified distribution which meets some minimum standards. The code in the Sage library is open and easily accessible. Every function has a docstring and doctests (alright, that's a lie, but it's almost true). Library code is subjected to some kind of peer review. References to papers are provided. And moreover, Sage builds on multiple platforms and is tested regularly so it is more useful for some people. These are all very important first steps in bringing the field into good repute. None of these things seems to have hurt the popularity of Sage. Hav
[sage-devel] Re: Sage & Pari
There seem to be two issues here. 1. It would be nice for Pari to get credit when Pari is used, even if the user gave credit to Sage. and 2. (Bill Hart's assertion) that citing an algorithm that you haven't studied is like citing a paper you haven't read. Regarding item 1. If you accept code under GPL, you are restricted in your further distribution of the code. I am unaware of requirements that you cite the code if you publish a result of that code. If that were the case, we would have to cite everything from Python, Cython, GCC, to Linux, or whatever layers of software were used. Maybe hardware too (which processors etc.) (It is sometimes useful to know such details, especially when comparing times; one hopes that numerical results in number theory are affected by the version of GCC, but that might happen.) Thus someone using Sage need not even mention Sage, though it would be courteous to do so. At least some of the early use of Macsyma by distinguished MIT math professors (usually through some surrogate lower-form-of-life:) ) was entirely hidden from view. Presumably all use of Maxima from Sage is hidden from the view of the user. It would be courteous to credit the programmers who wrote the programs that computed the results, but this is hardly ever known. Who really wrote the python system? Linux? etc. Giving credit where credit is due is hardly required of anyone. Indeed, taking credit where credit is NOT due occasionally happens. (Wolfram, in a blurb on their new CDF format, incidentally claims Wolfram was the major force behind MathML...) If Pari were distributed with a condition that persons using it would have to acknowledge it, then it would presumably not be part of Sage. If Pari has to show usage statistics, maybe some Sage server could be instrumented to show this. Otherwise, all one can do, it seems, is request number-theorists who use Sage to consider somehow whether they are computing using Pari, and cite it too. But they might not know. As for item 2, I disagree with Bill. I think that saying your result was computed by program X, and that other people (including the referees) can reproduce your results by using program X is perfectly valid. You might strengthen your article by saying that you believe the results to be correct for some reason, or even provide a proof of some sort. Your paper might still be right even if the program is riddled with bugs; your paper might be wrong and the program correct. The referees might accept the paper and it might be wrong. Or they might reject it even if it is right. Citing a paper rather than a computational result is not necessarily stronger, as evidenced by the fact that many papers have been shown to be wrong. It seems to me that the reality is: an author takes primary responsibility for the correctness of a paper submitted to a journal. Whatever tools, or references, are used to bolster the confidence of the correctness are potentially valid. I think we are largely past the idea that a paper should be rejected on the grounds that a proof was (merely) a computation and not confirmed by humans. But maybe I'm naive on that score. Also in response to Bill's view, it seems to me dubious that advances in proving program correctness or in formulation of standards will have much effect on number theory calculations. I hope that Bill was hoping to stir up some dissenting opinions. RJF -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
Hi Bill, let me comment on the "how to cite individual Sage components" topic. On 26 Jul., 00:18, Bill Hart wrote: > * Citing a Number Theory package you know nothing about in published > research is about as dubious as citing a paper you have never read. > Unless you are prepared to read the code and understand the algorithm > Pari uses, what right do you have to cite it? However, when writing > papers, if you use a published result which makes use of some other > published results, you usually cite the result you directly used, not > the many other results that backed it up. In fact it is considered > poor form to cite papers you know nothing about. You draw an analogy between "using software A that in fact uses software B behind the scenes" and "using a theorem of paper A that in fact uses a theorem from paper B". You conclude: One would only cite paper A, and it would actually be wrong to cite paper B if one doesn't know it; hence, one should (by analogy) cite software A, but not software B that one doesn't know but that does the actual work. I believe you need to differentiate a bit more. Assume that the theorem that you use appears in paper A just as a citation from paper B. What would you do in your paper? Would you plainly write "by Thm 3.2 in [A]", even though the theorem is not proved in A? Or would you try to look it up in paper B and write "by Thm 7.8 in [B] (see also Thm 3.2 in [A])"? The analogy of "paper A cites the theorem from paper B" in computer algebra is "software A just uses a thin wrapper around software B". That's to say: Assume that you have in your Sage session an object X that in fact is a group living in GAP. Now, you do X.SylowSubgroup(2). Sage's main contribution to that computation is to send the string "SylowSubgroup(%s, 2)"%X.name() to GAP and to wait for an answer. So, would you say "Sage has computed the Sylow subgroup"? Wouldn't it be more honest to say "GAP has computed the Sylow subgroup"? There are, of course, more complicated cases. For example, I wrote an (optional) spkg that computes modular cohomology rings of finite groups. * It makes intense use of GAP and Singular, but the code base is mainly original Cython code. However, the computations performed by GAP and Singular are essential steps in the algorithm. Would you say that it suffices to cite Sage for my spkg? I can tell you that some Singular developers would be VERY upset if I wouldn't properly credit Singular! And I agree with them. * In one line of an auxiliary method, I factor a multivariate polynomial over a finite field. The factorisation is of no importance for the final result, but it is part of a heuristic to speed up one step of the algorithm. Would you say that I should try to find out what component of Sage is responsible for the factorisation, and cite it? I think that would go too far. I don't know much about Pari. But if it does the actual work and Sage only provides the interface, then I think it is clear that Pari must be cited. Cheers, Simon -- To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URL: http://www.sagemath.org
[sage-devel] Re: Sage & Pari
Can we please be more precise and scientific when discussing this important issue. I find it confusing to try to think about the many issues raised without some precision. Some trivialities first. Please take care to distinguish: * finding a bug in Pari and contributing a bug report * finding a bug in Pari and contributing a fix to Pari * finding missing functionality in Pari and implementing it in the Sage library * finding missing functionality in Pari and implementing it in C and contributing it to Pari Can we also distinguish "blame" from "reasons things are the way they are". Are we criticising the choice of language (C) or are we simply giving a reason why people prefer contributing to Sage. Blame implies someone did something wrong. And if we are to "blame" someone for there only being two developers, please state the reasons. It is circuitous logic to state that no one wants to contribute to Pari because there are only two developers. It is therefore a logical fallacy to argue that someone should be blamed for this. Can we also distinguish between the fact that Pari has a developers list (I am subscribed and see posts every few days, the most recent being today) and the opinion that some functionality of the list is considered suboptimal. Also it is a matter of fact that there have been code contributions to Pari from the Sage group. Jason Moxham worked for 3 months on porting Pari to MSVC with a grant obtained from MSR by William Stein and myself. Some other issues require further discussion in my opinion. * I think it is simply an illusion that there are hordes of skilled C programmers out there who wish to write Open Source, Number Theory code. I think, basically, the intersection of these two specialties is almost empty these days. And where it is not, these people are working on their own projects or are already Pari developers. I think it is misguided for the Pari developers to expect Sage developers to contribute to Pari. The best way to improve the state of affairs is to train people, i.e. run undergrad classes, do summer projects, offer Computational Number Theory classes, teach mathematics students to write C, hire people to do PhD's and postdocs, run workshops on contributing to Pari, etc. * It is perfectly logical that the number of people citing Sage is increasing. The breadth of coverage in Sage of all areas of mathematics is increasing all the time. And the number of users of Sage is (or at least was) increasing. But this is not necessarily anything to do with Pari. * It is not at all clear to me that the number of people using Pari via Sage for publishable Computational Number Theory research is actually increasing. It may be that the number of real users of Pari is actually shrinking. I do not claim that this is the case. I am merely making the observation that I have not seen any quantifiable data to substantiate a claim in either direction (this may be a statement of ignorance on my part). * Citing a Number Theory package you know nothing about in published research is about as dubious as citing a paper you have never read. Unless you are prepared to read the code and understand the algorithm Pari uses, what right do you have to cite it? However, when writing papers, if you use a published result which makes use of some other published results, you usually cite the result you directly used, not the many other results that backed it up. In fact it is considered poor form to cite papers you know nothing about. Yet here we are asking people to cite Pari when they know nothing about it and in fact used Sage, which took care to use Pari in the way that it was intended (which was checked by referees right!?). The people who need to cite Pari are the people who interface directly with it in Sage. After all, they are implementing new algorithms which they are publishing, and they are carefully checking that Pari is being used correctly according to the published limits of the algorithms and the documented interface, right? If we apply the logic regarding citation of Pari to all software, then where does it stop? If someone writes a package on top of Sage, and another on top of that, should they all cite Pari if they use Number Theory functionality? Obviously not! And yes, I infer that it is the policy of the Sage developers to cite Pari when using number theoretical code that makes use of Pari. * It is very important to carefully check that you are using computer software carefully, according to the documented interface (it is documented, right!?) and that it does what you think it does. This is even more relevant for computer algebra than for papers, because papers -- especially ones in good journals -- are carefully peer reviewed and often relatively widely read, whereas computer software is usually not reviewed critically according to agreed standards and almost never read. In fact, computer software is not held to any academic standard. There is no rigorous formal basis fo