On Fri, Apr 4, 2008 at 3:31 PM, Anne Archibald <[EMAIL PROTECTED]> wrote:
> On 04/04/2008, Alan G Isaac <[EMAIL PROTECTED]> wrote: > > On Fri, 4 Apr 2008, Gael Varoquaux apparently wrote: > > > I really thing numpy should be as thin as possible, so > > > that you can really say that it is only an array > > > manipulation package. This will also make it easier to > > > sell as a core package for developpers who do not care > > > about "calculator" features. > > > > > > I'm a user rather than a developer, but I wonder: > > is this true? > > > > 1. Even as a user, I agree that what I really want from > > NumPy is a core array manipulation package (including > > matrices). BUT as long as this is the core of NumPy, > > will a developer care if other features are available? > > > > 2. Even if the answer to 1. is yes, could the > > build/installation process include an option not to > > build/install anything but the core array functionality? > > > > 3. It seems to me that pushing things out into SciPy remains > > a problem: a basic NumPy is easy to build on any platform, > > but SciPy still seems to generate many questions. > > > > 4. One reason this keeps coming up is that he NumPy/SciPy > > split is rather too crude. If the split were instead > > something like NumPy/SciPyBasic/SciPyFull/SciPyFull+Kits > > where SciPyBasic contained only pure Python code (no > > extensions), perhaps the desired location would be more > > obvious and some of this recurrent discussion would go away. > > It seems to me that there are two separate issues people are talking > about when they talk about packaging: > > * How should functions be arranged in the namespaces? numpy.foo(), > scipy.foo(), numpy.lib.financial.foo(), scikits.foo(), > numkitfull.foo()? > > * Which code should be distributed together? Should scipy require > separate downloading and compilation from numpy? > > The two questions are not completely independent - it would be > horribly confusing to have the set of functions available in a given > namespace depend on which packages you had installed - but for the > most part it's not a problem to have several toplevel namespaces in > one package (python's library is the biggest example of this I know > of). > > For the first question, there's definitely a question about how much > should be done with namespaces and how much with documentation. The > second is a different story. > > Personally, I would prefer if numpy and scipy were distributed > together, preferably with matplotlib. Then anybody who used numpy > would have available all the scpy tools and all the plotting tools; I > think it would cut down on wheel reinvention and make application > development easier. Teachers would not need to restrict themselves to > using only functions built into numpy for fear that their students > might not have scipy installed - how many students have learned to > save their arrays in unportable binary formats because their teacher > didn't want them to have to install scipy? > > I realize that this poses technical problems. For me installing scipy > is just a matter of clicking on a checkbox and installing a 30 MB > package, but I realize that some platforms make this much more > difficult. If we can't just bundle the two, fine. But I think it is > mad to consider subdividing further if we don't have to. If these were tightly tied together, for instance in one big dll , this would be unpleasant for me. I still have people downloading stuff over 56k modems and adding an extra 30 MB to the already somewhat bloated numpy distribution would make there lives more tedious than they already are. I think python's success is due in part to its "batteries included" > library. The fact that you can just write a short python script with > no extra dependencies that can download files from the Web, parse XML, > manage subprocesses, and save persistent objects makes development > much faster than if you had to forever decide between adding > dependencies and reinventing the wheel. I think numpy and scipy should > follow the same principle, of coming "batteries included". One thing they try to do in Python proper is think a lot more before adding stuff to the standard library. Generally packages need to exist separately for some period of time to prove there general utility and to stabilize before they get accepted. Particularly in the core, but in the library as well, they make an effort to chose a compact set of primitive operations without a lot of duplication (the old "There should be one-- and preferably only one --obvious way to do it."). The numpy community has, particularly of late, been rather quick to add things that seem like they *might *be useful. One of the advantages of having multiple namespaces would have been to enforce a certain amount of discipline on what went into numpy, since it would've been easier to look at and evaluate a few dozen functions that might have comprised some subpackage rather than, let's say, five hundred or so. I suspect it's too late now; numpy has chosen the path of matlab and the other array packages and is slowly accumulating nearly everything in one big flat namespace. I don't like it, but it seems pointless to fight it at this point. So in this specific case, yes, I think the financial functions should > absolutely be included; whether they should be included in scipy or > numpy is less important to me because I think everyone should install > both packages. > Personally I think it's a bad idea to add stuff that, as far as I can tell, no has even asked for yet. Put them in the sandbox. Advertise them. If people use them, figure out what needs to be changed. Then add them to SciPy once they've stabilized, if they actually get used. -- . __ . |-\ . . [EMAIL PROTECTED]
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion