David Cournapeau wrote: > I would love having a core C library of containers -
I'm all for that. However, I think that a very, very common need is simply for a growable numpy array. It seems this would actually be pretty darn easy (again, for someone familiar with the code!). IIUC, it would be exactly like a regular old ndarray, except: 1) The data block could be larger than the current size of the array: - this would require an additional attribute to carry that size 2) There would be an "append" method, or some other name: - this would increase the size of the array, manipulating .dimensions, and, if needed, doing a reallocation of the data pointer. - This is not very complex code, really, you simply allocate more than required, choosing some (user-resettable?) scale: 25% more than before, or whatever. re-allocating memory is actually pretty cheap these days, so it's not all that critical how you do it. 3) we'd probably want a .fit() method that would resize the data block to fit the current size of the array. In fact, as I write this, I wonder if these features could simply be added to the regular old ndarray -- if not used, they wouldn't be much overhead. Issues: - views on the same data block: as the data pointer would need to be re-allocated, you couldn't have views on the data block. Maybe this could be handles in the same way that ndarray.resize is handles now: you'd get an error if you try to resize an array that has a view to its data block. This is a bit tricky, as it's really easy to create a view without thinking about it (like by using ipython, for instance) - You could only grow the array in the first dimension (C-order, anyway). I think it could still be very useful, handling many of the common cases. - if you append to an array of more than one dimension, do you need to give the entire appended slice? Again, restrictive, but still useful. - Others? -- I'm not thinking of any right now. David Cournapeau wrote: > If you need to manipulate > core datatypes (float, etc...), I think python lists are a significant > cost, because of pointer chasing in particular. Exactly, and add custom numpy dtypes as well. Putting python types and/or tuples of python types into a list can be far more memory intensive than putting them in a numpy array. That's why I built a python implementation of a growable array -- it does save memory, but it's a fair bit slower than lists -- when adding to it in python, you've got python types already, so the additional overhead of an extra function call, etc. is substantial. That's why I'd like a C version. Even more important is having one that is not only written in C (or Cython) but can be accessed directly from C (or Cython), so you can fill an array with core data types efficiently. Going from C-type to python object to put it in a list is substantial overhead. > Stefan used some tricks to avoid using python list and use basic C > structures in some custom code. Exactly -- and he's not he only one -- we're all re-inventing the wheel here. By the way -- what does the fromstring/fromfile code do when you don't give it a size to start with? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion