On Wed, Jan 9, 2013 at 2:35 AM, Mike Anderson >> First -- is this a "matrix" library, or a general use nd-array >> library? That will drive your design a great deal.
> This is very useful context - thanks! I've had opinions in favour of both an > nd-array style library and a matrix library. I guess it depends on your use > case which one you are more inclined to think in. > > I'm hoping that it should be possible for the same API to support both, i.e. > you should be able to use a 2D array of numbers as a matrix, and vice-versa. sure, but the API can/should be differnent -- in some sense, the numpy matrix object is really just syntactic sugar -- you can use a 2-d array as a matrix, but then you have to explicilty call linear algebra functions to get things like matrix multiplication, etc. and do some hand work to make sure you're got things the right shape -- i.e a column or row vector where called for. tacking on the matrix object helped this, but in practice, it gets tricky to prevent operations from accidentally returning a plan array from operations on a matrix. Also numpy's matrix concept does not include the concept of a row or column vector, just 1XN or NX1 matrixes -- which works OK, but then when you iterate through a vector, you get 1X1 matrixes, rather than scalars -- a bit odd. Anyway, it takes some though to have two clean APIs sharing one core object. >> not a bad start, but another major strength of numpy is the multiple >> data types - you may wantt to design that concept in from the start. > But I'm curious: what is the main use case for the alternative data types in > NumPy? Is it for columns of data of heterogeneous types? or something else? heterogeneous data types were added relatively recently in numpy, and are great mostly for interacting with other libraries (and some syntactic sugar uses...) that may store data in arrays of structures. But multiple homogenous data types are critical for saving memory, speeding operations, doing integer math when that's really called for, manipulating images, etc, etc..... > 20-100GB is pretty ambitious and I guess reflects the maturity of > NumPy - I'd be happy with good handling of 100MB matrices right > now..... 100MB is prety darn small these days -- if you're only interested in smallish problems, then you can probably forget about performance issues, and focus on a really nice API. But I"m not sure I'd bother with that -- once people start using it, they'll want to use it for big problems! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion