Re: [Numpy-discussion] Huge arrays
On Wed, Sep 9, 2009 at 2:10 PM, Sebastian Haase wrote: > Hi, > you can probably use PyTables for this. Even though it's meant to > save/load data to/from disk (in HDF5 format) as far as I understand, > it can be used to make your task solvable - even on a 32bit system !! > It's free (pytables.org) -- so maybe you can try it out and tell me if > I'm right You still would not be able to load a numpy array > 2 Gb. Numpy memory model needs one contiguously addressable chunk of memory for the data, which is limited under the 32 bits archs. This cannot be overcome in any way AFAIK. You may be able to save data > 2 Gb, by appending several chunks < 2 Gb to disk - maybe pytables supports this if it has large file support (which enables to write files > 2Gb on a 32 bits system). cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Huge arrays
Hi, you can probably use PyTables for this. Even though it's meant to save/load data to/from disk (in HDF5 format) as far as I understand, it can be used to make your task solvable - even on a 32bit system !! It's free (pytables.org) -- so maybe you can try it out and tell me if I'm right Or someone else here would know right away... Cheers, Sebastian Haase On Wed, Sep 9, 2009 at 6:19 AM, Sturla Molden wrote: > Daniel Platz skrev: >> data1 = numpy.zeros((256,200),dtype=int16) >> data2 = numpy.zeros((256,200),dtype=int16) >> >> This works for the first array data1. However, it returns with a >> memory error for array data2. I have read somewhere that there is a >> 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still >> be below that? I use Windows XP Pro 32 bit with 3GB of RAM. > > There is a 2 GB limit for user space on Win32, this is about 1.9 GB. You > have other programs running as well, so this is still too much. Also > Windows reserves 50% of RAM for itself, so you have less than 1.5 GB to > play with. > > S.M. > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Huge arrays
Daniel Platz skrev: > data1 = numpy.zeros((256,200),dtype=int16) > data2 = numpy.zeros((256,200),dtype=int16) > > This works for the first array data1. However, it returns with a > memory error for array data2. I have read somewhere that there is a > 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still > be below that? I use Windows XP Pro 32 bit with 3GB of RAM. There is a 2 GB limit for user space on Win32, this is about 1.9 GB. You have other programs running as well, so this is still too much. Also Windows reserves 50% of RAM for itself, so you have less than 1.5 GB to play with. S.M. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Huge arrays
On Tue, Sep 8, 2009 at 7:30 PM, Daniel Platz < mail.to.daniel.pl...@googlemail.com> wrote: > Hi, > > I have a numpy newbie question. I want to store a huge amount of data > in an array. This data come from a measurement setup and I want to > write them to disk later since there is nearly no time for this during > the measurement. To put some numbers up: I have 2*256*200 int16 > numbers which I want to store. I tried > > data1 = numpy.zeros((256,200),dtype=int16) > data2 = numpy.zeros((256,200),dtype=int16) > > This works for the first array data1. However, it returns with a > memory error for array data2. I have read somewhere that there is a > 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still > be below that? I use Windows XP Pro 32 bit with 3GB of RAM. > > More precisely, 2GB for windows and 3GB for (non-PAE enabled) linux. The rest of the address space is set aside for the operating system. Note that address space is not the same as physical memory, but it sets a limit on what you can use, whether swap or real memory. Chuck. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype and dtype.char
Ok I finally got it I was going at it backward... Instead of checking for NPY_INT64 and trying to figure out which letter it is different on each platform) I needed to check for NPY_LONGLONG /NPY_LONG/ NPY_INT, etc.. i.e I need to check for the numpy types that have an associated unique letter. Not their aliases since these can be different... It works now. C. On Sep 8, 2009, at 1:13 PM, Charles سمير Doutriaux wrote: > Hi Robert, > > Ok we have a section of code that used to be like that: > > char t; > switch(type) { > case NPY_CHAR: > t = 'c'; > break; > etc... > > I now replaced with > char t; > switch(type) { > case NPY_CHAR: > t = NPY_CHARLTR; > break; > > But I'm still stuck with numpy.uint64 > NPY_UINT64LTR does not seem to exist > > What do you recommend? > > C. > > On Sep 8, 2009, at 1:02 PM, Robert Kern wrote: > >> 2009/9/8 Charles سمير Doutriaux : >>> Hi, >>> >>> I'm testing our code on 64bit vs 32bit >>> >>> I just realized that the dtype.car is platform dependent. >>> >>> I guess it's normal >>> >>> her emy little test: >>> for t in >>> [numpy >>> .byte >>> ,numpy >>> .short >>> ,numpy >>> .int >>> ,numpy >>> .int32 >>> ,numpy >>> .float >>> ,numpy >>> .float32 >>> ,numpy >>> .double >>> ,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: >>>print 'Testing type:',t >>>data = numpy.array([0], dtype=t) >>>print data.dtype.char,data.dtype >>> >>> >>> On 64bit I get for numpy.unit64: >>> Testing type: >>> L uint64 >>> >>> Whereas on 32bit i get >>> Testing type: >>> Q uint64 >>> >>> Is it really normal? I guess that means I shouldn't expect the >>> dtype.char to be the same on all platform >>> >>> Is that right? >> >> Yes. dtype.char corresponds more closely to the C type ("L" == >> "unsigned long" and "Q" == "unsigned long long") which is platform >> specific. >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it >> as >> though it had an underlying truth." >> -- Umberto Eco >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://**mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://*mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
On Tue, Sep 8, 2009 at 5:08 PM, David Cournapeau wrote: > - it remains to be seen whether we can do the py3k support in the > same source tree as the one use for python >= 2.4. Having two source > trees would make the effort even much bigger, well over the current > developers capacity IMHO. I know ipython is a very different beast than numpy for this discussion (no C code at all, but extensive, invasive and often obscure use of the stdlib and the language itself). But FWIW, I have convinced myself that we will only really be able to seriously tackle the 3 transition when we can ditch 2.5 compatibility and have a tree that runs for 2.6 only, with all the -3 options turned on. Only at that point does it become feasible to start attacking the 3 transition for us. We simply don't have the manpower to manage multiple source trees that diverge fully and exist separately for 2.x and 3.x. Cheers, f ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote: > Hi David, >> I already gave my own opinion on py3k, which can be summarized as: >> - it is a huge effort, and no core numpy/scipy developer has >> expressed the urge to transition to py3k, since py3k does not bring >> much for scientific computing. >> - very few packages with a significant portion of C have been ported >> to my knowledge, hence very little experience on how to do it. AFAIK, >> only small packages have been ported. Even big, pure python projects >> have not been ported. The only big C project to have been ported is >> python itself, and it broke compatibility and used a different source >> tree than python 2. >> - it remains to be seen whether we can do the py3k support in the >> same source tree as the one use for python >= 2.4. Having two source >> trees would make the effort even much bigger, well over the current >> developers capacity IMHO. >> >> The only area where I could see the PSF helping is the point 2: more >> documentation, more stories about 2->3 transition. > > I'm surprised to hear you say that. I would think additional developer > and/or financial resources would be useful, for all of the reasons you > listed. If there was enough resources to pay someone very familiar with numpy codebase for a long time, then yes, it could be useful - but I assume that's out of the question. This would be very expensive as it would requires several full months IMO. The PSF could help for the point 3, by porting other projects to py3k and documenting it. The only example I know so far is pycog2 (http://mail.python.org/pipermail/python-porting/2008-December/10.html). Paying people to do documentation about porting C code seems like a good way to spend money: it would be useful outside numpy community, and would presumably be less costly. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Huge arrays
On Wed, Sep 9, 2009 at 9:30 AM, Daniel Platz wrote: > Hi, > > I have a numpy newbie question. I want to store a huge amount of data > in an array. This data come from a measurement setup and I want to > write them to disk later since there is nearly no time for this during > the measurement. To put some numbers up: I have 2*256*200 int16 > numbers which I want to store. I tried > > data1 = numpy.zeros((256,200),dtype=int16) > data2 = numpy.zeros((256,200),dtype=int16) > > This works for the first array data1. However, it returns with a > memory error for array data2. I have read somewhere that there is a > 2GB limit for numpy arrays on a 32 bit machine This has nothing to do with numpy per se - that's the fundamental limitation of 32 bits architectures. Each of your array is 1024 Mb, so you won't be able to create two of them. The 2Gb limit is a theoretical upper limit, and in practice, it will always be lower, if only because python itself needs some memory. There is also the memory fragmentation problem, which means allocating one contiguous, almost 2Gb segment will be difficult. > If someone has an idea to help me I would be very glad. If you really need to deal with arrays that big, you should move on 64 bits architecture. That's exactly the problem they are solving. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Huge arrays
Hi, I have a numpy newbie question. I want to store a huge amount of data in an array. This data come from a measurement setup and I want to write them to disk later since there is nearly no time for this during the measurement. To put some numbers up: I have 2*256*200 int16 numbers which I want to store. I tried data1 = numpy.zeros((256,200),dtype=int16) data2 = numpy.zeros((256,200),dtype=int16) This works for the first array data1. However, it returns with a memory error for array data2. I have read somewhere that there is a 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still be below that? I use Windows XP Pro 32 bit with 3GB of RAM. If someone has an idea to help me I would be very glad. Thanks in advance. Daniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
Hi David, On Tue, Sep 8, 2009 at 8:08 PM, David Cournapeau wrote: > On Wed, Sep 9, 2009 at 4:21 AM, Darren Dale wrote: >> I'm not a core numpy developer and don't want to step on anybody's >> toes here. But I was wondering if anyone had considered approaching >> the Python Software Foundation about support to help get numpy working >> with python-3? > > I already gave my own opinion on py3k, which can be summarized as: > - it is a huge effort, and no core numpy/scipy developer has > expressed the urge to transition to py3k, since py3k does not bring > much for scientific computing. > - very few packages with a significant portion of C have been ported > to my knowledge, hence very little experience on how to do it. AFAIK, > only small packages have been ported. Even big, pure python projects > have not been ported. The only big C project to have been ported is > python itself, and it broke compatibility and used a different source > tree than python 2. > - it remains to be seen whether we can do the py3k support in the > same source tree as the one use for python >= 2.4. Having two source > trees would make the effort even much bigger, well over the current > developers capacity IMHO. > > The only area where I could see the PSF helping is the point 2: more > documentation, more stories about 2->3 transition. I'm surprised to hear you say that. I would think additional developer and/or financial resources would be useful, for all of the reasons you listed. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
On Wed, Sep 9, 2009 at 4:21 AM, Darren Dale wrote: > I'm not a core numpy developer and don't want to step on anybody's > toes here. But I was wondering if anyone had considered approaching > the Python Software Foundation about support to help get numpy working > with python-3? I already gave my own opinion on py3k, which can be summarized as: - it is a huge effort, and no core numpy/scipy developer has expressed the urge to transition to py3k, since py3k does not bring much for scientific computing. - very few packages with a significant portion of C have been ported to my knowledge, hence very little experience on how to do it. AFAIK, only small packages have been ported. Even big, pure python projects have not been ported. The only big C project to have been ported is python itself, and it broke compatibility and used a different source tree than python 2. - it remains to be seen whether we can do the py3k support in the same source tree as the one use for python >= 2.4. Having two source trees would make the effort even much bigger, well over the current developers capacity IMHO. The only area where I could see the PSF helping is the point 2: more documentation, more stories about 2->3 transition. cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
On Tue, Sep 8, 2009 at 5:57 PM, Christian Heimes wrote: > Darren Dale wrote: > > I'm not a core numpy developer and don't want to step on anybody's > > toes here. But I was wondering if anyone had considered approaching > > the Python Software Foundation about support to help get numpy working > > with python-3? > > What kind of support are you talking about? Developers, money, software, > PR, test platforms ...? For quite some time we are talking about ways on > the PSF list to aid projects. We are trying to figure out what projects > need, especially high profile projects and important infrastructure > projects. I myself consider NumPy as a great asset for both the > scientific community and Python. > > I think a full time developer would do the most to speed up the transition. Having a variety of platforms available for testing is good but I don't think it will speed things up significantly. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
Darren Dale wrote: > I'm not a core numpy developer and don't want to step on anybody's > toes here. But I was wondering if anyone had considered approaching > the Python Software Foundation about support to help get numpy working > with python-3? What kind of support are you talking about? Developers, money, software, PR, test platforms ...? For quite some time we are talking about ways on the PSF list to aid projects. We are trying to figure out what projects need, especially high profile projects and important infrastructure projects. I myself consider NumPy as a great asset for both the scientific community and Python. It's true that Pycon '09 was a major drawback on our financials. But there are other ways beside money to assist projects. For example the snakebite network (http://snakebite.org/) could be very useful for you once it's open. Please don't ask me about details on the status, I don't have an account yet. About a month ago we got 14 MSDN premium subscriptions with full access to MS development tools and all Windows platforms, which is very useful for porting and testing application on Windows. Some core developers may also be interested to assist you directly. The PSF might (!) even donate some money but I'm not in the position to discuss it. I can get you in touch with the PSF if you like. I'm a PSF member and a core developer. Christian ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
George Dahl wrote: > Sturla Molden molden.no> writes: >> Teraflops peak performance of modern GPUs is impressive. But NumPy >> cannot easily benefit from that. > I know that for my work, I can get around an order of a 50-fold speedup over > numpy using a python wrapper for a simple GPU matrix class. I think you're talking across each other here. Sturla is referring to making a numpy ndarray gpu-aware and then expecting expressions like: z = a*x**2 + b*x + c to go faster when s, b, c, and x are ndarrays. That's not going to happen. On the other hand, George is talking about moving higher-level operations (like a matrix product) over to GPU code. This is analogous to numpy.linalg and numpy.dot() using LAPACK routines, and yes, that could help those programs that use such operations. So a GPU LAPACK would be nice. This is also analogous to using SWIG, or ctypes or cython or weave, or ??? to move a computationally expensive part of the code over to C. I think anything that makes it easier to write little bits of your code for the GPU would be pretty cool -- a GPU-aware Cython? Also, perhaps a GPU-aware numexpr could be helpful which I think is the kind of thing that Sturla was refering to when she wrote: "Incidentally, this will also make it easier to leverage on modern GPUs." -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
Hi David, On Tue, Sep 8, 2009 at 3:56 PM, David Warde-Farley wrote: > Hey Darren, > > On 8-Sep-09, at 3:21 PM, Darren Dale wrote: > >> I'm not a core numpy developer and don't want to step on anybody's >> toes here. But I was wondering if anyone had considered approaching >> the Python Software Foundation about support to help get numpy working >> with python-3? > > It's a great idea, but word on the grapevine is they lost a LOT of > money on PyCon 2009 due to lower than expected turnout (recession, > etc.); worth a try, perhaps, but I wouldn't hold my breath. I'm blissfully ignorant of the grapevine. But if the numpy project could make use of additional resources to speed along the transition, and if the PSF is in a position to help (either now or in the future), both parties could benefit from such an arrangement. Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype and dtype.char
Hi Robert, Ok we have a section of code that used to be like that: char t; switch(type) { case NPY_CHAR: t = 'c'; break; etc... I now replaced with char t; switch(type) { case NPY_CHAR: t = NPY_CHARLTR; break; But I'm still stuck with numpy.uint64 NPY_UINT64LTR does not seem to exist What do you recommend? C. On Sep 8, 2009, at 1:02 PM, Robert Kern wrote: > 2009/9/8 Charles سمير Doutriaux : >> Hi, >> >> I'm testing our code on 64bit vs 32bit >> >> I just realized that the dtype.car is platform dependent. >> >> I guess it's normal >> >> her emy little test: >> for t in >> [numpy >> .byte >> ,numpy >> .short >> ,numpy >> .int >> ,numpy >> .int32 >> ,numpy >> .float >> ,numpy >> .float32 >> ,numpy >> .double >> ,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: >> print 'Testing type:',t >> data = numpy.array([0], dtype=t) >> print data.dtype.char,data.dtype >> >> >> On 64bit I get for numpy.unit64: >> Testing type: >> L uint64 >> >> Whereas on 32bit i get >> Testing type: >> Q uint64 >> >> Is it really normal? I guess that means I shouldn't expect the >> dtype.char to be the same on all platform >> >> Is that right? > > Yes. dtype.char corresponds more closely to the C type ("L" == > "unsigned long" and "Q" == "unsigned long long") which is platform > specific. > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://*mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] dtype and dtype.char
2009/9/8 Charles سمير Doutriaux : > Hi, > > I'm testing our code on 64bit vs 32bit > > I just realized that the dtype.car is platform dependent. > > I guess it's normal > > her emy little test: > for t in > [numpy > .byte > ,numpy > .short > ,numpy > .int > ,numpy > .int32 > ,numpy > .float > ,numpy > .float32 > ,numpy > .double,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: > print 'Testing type:',t > data = numpy.array([0], dtype=t) > print data.dtype.char,data.dtype > > > On 64bit I get for numpy.unit64: > Testing type: > L uint64 > > Whereas on 32bit i get > Testing type: > Q uint64 > > Is it really normal? I guess that means I shouldn't expect the > dtype.char to be the same on all platform > > Is that right? Yes. dtype.char corresponds more closely to the C type ("L" == "unsigned long" and "Q" == "unsigned long long") which is platform specific. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dtype and dtype.char
Hi, I'm testing our code on 64bit vs 32bit I just realized that the dtype.car is platform dependent. I guess it's normal her emy little test: for t in [numpy .byte ,numpy .short ,numpy .int ,numpy .int32 ,numpy .float ,numpy .float32 ,numpy .double,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]: print 'Testing type:',t data = numpy.array([0], dtype=t) print data.dtype.char,data.dtype On 64bit I get for numpy.unit64: Testing type: L uint64 Whereas on 32bit i get Testing type: Q uint64 Is it really normal? I guess that means I shouldn't expect the dtype.char to be the same on all platform Is that right? C. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] question about future support for python-3
Hey Darren, On 8-Sep-09, at 3:21 PM, Darren Dale wrote: > I'm not a core numpy developer and don't want to step on anybody's > toes here. But I was wondering if anyone had considered approaching > the Python Software Foundation about support to help get numpy working > with python-3? It's a great idea, but word on the grapevine is they lost a LOT of money on PyCon 2009 due to lower than expected turnout (recession, etc.); worth a try, perhaps, but I wouldn't hold my breath. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fwd: GPU Numpy
Sturla Molden molden.no> writes: > > Erik Tollerud skrev: > >> NumPy arrays on the GPU memory is an easy task. But then I would have to > >> write the computation in OpenCL's dialect of C99? > > This is true to some extent, but also probably difficult to do given > > the fact that paralellizable algorithms are generally more difficult > > to formulate in striaghtforward ways. > Then you have misunderstood me completely. Creating an ndarray that has > a buffer in graphics memory is not too difficult, given that graphics > memory can be memory mapped. This has nothing to do with parallelizable > algorithms or not. It is just memory management. We could make an > ndarray subclass that quickly puts is content in a buffer accessible to > the GPU. That is not difficult. But then comes the question of what you > do with it. > > I think many here misunderstands the issue here: > > Teraflops peak performance of modern GPUs is impressive. But NumPy > cannot easily benefit from that. In fact, there is little or nothing to > gain from optimising in that end. In order for a GPU to help, > computation must be the time-limiting factor. It is not. There is not > more to say about using GPUs in NumPy right now. > > Take a look at the timings here: http://www.scipy.org/PerformancePython > It shows that computing with NumPy is more than ten times slower than > using plain C. This is despite NumPy being written in C. The NumPy code > does not incur 10 times more floating point operations than the C code. > The floating point unit does not run in turtle mode when using NumPy. > NumPy's relative slowness compared to C has nothing to do with floating > point computation. It is due to inferior memory use (temporary buffers, > multiple buffer traversals) and memory access being slow. Moving > computation to the GPU can only make this worse. > > Improved memory usage - e.g. through lazy evaluation and JIT compilaton > of expressions - can give up to a tenfold increase in performance. That > is where we must start optimising to get a faster NumPy. Incidentally, > this will also make it easier to leverage on modern GPUs. > > Sturla Molden > I know that for my work, I can get around an order of a 50-fold speedup over numpy using a python wrapper for a simple GPU matrix class. So I might be dealing with a lot of matrix products where I multiply a fixed 512 by 784 matrix by a 784 by 256 matrix that changes between each matrix product, although to really see the largest gains I use a 4096 by 2048 matrix times a bunch of 2048 by 256 matrices. If all I was doing were those matrix products, it would be even faster, but what I actually am doing is a matrix product, then adding a column vector to the result, then applying an elementwise logistic sigmoid function and potentially generating a matrix of pseudorandom numbers the same shape as my result (although not always). When I do these sorts of workloads, my python numpy+GPU matrix class goes so much faster than anything that doesn't use the GPU (be it Matlab, or numpy, or C/C++ whatever) that I don't even bother measuring the speedups precisely. In some cases, my python code isn't making too many temporaries since what it is doing is so simple, but in other cases that is obviously slowing it down a bit. I have relatively complicated jobs that used to take weeks on the CPU can now take hours or days. Obviously improved memory usage would be more helpful since not everyone has access to the sorts of GPUs I use, but tenfold increases in performance seem like chump change compared to what I see with the sorts of workloads I do. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] question about future support for python-3
I'm not a core numpy developer and don't want to step on anybody's toes here. But I was wondering if anyone had considered approaching the Python Software Foundation about support to help get numpy working with python-3? Thanks, Darren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] creating mesh data from xyz data
On 2009-09-08 10:38 , Christopher Barker wrote: > Giuseppe Aprea wrote: >> I have some files with data stored in columns: >> >> x1 y1 z1 >> x2 y2 z2 >> x3 y3 z3 >> x4 y4 z4 >> x5 y5 z5 >> I usually load data using 3 lists: x, y and z; I wonder if there is >> any function which is able to take these 3 lists and return the right >> inputs for matplotlib functions. > > There may b e some MPL utilities that help with this, so you may want to > ask there, but: > > What you want to do depends on the nature of your data. If your data is > on a rectangular structured grid, the you should use your knowledge of > the data structure to re-create that structure to pass to MPL. To expand on Chris's very nice explanation, if the data points are in "raster" order where x1 = x2 = ... = xn and so forth, then you can use reshape to get your arrays for matplotlib. Here's an example: >>> x = [1]*5+[2]*5+[3]*5 >>> y = [6,7,8,9,10]*3 >>> z = range(15) >>> x,y,z ([1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3], [6, 7, 8, 9, 10, 6, 7, 8, 9, 10, 6, 7, 8, 9, 10], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) >>> plot_x = np.array(x).reshape(3,5) >>> plot_y = np.array(y).reshape(3,5) >>> plot_z = np.array(z).reshape(3,5) >>> plot_x,plot_y,plot_z (array([[1, 1, 1, 1, 1], [2, 2, 2, 2, 2], [3, 3, 3, 3, 3]]), array([[ 6, 7, 8, 9, 10], [ 6, 7, 8, 9, 10], [ 6, 7, 8, 9, 10]]), array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]])) -Neil ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy/scipy/matplotlib + 10.6 + Apple python 2.6.1
David Cournapeau wrote: > I think it is best to avoid touching anything in /System. Yes, it is. > The better > solution is to install things locally, at least if you don't need to > share with several users one install. And if you do, you can put it in: /Library/Frameworks (/Library is kind Apple's answer to /usr/local, at least for Frameworks) What that means is that you need to install a new Python, too. I think those notes were for using the Apple-supplied Python. But it's a good idea to build your own Python (or install the python.org one) in /Library anyway -- Apple has never upgraded a Python within an OS-X release, and tends to have a bunch of not-quite-up-to-date pacakges installed. Since you don't know which of those packages are being used by Apple utilities, and Python doesn't provide a package versioning system, and not all package updates are fully backwards compatible, it's best to simply not mess with Apple's python at all. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Behavior from a change in dtype?
Skipper Seabold wrote: > Hmm, okay, well I came across this in trying to create a recarray like > data2 below, so I guess I should just combine the two questions. key to understanding this is to understand what is going on under the hood in numpy. Travis O. gave a nice intro in an Enthought webcast a few months ago -- I"m not sure if those are recorded and up on the web, but it's worth a look. It was also discussed int eh advanced numpy tutorial at SciPy this year -- and that is up on the web: http://www.archive.org/details/scipy09_advancedTutorialDay1_1 Anyway, here is my minimal attempt to clarify: > import numpy as np > > data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) here we are using a standard array constructor -- it will look at the data you are passing in (a mixture of python floats and ints), and decide that they can best be represented by a numpy array of float64s. numpy arrays are essentially a pointer to a black of memory, and a bunch of attributes that describe how the bytes pointed to are to be interpreted. In this case, they are a 9 C doubles, representing a 3x3 array of doubles. > dt = np.dtype([('var1', 'f8'), ('var2', '>i8'), ('var3', '>i8')]) ) This is a data type descriptor that is analogous to a C struct, containing a float64 and two int84s > # Doesn't work, raises TypeError: expected a readable buffer object > data2 = data2.view(np.recarray) > data2.astype(dt) I'm don't understand that error either, but recarrays are about adding the ability to access parts of a structured array by name, but you still need the dtype to specify the types and names. This does seem to work (though may not be giving the results you expect): In [19]: data2 = data.copy() In [20]: data2 = data2.view(np.recarray) In [21]: data2 = data2.view(dtype=dt) or, indeed in the opposite order: In [24]: data2 = data.copy() In [25]: data2 = data2.view(dtype=dt) In [26]: data2 = data2.view(np.recarray) So you've done two operations, one is to change the dtype -- the interpretation of the bytes in the data buffer, and one is to make this a recarray, which allows you to access the "fields" by name: In [31]: data2['var1'] Out[31]: array([[ 10.75], [ 10.39], [ 18.18]]) > # Works without error (?) with unexpected result > data3 = data3.view(np.recarray) > data3.dtype = dt that all depends what you expect! I used "view" above, 'cause I think there is less magic, though it's the same thing. I suppose changing the dtype in place like that is a tiny bit more efficient -- if you use .view() , you are creating a new array pointing to the same data, rather than changing the array in place. But anyway, the dtype describes how the bytes in the memory black are to be interpreted, changing it by assigning the attribute or using .view() changes the interpretation, but does not change the bytes themselves at all, so in this case, you are taking the 8 bytes representing a float64 of value: 1.0, and interpreting those bytes as an 8 byte int -- which is going to give you garbage, essentially. > # One correct (though IMHO) unintuitive way > data = np.rec.fromarrays(data.swapaxes(1,0), dtype=dt) This is using the np.rec.fromarrays constructor to build a new record array with the dtype you want, the data is being converted and copied, it won't change the original at all: So the question remains -- is there a way to convert the floats in "data" to ints in place? This seems to work: In [78]: data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]]) In [79]: data[:,1:3] = data[:,1:3].astype('>i8').view(dtype='>f8') In [80]: data.dtype = dt It is making a copy of the integer data in process -- but I think that is required, as you are changing the value, not just the interpretation of the bytes. I suppose we could have a "astype_inplace" method, but that would only work if the two types were the same size, and I'm not sure it's a common enough use to be worth it. What is your real use case? I suspect that what you really should do here is define your dtype first, then create the array of data: data = np.array([(10.75, 1, 1), (10.39, 0, 1), (18.18, 0, 1)], dtype=dt) which does require that you use tuples, rather than lists to hold the "structs". HTH, - Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] creating mesh data from xyz data
Giuseppe Aprea wrote: > I have some files with data stored in columns: > > x1 y1 z1 > x2 y2 z2 > x3 y3 z3 > x4 y4 z4 > x5 y5 z5 > I usually load data using 3 lists: x, y and z; I wonder if there is > any function which is able to take these 3 lists and return the right > inputs for matplotlib functions. There may b e some MPL utilities that help with this, so you may want to ask there, but: What you want to do depends on the nature of your data. If your data is on a rectangular structured grid, the you should use your knowledge of the data structure to re-create that structure to pass to MPL. If it is unstructured data: i.e. the (x,y) points are at arbitrary positions, then you need some sort of interpolation scheme to get an appropriate rectangular mesh: Here's a good start: http://www.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] creating mesh data from xyz data
Hi list, I have some files with data stored in columns: x1 y1 z1 x2 y2 z2 x3 y3 z3 x4 y4 z4 x5 y5 z5 ... and I need to make a contour plot of this data using matplotlib. The problem is that contour plot functions usually handle a different kind of input: X=[[x1,x2,x3,x4,x5,x6], [x1,x2,x3,x4,x5,x6], [x1,x2,x3,x4,x5,x6],... Y=[[y1,y1,y1,y1,y1,y1], [y2,y2,y2,y2,y2,y2], [y3,y3,y3,y3,y3,y3],. Z=[[z1,z2,z3,z4,z5,z6], [z7,z8,zz9,z10,z11,z12], I usually load data using 3 lists: x, y and z; I wonder if there is any function which is able to take these 3 lists and return the right inputs for matplotlib functions. cheers g ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] greppable file of all numpy functions ?
denis bzowy t-online.de> writes: > > Does anyone have a program to generate a file with one line per Numpy function > / class / method, for local grepping ? Sorry I wasn't clear: I want just all defs, one per long line, like this: ... PyQt4.QtCore.QObject.findChildren(type type, QRegExp regExp) -> list PyQt4.QtCore.QObject.emit(SIGNAL(), ...) PyQt4.QtCore.QObject.objectName() -> QString PyQt4.QtCore.QObject.setObjectName(QString name) PyQt4.QtCore.QObject.isWidgetType() -> bool ... This file (PyQt4.api) is a bit different but you get the idea: egrep kilroy all.defs -> a.b.c.kilroy ... with args -- no __doc__ then pydoc or ipython %whoosh a.b.c.kilroy -> __doc__ is step 2. Sound dumb ? Well, grep is fast, simple, and works even when you don't know enough for tree-structured search. Whooshdoc looks very nice, can it do just all.defs ? (Oops, wdoc -v index numpy.core numpy.lib -> ... File "/opt/local/lib/python2.5/site-packages/epydoc-3.0.1-py2.5.egg/epydoc/doc module_doc.package.submodules.append(module_doc) AttributeError: _Sentinel instance has no attribute 'append' log on the way to enthought-dev ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion