Re: [Numpy-discussion] Huge arrays

2009-09-08 Thread David Cournapeau
On Wed, Sep 9, 2009 at 2:10 PM, Sebastian Haase wrote:
> Hi,
> you can probably use PyTables for this. Even though it's meant to
> save/load data to/from disk (in HDF5 format) as far as I understand,
> it can be used to make your task solvable - even on a 32bit system !!
> It's free (pytables.org) -- so maybe you can try it out and tell me if
> I'm right 

You still would not be able to load a numpy array > 2 Gb. Numpy memory
model needs one contiguously addressable chunk of memory for the data,
which is limited under the 32 bits archs. This cannot be overcome in
any way AFAIK.

You may be able to save data > 2 Gb, by appending several chunks < 2
Gb to disk - maybe pytables supports this if it has large file support
(which enables to write files > 2Gb on a 32 bits system).

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Huge arrays

2009-09-08 Thread Sebastian Haase
Hi,
you can probably use PyTables for this. Even though it's meant to
save/load data to/from disk (in HDF5 format) as far as I understand,
it can be used to make your task solvable - even on a 32bit system !!
It's free (pytables.org) -- so maybe you can try it out and tell me if
I'm right 
Or someone else here would know right away...

Cheers,
Sebastian Haase


On Wed, Sep 9, 2009 at 6:19 AM, Sturla Molden wrote:
> Daniel Platz skrev:
>> data1 = numpy.zeros((256,200),dtype=int16)
>> data2 = numpy.zeros((256,200),dtype=int16)
>>
>> This works for the first array data1. However, it returns with a
>> memory error for array data2. I have read somewhere that there is a
>> 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still
>> be below that? I use Windows XP Pro 32 bit with 3GB of RAM.
>
> There is a 2 GB limit for user space on Win32, this is about 1.9 GB. You
> have other programs running as well, so this is still too much. Also
> Windows reserves 50% of RAM for itself, so you have less than 1.5 GB to
> play with.
>
> S.M.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Huge arrays

2009-09-08 Thread Sturla Molden
Daniel Platz skrev:
> data1 = numpy.zeros((256,200),dtype=int16)
> data2 = numpy.zeros((256,200),dtype=int16)
>
> This works for the first array data1. However, it returns with a
> memory error for array data2. I have read somewhere that there is a
> 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still
> be below that? I use Windows XP Pro 32 bit with 3GB of RAM.

There is a 2 GB limit for user space on Win32, this is about 1.9 GB. You 
have other programs running as well, so this is still too much. Also 
Windows reserves 50% of RAM for itself, so you have less than 1.5 GB to 
play with.

S.M.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Huge arrays

2009-09-08 Thread Charles R Harris
On Tue, Sep 8, 2009 at 7:30 PM, Daniel Platz <
mail.to.daniel.pl...@googlemail.com> wrote:

> Hi,
>
> I have a numpy newbie question. I want to store a huge amount of data
> in  an array. This data come from a measurement setup and I want to
> write them to disk later since there is nearly no time for this during
> the measurement. To put some numbers up: I have 2*256*200 int16
> numbers which I want to store. I tried
>
> data1 = numpy.zeros((256,200),dtype=int16)
> data2 = numpy.zeros((256,200),dtype=int16)
>
> This works for the first array data1. However, it returns with a
> memory error for array data2. I have read somewhere that there is a
> 2GB limit for numpy arrays on a 32 bit machine but shouldn't I still
> be below that? I use Windows XP Pro 32 bit with 3GB of RAM.
>
>
More precisely, 2GB for windows and 3GB for (non-PAE enabled) linux. The
rest of the address space is set aside for the operating system.  Note that
address space is not the same as physical memory, but it sets a limit on
what you can use, whether swap or real memory.

Chuck.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dtype and dtype.char

2009-09-08 Thread Charles سمير Doutriaux
Ok I finally got it

I was going at it backward... Instead of checking for NPY_INT64 and  
trying to figure out which letter it is different on each platform) I  
needed to check for
NPY_LONGLONG /NPY_LONG/ NPY_INT, etc..

i.e I need to check for the numpy types that have an associated unique  
letter. Not their aliases since these can be different...

It works now.

C.


On Sep 8, 2009, at 1:13 PM, Charles سمير Doutriaux wrote:

> Hi Robert,
>
> Ok we have a section of code that used to be like that:
>
>   char t;
>   switch(type) {
>   case NPY_CHAR:
> t = 'c';
> break;
> etc...
>
> I now replaced with
>   char t;
>   switch(type) {
>   case NPY_CHAR:
> t = NPY_CHARLTR;
> break;
>
> But I'm still stuck with numpy.uint64
> NPY_UINT64LTR does not seem to exist
>
> What do you recommend?
>
> C.
>
> On Sep 8, 2009, at 1:02 PM, Robert Kern wrote:
>
>> 2009/9/8 Charles سمير Doutriaux :
>>> Hi,
>>>
>>> I'm testing our code on 64bit vs 32bit
>>>
>>> I just realized that the dtype.car is platform dependent.
>>>
>>> I guess it's normal
>>>
>>> her emy little test:
>>> for t in
>>> [numpy
>>> .byte
>>> ,numpy
>>> .short
>>> ,numpy
>>> .int
>>> ,numpy
>>> .int32
>>> ,numpy
>>> .float
>>> ,numpy
>>> .float32
>>> ,numpy
>>> .double
>>> ,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]:
>>>print 'Testing type:',t
>>>data = numpy.array([0], dtype=t)
>>>print data.dtype.char,data.dtype
>>>
>>>
>>> On 64bit I get for numpy.unit64:
>>> Testing type: 
>>> L uint64
>>>
>>> Whereas on 32bit i get
>>> Testing type: 
>>> Q uint64
>>>
>>> Is it really normal? I guess that means I shouldn't expect the
>>> dtype.char to be the same on all platform
>>>
>>> Is that right?
>>
>> Yes. dtype.char corresponds more closely to the C type ("L" ==
>> "unsigned long" and "Q" == "unsigned long long") which is platform
>> specific.
>>
>> -- 
>> Robert Kern
>>
>> "I have come to believe that the whole world is an enigma, a harmless
>> enigma that is made terrible by our own mad attempt to interpret it  
>> as
>> though it had an underlying truth."
>> -- Umberto Eco
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://**mail.scipy.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://*mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread Fernando Perez
On Tue, Sep 8, 2009 at 5:08 PM, David Cournapeau  wrote:
>  - it remains to be seen whether we can do the py3k support in the
> same source tree as the one use for python >= 2.4. Having two source
> trees would make the effort even much bigger, well over the current
> developers capacity IMHO.

I know ipython is a very different beast than numpy for this
discussion (no C code at all, but extensive, invasive and often
obscure use of the stdlib and the language itself).  But FWIW, I have
convinced myself that we will only really be able to seriously tackle
the 3 transition when we can ditch 2.5 compatibility and have a tree
that runs for 2.6 only, with all the -3 options turned on.  Only at
that point does it become feasible to start attacking the 3 transition
for us.  We simply don't have the manpower to manage multiple source
trees that diverge fully and exist separately for 2.x and 3.x.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread David Cournapeau
On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote:
> Hi David,

>> I already gave my own opinion on py3k, which can be summarized as:
>>  - it is a huge effort, and no core numpy/scipy developer has
>> expressed the urge to transition to py3k, since py3k does not bring
>> much for scientific computing.
>>  - very few packages with a significant portion of C have been ported
>> to my knowledge, hence very little experience on how to do it. AFAIK,
>> only small packages have been ported. Even big, pure python projects
>> have not been ported. The only big C project to have been ported is
>> python itself, and it broke compatibility and used a different source
>> tree than python 2.
>>  - it remains to be seen whether we can do the py3k support in the
>> same source tree as the one use for python >= 2.4. Having two source
>> trees would make the effort even much bigger, well over the current
>> developers capacity IMHO.
>>
>> The only area where I could see the PSF helping is the point 2: more
>> documentation, more stories about 2->3 transition.
>
> I'm surprised to hear you say that. I would think additional developer
> and/or financial resources would be useful, for all of the reasons you
> listed.

If there was enough resources to pay someone very familiar with numpy
codebase for a long time, then yes, it could be useful - but I assume
that's out of the question. This would be very expensive as it would
requires several full months IMO.

The PSF could help for the point 3, by porting other projects to py3k
and documenting it. The only example I know so far is pycog2
(http://mail.python.org/pipermail/python-porting/2008-December/10.html).

Paying people to do documentation about porting C code seems like a
good way to spend money: it would be useful outside numpy community,
and would presumably be less costly.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Huge arrays

2009-09-08 Thread David Cournapeau
On Wed, Sep 9, 2009 at 9:30 AM, Daniel
Platz wrote:
> Hi,
>
> I have a numpy newbie question. I want to store a huge amount of data
> in  an array. This data come from a measurement setup and I want to
> write them to disk later since there is nearly no time for this during
> the measurement. To put some numbers up: I have 2*256*200 int16
> numbers which I want to store. I tried
>
> data1 = numpy.zeros((256,200),dtype=int16)
> data2 = numpy.zeros((256,200),dtype=int16)
>
> This works for the first array data1. However, it returns with a
> memory error for array data2. I have read somewhere that there is a
> 2GB limit for numpy arrays on a 32 bit machine

This has nothing to do with numpy per se - that's the fundamental
limitation of 32 bits architectures. Each of your array is 1024 Mb, so
you won't be able to create two of them.
The 2Gb limit is a theoretical upper limit, and in practice, it will
always be lower, if only because python itself needs some memory.
There is also the memory fragmentation problem, which means allocating
one contiguous, almost 2Gb segment will be difficult.

> If someone has an idea to help me I would be very glad.

If you really need to deal with arrays that big, you should move on 64
bits architecture. That's exactly the problem they are solving.

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Huge arrays

2009-09-08 Thread Daniel Platz
Hi,

I have a numpy newbie question. I want to store a huge amount of data
in  an array. This data come from a measurement setup and I want to
write them to disk later since there is nearly no time for this during
the measurement. To put some numbers up: I have 2*256*200 int16
numbers which I want to store. I tried

data1 = numpy.zeros((256,200),dtype=int16)
data2 = numpy.zeros((256,200),dtype=int16)

This works for the first array data1. However, it returns with a
memory error for array data2. I have read somewhere that there is a
2GB limit for numpy arrays on a 32 bit machine but shouldn't I still
be below that? I use Windows XP Pro 32 bit with 3GB of RAM.

If someone has an idea to help me I would be very glad.

Thanks in advance.

Daniel
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread Darren Dale
Hi David,

On Tue, Sep 8, 2009 at 8:08 PM, David Cournapeau wrote:
> On Wed, Sep 9, 2009 at 4:21 AM, Darren Dale wrote:
>> I'm not a core numpy developer and don't want to step on anybody's
>> toes here. But I was wondering if anyone had considered approaching
>> the Python Software Foundation about support to help get numpy working
>> with python-3?
>
> I already gave my own opinion on py3k, which can be summarized as:
>  - it is a huge effort, and no core numpy/scipy developer has
> expressed the urge to transition to py3k, since py3k does not bring
> much for scientific computing.
>  - very few packages with a significant portion of C have been ported
> to my knowledge, hence very little experience on how to do it. AFAIK,
> only small packages have been ported. Even big, pure python projects
> have not been ported. The only big C project to have been ported is
> python itself, and it broke compatibility and used a different source
> tree than python 2.
>  - it remains to be seen whether we can do the py3k support in the
> same source tree as the one use for python >= 2.4. Having two source
> trees would make the effort even much bigger, well over the current
> developers capacity IMHO.
>
> The only area where I could see the PSF helping is the point 2: more
> documentation, more stories about 2->3 transition.

I'm surprised to hear you say that. I would think additional developer
and/or financial resources would be useful, for all of the reasons you
listed.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread David Cournapeau
On Wed, Sep 9, 2009 at 4:21 AM, Darren Dale wrote:
> I'm not a core numpy developer and don't want to step on anybody's
> toes here. But I was wondering if anyone had considered approaching
> the Python Software Foundation about support to help get numpy working
> with python-3?

I already gave my own opinion on py3k, which can be summarized as:
  - it is a huge effort, and no core numpy/scipy developer has
expressed the urge to transition to py3k, since py3k does not bring
much for scientific computing.
  - very few packages with a significant portion of C have been ported
to my knowledge, hence very little experience on how to do it. AFAIK,
only small packages have been ported. Even big, pure python projects
have not been ported. The only big C project to have been ported is
python itself, and it broke compatibility and used a different source
tree than python 2.
  - it remains to be seen whether we can do the py3k support in the
same source tree as the one use for python >= 2.4. Having two source
trees would make the effort even much bigger, well over the current
developers capacity IMHO.

The only area where I could see the PSF helping is the point 2: more
documentation, more stories about 2->3 transition.

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread Charles R Harris
On Tue, Sep 8, 2009 at 5:57 PM, Christian Heimes  wrote:

> Darren Dale wrote:
> > I'm not a core numpy developer and don't want to step on anybody's
> > toes here. But I was wondering if anyone had considered approaching
> > the Python Software Foundation about support to help get numpy working
> > with python-3?
>
> What kind of support are you talking about? Developers, money, software,
> PR, test platforms ...? For quite some time we are talking about ways on
> the PSF list to aid projects. We are trying to figure out what projects
> need, especially high profile projects and important infrastructure
> projects. I myself consider NumPy as a great asset for both the
> scientific community and Python.
>
>
I think a full time developer would do the most to speed up the transition.
Having a variety of platforms available for testing is good but I don't
think it will speed things up significantly.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread Christian Heimes
Darren Dale wrote:
> I'm not a core numpy developer and don't want to step on anybody's
> toes here. But I was wondering if anyone had considered approaching
> the Python Software Foundation about support to help get numpy working
> with python-3?

What kind of support are you talking about? Developers, money, software,
PR, test platforms ...? For quite some time we are talking about ways on
the PSF list to aid projects. We are trying to figure out what projects
need, especially high profile projects and important infrastructure
projects. I myself consider NumPy as a great asset for both the
scientific community and Python.

It's true that Pycon '09 was a major drawback on our financials. But
there are other ways beside money to assist projects. For example the
snakebite network (http://snakebite.org/) could be very useful for you
once it's open. Please don't ask me about details on the status, I don't
have an account yet. About a month ago we got 14 MSDN premium
subscriptions with full access to MS development tools and all Windows
platforms, which is very useful for porting and testing application on
Windows. Some core developers may also be interested to assist you
directly. The PSF might (!) even donate some money but I'm not in the
position to discuss it.

I can get you in touch with the PSF if you like. I'm a PSF member and a
core developer.

Christian

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-08 Thread Christopher Barker
George Dahl wrote:
> Sturla Molden  molden.no> writes:
>> Teraflops peak performance of modern GPUs is impressive. But NumPy 
>> cannot easily benefit from that. 

> I know that for my work, I can get around an order of a 50-fold speedup over
> numpy using a python wrapper for a simple GPU matrix class.

I think you're talking across each other here. Sturla is referring to 
making a numpy ndarray gpu-aware and then expecting expressions like:

z = a*x**2 + b*x + c

to go faster when s, b, c, and x are ndarrays.

That's not going to happen.

On the other hand, George is talking about moving higher-level 
operations (like a matrix product) over to GPU code. This is analogous 
to numpy.linalg and numpy.dot() using LAPACK routines, and yes, that 
could help those programs that use such operations.

So a GPU LAPACK would be nice.

This is also analogous to using SWIG, or ctypes or cython or weave, or 
??? to move a computationally expensive part of the code over to C.

I think anything that makes it easier to write little bits of your code 
for the GPU would be pretty cool -- a GPU-aware Cython?

Also, perhaps a GPU-aware numexpr could be helpful which I think is the 
kind of thing that Sturla was refering to when she wrote:

"Incidentally,  this will  also make it easier to leverage on modern GPUs."

-Chris











-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread Darren Dale
Hi David,

On Tue, Sep 8, 2009 at 3:56 PM, David Warde-Farley wrote:
> Hey Darren,
>
> On 8-Sep-09, at 3:21 PM, Darren Dale wrote:
>
>> I'm not a core numpy developer and don't want to step on anybody's
>> toes here. But I was wondering if anyone had considered approaching
>> the Python Software Foundation about support to help get numpy working
>> with python-3?
>
> It's a great idea, but word on the grapevine is they lost a LOT of
> money on PyCon 2009 due to lower than expected turnout (recession,
> etc.); worth a try, perhaps, but I wouldn't hold my breath.

I'm blissfully ignorant of the grapevine. But if the numpy project
could make use of additional resources to speed along the transition,
and if the PSF is in a position to help (either now or in the future),
both parties could benefit from such an arrangement.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dtype and dtype.char

2009-09-08 Thread Charles سمير Doutriaux
Hi Robert,

Ok we have a section of code that used to be like that:

   char t;
   switch(type) {
   case NPY_CHAR:
 t = 'c';
 break;
etc...

I now replaced with
   char t;
   switch(type) {
   case NPY_CHAR:
 t = NPY_CHARLTR;
 break;

But I'm still stuck with numpy.uint64
NPY_UINT64LTR does not seem to exist

What do you recommend?

C.

On Sep 8, 2009, at 1:02 PM, Robert Kern wrote:

> 2009/9/8 Charles سمير Doutriaux :
>> Hi,
>>
>> I'm testing our code on 64bit vs 32bit
>>
>> I just realized that the dtype.car is platform dependent.
>>
>> I guess it's normal
>>
>> her emy little test:
>> for t in
>> [numpy
>> .byte
>> ,numpy
>> .short
>> ,numpy
>> .int
>> ,numpy
>> .int32
>> ,numpy
>> .float
>> ,numpy
>> .float32
>> ,numpy
>> .double
>> ,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]:
>> print 'Testing type:',t
>> data = numpy.array([0], dtype=t)
>> print data.dtype.char,data.dtype
>>
>>
>> On 64bit I get for numpy.unit64:
>> Testing type: 
>> L uint64
>>
>> Whereas on 32bit i get
>> Testing type: 
>> Q uint64
>>
>> Is it really normal? I guess that means I shouldn't expect the
>> dtype.char to be the same on all platform
>>
>> Is that right?
>
> Yes. dtype.char corresponds more closely to the C type ("L" ==
> "unsigned long" and "Q" == "unsigned long long") which is platform
> specific.
>
> -- 
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://*mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] dtype and dtype.char

2009-09-08 Thread Robert Kern
2009/9/8 Charles سمير Doutriaux :
> Hi,
>
> I'm testing our code on 64bit vs 32bit
>
> I just realized that the dtype.car is platform dependent.
>
> I guess it's normal
>
> her emy little test:
> for t in
> [numpy
> .byte
> ,numpy
> .short
> ,numpy
> .int
> ,numpy
> .int32
> ,numpy
> .float
> ,numpy
> .float32
> ,numpy
> .double,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]:
>     print 'Testing type:',t
>     data = numpy.array([0], dtype=t)
>     print data.dtype.char,data.dtype
>
>
> On 64bit I get for numpy.unit64:
> Testing type: 
> L uint64
>
> Whereas on 32bit i get
> Testing type: 
> Q uint64
>
> Is it really normal? I guess that means I shouldn't expect the
> dtype.char to be the same on all platform
>
> Is that right?

Yes. dtype.char corresponds more closely to the C type ("L" ==
"unsigned long" and "Q" == "unsigned long long") which is platform
specific.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] dtype and dtype.char

2009-09-08 Thread Charles سمير Doutriaux
Hi,

I'm testing our code on 64bit vs 32bit

I just realized that the dtype.car is platform dependent.

I guess it's normal

her emy little test:
for t in  
[numpy 
.byte 
,numpy 
.short 
,numpy 
.int 
,numpy 
.int32 
,numpy 
.float 
,numpy 
.float32 
,numpy 
.double,numpy.ubyte,numpy.ushort,numpy.uint,numpy.int64,numpy.uint64]:
 print 'Testing type:',t
 data = numpy.array([0], dtype=t)
 print data.dtype.char,data.dtype


On 64bit I get for numpy.unit64:
Testing type: 
L uint64

Whereas on 32bit i get
Testing type: 
Q uint64

Is it really normal? I guess that means I shouldn't expect the  
dtype.char to be the same on all platform

Is that right?

C.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-08 Thread David Warde-Farley
Hey Darren,

On 8-Sep-09, at 3:21 PM, Darren Dale wrote:

> I'm not a core numpy developer and don't want to step on anybody's
> toes here. But I was wondering if anyone had considered approaching
> the Python Software Foundation about support to help get numpy working
> with python-3?

It's a great idea, but word on the grapevine is they lost a LOT of  
money on PyCon 2009 due to lower than expected turnout (recession,  
etc.); worth a try, perhaps, but I wouldn't hold my breath.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-08 Thread George Dahl
Sturla Molden  molden.no> writes:

> 
> Erik Tollerud skrev:
> >> NumPy arrays on the GPU memory is an easy task. But then I would have to
> >> write the computation in OpenCL's dialect of C99? 
> > This is true to some extent, but also probably difficult to do given
> > the fact that paralellizable algorithms are generally more difficult
> > to formulate in striaghtforward ways. 
> Then you have misunderstood me completely. Creating an ndarray that has 
> a buffer in graphics memory is not too difficult, given that graphics 
> memory can be memory mapped. This has nothing to do with parallelizable 
> algorithms or not. It is just memory management. We could make an 
> ndarray subclass that quickly puts is content in a buffer accessible to 
> the GPU. That is not difficult. But then comes the question of what you 
> do with it.
> 
> I think many here misunderstands the issue here:
> 
> Teraflops peak performance of modern GPUs is impressive. But NumPy 
> cannot easily benefit from that. In fact, there is little or nothing to 
> gain from optimising in that end. In order for a GPU to help, 
> computation must be the time-limiting factor. It is not. There is not 
> more to say about using GPUs in NumPy right now.
> 
> Take a look at the timings here: http://www.scipy.org/PerformancePython 
> It shows that computing with NumPy is more than ten times slower than 
> using plain C. This is despite NumPy being written in C. The NumPy code 
> does not incur 10 times more floating point operations than the C code. 
> The floating point unit does not run in turtle mode when using NumPy. 
> NumPy's relative slowness compared to C has nothing to do with floating 
> point computation. It is due to inferior memory use (temporary buffers, 
> multiple buffer traversals) and memory access being slow. Moving 
> computation to the GPU can only make this worse.
> 
> Improved memory usage - e.g. through lazy evaluation and JIT compilaton 
> of expressions - can give up to a tenfold increase in performance. That 
> is where we must start optimising to get a faster NumPy. Incidentally, 
> this will  also make it easier to leverage on modern GPUs.
> 
> Sturla Molden
> 


I know that for my work, I can get around an order of a 50-fold speedup over
numpy using a python wrapper for a simple GPU matrix class.  So I might be
dealing with a lot of matrix products where I multiply a fixed 512 by 784 matrix
by a 784 by 256 matrix that changes between each matrix product, although to
really see the largest gains I use a 4096 by 2048 matrix times a bunch of 2048
by 256 matrices.  If all I was doing were those matrix products, it would be
even faster, but what I actually am doing is a matrix product, then adding a
column vector to the result, then applying an elementwise logistic sigmoid
function and potentially generating a matrix of pseudorandom numbers the same
shape as my result (although not always).  When I do these sorts of workloads,
my python numpy+GPU matrix class goes so much faster than anything that doesn't
use the GPU (be it Matlab, or numpy, or C/C++ whatever) that I don't even bother
measuring the speedups precisely.  In some cases, my python code isn't making
too many temporaries since what it is doing is so simple, but in other cases
that is obviously slowing it down a bit.  I have relatively complicated jobs
that used to take weeks on the CPU can now take hours or days.

Obviously improved memory usage would be more helpful since not everyone has
access to the sorts of GPUs I use, but tenfold increases in performance seem
like chump change compared to what I see with the sorts of workloads I do.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] question about future support for python-3

2009-09-08 Thread Darren Dale
I'm not a core numpy developer and don't want to step on anybody's
toes here. But I was wondering if anyone had considered approaching
the Python Software Foundation about support to help get numpy working
with python-3?

Thanks,
Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] creating mesh data from xyz data

2009-09-08 Thread Neil Martinsen-Burrell
On 2009-09-08 10:38 , Christopher Barker wrote:
> Giuseppe Aprea wrote:
>> I have some files with data stored in columns:
>>
>> x1 y1 z1
>> x2 y2 z2
>> x3 y3 z3
>> x4 y4 z4
>> x5 y5 z5
>> I usually load data using 3 lists: x, y and z; I wonder if there is
>> any function which is able to take these 3 lists and return the right
>> inputs for matplotlib functions.
>
> There may b e some MPL utilities that help with this, so you may want to
> ask there, but:
>
> What you want to do depends on the nature of your data. If your data is
> on a rectangular structured grid, the you should use your knowledge of
> the data structure to re-create that structure to pass to MPL.

To expand on Chris's very nice explanation, if the data points are in 
"raster" order where x1 = x2 = ... = xn and so forth, then you can use 
reshape to get your arrays for matplotlib.  Here's an example:

 >>> x = [1]*5+[2]*5+[3]*5
 >>> y = [6,7,8,9,10]*3
 >>> z = range(15)
 >>> x,y,z
([1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3], [6, 7, 8, 9, 10, 6, 7, 
8, 9, 10, 6, 7, 8, 9, 10], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14])
 >>> plot_x = np.array(x).reshape(3,5)
 >>> plot_y = np.array(y).reshape(3,5)
 >>> plot_z = np.array(z).reshape(3,5)
 >>> plot_x,plot_y,plot_z
(array([[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3]]), array([[ 6,  7,  8,  9, 10],
[ 6,  7,  8,  9, 10],
[ 6,  7,  8,  9, 10]]), array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]]))

-Neil
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy/scipy/matplotlib + 10.6 + Apple python 2.6.1

2009-09-08 Thread Christopher Barker
David Cournapeau wrote:
> I think it is best to avoid touching anything in /System.

Yes, it is.

> The better
> solution is to install things locally, at least if you don't need to
> share with several users one install.

And if you do, you can put it in:

/Library/Frameworks

(/Library is kind Apple's answer to /usr/local, at least for Frameworks)

What that means is that you need to install a new Python, too. I think 
those notes were for using the Apple-supplied Python. But it's a good 
idea to build your own Python (or install the python.org one) in 
/Library anyway -- Apple has never upgraded a Python within an OS-X 
release, and tends to have a bunch of not-quite-up-to-date pacakges 
installed. Since you don't know which of those packages are being used 
by Apple utilities, and Python doesn't provide a package versioning 
system, and not all package updates are fully backwards compatible, it's 
best to simply not mess with Apple's python at all.

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Behavior from a change in dtype?

2009-09-08 Thread Christopher Barker
Skipper Seabold wrote:
> Hmm, okay, well I came across this in trying to create a recarray like
> data2 below, so I guess I should just combine the two questions.

key to understanding this is to understand what is going on under the 
hood in numpy. Travis O. gave a nice intro in an Enthought webcast a few 
months ago -- I"m not sure if those are recorded and up on the web, but 
it's worth a look. It was also discussed int eh advanced numpy tutorial 
at SciPy this year -- and that is up on the web:

http://www.archive.org/details/scipy09_advancedTutorialDay1_1


Anyway, here is my minimal attempt to clarify:

> import numpy as np
> 
> data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]])

here we are using a standard array constructor -- it will look at the 
data you are passing in (a mixture of python floats and ints), and 
decide that they can best be represented by a numpy array of float64s.

numpy arrays are essentially a pointer to a black of memory, and a bunch 
of attributes that describe how the bytes pointed to are to be 
interpreted. In this case, they are a 9 C doubles, representing a 3x3 
array of doubles.

> dt = np.dtype([('var1', 'f8'), ('var2', '>i8'), ('var3', '>i8')])
)

This is a data type descriptor that is analogous to a C struct, 
containing a float64 and two int84s

> # Doesn't work, raises TypeError: expected a readable buffer object
> data2 = data2.view(np.recarray)
> data2.astype(dt)

I'm don't understand that error either, but recarrays are about adding 
the ability to access parts of a structured array by name, but you still 
need the dtype to specify the types and names. This does seem to work 
(though may not be giving the results you expect):

In [19]: data2 = data.copy()
In [20]: data2 = data2.view(np.recarray)
In [21]: data2 = data2.view(dtype=dt)

or, indeed in the opposite order:

In [24]: data2 = data.copy()
In [25]: data2 = data2.view(dtype=dt)
In [26]: data2 = data2.view(np.recarray)


So you've done two operations, one is to change the dtype -- the 
interpretation of the bytes in the data buffer, and one is to make this 
a recarray, which allows you to access the "fields" by name:

In [31]: data2['var1']
Out[31]:
array([[ 10.75],
[ 10.39],
[ 18.18]])

> # Works without error (?) with unexpected result
> data3 = data3.view(np.recarray)
> data3.dtype = dt

that all depends what you expect! I used "view" above, 'cause I think 
there is less magic, though it's the same thing. I suppose changing the 
dtype in place like that is a tiny bit more efficient -- if you use 
.view() , you are creating a new array pointing to the same data, rather 
than changing the array in place.

But anyway, the dtype describes how the bytes in the memory black are to 
be interpreted, changing it by assigning the attribute or using .view() 
changes the interpretation, but does not change the bytes themselves at 
all, so in this case, you are taking the 8 bytes representing a float64 
of value: 1.0, and interpreting those bytes as an 8 byte int -- which is 
going to give you garbage, essentially.

> # One correct (though IMHO) unintuitive way
> data = np.rec.fromarrays(data.swapaxes(1,0), dtype=dt)

This is using the np.rec.fromarrays constructor to build a new record 
array with the dtype you want, the data is being converted and copied, 
it won't change the original at all:

So the question remains -- is there a way to convert the floats in 
"data" to ints in place?


This seems to work:
In [78]: data = np.array([[10.75, 1, 1],[10.39, 0, 1],[18.18, 0, 1]])

In [79]: data[:,1:3] = data[:,1:3].astype('>i8').view(dtype='>f8')

In [80]: data.dtype = dt

It is making a copy of the integer data in process -- but I think that 
is required, as you are changing the value, not just the interpretation 
of the bytes. I suppose we could have a "astype_inplace" method, but 
that would only work if the two types were the same size, and I'm not 
sure it's a common enough use to be worth it.

What is your real use case? I suspect that what you really should do 
here is define your dtype first, then create the array of data:

data = np.array([(10.75, 1, 1), (10.39, 0, 1), (18.18, 0, 1)], dtype=dt)

which does require that you use tuples, rather than lists to hold the 
"structs".

HTH,
  - Chris







-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] creating mesh data from xyz data

2009-09-08 Thread Christopher Barker
Giuseppe Aprea wrote:
> I have some files with data stored in columns:
> 
> x1 y1 z1
> x2 y2 z2
> x3 y3 z3
> x4 y4 z4
> x5 y5 z5
> I usually load data using 3 lists: x, y and z; I wonder if there is
> any function which is able to take these 3 lists and return the right
> inputs for matplotlib functions.

There may b e some MPL utilities that help with this, so you may want to 
ask there, but:

What you want to do depends on the nature of your data. If your data is 
on a rectangular structured grid, the you should use your knowledge of 
the data structure to re-create that structure to pass to MPL.

If it is unstructured data: i.e. the (x,y) points are at arbitrary 
positions, then you need some sort of interpolation scheme to get an 
appropriate rectangular mesh:


Here's a good start:

http://www.scipy.org/Cookbook/Matplotlib/Gridding_irregularly_spaced_data

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] creating mesh data from xyz data

2009-09-08 Thread Giuseppe Aprea
Hi list,

I have some files with data stored in columns:

x1 y1 z1
x2 y2 z2
x3 y3 z3
x4 y4 z4
x5 y5 z5
...

and I need to make a contour plot of this data using matplotlib. The
problem is that contour plot functions usually handle a different kind
of input:

X=[[x1,x2,x3,x4,x5,x6],
[x1,x2,x3,x4,x5,x6],
[x1,x2,x3,x4,x5,x6],...


Y=[[y1,y1,y1,y1,y1,y1],
[y2,y2,y2,y2,y2,y2],
[y3,y3,y3,y3,y3,y3],.

Z=[[z1,z2,z3,z4,z5,z6],
[z7,z8,zz9,z10,z11,z12],

I usually load data using 3 lists: x, y and z; I wonder if there is
any function which is able to take these 3 lists and return the right
inputs for matplotlib functions.

cheers

g
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] greppable file of all numpy functions ?

2009-09-08 Thread denis bzowy
denis bzowy  t-online.de> writes:

> 
> Does anyone have a program to generate a file with one line per Numpy function
> / class / method, for local grepping ?

Sorry I wasn't clear: I want just all defs, one per long line, like this:
...
PyQt4.QtCore.QObject.findChildren(type type, QRegExp regExp) -> list
PyQt4.QtCore.QObject.emit(SIGNAL(), ...)
PyQt4.QtCore.QObject.objectName() -> QString
PyQt4.QtCore.QObject.setObjectName(QString name)
PyQt4.QtCore.QObject.isWidgetType() -> bool
...
This file (PyQt4.api) is a bit different but you get the idea:

egrep kilroy all.defs -> a.b.c.kilroy ... with args -- no __doc__

then pydoc or ipython %whoosh a.b.c.kilroy -> __doc__ is step 2.
Sound dumb ? Well, grep is fast, simple,
and works even when you don't know enough for tree-structured search.

Whooshdoc looks very nice, can it do just all.defs ?

(Oops, wdoc -v index numpy.core numpy.lib ->
...
  File "/opt/local/lib/python2.5/site-packages/epydoc-3.0.1-py2.5.egg/epydoc/doc
module_doc.package.submodules.append(module_doc)
AttributeError: _Sentinel instance has no attribute 'append'
log on the way to enthought-dev




___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion