Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Fernando Perez
On Wed, Sep 9, 2009 at 9:47 PM, Sturla Molden wrote:
> James Bergstra skrev:
>> Suppose you want to evaluate "dot(a*b+c*sqrt(d), e)".  The GPU is
>> great for doing dot(),
> The CPU is equally great (or better?) for doing dot(). In both cases:
>
> - memory access scale O(n) for dot producs.
> - computation scale O(n) for dot producs.

Remember that we have  a little terminology ambiguity here: in numpy,
dot(a,b) is used to describe both the vector dot product, an O(n)
operation if a and b are n-element vectors, and the matrix product, an
O(n**3) operation if a and b are both nxn square matrices.

Just a clarification...

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread David Warde-Farley
On 10-Sep-09, at 12:47 AM, Sturla Molden wrote:

> The CPU is equally great (or better?) for doing dot(). In both cases:
>
> - memory access scale O(n) for dot producs.
> - computation scale O(n) for dot producs.
> - memory is low
> - computation is fast (faster for GPU)

You do realize that the throughput from onboard (video) RAM is going  
to be much higher, right? It's not just the parallelization but the  
memory bandwidth. And as James pointed out, if you can keep most of  
your intermediate computation on-card, you stand to benefit immensely,  
even if doing some operations where the GPU provides no tangible  
benefit (i.e. the benefit is in aggregate and avoiding copies).

FWIW I agree with you that NumPy isn't the place for GPU stuff to  
happen. In the short to medium term we need a way to make it simpler  
for naturally expressed computations not go hog wild with temporary  
allocations (it's a very hard problem given the constraints of the  
language). In the long term I envision something with flexible enough  
machinery to be manipulating objects in GPU memory with the same ease  
as in main memory, but I think the path to that lies in increasing the  
generality and flexibility of the interfaces exposed.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Sturla Molden
James Bergstra skrev:
> Suppose you want to evaluate "dot(a*b+c*sqrt(d), e)".  The GPU is
> great for doing dot(), 
The CPU is equally great (or better?) for doing dot(). In both cases:

- memory access scale O(n) for dot producs.
- computation scale O(n) for dot producs.
- memory is low
- computation is fast (faster for GPU)

In both cases, the floating point unit is starved. That means it could 
do a lot more work if memory were faster.

For the GPU to be "faster than CPU", you have to have a situation where 
computation dominates over memory access. Matrix-matrix multiplication 
is one such example. This is what GPUs are designed to do, as it is the 
major bootleneck in 3D graphics.

The proper way to speed up "dot(a*b+c*sqrt(d), e)" is to get rid of 
temporary intermediates. That is, in Python pseudo-code:

result = 0
for i in range(n):
result += (a[i]*b[i] + c[i]*sqrt(d[i])) * e[i]

instead of:

tmp0 = empty(n)
for i in range(n):
   tmp0[i] = a[i] * b[i]

tmp1 = empty(n)
for i in range(n):
   tmp1[i] = sqrt(d[i])

tmp2 = empty(n)
for i in range(n):
   tmp2[i] = c[i] * tmp1[i]

tmp3 = empty(n)
for i in range(n):
   tmp3[i]  = tmp0[i] + tmp2[i]

result = 0
for i in range(n): 
   result += tmp3[i] * e[i]


It is this complication that makes NumPy an order of magnitude slower 
than hand-crafted C (but still much faster than pure Python!) Adding in 
GPUs will not change this. The amount of computation (flop count) is the 
same, so it cannot be the source of the slowness.


Sturla Molden



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Sturla Molden
George Dahl skrev:
 > I know that for my work, I can get around an order of a 50-fold 
speedup over
 > numpy using a python wrapper for a simple GPU matrix class.  So I 
might be
 > dealing with a lot of matrix products where I multiply a fixed 512 by 
784 matrix
 > by a 784 by 256 matrix that changes between each matrix product, 
although to
 > really see the largest gains I use a 4096 by 2048 matrix times a 
bunch of 2048
 > by 256 matrices.



Matrix multiplication is at the core of 3D graphics, and the raison 
d'etre for GPUs. That is specifically what they are designed to do. 
Matrix multiplication scale O(n**3) with floating point operations and 
O(n**2) with memory access. That is GPUs gives fast 3D graphics (matrix 
multiplications) by speeding up floating point operations.

GPUs makes sence for certain level-3 BLAS calls, but that really belongs 
in BLAS, not in NumPy's core. One could e.g. consider linking with a 
BLAS wrapper that directs these special cases to the GPU and the rest to 
ATLAS / MKL / netlib BLAS.

Sturla Molden
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Huge arrays

2009-09-09 Thread David Warde-Farley
On 9-Sep-09, at 4:48 AM, Francesc Alted wrote:

> Yes, this later is supported in PyTables as long as the underlying  
> filesystem
> supports files > 2 GB, which is very usual in modern operating  
> systems.

I think the OP said he was on Win32, in which case it should be noted:  
FAT32 has its upper file size limit at 4GB (minus one byte), so  
storing both your arrays as one file on a FAT32 partition is a no-no.

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Dag Sverre Seljebotn
Ruben Salvador wrote:
> Your results are what I expected...but. This code is called from my main 
> program, and what I have in there (output array already created for both 
> cases) is:
> 
> print "lambd", lambd
> print "np.shape(a)", np.shape(a)
> print "np.shape(r)", np.shape(r)
> print "np.shape(offspr)", np.shape(offspr)
> t = clock()
> for i in range(lambd):
> offspr[i] = r[i] + a[i] 
> t1 = clock() - t
> print "For loop time ==> %.8f seconds" % t1
> t2 = clock()
> offspr = r + a[:,None]
> t3 = clock() - t2
> print "Pythonic time ==> %.8f seconds" % t3
> 
> The results I obtain are:
> 
> lambd 8
> np.shape(a) (8,)
> np.shape(r) (8, 26)
> np.shape(offspr) (8, 26)
> For loop time ==> 0.34528804 seconds
> Pythonic time ==> 0.35956192 seconds
> 
> Maybe I'm not measuring properly, so, how should I do it?

Like Luca said, you are not including the creation time of offspr in the 
for-loop version. A fairer comparison would be

offspr[...] = r + a[:, None]

Even fairer (one less temporary copy):

offspr[...] = r
offspr += a[:, None]

Of course, see how the trend is for larger N as well.

Also your timings are a bit crude (though this depends on how many times 
you ran your script to check :-)). To get better measurements, use the 
timeit module, or (easier) IPython and the %timeit command.

> 
> On Wed, Sep 9, 2009 at 1:20 PM, Citi, Luca  > wrote:
> 
> I am sorry but it doesn't make much sense.
> How do you measure the performance?
> Are you sure you include the creation of the "c" output array in the
> time spent (which is outside the for loop but should be considered
> anyway)?
> 
> Here are my results...
> 
> In [84]: a = np.random.rand(8,26)
> 
> In [85]: b = np.random.rand(8)
> 
> In [86]: def o(a,b):
>   : c = np.empty_like(a)
>   : for i in range(len(a)):
>   : c[i] = a[i] + b[i]
>   : return c
>   :
> 
> In [87]: d = a + b[:,None]
> 
> In [88]: (d == o(a,b)).all()
> Out[88]: True
> 
> In [89]: %timeit o(a,b)
> %ti1 loops, best of 3: 36.8 µs per loop
> 
> In [90]: %timeit d = a + b[:,None]
> 10 loops, best of 3: 5.17 µs per loop
> 
> In [91]: a = np.random.rand(8,26)
> 
> In [92]: b = np.random.rand(8)
> 
> In [93]: %timeit o(a,b)
> %ti10 loops, best of 3: 287 ms per loop
> 
> In [94]: %timeit d = a + b[:,None]
> 100 loops, best of 3: 15.4 ms per loop
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org 
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> 
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Dag Sverre
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Robert Kern
On Wed, Sep 9, 2009 at 11:40, Christopher Barker wrote:
> Robert Kern wrote:
>> On Wed, Sep 9, 2009 at 07:15, Darren Dale wrote:
>> We aren't supposed to break APIs that aren't related to the 2-3
>> transition. PEP3118 is related to the 2-3 transition. Since I'm that
>> somebody that always pipes up about this topic, I'm pretty sure it
>> hasn't been PEP3118-related breakage that has been proposed.
>
> Is there a difference between changing the C api and the Python API? I'd
> kind of expect that any C code is going to be broken more than Python
> code anyway, so maybe that's a distinction worth making.
>
> Or maybe not -- I suppose the logic is that the transition of user code
> form 2->3 should be as easy as possible, so we don't want users to have
> to deal with Python changes, and numpy changes, and wxPython changes,
> and ??? all at once.

Yes, that is the logic. If the breakage is actually related to the
Python 3 transition (e.g. reworking things to use bytes and unicode
instead of str), then it's okay. You do whatever you need to do to
convert your code to work with Python 3. What you shouldn't do is
break APIs for reasons that are unrelated to the transition, like
cleaning up various warts that have been bugging us over the years.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Christopher Barker
Robert Kern wrote:
> On Wed, Sep 9, 2009 at 07:15, Darren Dale wrote:
> We aren't supposed to break APIs that aren't related to the 2-3
> transition. PEP3118 is related to the 2-3 transition. Since I'm that
> somebody that always pipes up about this topic, I'm pretty sure it
> hasn't been PEP3118-related breakage that has been proposed.

Is there a difference between changing the C api and the Python API? I'd 
kind of expect that any C code is going to be broken more than Python 
code anyway, so maybe that's a distinction worth making.

Or maybe not -- I suppose the logic is that the transition of user code 
form 2->3 should be as easy as possible, so we don't want users to have 
to deal with Python changes, and numpy changes, and wxPython changes, 
and ??? all at once.

-Chris





-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Dag Sverre Seljebotn
Dag Sverre Seljebotn wrote:
> Darren Dale wrote:
>> On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau wrote:
>>> On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote:
 Hi David,
> I already gave my own opinion on py3k, which can be summarized as:
>  - it is a huge effort, and no core numpy/scipy developer has
> expressed the urge to transition to py3k, since py3k does not bring
> much for scientific computing.
>  - very few packages with a significant portion of C have been ported
> to my knowledge, hence very little experience on how to do it. AFAIK,
> only small packages have been ported. Even big, pure python projects
> have not been ported. The only big C project to have been ported is
> python itself, and it broke compatibility and used a different source
> tree than python 2.
>  - it remains to be seen whether we can do the py3k support in the
> same source tree as the one use for python >= 2.4. Having two source
> trees would make the effort even much bigger, well over the current
> developers capacity IMHO.
>
> The only area where I could see the PSF helping is the point 2: more
> documentation, more stories about 2->3 transition.
 I'm surprised to hear you say that. I would think additional developer
 and/or financial resources would be useful, for all of the reasons you
 listed.
>>> If there was enough resources to pay someone very familiar with numpy
>>> codebase for a long time, then yes, it could be useful - but I assume
>>> that's out of the question. This would be very expensive as it would
>>> requires several full months IMO.
>>>
>>> The PSF could help for the point 3, by porting other projects to py3k
>>> and documenting it. The only example I know so far is pycog2
>>> (http://mail.python.org/pipermail/python-porting/2008-December/10.html).
>>>
>>> Paying people to do documentation about porting C code seems like a
>>> good way to spend money: it would be useful outside numpy community,
>>> and would presumably be less costly.
>> Another topic concerning documentation is API compatibility. The
>> python devs have requested projects not use the 2-3 transition as an
>> excuse to change their APIs, but numpy is maybe a special case. I'm
>> thinking about PEP3118. Is numpy going to transition to python 3 and
>> then down the road transition again to the new buffer protocol? What
>> is the strategy here? My underinformed impression is that there isn't
>> one, since every time PEP3118 is considered in the context of the 2-3
>> transition somebody helpfully reminds the list that we aren't supposed
>> to break APIs. Numpy is a critical python library, perhaps the
> 
> I'd be surprised if this is the case and if there are any issues.
> 
> What Robert said applies, plus: In Python 2.6 the ndarray type would 
> support *both* the old and the new buffer protocols, which can be usedin 
> parallel on Python 2.6.
> 
> There's no real issue on the PEP 3118 at all as I can see, it just needs 
> to be done. I'll try hard to give this a small start (ndarray export its 
> buffer) in November (though when the time comes I might feel that I 
> really should be studying instead...).
> 
>> transition presents an opportunity, if the community can yield a
>> little on numpy's C api. For example, in the long run, what would it
>> take to get numpy (or the core thereof) into the standard library, and
>> can we take steps now in that direction? Would the numpy devs be
>> receptive to comments from the python devs on the existing numpy
>> codebase?
> 
> I think this one is likely a question of semantics. My feeling is that 
> for instance the slice-returns-a-view on an array type would be hard to 
> swallow on a standard library component? (Seeing as list returns a copy.)
> 
> Python 3 kind of solved this by calling the type "memoryview", which 
> implies that slicing returns another view.
> 
> I have a feeling the the best start in this direction might be for 
> somebody to give the memoryview type in Python 3 some love, perhaps set 
> it up as a light-weight ndarray replacement in the standard library.
> 
> (If anybody implemented fancy indexing on a memoryview I suppose it 
> should return a new view though (through a pointer table), meaning 
> incompatability with NumPy's fancy indexing...)
> 
>> I'm willing to pitch in and work on the transition, not because I need
>> python-3 right now, but because the transition needs to happen and it
>> would benefit everyone in the long run. But I would like to know that
>> we are making the most of the opportunity, and have considered our
>> options.

Another note: Perhaps there is an opportunity for replacing NumPy with 
more buffer-centric cross-library approaches in Python 3 eventually, but 
current NumPy with the current API really has to be ported to Python 3 
just so that people can port their existing programs to Python 3.



-- 
Dag Sverre
_

Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Darren Dale
On Wed, Sep 9, 2009 at 11:25 AM, Charles R
Harris wrote:
>
>
> On Wed, Sep 9, 2009 at 7:15 AM, Darren Dale  wrote:
>>
>> On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau
>> wrote:
>> > On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote:
>> >> Hi David,
>> >
>> >>> I already gave my own opinion on py3k, which can be summarized as:
>> >>>  - it is a huge effort, and no core numpy/scipy developer has
>> >>> expressed the urge to transition to py3k, since py3k does not bring
>> >>> much for scientific computing.
>> >>>  - very few packages with a significant portion of C have been ported
>> >>> to my knowledge, hence very little experience on how to do it. AFAIK,
>> >>> only small packages have been ported. Even big, pure python projects
>> >>> have not been ported. The only big C project to have been ported is
>> >>> python itself, and it broke compatibility and used a different source
>> >>> tree than python 2.
>> >>>  - it remains to be seen whether we can do the py3k support in the
>> >>> same source tree as the one use for python >= 2.4. Having two source
>> >>> trees would make the effort even much bigger, well over the current
>> >>> developers capacity IMHO.
>> >>>
>> >>> The only area where I could see the PSF helping is the point 2: more
>> >>> documentation, more stories about 2->3 transition.
>> >>
>> >> I'm surprised to hear you say that. I would think additional developer
>> >> and/or financial resources would be useful, for all of the reasons you
>> >> listed.
>> >
>> > If there was enough resources to pay someone very familiar with numpy
>> > codebase for a long time, then yes, it could be useful - but I assume
>> > that's out of the question. This would be very expensive as it would
>> > requires several full months IMO.
>> >
>> > The PSF could help for the point 3, by porting other projects to py3k
>> > and documenting it. The only example I know so far is pycog2
>> >
>> > (http://mail.python.org/pipermail/python-porting/2008-December/10.html).
>> >
>> > Paying people to do documentation about porting C code seems like a
>> > good way to spend money: it would be useful outside numpy community,
>> > and would presumably be less costly.
>>
>> Another topic concerning documentation is API compatibility. The
>> python devs have requested projects not use the 2-3 transition as an
>> excuse to change their APIs, but numpy is maybe a special case. I'm
>> thinking about PEP3118. Is numpy going to transition to python 3 and
>> then down the road transition again to the new buffer protocol? What
>> is the strategy here? My underinformed impression is that there isn't
>> one, since every time PEP3118 is considered in the context of the 2-3
>> transition somebody helpfully reminds the list that we aren't supposed
>> to break APIs. Numpy is a critical python library, perhaps the
>> transition presents an opportunity, if the community can yield a
>> little on numpy's C api. For example, in the long run, what would it
>> take to get numpy (or the core thereof) into the standard library, and
>> can we take steps now in that direction? Would the numpy devs be
>> receptive to comments from the python devs on the existing numpy
>> codebase?
>>
>> I'm willing to pitch in and work on the transition, not because I need
>> python-3 right now, but because the transition needs to happen and it
>> would benefit everyone in the long run. But I would like to know that
>> we are making the most of the opportunity, and have considered our
>> options.
>>
>
> Making numpy more buffer centric is an interesting idea and might be where
> we want to go with the ufuncs, but the new buffer protocol didn't go in
> until python 2.6. If there was no rush I'd go with Fernando and wait until
> we could be all python 2.6 all the time.

I wonder what such a timeframe would look like, what would decide when
to require python-2.6 for future releases of packages. Could a
maintenance-only branch be created for the numpy-1.4 or 1.5 series,
and then future development require 2.6 or 3.1?

> However, if anyone has the time to work on getting the c-code up to snuff
> and finding out what the problems are I'm all for that. I have some notes on
> the transition in the src directory and if you do anything please keep them
> current.

I will have a look, thank you for putting those notes together.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Dag Sverre Seljebotn
Darren Dale wrote:
> On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau wrote:
>> On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote:
>>> Hi David,
 I already gave my own opinion on py3k, which can be summarized as:
  - it is a huge effort, and no core numpy/scipy developer has
 expressed the urge to transition to py3k, since py3k does not bring
 much for scientific computing.
  - very few packages with a significant portion of C have been ported
 to my knowledge, hence very little experience on how to do it. AFAIK,
 only small packages have been ported. Even big, pure python projects
 have not been ported. The only big C project to have been ported is
 python itself, and it broke compatibility and used a different source
 tree than python 2.
  - it remains to be seen whether we can do the py3k support in the
 same source tree as the one use for python >= 2.4. Having two source
 trees would make the effort even much bigger, well over the current
 developers capacity IMHO.

 The only area where I could see the PSF helping is the point 2: more
 documentation, more stories about 2->3 transition.
>>> I'm surprised to hear you say that. I would think additional developer
>>> and/or financial resources would be useful, for all of the reasons you
>>> listed.
>> If there was enough resources to pay someone very familiar with numpy
>> codebase for a long time, then yes, it could be useful - but I assume
>> that's out of the question. This would be very expensive as it would
>> requires several full months IMO.
>>
>> The PSF could help for the point 3, by porting other projects to py3k
>> and documenting it. The only example I know so far is pycog2
>> (http://mail.python.org/pipermail/python-porting/2008-December/10.html).
>>
>> Paying people to do documentation about porting C code seems like a
>> good way to spend money: it would be useful outside numpy community,
>> and would presumably be less costly.
> 
> Another topic concerning documentation is API compatibility. The
> python devs have requested projects not use the 2-3 transition as an
> excuse to change their APIs, but numpy is maybe a special case. I'm
> thinking about PEP3118. Is numpy going to transition to python 3 and
> then down the road transition again to the new buffer protocol? What
> is the strategy here? My underinformed impression is that there isn't
> one, since every time PEP3118 is considered in the context of the 2-3
> transition somebody helpfully reminds the list that we aren't supposed
> to break APIs. Numpy is a critical python library, perhaps the

I'd be surprised if this is the case and if there are any issues.

What Robert said applies, plus: In Python 2.6 the ndarray type would 
support *both* the old and the new buffer protocols, which can be usedin 
parallel on Python 2.6.

There's no real issue on the PEP 3118 at all as I can see, it just needs 
to be done. I'll try hard to give this a small start (ndarray export its 
buffer) in November (though when the time comes I might feel that I 
really should be studying instead...).

> transition presents an opportunity, if the community can yield a
> little on numpy's C api. For example, in the long run, what would it
> take to get numpy (or the core thereof) into the standard library, and
> can we take steps now in that direction? Would the numpy devs be
> receptive to comments from the python devs on the existing numpy
> codebase?

I think this one is likely a question of semantics. My feeling is that 
for instance the slice-returns-a-view on an array type would be hard to 
swallow on a standard library component? (Seeing as list returns a copy.)

Python 3 kind of solved this by calling the type "memoryview", which 
implies that slicing returns another view.

I have a feeling the the best start in this direction might be for 
somebody to give the memoryview type in Python 3 some love, perhaps set 
it up as a light-weight ndarray replacement in the standard library.

(If anybody implemented fancy indexing on a memoryview I suppose it 
should return a new view though (through a pointer table), meaning 
incompatability with NumPy's fancy indexing...)

> I'm willing to pitch in and work on the transition, not because I need
> python-3 right now, but because the transition needs to happen and it
> would benefit everyone in the long run. But I would like to know that
> we are making the most of the opportunity, and have considered our
> options.

Well, something that may belong here: There's been some talk now and 
then on whether one should to port parts of the NumPy C codebase to 
Cython (which gives automatic Python 3 compatability, up to string/bytes 
issues etc.). That could probably take somewhat longer, but perhaps 
result in a better maintainable code base in the end which more people 
could work on.

-- 
Dag Sverre
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy

Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Dag Sverre Seljebotn
Christopher Barker wrote:
> George Dahl wrote:
>> Sturla Molden  molden.no> writes:
>>> Teraflops peak performance of modern GPUs is impressive. But NumPy 
>>> cannot easily benefit from that. 
> 
>> I know that for my work, I can get around an order of a 50-fold speedup over
>> numpy using a python wrapper for a simple GPU matrix class.
> 
> I think you're talking across each other here. Sturla is referring to 
> making a numpy ndarray gpu-aware and then expecting expressions like:
> 
> z = a*x**2 + b*x + c
> 
> to go faster when s, b, c, and x are ndarrays.
> 
> That's not going to happen.
> 
> On the other hand, George is talking about moving higher-level 
> operations (like a matrix product) over to GPU code. This is analogous 
> to numpy.linalg and numpy.dot() using LAPACK routines, and yes, that 
> could help those programs that use such operations.
> 
> So a GPU LAPACK would be nice.
> 
> This is also analogous to using SWIG, or ctypes or cython or weave, or 
> ??? to move a computationally expensive part of the code over to C.
> 
> I think anything that makes it easier to write little bits of your code 
> for the GPU would be pretty cool -- a GPU-aware Cython?

Cython is probably open for that if anybody's interested in implementing 
it/make a student project on it (way too big for GSoC I think, 
unfortunately).

However I'd definitely make it a generic library turning expressions 
into compiled code (either GPU or CPU w/SSE); that could then be used 
both at compile-time from Cython, or at run-time using e.g. SymPy or 
SAGE expressions. Both PyCUDA and CorePy would tend to allow both 
compile-time operation and run-time operation.

-- 
Dag Sverre
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread James Bergstra
On Wed, Sep 9, 2009 at 10:41 AM, Francesc Alted  wrote:
>> Numexpr mainly supports functions that are meant to be used element-wise,
>> so the operation/element ratio is normally 1 (or close to 1).  In these
>> scenarios is where improved memory access is much more important than CPU
>> (or, for that matter, GPU), and is the reason why numexpr is much more
>> efficient than NumPy when evaluating complex expressions like
>> ``a*b+c*sqrt(d)``.
>>
>> In other words, a GPU-enabled numexpr makes little sense.

There's another way of looking at this, which has been mentioned
before in the conversation, but which I think should be mentioned
again...

The cost of transfer to and from a GPU is very high, compared with
most of the sorts of things that we do with ndarrays.  So the approach
of using libraries to speed up little pieces here and there (i.e. with
VML or ATLAS) but basically to let stock numpy take care of the rest
does not work.  In order to benefit from huge speedups on a GPU, data
need to be on the GPU already.  It is a good idea to perform
low-instruction density functions on the GPU even when the CPU could
go just as fast (or even if the CPU is faster!) just to ensure that
the data stay on the GPU.

Suppose you want to evaluate "dot(a*b+c*sqrt(d), e)".  The GPU is
great for doing dot(), but if you have to copy the result of the
elemwise expression to the GPU before you can start doing dot(), then
the performance advantage is ruined.  Except for huge matrices, you
might as well just leave the data in the system RAM and use a normal
BLAS library.

So that's why it is a good idea to use the GPU to do some functions
even when the CPU would be faster for them (in isolation).

All that said, there is a possibility that future devices (and some
laptops already?) will use an integrated memory system that might make
'copying to the GPU' a non-issue... but we're not there yet I think...

James
-- 
http://www-etud.iro.umontreal.ca/~bergstrj
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Charles R Harris
On Wed, Sep 9, 2009 at 7:15 AM, Darren Dale  wrote:

>  On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau
> wrote:
> > On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote:
> >> Hi David,
> >
> >>> I already gave my own opinion on py3k, which can be summarized as:
> >>>  - it is a huge effort, and no core numpy/scipy developer has
> >>> expressed the urge to transition to py3k, since py3k does not bring
> >>> much for scientific computing.
> >>>  - very few packages with a significant portion of C have been ported
> >>> to my knowledge, hence very little experience on how to do it. AFAIK,
> >>> only small packages have been ported. Even big, pure python projects
> >>> have not been ported. The only big C project to have been ported is
> >>> python itself, and it broke compatibility and used a different source
> >>> tree than python 2.
> >>>  - it remains to be seen whether we can do the py3k support in the
> >>> same source tree as the one use for python >= 2.4. Having two source
> >>> trees would make the effort even much bigger, well over the current
> >>> developers capacity IMHO.
> >>>
> >>> The only area where I could see the PSF helping is the point 2: more
> >>> documentation, more stories about 2->3 transition.
> >>
> >> I'm surprised to hear you say that. I would think additional developer
> >> and/or financial resources would be useful, for all of the reasons you
> >> listed.
> >
> > If there was enough resources to pay someone very familiar with numpy
> > codebase for a long time, then yes, it could be useful - but I assume
> > that's out of the question. This would be very expensive as it would
> > requires several full months IMO.
> >
> > The PSF could help for the point 3, by porting other projects to py3k
> > and documenting it. The only example I know so far is pycog2
> > (
> http://mail.python.org/pipermail/python-porting/2008-December/10.html
> ).
> >
> > Paying people to do documentation about porting C code seems like a
> > good way to spend money: it would be useful outside numpy community,
> > and would presumably be less costly.
>
> Another topic concerning documentation is API compatibility. The
> python devs have requested projects not use the 2-3 transition as an
> excuse to change their APIs, but numpy is maybe a special case. I'm
> thinking about PEP3118. Is numpy going to transition to python 3 and
> then down the road transition again to the new buffer protocol? What
> is the strategy here? My underinformed impression is that there isn't
> one, since every time PEP3118 is considered in the context of the 2-3
> transition somebody helpfully reminds the list that we aren't supposed
> to break APIs. Numpy is a critical python library, perhaps the
> transition presents an opportunity, if the community can yield a
> little on numpy's C api. For example, in the long run, what would it
> take to get numpy (or the core thereof) into the standard library, and
> can we take steps now in that direction? Would the numpy devs be
> receptive to comments from the python devs on the existing numpy
> codebase?
>
> I'm willing to pitch in and work on the transition, not because I need
> python-3 right now, but because the transition needs to happen and it
> would benefit everyone in the long run. But I would like to know that
> we are making the most of the opportunity, and have considered our
> options.
>
>
Making numpy more buffer centric is an interesting idea and might be where
we want to go with the ufuncs, but the new buffer protocol didn't go in
until python 2.6. If there was no rush I'd go with Fernando and wait until
we could be all python 2.6 all the time.

However, if anyone has the time to work on getting the c-code up to snuff
and finding out what the problems are I'm all for that. I have some notes on
the transition in the src directory and if you do anything please keep them
current.

There is a lot of work to be done in the python code also, some of which can
be done at this time, i.e., making all exceptions use class ctors, etc.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] re loading f2py modules in ipython

2009-09-09 Thread Robert Kern
On Wed, Sep 9, 2009 at 06:25, John [H2O] wrote:
>
> Hello,
>
> I've started to rely more and more on f2py to create simple modules
> utilizing Fortran for efficiency. This is a great tool to have within
> Python!
>
> A problem, however, is that unlike python modules, the reload() function
> does not seem to update the f2py modules within ipython (which I use
> extensively for testing).
>
> Is there another function to call?

No. Extension modules, regardless of whether they are built with f2py
or pure C, cannot be be reloaded. It's a limitation of CPython.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Robert Kern
On Wed, Sep 9, 2009 at 07:15, Darren Dale wrote:

> Another topic concerning documentation is API compatibility. The
> python devs have requested projects not use the 2-3 transition as an
> excuse to change their APIs, but numpy is maybe a special case. I'm
> thinking about PEP3118. Is numpy going to transition to python 3 and
> then down the road transition again to the new buffer protocol? What
> is the strategy here? My underinformed impression is that there isn't
> one, since every time PEP3118 is considered in the context of the 2-3
> transition somebody helpfully reminds the list that we aren't supposed
> to break APIs.

We aren't supposed to break APIs that aren't related to the 2-3
transition. PEP3118 is related to the 2-3 transition. Since I'm that
somebody that always pipes up about this topic, I'm pretty sure it
hasn't been PEP3118-related breakage that has been proposed.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Francesc Alted
A Wednesday 09 September 2009 11:26:06 Francesc Alted escrigué:
> A Tuesday 08 September 2009 23:21:53 Christopher Barker escrigué:
> > Also, perhaps a GPU-aware numexpr could be helpful which I think is the
> > kind of thing that Sturla was refering to when she wrote:
> >
> > "Incidentally,  this will  also make it easier to leverage on modern
> > GPUs."
>
> Numexpr mainly supports functions that are meant to be used element-wise,
> so the operation/element ratio is normally 1 (or close to 1).  In these
> scenarios is where improved memory access is much more important than CPU
> (or, for that matter, GPU), and is the reason why numexpr is much more
> efficient than NumPy when evaluating complex expressions like
> ``a*b+c*sqrt(d)``.
>
> In other words, a GPU-enabled numexpr makes little sense.

Er, I forgot the fact that one exception to operation/element ratio being 
normally 1 in numexpr is the computation of transcendental functions 
(trigonometrical, exponential, logarithmic...) where the number of CPU 
operations per element is much larger than 1 (normally in the 100s).  Right 
now, there is support for accelerating them in numexpr via VML (Intel's Vector 
Math Library), but I suppose that a library making use of a GPU would be very 
interesting too (and the same applies to numpy).

But again, it makes more sense to rely on external packages or libraries 
(similar to the VML above) for this sort of things.  After having a look at 
CULA (thanks for the pointer, Lev!), my hope is that in short we will see 
other libraries allowing for efficient evaluation of transcendental functions 
using GPUs too.

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Lev Givon
Received from Francesc Alted on Wed, Sep 09, 2009 at 05:18:48AM EDT:

(snip)

> The point here is that matrix-matrix multiplications (or, in general, 
> functions with a large operation/element ratio) are a *tiny* part of all the 
> possible operations between arrays that NumPy supports.  This is why Sturla 
> is 
> saying that it is not a good idea to include support of GPUs in all parts of 
> NumPy.  A much better strategy is to give NumPy the possibility to link with 
> external packages (à la BLAS, LAPACK, Atlas, MKL) that can leverage the 

.. and CULA: http://www.culatools.com/

> powerful GPUs for specific problems (e.g. matrix-matrix multiplications).

L.G.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Ruben Salvador
I forgot...just in case:

rsalva...@cactus:~$ python --version
Python 2.5.2

python-scipy: version 0.6.0


On Wed, Sep 9, 2009 at 2:36 PM, Ruben Salvador wrote:

> Your results are what I expected...but. This code is called from my main
> program, and what I have in there (output array already created for both
> cases) is:
>
> print "lambd", lambd
> print "np.shape(a)", np.shape(a)
> print "np.shape(r)", np.shape(r)
> print "np.shape(offspr)", np.shape(offspr)
> t = clock()
> for i in range(lambd):
> offspr[i] = r[i] + a[i]
> t1 = clock() - t
> print "For loop time ==> %.8f seconds" % t1
> t2 = clock()
> offspr = r + a[:,None]
> t3 = clock() - t2
> print "Pythonic time ==> %.8f seconds" % t3
>
> The results I obtain are:
>
> lambd 8
> np.shape(a) (8,)
> np.shape(r) (8, 26)
> np.shape(offspr) (8, 26)
> For loop time ==> 0.34528804 seconds
> Pythonic time ==> 0.35956192 seconds
>
> Maybe I'm not measuring properly, so, how should I do it?
>
> On Wed, Sep 9, 2009 at 1:20 PM, Citi, Luca  wrote:
>
>> I am sorry but it doesn't make much sense.
>> How do you measure the performance?
>> Are you sure you include the creation of the "c" output array in the time
>> spent (which is outside the for loop but should be considered anyway)?
>>
>> Here are my results...
>>
>> In [84]: a = np.random.rand(8,26)
>>
>> In [85]: b = np.random.rand(8)
>>
>> In [86]: def o(a,b):
>>   : c = np.empty_like(a)
>>   : for i in range(len(a)):
>>   : c[i] = a[i] + b[i]
>>   : return c
>>   :
>>
>> In [87]: d = a + b[:,None]
>>
>> In [88]: (d == o(a,b)).all()
>> Out[88]: True
>>
>> In [89]: %timeit o(a,b)
>> %ti1 loops, best of 3: 36.8 µs per loop
>>
>> In [90]: %timeit d = a + b[:,None]
>> 10 loops, best of 3: 5.17 µs per loop
>>
>> In [91]: a = np.random.rand(8,26)
>>
>> In [92]: b = np.random.rand(8)
>>
>> In [93]: %timeit o(a,b)
>> %ti10 loops, best of 3: 287 ms per loop
>>
>> In [94]: %timeit d = a + b[:,None]
>> 100 loops, best of 3: 15.4 ms per loop
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Ruben Salvador
Your results are what I expected...but. This code is called from my main
program, and what I have in there (output array already created for both
cases) is:

print "lambd", lambd
print "np.shape(a)", np.shape(a)
print "np.shape(r)", np.shape(r)
print "np.shape(offspr)", np.shape(offspr)
t = clock()
for i in range(lambd):
offspr[i] = r[i] + a[i]
t1 = clock() - t
print "For loop time ==> %.8f seconds" % t1
t2 = clock()
offspr = r + a[:,None]
t3 = clock() - t2
print "Pythonic time ==> %.8f seconds" % t3

The results I obtain are:

lambd 8
np.shape(a) (8,)
np.shape(r) (8, 26)
np.shape(offspr) (8, 26)
For loop time ==> 0.34528804 seconds
Pythonic time ==> 0.35956192 seconds

Maybe I'm not measuring properly, so, how should I do it?

On Wed, Sep 9, 2009 at 1:20 PM, Citi, Luca  wrote:

> I am sorry but it doesn't make much sense.
> How do you measure the performance?
> Are you sure you include the creation of the "c" output array in the time
> spent (which is outside the for loop but should be considered anyway)?
>
> Here are my results...
>
> In [84]: a = np.random.rand(8,26)
>
> In [85]: b = np.random.rand(8)
>
> In [86]: def o(a,b):
>   : c = np.empty_like(a)
>   : for i in range(len(a)):
>   : c[i] = a[i] + b[i]
>   : return c
>   :
>
> In [87]: d = a + b[:,None]
>
> In [88]: (d == o(a,b)).all()
> Out[88]: True
>
> In [89]: %timeit o(a,b)
> %ti1 loops, best of 3: 36.8 µs per loop
>
> In [90]: %timeit d = a + b[:,None]
> 10 loops, best of 3: 5.17 µs per loop
>
> In [91]: a = np.random.rand(8,26)
>
> In [92]: b = np.random.rand(8)
>
> In [93]: %timeit o(a,b)
> %ti10 loops, best of 3: 287 ms per loop
>
> In [94]: %timeit d = a + b[:,None]
> 100 loops, best of 3: 15.4 ms per loop
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] question about future support for python-3

2009-09-09 Thread Darren Dale
On Tue, Sep 8, 2009 at 9:02 PM, David Cournapeau wrote:
> On Wed, Sep 9, 2009 at 9:37 AM, Darren Dale wrote:
>> Hi David,
>
>>> I already gave my own opinion on py3k, which can be summarized as:
>>>  - it is a huge effort, and no core numpy/scipy developer has
>>> expressed the urge to transition to py3k, since py3k does not bring
>>> much for scientific computing.
>>>  - very few packages with a significant portion of C have been ported
>>> to my knowledge, hence very little experience on how to do it. AFAIK,
>>> only small packages have been ported. Even big, pure python projects
>>> have not been ported. The only big C project to have been ported is
>>> python itself, and it broke compatibility and used a different source
>>> tree than python 2.
>>>  - it remains to be seen whether we can do the py3k support in the
>>> same source tree as the one use for python >= 2.4. Having two source
>>> trees would make the effort even much bigger, well over the current
>>> developers capacity IMHO.
>>>
>>> The only area where I could see the PSF helping is the point 2: more
>>> documentation, more stories about 2->3 transition.
>>
>> I'm surprised to hear you say that. I would think additional developer
>> and/or financial resources would be useful, for all of the reasons you
>> listed.
>
> If there was enough resources to pay someone very familiar with numpy
> codebase for a long time, then yes, it could be useful - but I assume
> that's out of the question. This would be very expensive as it would
> requires several full months IMO.
>
> The PSF could help for the point 3, by porting other projects to py3k
> and documenting it. The only example I know so far is pycog2
> (http://mail.python.org/pipermail/python-porting/2008-December/10.html).
>
> Paying people to do documentation about porting C code seems like a
> good way to spend money: it would be useful outside numpy community,
> and would presumably be less costly.

Another topic concerning documentation is API compatibility. The
python devs have requested projects not use the 2-3 transition as an
excuse to change their APIs, but numpy is maybe a special case. I'm
thinking about PEP3118. Is numpy going to transition to python 3 and
then down the road transition again to the new buffer protocol? What
is the strategy here? My underinformed impression is that there isn't
one, since every time PEP3118 is considered in the context of the 2-3
transition somebody helpfully reminds the list that we aren't supposed
to break APIs. Numpy is a critical python library, perhaps the
transition presents an opportunity, if the community can yield a
little on numpy's C api. For example, in the long run, what would it
take to get numpy (or the core thereof) into the standard library, and
can we take steps now in that direction? Would the numpy devs be
receptive to comments from the python devs on the existing numpy
codebase?

I'm willing to pitch in and work on the transition, not because I need
python-3 right now, but because the transition needs to happen and it
would benefit everyone in the long run. But I would like to know that
we are making the most of the opportunity, and have considered our
options.

Darren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Citi, Luca
I am sorry but it doesn't make much sense.
How do you measure the performance?
Are you sure you include the creation of the "c" output array in the time spent 
(which is outside the for loop but should be considered anyway)?

Here are my results...

In [84]: a = np.random.rand(8,26)

In [85]: b = np.random.rand(8)

In [86]: def o(a,b):
   : c = np.empty_like(a)
   : for i in range(len(a)):
   : c[i] = a[i] + b[i]
   : return c
   :

In [87]: d = a + b[:,None]

In [88]: (d == o(a,b)).all()
Out[88]: True

In [89]: %timeit o(a,b)
%ti1 loops, best of 3: 36.8 µs per loop

In [90]: %timeit d = a + b[:,None]
10 loops, best of 3: 5.17 µs per loop

In [91]: a = np.random.rand(8,26)

In [92]: b = np.random.rand(8)

In [93]: %timeit o(a,b)
%ti10 loops, best of 3: 287 ms per loop

In [94]: %timeit d = a + b[:,None]
100 loops, best of 3: 15.4 ms per loop

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] re loading f2py modules in ipython

2009-09-09 Thread John [H2O]

Hello,

I've started to rely more and more on f2py to create simple modules
utilizing Fortran for efficiency. This is a great tool to have within
Python!

A problem, however, is that unlike python modules, the reload() function
does not seem to update the f2py modules within ipython (which I use
extensively for testing).

Is there another function to call?

Regards!
-- 
View this message in context: 
http://www.nabble.com/reloading-f2py-modules-in-ipython-tp25362944p25362944.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Pauli Virtanen
Wed, 09 Sep 2009 13:08:22 +0200, Ruben Salvador wrote:
> Perfect! Thank you very much :D
> 
> It's not obvious, though...I think I should read more deeply into
> Python/NumPy...but for the use I'm giving to it...
> 
> Anyway, I thought the pythonic way would be faster, but after trying
> with a size 8 instead of 8...the for loop is faster!
> 
> Pythonic time ==> 0.36776400 seconds
> For loop time ==> 0.31708717 seconds

Doubtful:

In [1]: import numpy as np

In [2]: a = np.zeros((8, 26))

In [3]: b = np.zeros((8,))

In [4]: def foo():
   ...: d = np.empty(a.shape, a.dtype)
   ...: for i in xrange(a.shape[0]):
   ...: d[i] = a[i] + b[i]
   ...: 

In [5]: %timeit d=a+b[:,None]
100 loops, best of 3: 18.3 ms per loop

In [6]: %timeit foo()
10 loops, best of 3: 334 ms per loop

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Ruben Salvador
Perfect! Thank you very much :D

It's not obvious, though...I think I should read more deeply into
Python/NumPy...but for the use I'm giving to it...

Anyway, I thought the pythonic way would be faster, but after trying with a
size 8 instead of 8...the for loop is faster!

Pythonic time ==> 0.36776400 seconds
For loop time ==> 0.31708717 seconds

:S

On Wed, Sep 9, 2009 at 12:46 PM, Citi, Luca  wrote:

> Hi Ruben
>
> One dimensional arrays can be thought of as rows. If you want a column, you
> need to append a dimension.
>
> >>> d = a + b[:,None]
> which is equivalent to
> >>> d = a + b[:,np.newaxis]
>
> Best,
> Luca
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Citi, Luca
Hi Ruben

One dimensional arrays can be thought of as rows. If you want a column, you 
need to append a dimension.

>>> d = a + b[:,None]
which is equivalent to 
>>> d = a + b[:,np.newaxis]

Best,
Luca
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Adding a 2D with a 1D array...

2009-09-09 Thread Ruben Salvador
Hi there!

I'm sure I'm missing something, but I am not able of doing a simple sum of
two arrays with different dimensions. I have a 2D array and a 1D array

np.shape(a) (8, 26)
np.shape(b) (8,)

and I want to sum each *row* of 'a' with the equivalent *row* of 'b' (this
is, summing each 1D row array of 'a' with each scalar of 'b' for all rows).
It's straight forward doing the sum with a for loop (yes, I learnt C long
ago :S ):

for i in range(8):
c[i] = a[i] + b[i]

but, is there a Pythonic way to do this? or even better...which is the
pythonic way of doing it? because I'm sure there is one...

Thanks, and sorry for such an easy question...but I'm stuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Row-wise dot product?

2009-09-09 Thread Chris Colbert
the way I do my rotations is this:

tmat = rotation matrix
vec = stack of row vectors

rotated_vecs = np.dot(tmat, vec.T).T



On Mon, Sep 7, 2009 at 6:53 PM, T J wrote:
> On Mon, Sep 7, 2009 at 3:43 PM, T J wrote:
>> Or perhaps I am just being dense.
>>
>
> Yes.  I just tried to reinvent standard matrix multiplication.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Huge arrays

2009-09-09 Thread Francesc Alted
A Wednesday 09 September 2009 10:48:48 Francesc Alted escrigué:
> OTOH, having the possibility to manage compressed data buffers
> transparently in NumPy would help here, but not there yet ;-)

Now that I think about it, in case the data is compressible, Daniel could try 
to define a PyTables' compressed array or table on-disk and save chunks to it.
If data is compressible enough, the filesystem cache will keep it in-memory, 
until the disk can eventually absorb it.

For doing this, I would recommend to use the LZO compressor, as it is one of 
the fastest I've seen (at least until Blosc would be ready), because it can 
compress up to 5 times faster than output data to disk (depending on how 
compressible the data is, and the speed of the disk subsystem).

Of course, if data is not compressible at all, then this venue doesn't make a 
lot of sense.

HTH,

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Francesc Alted
A Tuesday 08 September 2009 23:21:53 Christopher Barker escrigué:
> Also, perhaps a GPU-aware numexpr could be helpful which I think is the
> kind of thing that Sturla was refering to when she wrote:
>
> "Incidentally,  this will  also make it easier to leverage on modern GPUs."

Numexpr mainly supports functions that are meant to be used element-wise, so 
the operation/element ratio is normally 1 (or close to 1).  In these scenarios 
is where improved memory access is much more important than CPU (or, for that 
matter, GPU), and is the reason why numexpr is much more efficient than NumPy 
when evaluating complex expressions like ``a*b+c*sqrt(d)``.

In other words, a GPU-enabled numexpr makes little sense.

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fwd: GPU Numpy

2009-09-09 Thread Francesc Alted
A Tuesday 08 September 2009 21:19:05 George Dahl escrigué:
> Sturla Molden  molden.no> writes:
> > Erik Tollerud skrev:
> > >> NumPy arrays on the GPU memory is an easy task. But then I would have
> > >> to write the computation in OpenCL's dialect of C99?
> > >
> > > This is true to some extent, but also probably difficult to do given
> > > the fact that paralellizable algorithms are generally more difficult
> > > to formulate in striaghtforward ways.
> >
> > Then you have misunderstood me completely. Creating an ndarray that has
> > a buffer in graphics memory is not too difficult, given that graphics
> > memory can be memory mapped. This has nothing to do with parallelizable
> > algorithms or not. It is just memory management. We could make an
> > ndarray subclass that quickly puts is content in a buffer accessible to
> > the GPU. That is not difficult. But then comes the question of what you
> > do with it.
> >
> > I think many here misunderstands the issue here:
> >
> > Teraflops peak performance of modern GPUs is impressive. But NumPy
> > cannot easily benefit from that. In fact, there is little or nothing to
> > gain from optimising in that end. In order for a GPU to help,
> > computation must be the time-limiting factor. It is not. There is not
> > more to say about using GPUs in NumPy right now.
> >
> > Take a look at the timings here: http://www.scipy.org/PerformancePython
> > It shows that computing with NumPy is more than ten times slower than
> > using plain C. This is despite NumPy being written in C. The NumPy code
> > does not incur 10 times more floating point operations than the C code.
> > The floating point unit does not run in turtle mode when using NumPy.
> > NumPy's relative slowness compared to C has nothing to do with floating
> > point computation. It is due to inferior memory use (temporary buffers,
> > multiple buffer traversals) and memory access being slow. Moving
> > computation to the GPU can only make this worse.
> >
> > Improved memory usage - e.g. through lazy evaluation and JIT compilaton
> > of expressions - can give up to a tenfold increase in performance. That
> > is where we must start optimising to get a faster NumPy. Incidentally,
> > this will  also make it easier to leverage on modern GPUs.
> >
> > Sturla Molden
>
> I know that for my work, I can get around an order of a 50-fold speedup
> over numpy using a python wrapper for a simple GPU matrix class.  So I
> might be dealing with a lot of matrix products where I multiply a fixed 512
> by 784 matrix by a 784 by 256 matrix that changes between each matrix
> product, although to really see the largest gains I use a 4096 by 2048
> matrix times a bunch of 2048 by 256 matrices.  If all I was doing were
> those matrix products, it would be even faster, but what I actually am
> doing is a matrix product, then adding a column vector to the result, then
> applying an elementwise logistic sigmoid function and potentially
> generating a matrix of pseudorandom numbers the same shape as my result
> (although not always).  When I do these sorts of workloads, my python
> numpy+GPU matrix class goes so much faster than anything that doesn't use
> the GPU (be it Matlab, or numpy, or C/C++ whatever) that I don't even
> bother measuring the speedups precisely.  In some cases, my python code
> isn't making too many temporaries since what it is doing is so simple, but
> in other cases that is obviously slowing it down a bit.  I have relatively
> complicated jobs that used to take weeks on the CPU can now take hours or
> days.
>
> Obviously improved memory usage would be more helpful since not everyone
> has access to the sorts of GPUs I use, but tenfold increases in performance
> seem like chump change compared to what I see with the sorts of workloads I
> do.

50-fold increases over NumPy+[Atlas|MKL] are really impressive.  However, the 
point is that these speed-ups can be achieved only when the ratio of 
operations per element is really huge.  Matrix-matrix multiplication (your 
example above) is a paradigmatic example of these scenarios, where 
computations are O(3) (or little smaller than 3, when optimized algorithms are 
used), while memory access is O(2).  Of course, when the matrices
are large, the ratio operations/elements is larger, allowing much better  
speed-ups; this is why GPUs really do a good job here.

The point here is that matrix-matrix multiplications (or, in general, 
functions with a large operation/element ratio) are a *tiny* part of all the 
possible operations between arrays that NumPy supports.  This is why Sturla is 
saying that it is not a good idea to include support of GPUs in all parts of 
NumPy.  A much better strategy is to give NumPy the possibility to link with 
external packages (à la BLAS, LAPACK, Atlas, MKL) that can leverage the 
powerful GPUs for specific problems (e.g. matrix-matrix multiplications).

-- 
Francesc Alted
___
NumPy-Discussion mailing

Re: [Numpy-discussion] Huge arrays

2009-09-09 Thread Francesc Alted
A Wednesday 09 September 2009 07:22:33 David Cournapeau escrigué:
> On Wed, Sep 9, 2009 at 2:10 PM, Sebastian Haase wrote:
> > Hi,
> > you can probably use PyTables for this. Even though it's meant to
> > save/load data to/from disk (in HDF5 format) as far as I understand,
> > it can be used to make your task solvable - even on a 32bit system !!
> > It's free (pytables.org) -- so maybe you can try it out and tell me if
> > I'm right 
>
> You still would not be able to load a numpy array > 2 Gb. Numpy memory
> model needs one contiguously addressable chunk of memory for the data,
> which is limited under the 32 bits archs. This cannot be overcome in
> any way AFAIK.
>
> You may be able to save data > 2 Gb, by appending several chunks < 2
> Gb to disk - maybe pytables supports this if it has large file support
> (which enables to write files > 2Gb on a 32 bits system).

Yes, this later is supported in PyTables as long as the underlying filesystem 
supports files > 2 GB, which is very usual in modern operating systems.  This 
even works on 32-bit systems as the indexing machinery in Python has been 
completely replaced inside PyTables.

However, I think that what Daniel is trying to achieve is to be able to keep 
all the info in-memory because writing it to disk is too slow.  I also agree 
that your suggestion to use a 64-bit OS (or 32-bit Linux, as it can address 
the full 3GB right out-of-the-box, as Chuck said) is the way to go.

OTOH, having the possibility to manage compressed data buffers transparently 
in NumPy would help here, but not there yet ;-)

-- 
Francesc Alted
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion