Re: [Numpy-discussion] Simple problem. Is it possible without a loop?

2010-06-10 Thread V. Armando Solé
Hi Bruce,

In the context of the actual problem, I have a long series of 
non-equidistant and irregularly spaced float numbers and I have to take 
values between given limits with the constraint of keeping a minimal 
separation. Option 2 just misses the first value of the input array if 
it is within the limits, but for my purposes (perform a fit with a given 
function) is acceptable. I said this seems to be quite close to what I 
need because I do not like missing the first point because that gives 
equivalent but not exactly the same solutions.

By the way, thanks for the % hint. That should make the .astype(int) 
disappear and make the expression look nicer.

Armando

Bruce Southey wrote:
 On 06/09/2010 10:24 AM, Vicente Sole wrote:
 ? Well a loop or list comparison seems like a good choice to me. It is
 much more obvious at the expense of two LOCs. Did you profile the two
 possibilities and are they actually performance-critical?

 cheers

   
 The second is between 8 and ten times faster on my machine.

 import numpy
 import time
 x0 = numpy.arange(1.)
 niter = 2000   # I expect between 1 and 10


 def option1(x, delta=0.2):
  y = [x[0]]
  for value in x:
  if (value - y[-1])  delta:
  y.append(value)
  return numpy.array(y)

 def option2(x, delta=0.2):
  y = numpy.cumsum((x[1:]-x[:-1])/delta).astype(numpy.int)
  i1 = numpy.nonzero(y[1:]  y[:-1])
  return numpy.take(x, i1)


 t0 = time.time()
 for i in range(niter):
  t = option1(x0)
 print Elapsed = , time.time() - t0
 t0 = time.time()
 for i in range(niter):
  t = option2(x0)
 print Elapsed = , time.time() - t0

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
   
 For integer arguments for delta, I don't see any different between 
 using option1 and using the '%' operator.
  (x0[(x0*10)%2==0]-option1(x0)).sum()
 0.0

 Also option2 gives a different result than option1 so these are not 
 equivalent functions. You can see that from the shapes
  option2(x0).shape
 (1, 9998)
  option1(x0).shape
 (1,)
  ((option1(x0)[:9998])-option2(x0)).sum()
 0.0

 So, allowing for shape difference, option2 is the same for most of 
 output from option1 but it is still smaller than option1.

 Probably the main reason for the speed difference is that option2 is 
 virtually pure numpy (and hence done in C) and option1 is using a lot 
 of array lookups that are always slow. So keep it in numpy as most as 
 possible.


 Bruce
 

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
   


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] yet another scipy Install problem

2010-06-10 Thread David Cournapeau
On Thu, Jun 10, 2010 at 1:16 PM, Bino Oetomo b...@indoakses-online.com wrote:
 Dear All ..
 I tried to install numpy and scipy ... basicaly I want to use OMPC

You need at least numpy 1.4 to build scipy from current svn,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Charles R Harris
On Wed, Jun 9, 2010 at 5:27 PM, Jason McCampbell
jmccampb...@enthought.comwrote:

 Hi everyone,

 This is a follow-up to Travis's message on the re-factoring project from
 May 25th and the subsequent discussion. For background, I am a developer at
 Enthought working on the NumPy re-factoring project with Travis and Scott.
 The immediate goal from our perspective is to re-factor the core of NumPy
 into two architectural layers: a true core that is CPython-independent and
 an interface layer that binds the core to CPython.

 A design proposal is now posted on the NumPy developer wiki:
 http://projects.scipy.org/numpy/wiki/NumPyRefactoring

 The write-up is a fairly high-level description of what we think the split
 will look like and how to deal with issues such as memory management.  There
 are also placeholders listed as 'TBD' where more investigation is still
 needed and will be filled in over time.  At the end of the page there is a
 section on the C API with a link to a function-by-function breakdown of the
 C API and whether the function belongs in the interface layer, the core, or
 need to be split between the two.  All functions listed as 'core' will
 continue to have an interface-level wrapper with the same name to ensure
 source-compatibility.

 All of this, particularly the interface/core function designations, is a
 first analysis and in flux. The goal is to get the information out and
 elicit discussion and feedback from the community.


A few thoughts came to mind while reading the initial writeup.

1) How is the GIL handled in the callbacks.
2) What about error handling? That is tricky to get right, especially in C
and with reference counting.
3) Is there a general policy as to how the reference counting should be
handled in specific functions? That is, who does the reference
incrementing/decrementing?
4) Boost has some reference counted pointers, have you looked at them? C++
is admittedly a very different animal for this sort of application.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Charles R Harris
On Thu, Jun 10, 2010 at 7:26 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Wed, Jun 9, 2010 at 5:27 PM, Jason McCampbell 
 jmccampb...@enthought.com wrote:

 Hi everyone,

 This is a follow-up to Travis's message on the re-factoring project from
 May 25th and the subsequent discussion. For background, I am a developer at
 Enthought working on the NumPy re-factoring project with Travis and Scott.
 The immediate goal from our perspective is to re-factor the core of NumPy
 into two architectural layers: a true core that is CPython-independent and
 an interface layer that binds the core to CPython.

 A design proposal is now posted on the NumPy developer wiki:
 http://projects.scipy.org/numpy/wiki/NumPyRefactoring

 The write-up is a fairly high-level description of what we think the split
 will look like and how to deal with issues such as memory management.  There
 are also placeholders listed as 'TBD' where more investigation is still
 needed and will be filled in over time.  At the end of the page there is a
 section on the C API with a link to a function-by-function breakdown of the
 C API and whether the function belongs in the interface layer, the core, or
 need to be split between the two.  All functions listed as 'core' will
 continue to have an interface-level wrapper with the same name to ensure
 source-compatibility.

 All of this, particularly the interface/core function designations, is a
 first analysis and in flux. The goal is to get the information out and
 elicit discussion and feedback from the community.


 A few thoughts came to mind while reading the initial writeup.

 1) How is the GIL handled in the callbacks.
 2) What about error handling? That is tricky to get right, especially in C
 and with reference counting.
 3) Is there a general policy as to how the reference counting should be
 handled in specific functions? That is, who does the reference
 incrementing/decrementing?
 4) Boost has some reference counted pointers, have you looked at them? C++
 is admittedly a very different animal for this sort of application.

 I've thought that PyArray_SetNumericOps, PyArray_SetNumericOps should have
getter/setter functions so that the structure doesn't have to be made public
in the split up files.

Can the casting functions be implemented as ufuncs? Or at least look like
ufuncs so they can handle strided memory. Likewise for some of the other
things in arraytypes.c.src.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Jason McCampbell
Hi Chuck,

Good questions.  Responses inline below...

Jason

On Thu, Jun 10, 2010 at 8:26 AM, Charles R Harris charlesr.har...@gmail.com
 wrote:



 On Wed, Jun 9, 2010 at 5:27 PM, Jason McCampbell 
 jmccampb...@enthought.com wrote:

 Hi everyone,

 This is a follow-up to Travis's message on the re-factoring project from
 May 25th and the subsequent discussion. For background, I am a developer at
 Enthought working on the NumPy re-factoring project with Travis and Scott.
 The immediate goal from our perspective is to re-factor the core of NumPy
 into two architectural layers: a true core that is CPython-independent and
 an interface layer that binds the core to CPython.

 A design proposal is now posted on the NumPy developer wiki:
 http://projects.scipy.org/numpy/wiki/NumPyRefactoring

 The write-up is a fairly high-level description of what we think the split
 will look like and how to deal with issues such as memory management.  There
 are also placeholders listed as 'TBD' where more investigation is still
 needed and will be filled in over time.  At the end of the page there is a
 section on the C API with a link to a function-by-function breakdown of the
 C API and whether the function belongs in the interface layer, the core, or
 need to be split between the two.  All functions listed as 'core' will
 continue to have an interface-level wrapper with the same name to ensure
 source-compatibility.

 All of this, particularly the interface/core function designations, is a
 first analysis and in flux. The goal is to get the information out and
 elicit discussion and feedback from the community.


 A few thoughts came to mind while reading the initial writeup.

 1) How is the GIL handled in the callbacks.


How to handle the GIL still requires some thought.  The cleanest way, IMHO,
would is for the interface layer to release the lock prior to calling into
the core and then each callback function in the interface is responsible for
re-acquiring it.  That's straightforward to define as a rule and should work
well in general, but I'm worried about potential performance issues if/when
a callback is called in a loop.  A few optimization points is ok, but too
many and it will just be a source of heisenbugs.

One other option is to just use the existing release/acquire macros in NumPy
and redirect them to the interface layer.  Any app that isn't CPython would
just leave those callback pointers NULL.  It's less disruptive but leaves
some very CPython-specific behavior in the core.


 2) What about error handling? That is tricky to get right, especially in C
 and with reference counting.


The error reporting functions in the core will likely look a lot like the
CPython functions - they seem general enough.  The biggest change is the
CPython ones take a PyObject as the error type.  99% of the errors reported
in NumPy use one of a half-dozen pre-defined types that are easy to
translate.  There is at least one case where an object type (complex number)
is dynamically and used as the type, but so far I believe it's only one
case.

The reference counting does get a little more complex because a core routine
will need to decref the core object on error and the interface layer will
need to similarly detect the error and potentially do it's own decref.  Each
layer is still responsible for it's own clean up, but there are now two
opportunities to introduce leaks.


 3) Is there a general policy as to how the reference counting should be
 handled in specific functions? That is, who does the reference
 incrementing/decrementing?


Both layers should implement the existing policy for the objects that it
manages. Essentially a function can use it's caller's reference but needs to
increment the count if it's going to store it.  A new instance is returned
with a refcnt of 1 and the caller needs to clean it up when it's no longer
needed.  But that means that if the core returns a new NpyArray instance to
the interface layer, the receiving function in the interface must allocate a
PyObject wrapper around it and set the wrapper's refcnt to 1 before
returning it.

Is that what you were asking?

4) Boost has some reference counted pointers, have you looked at them? C++
 is admittedly a very different animal for this sort of application.


There is also need to replace the usage of PyDict and other uses of CPython
for basic data structures that aren't present in C.  Having access to C++
for this and reference counting would be nice, but has the potential to
break builds for everyone who use the C API.  I think it's worth discussing
for the future but a bigger (and possibly more contentious) change than we
are able to take on for this project.


 Chuck

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org

Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden


I have a few radical suggestions:

1. Use ctypes as glue to the core DLL, so we can completely forget about 
refcounts and similar mess. Why put manual reference counting and error 
handling in the core? It's stupid.


2. The core should be a plain DLL, loadable with ctypes. (I know David 
Cournapeau and Robert Kern is going to hate this.) But if Python can 
have a custom loader for .pyd files, so can NumPy for it's core DLL. For 
ctypes we just need to specify a fully qualified path to the DLL, which 
can be read from a config file or whatever.


3. ctypes will take care of all the mess regarding the GIL. Again there 
is no need to re-invent the wheel. ctypes.CDLL releases the GIL when 
calling into C, and re-acquires before making callbacks to Python. 
ctypes.PyDLL keeps the GIL for legacy library code that are not threadsafe.


ctypes will also make porting to other Python implementations easier (or 
even other languages: Ruby, JacaScript) easier. Not to mention that it 
will make NumPy impervious to changes in the Python C API.


4. Write the core in C++ or Fortran (95/2003), not C. ANSI C (aka C89). 
Use std::vector instead of malloc/free for C++, and possibly templates 
for generics on various dtypes.


5. Allow OpenMP pragmas in the core. If arrays are above a certain size, 
it should switch to multi-threading.


Sturla






Den 10.06.2010 15:26, skrev Charles R Harris:


A few thoughts came to mind while reading the initial writeup.

1) How is the GIL handled in the callbacks.
2) What about error handling? That is tricky to get right, especially 
in C and with reference counting.
3) Is there a general policy as to how the reference counting should 
be handled in specific functions? That is, who does the reference 
incrementing/decrementing?
4) Boost has some reference counted pointers, have you looked at them? 
C++ is admittedly a very different animal for this sort of application.


Chuck


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
   


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden
Den 10.06.2010 18:48, skrev Sturla Molden:
 ctypes will also make porting to other Python implementations easier 
 (or even other languages: Ruby, JacaScript) easier. Not to mention 
 that it will make NumPy impervious to changes in the Python C API.

Linking is also easier with ctypes. I started getting a lot of linker 
errors when switching to 64-bit Python on Windows. Just throwing out all 
of Cython et al. in favour of plain C keeps the problem away.

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden


Another suggestion I'd like to make is bytearray as memory buffer for 
the ndarray. An ndarray could just store or extend a bytearray, instead 
of having to deal with malloc/free and and the mess that comes with it. 
Python takes care of the reference counts for bytearrays, and ctypes 
calls the NumPy core DLL. Yes it will not work with older versions of 
Python. But for a complete refactoring (NumPy 3000), backwards 
compatibility should not matter much. (It's easier to backport bytearray 
than track down memory leaks in NumPy's core.)


Sturla


Den 10.06.2010 18:48, skrev Sturla Molden:


I have a few radical suggestions:

1. Use ctypes as glue to the core DLL, so we can completely forget 
about refcounts and similar mess. Why put manual reference counting 
and error handling in the core? It's stupid.


2. The core should be a plain DLL, loadable with ctypes. (I know David 
Cournapeau and Robert Kern is going to hate this.) But if Python can 
have a custom loader for .pyd files, so can NumPy for it's core DLL. 
For ctypes we just need to specify a fully qualified path to the DLL, 
which can be read from a config file or whatever.


3. ctypes will take care of all the mess regarding the GIL. Again 
there is no need to re-invent the wheel. ctypes.CDLL releases the GIL 
when calling into C, and re-acquires before making callbacks to 
Python. ctypes.PyDLL keeps the GIL for legacy library code that are 
not threadsafe.


ctypes will also make porting to other Python implementations easier 
(or even other languages: Ruby, JacaScript) easier. Not to mention 
that it will make NumPy impervious to changes in the Python C API.


4. Write the core in C++ or Fortran (95/2003), not C. ANSI C (aka 
C89). Use std::vector instead of malloc/free for C++, and possibly 
templates for generics on various dtypes.


5. Allow OpenMP pragmas in the core. If arrays are above a certain 
size, it should switch to multi-threading.


Sturla






Den 10.06.2010 15:26, skrev Charles R Harris:


A few thoughts came to mind while reading the initial writeup.

1) How is the GIL handled in the callbacks.
2) What about error handling? That is tricky to get right, especially 
in C and with reference counting.
3) Is there a general policy as to how the reference counting should 
be handled in specific functions? That is, who does the reference 
incrementing/decrementing?
4) Boost has some reference counted pointers, have you looked at 
them? C++ is admittedly a very different animal for this sort of 
application.


Chuck


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
   



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
   


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread David Cournapeau
On Fri, Jun 11, 2010 at 6:56 AM, Sturla Molden stu...@molden.no wrote:


 Also about array iterators in NumPy's C base (i.e. for doing something
 along an axis): we don't need those. There is a different way of coding
 which leads to faster code.

 1. Collect an array of pointers to each subarray (e.g. using
 std::vectordtype* or dtype**)
 2. Dispatch on the pointer array...

Do you have the code for this ? That's something I wanted to do, but
never took the time to do. Faster generic iterator would be nice, but
very hard to do in general.

 Another thing I did when reimplementing lfilter was copy-in copy-out
 for strided arrays.

What is copy-in copy out ? I am not familiar with this term ?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [Job] research developer job opening at AQR Capital

2010-06-10 Thread Pierre GM
Wes,
Sorry to contact you directly. I answered to this ad on 05/24, sending a cover 
letter and a link to my resume to Matt. However, I didn't get any news since. 
Do you mind telling me how to check the status of my application ? If it is 
rejected, so it goes, but I'd still like to know.
In other news, I have plenty of free time in front of me and need a project of 
some sort to stay busy. That'd be a good opportunity to work on a standard way 
to deal w/ time series in Numpy/Scipy, eg providing bridges between 
scikits.timeseries and pandas...
Let me know what you think.
Sincerely,
P.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] timeseries - dates prior to 1970

2010-06-10 Thread Bevan Jenkins
Hello,

I have posted previously about dates prior to 1900 but this seems to be a 
seperate issue.  The error message is definitley different.
I can not seem to convert a timseseries from one frequency ('D') to another 
('H') when i use dates prior to 1970 as shown below.  This works fine when I 
use a date after 1970.  Is this something that can be easily fixed or work 
around that I can use? Thanks
 
In [1]: import datetime
In [2]: import numpy as np
In [3]: import scikits.timeseries as ts
In [4]: from scikits.timeseries.lib.interpolate import interp_masked1d

In [5]:

In [6]: dta = np.linspace(1.0, 5.0,5)

In [7]: msk = [1,0,1,0,0]

In [8]: dta_maskd = np.ma.masked_array(dta,msk)

In [9]: yr = 1969

In [10]: dtes = [datetime.datetime(yr, 1, 1),
   :  datetime.datetime(yr, 1, 2),
   :  datetime.datetime(yr, 1, 3),
   :  datetime.datetime(yr, 1, 4),
   :  datetime.datetime(yr, 1, 5)]

In [11]: day_ts = ts.time_series(dta_maskd, dtes, freq='D')

In [12]: hour_ts = day_ts.convert('H')
---
ValueErrorTraceback (most recent call last)


C:\Python26\lib\site-packages\scikits\timeseries\tseries.pyc in convert(series, 
freq, func, position, *args, **kwargs)
   2000
   2001 if series.ndim == 1:
- 2002 obj = _convert1d(series, freq, func, position, *args, **kwargs)
   2003 elif series.ndim == 2:
   2004 base = _convert1d(series[:, 0], freq, func, position, *args, 
**kwargs)

C:\Python26\lib\site-packages\scikits\timeseries\tseries.pyc in _convert1d
(series, freq, func, position, *args, **kwargs)
   1910
   1911 cdictresult = cseries.TS_convert(data_, from_freq, to_freq, 
position,
- 1912  int(start_date), mask_)
   1913 start_date = Date(freq=to_freq, value=cdictresult['startindex'])
   1914 data_ = masked_array(cdictresult['values'], mask=cdictresult
['mask'])

ValueError: start_date outside allowable range for destination frequency

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Travis Oliphant

On Jun 10, 2010, at 11:48 AM, Sturla Molden wrote:

 
 I have a few radical suggestions:

There are some good ideas there.   I suspect we can't address all of them in 
the course of this re-factoring effort, but I really appreciate you putting 
them out there, because they are useful things to consider. 

 
 1. Use ctypes as glue to the core DLL, so we can completely forget about 
 refcounts and similar mess. Why put manual reference counting and error 
 handling in the core? It's stupid. 

If we could get to a single core DLL, then perhaps this would work.   It may be 
very difficult to actually get to that point though from where we are because 
there is a lot to the current CPython interface that would have to be 
re-thought.   Right now, it looks like there is still a need for an interface 
layer.  

 
 2. The core should be a plain DLL, loadable with ctypes. (I know David 
 Cournapeau and Robert Kern is going to hate this.) But if Python can have a 
 custom loader for .pyd files, so can NumPy for it's core DLL. For ctypes we 
 just need to specify a fully qualified path to the DLL, which can be read 
 from a config file or whatever.

This approach does not build a new Python type in compiled code.   There are 
speed disadvantages to this --- especially for the numpy scalars. 


 
 5. Allow OpenMP pragmas in the core. If arrays are above a certain size, it 
 should switch to multi-threading.

This is an interesting idea. 

-Travis

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden
Den 10.06.2010 22:28, skrev Pauli Virtanen:

 Some places where Openmp could probably help are in the inner ufunc
 loops. However, improving the memory efficiency of the data access
 pattern is another low-hanging fruit for multidimensional arrays.



Getting the intermediate array out of expressions like a + b would be 
very cool. But it is not as trivial as it seems.

Possibility of controlling aliasing (cf. Fortran and C99) can help. For 
example, for binary array operators we know that the output array (a 
temporary intermediate) is not aliased with the two arguments.

ufuncs involving trancendental functions would benefit from OpenMP. 
O(N**2) ufuncs like var and std would also benefit.

We can also use copy-in copy-out for strided array access (see below).

 I'm not sure how Openmp plays together with thread-safety; especially if
 used in outer constructs. Whether this is possible to do easily is not so
 clear, as Numpy uses e.g. iterator and loop objects, which cannot
 probably be shared between different openmp threads. So one would have to
 do some extra work to parallelize these parts manually.



OpenMP has pragmas for serializing access to code, which is:

#pragma omp critical
{
  // done by all threads
  // serialized access (we can also name the lock)
}

#pragma omp atomic
a += 1 // atomic operation (e.g. nice for refcounts, instead a big lock)

#pragma omp single
{
  // done by one thread (random or named)
}

#pragma omp master
{
  // done by the master thread only
  // e.g. used when calling back to the Python C API
}

#pragma omp barrier
// wait for all threads here

That's about all you need to know about thread synchronization in OpenMP...


 But probably if multithreading is desired, openmp sounds like the way to
 go...



Anyone profient in C or Fortran can learn OpenMP in 10 minutes, and it 
is close to infinitely more easy than manual multithreading. And now it 
is supported by the majority of compilers.

Also about array iterators in NumPy's C base (i.e. for doing something 
along an axis): we don't need those. There is a different way of coding 
which leads to faster code.

1. Collect an array of pointers to each subarray (e.g. using 
std::vectordtype* or dtype**)
2. Dispatch on the pointer array...

I used this to reimplement lfilter in scipy. It is just a lot faster 
than array iterators. It also allows us to use OpenMP in a very natural 
way. For multidimensional array I have a recursive function for 
collecting the array of subarray pointers, but for 2D and 3D with can 
use nested for-loops.

Another thing I did when reimplementing lfilter was copy-in copy-out 
for strided arrays. This also helps a lot. It might seem paradoxical as 
we need to iterate over the strided array(s) to make the copies. But in 
practice it is faster by a factor of 2 or 3. Fortran compilers also tend 
to do this optimization for strided arrays. It is even mentioned in the 
Fortran standard as explicitely allowed, as it has consequences for 
array pointers (they can be invalidated by function calls if a 
contigiuous copy is made).

Sturla



















___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Pauli Virtanen
Thu, 10 Jun 2010 18:48:04 +0200, Sturla Molden wrote:
[clip]
 5. Allow OpenMP pragmas in the core. If arrays are above a certain size,
 it should switch to multi-threading.

Some places where Openmp could probably help are in the inner ufunc 
loops. However, improving the memory efficiency of the data access 
pattern is another low-hanging fruit for multidimensional arrays.

I'm not sure how Openmp plays together with thread-safety; especially if 
used in outer constructs. Whether this is possible to do easily is not so 
clear, as Numpy uses e.g. iterator and loop objects, which cannot 
probably be shared between different openmp threads. So one would have to 
do some extra work to parallelize these parts manually.

But probably if multithreading is desired, openmp sounds like the way to 
go...

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread David Cournapeau
On Fri, Jun 11, 2010 at 7:25 AM, Sturla Molden stu...@molden.no wrote:
 Den 10.06.2010 22:07, skrev Travis Oliphant:

 2. The core should be a plain DLL, loadable with ctypes. (I know David 
 Cournapeau and Robert Kern is going to hate this.) But if Python can have a 
 custom loader for .pyd files, so can NumPy for it's core DLL. For ctypes we 
 just need to specify a fully qualified path to the DLL, which can be read 
 from a config file or whatever.

I am not sure why you say I would hate it - that's exactly what I
would like numpy core to be (a library). Ctypes is an implementation
detail. For reference counting, I am not sure we can get away with it
unless we use something like LUA stack technique for the core. The big
issue with reference counting is how to deal with non C VM - there has
to be known good practices to make libraries reusable from say the CLI
or the JVM, but I don't know them.



 By the way: Is Cython mature enough to toss all C out the door?

Maybe not all, but I am convinced most of it could. Today, I would not
have rewritten lfilter in C, for example, I would have just used
cython. One issue with cython compared to C is that it makes
compilation much slower altogether, at least with gcc. I have never
been able to pin-point the exact issue (it does not seem like gcc
should be that slow compiling the generated C).

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden
Den 10.06.2010 22:07, skrev Travis Oliphant:

 2. The core should be a plain DLL, loadable with ctypes. (I know David 
 Cournapeau and Robert Kern is going to hate this.) But if Python can have a 
 custom loader for .pyd files, so can NumPy for it's core DLL. For ctypes we 
 just need to specify a fully qualified path to the DLL, which can be read 
 from a config file or whatever.
  
 This approach does not build a new Python type in compiled code.   There are 
 speed disadvantages to this --- especially for the numpy scalars.

There are at least four speed penalties in what I suggested:

- the ctypes overhead is bigger than using Python's C API manually (or 
Cython).
- there is a speed penalty for scalars and small arrays. (Previously 
seen in numarray vs. numeric.)
- bytearray is a bit slower to create than to malloc a buffer.
- arrays with dtype=object

The advantages are:

- the code is easier to read and maintain (clean separation of Python and C)
- more portable code (no Python C API dependence)
- no obscure memory leaks to track down (bytearray instead of 
malloc/free and no manual ref counting).


By the way: Is Cython mature enough to toss all C out the door? Since 
Cython has syntax for PEP 3118 buffer objects, we should theoretically 
be able to implement NumPy in Cython. Then we don't have the speed 
penalty and no difficult C to maintain. But if the idea is to make a 
Python independent C library from the core, it is probably a bit counter 
productive...


Sturla









___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [Job] research developer job opening at AQR Capital

2010-06-10 Thread Pierre GM
Oops, somebody should double check who's in copy before sending his emails... 
Sorry about the noise all. 
However, feel free to contact me offlist w/ your ideas regarding handling time 
series in numpy.
All my apologies again.
P.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden

Den 11.06.2010 00:57, skrev David Cournapeau:

Do you have the code for this ? That's something I wanted to do, but
never took the time to do. Faster generic iterator would be nice, but
very hard to do in general.

   



/* this computes the start adress for every vector along a dimension
(axis) of an ndarray */

typedef PyArrayObject ndarray;

inline char *datapointer(ndarray *a, npy_intp *indices)
{
char *data = a-data;
int i;
npy_intp j;
for (i=0; i  a-nd; i++) {
j = indices[i];
data += j * a-strides[i];
}
return data;
}

static double ** _get_axis_pointer(npy_intp *indices, int axis,
   ndarray *a, double **axis_ptr_array, int dim)
{
/* recursive axis pointer search for 4 dimensions or more */
npy_intp i;
double *axis_ptr;
if (dim == a-nd) {
/* done recursing dimensions,
compute address of this axis */
axis_ptr = (double *) datapointer(a, indices);
*axis_ptr_array = axis_ptr;
return (axis_ptr_array + 1);
} else if (dim == axis) {
/* use first element only */
indices[dim] = 0;
axis_ptr_array = _get_axis_pointer(indices, axis, a, 
axis_ptr_array, dim+1);

return axis_ptr_array;
} else {
/* iterate range of indices */
npy_intp length = a-dimensions[dim];
for (i=0; i  length; i++) {
indices[dim] = i;
axis_ptr_array = _get_axis_pointer(indices, axis, a, 
axis_ptr_array, dim+1);

}
return axis_ptr_array;
}
}


static double **get_axis_pointers(ndarray *a, int axis, npy_intp *count)
{
npy_intp indices[NPY_MAXDIMS];
double **out, **tmp;
npy_intp i, j;
j = 1;
for (i=0; i  a-nd; i++) {
if (i != axis)
j *= a-dimensions[i];
}
*count = j;

out = (double **) malloc(*count * sizeof(double*));
if (out == NULL) {
*count = 0;
return NULL;
}
tmp = out;
switch (a-nd) {
/* for one dimension we just return the data pointer */
case 1:
*tmp = (double *)a-data;
break;
/* specialized for two dimensions */
case 2:
switch (axis) {
case 0:
for (i=0; ia-dimensions[1]; i++)
*tmp++ = (double *)(a-data + i*a-strides[1]);
break;

case 1:
for (i=0; ia-dimensions[0]; i++)
*tmp++ = (double *)(a-data + i*a-strides[0]);
break;
}
break;
/* specialized for three dimensions */
case 3:
switch (axis) {
case 0:
for (i=0; ia-dimensions[1]; i++)
for (j=0; ja-dimensions[2]; j++)
*tmp++ = (double *)(a-data + 
i*a-strides[1] + j*a-strides[2]);

break;
case 1:
for (i=0; ia-dimensions[0]; i++)
for (j=0; ja-dimensions[2]; j++)
*tmp++ = (double *)(a-data + 
i*a-strides[0] + j*a-strides[2]);

break;

case 2:
for (i=0; ia-dimensions[0]; i++)
for (j=0; ja-dimensions[1]; j++)
*tmp++ = (double *)(a-data + 
i*a-strides[0] + j*a-strides[1]);


}
break;
/* four or more dimensions: use recursion */
default:
for (i=0; ia-nd; i++) indices[i] = 0;
_get_axis_pointer(indices, axis, a, out, 0);

}
done:
return out;
}



Another thing I did when reimplementing lfilter was copy-in copy-out
for strided arrays.
 

What is copy-in copy out ? I am not familiar with this term ?

   


Strided memory access is slow. So it often helps to make a temporary 
copy that are contiguous.


static void copy_in(double *restrict y, double *restrict x, npy_intp n, 
npy_intp s)

{
npy_intp i;
char *tmp;
if (s == sizeof(double))
memcpy(y, x, n*sizeof(double));
else {
tmp = (char *)x;
for (i=0; in; i++) {
*y++ = *x;
tmp += s;
x = (double *)tmp;
}
}
}

static void copy_out(double *restrict y, double *restrict x, npy_intp n, 
npy_intp s)

{
npy_intp i;
char *tmp;
if (s == sizeof(double))
memcpy(y, x, n*sizeof(double));
else {
tmp = (char *)y;
for (i=0; in; i++) {
*y = *x++;
tmp += s;
y = (double *)tmp;
}
}
}



Sturla








___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Charles R Harris
On Thu, Jun 10, 2010 at 6:27 PM, Sturla Molden stu...@molden.no wrote:

  Den 11.06.2010 00:57, skrev David Cournapeau:

  Do you have the code for this ? That's something I wanted to do, but
 never took the time to do. Faster generic iterator would be nice, but
 very hard to do in general.





 /* this computes the start adress for every vector along a dimension
 (axis) of an ndarray */

 typedef PyArrayObject ndarray;

 inline char *datapointer(ndarray *a, npy_intp *indices)
 {
 char *data = a-data;
 int i;
 npy_intp j;
 for (i=0; i  a-nd; i++) {
 j = indices[i];
 data += j * a-strides[i];
 }
 return data;
 }

 static double ** _get_axis_pointer(npy_intp *indices, int axis,
ndarray *a, double **axis_ptr_array, int dim)
 {
 /* recursive axis pointer search for 4 dimensions or more */
 npy_intp i;
 double *axis_ptr;
 if (dim == a-nd) {
 /* done recursing dimensions,
 compute address of this axis */
 axis_ptr = (double *) datapointer(a, indices);
 *axis_ptr_array = axis_ptr;
 return (axis_ptr_array + 1);
 } else if (dim == axis) {
 /* use first element only */
 indices[dim] = 0;
 axis_ptr_array = _get_axis_pointer(indices, axis, a,
 axis_ptr_array, dim+1);
 return axis_ptr_array;
 } else {
 /* iterate range of indices */
 npy_intp length = a-dimensions[dim];
 for (i=0; i  length; i++) {
 indices[dim] = i;
 axis_ptr_array = _get_axis_pointer(indices, axis, a,
 axis_ptr_array, dim+1);
 }
 return axis_ptr_array;
 }
 }


 static double **get_axis_pointers(ndarray *a, int axis, npy_intp *count)
 {
 npy_intp indices[NPY_MAXDIMS];
 double **out, **tmp;
 npy_intp i, j;
 j = 1;
 for (i=0; i  a-nd; i++) {
 if (i != axis)
 j *= a-dimensions[i];
 }
 *count = j;

 out = (double **) malloc(*count * sizeof(double*));
 if (out == NULL) {
 *count = 0;
 return NULL;
 }
 tmp = out;
 switch (a-nd) {
 /* for one dimension we just return the data pointer */
 case 1:
 *tmp = (double *)a-data;
 break;
 /* specialized for two dimensions */
 case 2:
 switch (axis) {
 case 0:
 for (i=0; ia-dimensions[1]; i++)
 *tmp++ = (double *)(a-data + i*a-strides[1]);
 break;

 case 1:
 for (i=0; ia-dimensions[0]; i++)
 *tmp++ = (double *)(a-data + i*a-strides[0]);
 break;
 }
 break;
 /* specialized for three dimensions */
 case 3:
 switch (axis) {
 case 0:
 for (i=0; ia-dimensions[1]; i++)
 for (j=0; ja-dimensions[2]; j++)
 *tmp++ = (double *)(a-data + i*a-strides[1] +
 j*a-strides[2]);
 break;
 case 1:
 for (i=0; ia-dimensions[0]; i++)
 for (j=0; ja-dimensions[2]; j++)
 *tmp++ = (double *)(a-data + i*a-strides[0] +
 j*a-strides[2]);
 break;

 case 2:
 for (i=0; ia-dimensions[0]; i++)
 for (j=0; ja-dimensions[1]; j++)
 *tmp++ = (double *)(a-data + i*a-strides[0] +
 j*a-strides[1]);

 }
 break;
 /* four or more dimensions: use recursion */
 default:
 for (i=0; ia-nd; i++) indices[i] = 0;
 _get_axis_pointer(indices, axis, a, out, 0);

 }
 done:
 return out;

 }


   Another thing I did when reimplementing lfilter was copy-in copy-out
 for strided arrays.


  What is copy-in copy out ? I am not familiar with this term ?




 Strided memory access is slow. So it often helps to make a temporary copy
 that are contiguous.


But for an initial refactoring it probably falls in the category of
premature optimization. Another thing to avoid on the first go around is
micro-optimization, as it tends to complicate the code and often doesn't do
much for performance.

snip

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread David
On 06/11/2010 10:02 AM, Charles R Harris wrote:



 But for an initial refactoring it probably falls in the category of
 premature optimization. Another thing to avoid on the first go around is
 micro-optimization, as it tends to complicate the code and often doesn't
 do much for performance.

I agree it may be difficult to add this in the initial refactorization, 
but I don't think it falls into the micro optimization category.

The whole idea of converting strided buffers into temporary contiguous 
areas is already in ufunc, though. Maybe a core API to do just that 
would be useful. I am not familiar with the ufunc API at all, so I am 
not sure how it would go.

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread David
On 06/11/2010 09:27 AM, Sturla Molden wrote:


 Strided memory access is slow. So it often helps to make a temporary
 copy that are contiguous.

Ah, ok, I did not know this was called copy-in/copy-out, thanks for the 
explanation. I agree this would be a good direction to pursue, but maybe 
out of scope for the first refactoring,

cheers,

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion



Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden
Den 11.06.2010 04:19, skrev David:

 Ah, ok, I did not know this was called copy-in/copy-out, thanks for the
 explanation. I agree this would be a good direction to pursue, but maybe
 out of scope for the first refactoring,



Copy-in copy-out is actually an implementation detail in Fortran 
compilers. It has more to do with how subroutines are called. But the 
purpose is to turn a strided array into a contiguous one.

Sturla


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] timeseries - dates prior to 1970

2010-06-10 Thread Pierre GM
On Jun 10, 2010, at 7:16 PM, Bevan Jenkins wrote:
 Hello,
 
 I have posted previously about dates prior to 1900 but this seems to be a 
 seperate issue.  The error message is definitley different.
 I can not seem to convert a timseseries from one frequency ('D') to another 
 ('H') when i use dates prior to 1970 as shown below.  This works fine when I 
 use a date after 1970.  Is this something that can be easily fixed or work 
 around that I can use? Thanks


Argh, major bug indeed... For frequencies below days, the reference is 1AD. 
Above that, the reference is the unix epoch (1970/01/01). When you try to use 
dates before that epoch with an hourly frequency, you get negative integers 
that are not properly dealt with... I gonna work on that.
In the meantime, you can use the following (extremely ugly) trick:

 s_d = ts.time_series(np.arange(5) + 1,
 start_date=ts.Date('D', 1969/01/01))
 # This fails: 
 # s.convert('H')
 # Create s hortcut to dates (to save some __getattr__ time)
 d = s_d.dates
 # Calculate the offset from the unix epoch
 offset = ts.Date('D', year=1970, month=1, day=1) - d[0]
 # Create a new temporary series
 s_h = s_d
 # Add the offset to the dates, so that you're after the epoch
 s_h.dates += offset
 # Convert to HOURLY
 s_h = s_h.convert('H')
 # Subtract the offset (don't forget to convert to 'H')
 s_h.dates -= offset * 24

Your predicament is far from over now, because you won't be able to print the 
dates... But that's another bug.
I'll keep you posted when we have some better solutions.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Sturla Molden
Den 11.06.2010 03:02, skrev Charles R Harris:

 But for an initial refactoring it probably falls in the category of 
 premature optimization. Another thing to avoid on the first go around 
 is micro-optimization, as it tends to complicate the code and often 
 doesn't do much for performance.




First, to refractor NumPy we must get rid of array iterators as Python 
objects. Retaining them as Python objects defies the whole purpose. 
Second, creating an array of pointers has the advantage of being more 
friendly to OpenMP and multi-core CPUs. We don't need to serialize 
access to an iterator, and we allow OpenMP to do load balancing. For 
things like FFT on vectors along an axis in a multidimensional array 
(e.g. in multi-dimensional FFTs), this is not a micro-optimization. Not 
to mention that it is easier to iterate over an array than use a 
complicated iterator object in C.

Sturla
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy re-factoring project

2010-06-10 Thread Charles R Harris
On Thu, Jun 10, 2010 at 8:40 PM, Sturla Molden stu...@molden.no wrote:

 Den 11.06.2010 03:02, skrev Charles R Harris:
 
  But for an initial refactoring it probably falls in the category of
  premature optimization. Another thing to avoid on the first go around
  is micro-optimization, as it tends to complicate the code and often
  doesn't do much for performance.




 First, to refractor NumPy we must get rid of array iterators as Python
 objects. Retaining them as Python objects defies the whole purpose.
 Second, creating an array of pointers has the advantage of being more
 friendly to OpenMP and multi-core CPUs. We don't need to serialize
 access to an iterator, and we allow OpenMP to do load balancing. For
 things like FFT on vectors along an axis in a multidimensional array
 (e.g. in multi-dimensional FFTs), this is not a micro-optimization. Not
 to mention that it is easier to iterate over an array than use a
 complicated iterator object in C.


How does that work doing ffts along all axis? What about slicing and views,
axis rolling and transposes? Furthermore, if you are just going to add two
discontiguous arrays, or take the sine, making copies is wasted effort. I
think those things definitely fall in the category of optimizations for many
things and will just be a distraction from the real problems of the
refactoring.

When I say micro-optimization, I am not talking about such things, however.
Micro-optimization is such things as loop unrolling, which while it can
greatly speed up some things adds a lot of obfuscating code when the first
priority is to get things right. Another example would be trying to get a
tad more speed using pointers instead of indexes, an optimization which also
tends to be architecture dependent. Or using a ton of macros instead of
subroutines or inline functions. So on and so forth. I think in these things
the first priority is to have something readable and debuggable with clear
lines of control flow. This is easier said than done, of course.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion