A very radical solution would be to get rid of all C, and go for a pure
Python solution. NumPy could build up a text string with OpenCL code on
the fly, and use the OpenCL driver as a JIT compiler for fast array
expressions. Most GPUs and CPUs will support OpenCL, and thus there will
be no
Den 15.06.2010 18:30, skrev Sturla Molden:
A very radical solution would be to get rid of all C, and go for a
pure Python solution. NumPy could build up a text string with OpenCL
code on the fly, and use the OpenCL driver as a JIT compiler for
fast array expressions. Most GPUs and CPUs will
Sun, 13 Jun 2010 06:54:29 +0200, Sturla Molden wrote:
[clip: memory management only in the interface]
You forgot views: if memory management is done in the interface layer, it
must also make sure that the memory pointed to by a view is never moved
around, and not freed before all the views are
On Thu, Jun 10, 2010 at 6:48 PM, Sturla Molden stu...@molden.no wrote:
I have a few radical suggestions:
1. Use ctypes as glue to the core DLL, so we can completely forget about
refcounts and similar mess. Why put manual reference counting and error
handling in the core? It's stupid.
I
On Sat, Jun 12, 2010 at 10:27 PM, Sebastian Walter
sebastian.wal...@gmail.com wrote:
On Thu, Jun 10, 2010 at 6:48 PM, Sturla Molden stu...@molden.no wrote:
I have a few radical suggestions:
1. Use ctypes as glue to the core DLL, so we can completely forget about
refcounts and similar mess.
Den 12.06.2010 15:57, skrev David Cournapeau:
Anything non trivial will require memory allocation and object
ownership conventions. If the goal is interoperation with other
languages and vm, you may want to use something else than plain
malloc, to interact better with the allocation strategies
David Cournapeau wrote:
In the core C numpy library there would be new numpy_array struct
with attributes
numpy_array-buffer
Anything non trivial will require memory allocation and object
ownership conventions.
I totally agree -- I've been thinking for a while about a core array
data
Christopher Barker wrote:
David Cournapeau wrote:
In the core C numpy library there would be new numpy_array struct
with attributes
numpy_array-buffer
Anything non trivial will require memory allocation and object
ownership conventions.
I totally agree -- I've been thinking for a while
On Sat, Jun 12, 2010 at 11:38 AM, Dag Sverre Seljebotn
da...@student.matnat.uio.no wrote:
Christopher Barker wrote:
David Cournapeau wrote:
In the core C numpy library there would be new numpy_array struct
with attributes
numpy_array-buffer
Anything non trivial will require
On Sat, Jun 12, 2010 at 1:35 PM, Charles R Harris charlesr.har...@gmail.com
wrote:
On Sat, Jun 12, 2010 at 11:38 AM, Dag Sverre Seljebotn
da...@student.matnat.uio.no wrote:
Christopher Barker wrote:
David Cournapeau wrote:
In the core C numpy library there would be new numpy_array
2010/6/12 Charles R Harris charlesr.har...@gmail.com
This is more the way I see things, except I would divide the bottom layer
into two parts, views and memory. The memory can come from many places --
memmaps, user supplied buffers, etc. -- but we should provide a simple
reference counted
If I could, I would like to throw out another possible feature that might
need to be taken into consideration for designing the implementation of
numpy arrays.
One thing I found somewhat lacking -- if that is the right term -- is a way
to convolve a numpy array with an arbitrary windowing
On Sat, Jun 12, 2010 at 3:57 PM, David Cournapeau courn...@gmail.com wrote:
On Sat, Jun 12, 2010 at 10:27 PM, Sebastian Walter
sebastian.wal...@gmail.com wrote:
On Thu, Jun 10, 2010 at 6:48 PM, Sturla Molden stu...@molden.no wrote:
I have a few radical suggestions:
1. Use ctypes as glue to
Charles Harris wrote:
On Sat, Jun 12, 2010 at 11:38 AM, Dag Sverre Seljebotn
da...@student.matnat.uio.no wrote:
Christopher Barker wrote:
David Cournapeau wrote:
In the core C numpy library there would be new numpy_array struct
with attributes
numpy_array-buffer
Anything non
On Sat, Jun 12, 2010 at 2:56 PM, Dag Sverre Seljebotn
da...@student.matnat.uio.no wrote:
Charles Harris wrote:
On Sat, Jun 12, 2010 at 11:38 AM, Dag Sverre Seljebotn
da...@student.matnat.uio.no wrote:
Christopher Barker wrote:
David Cournapeau wrote:
In the core C numpy library
On Sun, Jun 13, 2010 at 2:00 AM, Sturla Molden stu...@molden.no wrote:
Den 12.06.2010 15:57, skrev David Cournapeau:
Anything non trivial will require memory allocation and object
ownership conventions. If the goal is interoperation with other
languages and vm, you may want to use something
Den 13.06.2010 02:39, skrev David Cournapeau:
But the point is to get rid of the python dependency, and if you don't
allow any api call to allocate memory, there is not much left to
implement in the core.
Memory allocation is platform dependent. A CPython version could use
bytearray,
On Sun, Jun 13, 2010 at 11:39 AM, Sturla Molden stu...@molden.no wrote:
If NumPy does not allocate memory on it's own, there will be no leaks
due to errors in NumPy.
There is still work to do in the core, i.e. the computational loops in
array operators, broadcasting, ufuncs, copying data
Den 13.06.2010 05:47, skrev David Cournapeau:
This only works in simple cases. What do you do when you don't know
the output size ?
First: If you don't know, you don't know. Then you're screwed and C is
not going to help.
Second: If we cannot figure out how much to allocate before starting
On Fri, 11 Jun 2010 00:25:17 +0200, Sturla Molden stu...@molden.no wrote:
Den 10.06.2010 22:07, skrev Travis Oliphant:
2. The core should be a plain DLL, loadable with ctypes. (I know David
Cournapeau and Robert Kern is going to hate this.) But if Python can have
a custom loader for
A Friday 11 June 2010 02:27:18 Sturla Molden escrigué:
Another thing I did when reimplementing lfilter was copy-in copy-out
for strided arrays.
What is copy-in copy out ? I am not familiar with this term ?
Strided memory access is slow. So it often helps to make a temporary
copy that
Thu, 10 Jun 2010 23:56:56 +0200, Sturla Molden wrote:
[clip]
Also about array iterators in NumPy's C base (i.e. for doing something
along an axis): we don't need those. There is a different way of coding
which leads to faster code.
1. Collect an array of pointers to each subarray (e.g. using
On Thursday 10 June 2010 22:28:28 Pauli Virtanen wrote:
Some places where Openmp could probably help are in the inner ufunc
loops. However, improving the memory efficiency of the data access
pattern is another low-hanging fruit for multidimensional arrays.
I was about to mention this when the
Fri, 11 Jun 2010 10:29:28 +0200, Hans Meine wrote:
[clip]
Ideally, algorithms would get wrapped in between two additional
pre-/postprocessing steps:
1) Preprocessing: After broadcasting, transpose the input arrays such
that they become C order. More specifically, sort the strides of one
On Friday 11 June 2010 10:38:28 Pauli Virtanen wrote:
Fri, 11 Jun 2010 10:29:28 +0200, Hans Meine wrote:
Ideally, algorithms would get wrapped in between two additional
pre-/postprocessing steps:
1) Preprocessing: After broadcasting, transpose the input arrays such
that they become C
Den 11.06.2010 10:17, skrev Pauli Virtanen:
1. Collect an array of pointers to each subarray (e.g. using
std::vectordtype* or dtype**)
2. Dispatch on the pointer array...
This is actually what the current ufunc code does.
The innermost dimension is handled via the ufunc loop, which
Den 11.06.2010 09:14, skrev Sebastien Binet:
it of course depends on the granularity at which you wrap and use
numpy-core but tight loops calling ctypes ain't gonna be pretty
performance-wise.
Tight loops in Python are never pretty.
The purpose of vectorization with NumPy is to avoid
On Fri, Jun 11, 2010 at 8:31 AM, Sturla Molden stu...@molden.no wrote:
It would also make sence to evaluate expressions like y = b*x + a
without a temporary array for b*x. I know roughly how to do it, but
don't have time to look at it before next year. (Yes I know about
numexpr, I am
On 11 June 2010 11:12, Benjamin Root ben.r...@ou.edu wrote:
On Fri, Jun 11, 2010 at 8:31 AM, Sturla Molden stu...@molden.no wrote:
It would also make sence to evaluate expressions like y = b*x + a
without a temporary array for b*x. I know roughly how to do it, but
don't have time to look
Sturla Molden wrote:
Den 11.06.2010 09:14, skrev Sebastien Binet:
it of course depends on the granularity at which you wrap and use
numpy-core but tight loops calling ctypes ain't gonna be pretty
performance-wise.
Tight loops in Python are never pretty.
The purpose of vectorization
Den 11.06.2010 17:17, skrev Anne Archibald:
On the other hand, since memory reads are very slow, optimizations
that do more calculation per load/store could make a very big
difference, eliminating temporaries as a side effect.
Yes, that's the main issue, not the extra memory they use.
Fri, 11 Jun 2010 15:31:45 +0200, Sturla Molden wrote:
[clip]
The innermost dimension is handled via the ufunc loop, which is a
simple for loop with constant-size step and is given a number of
iterations. The array iterator objects are used only for stepping
through the outer dimensions. That
On Wed, Jun 9, 2010 at 5:27 PM, Jason McCampbell
jmccampb...@enthought.comwrote:
Hi everyone,
This is a follow-up to Travis's message on the re-factoring project from
May 25th and the subsequent discussion. For background, I am a developer at
Enthought working on the NumPy re-factoring
On Thu, Jun 10, 2010 at 7:26 AM, Charles R Harris charlesr.har...@gmail.com
wrote:
On Wed, Jun 9, 2010 at 5:27 PM, Jason McCampbell
jmccampb...@enthought.com wrote:
Hi everyone,
This is a follow-up to Travis's message on the re-factoring project from
May 25th and the subsequent
Hi Chuck,
Good questions. Responses inline below...
Jason
On Thu, Jun 10, 2010 at 8:26 AM, Charles R Harris charlesr.har...@gmail.com
wrote:
On Wed, Jun 9, 2010 at 5:27 PM, Jason McCampbell
jmccampb...@enthought.com wrote:
Hi everyone,
This is a follow-up to Travis's message on the
I have a few radical suggestions:
1. Use ctypes as glue to the core DLL, so we can completely forget about
refcounts and similar mess. Why put manual reference counting and error
handling in the core? It's stupid.
2. The core should be a plain DLL, loadable with ctypes. (I know David
Den 10.06.2010 18:48, skrev Sturla Molden:
ctypes will also make porting to other Python implementations easier
(or even other languages: Ruby, JacaScript) easier. Not to mention
that it will make NumPy impervious to changes in the Python C API.
Linking is also easier with ctypes. I started
Another suggestion I'd like to make is bytearray as memory buffer for
the ndarray. An ndarray could just store or extend a bytearray, instead
of having to deal with malloc/free and and the mess that comes with it.
Python takes care of the reference counts for bytearrays, and ctypes
calls the
On Fri, Jun 11, 2010 at 6:56 AM, Sturla Molden stu...@molden.no wrote:
Also about array iterators in NumPy's C base (i.e. for doing something
along an axis): we don't need those. There is a different way of coding
which leads to faster code.
1. Collect an array of pointers to each subarray
On Jun 10, 2010, at 11:48 AM, Sturla Molden wrote:
I have a few radical suggestions:
There are some good ideas there. I suspect we can't address all of them in
the course of this re-factoring effort, but I really appreciate you putting
them out there, because they are useful things to
Den 10.06.2010 22:28, skrev Pauli Virtanen:
Some places where Openmp could probably help are in the inner ufunc
loops. However, improving the memory efficiency of the data access
pattern is another low-hanging fruit for multidimensional arrays.
Getting the intermediate array out of
Thu, 10 Jun 2010 18:48:04 +0200, Sturla Molden wrote:
[clip]
5. Allow OpenMP pragmas in the core. If arrays are above a certain size,
it should switch to multi-threading.
Some places where Openmp could probably help are in the inner ufunc
loops. However, improving the memory efficiency of the
On Fri, Jun 11, 2010 at 7:25 AM, Sturla Molden stu...@molden.no wrote:
Den 10.06.2010 22:07, skrev Travis Oliphant:
2. The core should be a plain DLL, loadable with ctypes. (I know David
Cournapeau and Robert Kern is going to hate this.) But if Python can have a
custom loader for .pyd
Den 10.06.2010 22:07, skrev Travis Oliphant:
2. The core should be a plain DLL, loadable with ctypes. (I know David
Cournapeau and Robert Kern is going to hate this.) But if Python can have a
custom loader for .pyd files, so can NumPy for it's core DLL. For ctypes we
just need to specify a
Den 11.06.2010 00:57, skrev David Cournapeau:
Do you have the code for this ? That's something I wanted to do, but
never took the time to do. Faster generic iterator would be nice, but
very hard to do in general.
/* this computes the start adress for every vector along a dimension
(axis)
On Thu, Jun 10, 2010 at 6:27 PM, Sturla Molden stu...@molden.no wrote:
Den 11.06.2010 00:57, skrev David Cournapeau:
Do you have the code for this ? That's something I wanted to do, but
never took the time to do. Faster generic iterator would be nice, but
very hard to do in general.
On 06/11/2010 10:02 AM, Charles R Harris wrote:
But for an initial refactoring it probably falls in the category of
premature optimization. Another thing to avoid on the first go around is
micro-optimization, as it tends to complicate the code and often doesn't
do much for performance.
I
On 06/11/2010 09:27 AM, Sturla Molden wrote:
Strided memory access is slow. So it often helps to make a temporary
copy that are contiguous.
Ah, ok, I did not know this was called copy-in/copy-out, thanks for the
explanation. I agree this would be a good direction to pursue, but maybe
out of
Den 11.06.2010 04:19, skrev David:
Ah, ok, I did not know this was called copy-in/copy-out, thanks for the
explanation. I agree this would be a good direction to pursue, but maybe
out of scope for the first refactoring,
Copy-in copy-out is actually an implementation detail in Fortran
Den 11.06.2010 03:02, skrev Charles R Harris:
But for an initial refactoring it probably falls in the category of
premature optimization. Another thing to avoid on the first go around
is micro-optimization, as it tends to complicate the code and often
doesn't do much for performance.
On Thu, Jun 10, 2010 at 8:40 PM, Sturla Molden stu...@molden.no wrote:
Den 11.06.2010 03:02, skrev Charles R Harris:
But for an initial refactoring it probably falls in the category of
premature optimization. Another thing to avoid on the first go around
is micro-optimization, as it
Hi everyone,
This is a follow-up to Travis's message on the re-factoring project from May
25th and the subsequent discussion. For background, I am a developer at
Enthought working on the NumPy re-factoring project with Travis and Scott.
The immediate goal from our perspective is to re-factor the
52 matches
Mail list logo