A quick answer to some of the points.
Robert Bradshaw wrote:
> First, it sounds a bit like you're proposing to essentially re-
> implement NumPy. Would it just be slicing and arithmetic, or where
> would you draw the line before it doesn't really belong in a
> compiler, but rather a library. More below:
At this point, I should stress that what I'm after is a roadmap (how do
I deal with requests coming up, and things like Hoyt's proposed project
when that came up). I won't have time for doing more than the basic
array stuff (to better facilitate ourself of Kurt's work) this time
around -- which is still very useful for exchanging array data with
C/Fortran code.
It would be "reimplementing" NumPy as far as the core array API goes,
but the implementation would be totally different as we have a
compilation stage. If anything, it is reimplementing Fortran, however
Fortran desperately need more alternatives :-)
As for the library, the CEP scetches full interopability with the
library, you would likely still do
arr = np.dot(arr,arr)
for matrix mul.
What to include? I think slicing, arithmetic (with "broadcasting", i.e
dimensions of length 1 are repeated to make arrays conform), transpose
(arr.T). No "sum()" member function or similar.
In addition perhaps high-level utilities for looping, such as
"ndenumerate" etc:
cdef int[:,:] arr = ...
for (i, j), value in cython.array.ndenumerate(arr):
...
These would be quite nicely contained in cython.array though (and again
have no direct NumPy link that's not horribly slow due to Python-level
iterators).
>
> On Jun 10, 2009, at 11:57 AM, Dag Sverre Seljebotn wrote:
>
>> Brian Granger wrote:
>>> Dag,
>>>
>>> I quickly glanced through the proposal and have two big picture
>>> questions:
>>>
>>> * What will this make possible that is currently not possible?
>
> This was originally my first question too, but you beat me to it.
>
>> 1) Efficient slices
>
> Is the inefficiency just in the object creation and Python indexing
> semantics? It's still O(1), right? Same with the other operations. (I
> guess there's also a question of result type.)
Well, and then the resulting buffer is acquired and checked for data
type. But essentially, yes.
This is purely a matter of notational convenience -- will it hurt to
have that O(1) in your almost-inner-loop taking a slice of 50 elements,
or do you pass "start", "end" around in your function?
>> 2) Leave the road open for memory bus friendlier arithmetic (re: Hoyt
>> Koepke's project proposal)
>
> Could you provide a bit more of a concrete example here? There is
> avoiding allocation of temporary arrays, is there more?
Most systems today are for simple functions like addition constrained by
memory bus speed, not CPU speed. Consider
B = A + A + A + A + A + A
With NumPy, the data of A has to travel over the bus 6 times, with loop
unrolling only 1, as it would turn into
B = new memory
for idx in ...
B[idx] = A[idx] + A[idx] + ...
This can mean a dramatic speed increase as data doesn't have to enter
the cache more than once.
(NumPy can't do this, but there is numexpr, which allow this:
B = numexpr.eval("A+A+...")
using bytecode interpreter and working in cache-sized blocks.)
>> 3) Automatic contiguous copies in/out if a function is coded to
>> work on
>> contiguous memory
>>
>> Why can't this be done currently?
>>
>> * Cython contains no in-compiler support for NumPy, and so cannot
>> know
>> how to create new underlying ndarray objects.
>>
>> * All optimizations need to hard-code semantics at compile-time. With
>> single-element indexing it seemed fair to assume the usual
>> semantics of
>> zero-based indexing etc., but with slices tings get worse (which
>> kind of
>> object is returned) and with arithmetic downright impossible (what
>> does *
>> do again?)
>>
>> That's not to say there's not other options:
>> 1) We could hard-code support for NumPy only, and only allow
>> ndarray and
>> not subclasses thereof.
>>
>> 2) We could invent some new protocol/syntax for defining compile-time
>> semantics for all relevant operations.
>
> Here I am torn--I don't like defining compile-time semantics because
> it goes against the whole OO style of inheritance (and feels even
> more remote than the very dynamic, late-binding Python runtime). I
> don't like option (1) either though.
>
> Another an idea, have you thought of using NumPy as the backend? I.e.
> an int[:,:] is any bufferinfo--supporting object, but if one needs to
> be created you create it via an ndarray? This could (potentially)
> facilitate a lot more code reuse (especially for operations that are
> more complicated than a single loop over the data). (Might be messier
> than implementing int[:,:] directly though.) Suppose one develops a
> vectorized version of ndarrays, could that be a drop-in replacement?
? ndarrays are vectorized? But subclasses may not be.
I've thought about it. If arithmetic is implemented on arrays, the only
thing that's seems to be reused this way is malloc/free of a memory area
though. Better loose the dependency then.
> The plug-in idea is very interesting, both from the perspective that
> it would allow one to play with different ways to operate on arrays,
> and also it could separate some of the array-processing logic out of
> the core compiler itself.
I can definitely see NumPy as being the first such plugin, i.e. it would
force each array to exactly ndarray on arithmetic, to at least provide
some arithmetic support right away.
--
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev