On Jul 30, 2008, at 12:14 AM, Dag Sverre Seljebotn wrote:

> Robert Bradshaw wrote:
>> On Jul 29, 2008, at 1:00 PM, Dag Sverre Seljebotn wrote:
>>
>>> There's a lot of email today but as I finished one part I'm staking
>>> out
>>> the course again.
>>>
>>> I'm thinking about letting the class author specify default buffer
>>> options. Something like this:
>>>
>>> cdef extern class Image ...:
>>>    __cythonbufferdefaults__ = {'ndim' : 2}
>>>    __cythonbuffermandatory__ = {'dtype': unsigned char,  
>>> indirect=True}
>>>    __cythonbufferalways__ = True
>>>
>>> [1]
>>>
>>> I think this makes a lot of sense, as the author of the Image class
>>> might export exactly this kind of buffer and nothing else. With the
>>> above settings, one would automatically get efficient indexing on  
>>> all
>>> "cdef Image" instances.
>>>
>>> NumPy is kind of special in the flexibility it provides, but even
>>> there,
>>> setting indirect=False can provide a speedup transparently. Once the
>>> indirect option is implemented at all, that is :-) but that *must*
>>> happen to make NumPy as efficient as possible.
>>>
>>> As for priority, I guess this is below most of what I've talked
>>> about so
>>> far. Though it might suddenly look like a low hanging fruit that  
>>> I go
>>> for when I need a break from other stuff.
>>>
>>> [1] (Well, I consider custom parsing for that last line better than
>>> inventing a wholly new syntax...and we've been talking about new
>>> Python-style type references as well, which fits in here)
>>
>> This almost seems to magical for me, if one wants to use a buffer
>> perhaps it is better to be explicit. But I'm curious to hear what
>> other people think.
>
> Let me just give one more concrete example for NumPy then. If you do
>
> cdef ndarray[int, 2] buf
>
> currently, then it is going to create code for checking whether it  
> should
> do indirect access, which is one if-test per dimension per lookup  
> -- and
> you *know* that for ndarrays, you can always get around with strided
> access, i.e. something like
>
> cdef ndarray[int, 2, 'strided'] buf
>
> Now, I would kind of like to be able to tell people to use "object 
> [int,
> 2]" to write a generic buffer algorithm, but "ndarray[int, 2]" to  
> write
> something that only works with NumPy in an optimized fashion, and  
> that is
> it -- and the proposal kind of grew out of that, it is a way of  
> letting
> the users not have to type mode="strided" all the time.
>
> (BTW, do you like mode=different strings for this, or should I go with
> "strided=True", "c=True", "fortran=True", etc?

I like just providing strings, which I am assuming map to access flags.

> There will be two modes at
> first: "full" and "strided", although if cython.buffer.bufptr is
> introduced than "c", "fortran", "contig" will be useful as well.)
>
> Of course I could only do this for the mode, but this doesn't seem  
> to be a
> special case --
> though most buffer usecases seems to be even more fixed (if you have a
> JPEG library why bother to specify the ndim and so on).
>
> As for an option for automatically retrieving a buffer, I might agree.
> Also I see the downside that it allows syntax like "JPEGImage[]" and
> "MultiDimImage[3]".

You have me convinced that providing defaults is a good thing (and I  
agree many (most?) libraries/classes will have a fixed dimension/ 
type). The __cythonbuffermandatory__ just to turn what would be a  
runtime error into a compile time error, right? It may fail to be  
true for subclasses. __cythonbufferalways__ can be assumed--if there  
is enough (default) information to provide a buffer, then do it,  
otherwise don't.

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to