Re: [Cython] Buffer default options

Dag Sverre Seljebotn Wed, 30 Jul 2008 07:45:05 -0700

Robert Bradshaw wrote:
> On Jul 30, 2008, at 12:14 AM, Dag Sverre Seljebotn wrote:
> 
>> Robert Bradshaw wrote:
>>> On Jul 29, 2008, at 1:00 PM, Dag Sverre Seljebotn wrote:
>>>
>>>> There's a lot of email today but as I finished one part I'm staking
>>>> out
>>>> the course again.
>>>>
>>>> I'm thinking about letting the class author specify default buffer
>>>> options. Something like this:
>>>>
>>>> cdef extern class Image ...:
>>>>    __cythonbufferdefaults__ = {'ndim' : 2}
>>>>    __cythonbuffermandatory__ = {'dtype': unsigned char,  
>>>> indirect=True}
>>>>    __cythonbufferalways__ = True
>>>>
>>>> [1]
>>>>
>>>> I think this makes a lot of sense, as the author of the Image class
>>>> might export exactly this kind of buffer and nothing else. With the
>>>> above settings, one would automatically get efficient indexing on  
>>>> all
>>>> "cdef Image" instances.
>>>>
>>>> NumPy is kind of special in the flexibility it provides, but even
>>>> there,
>>>> setting indirect=False can provide a speedup transparently. Once the
>>>> indirect option is implemented at all, that is :-) but that *must*
>>>> happen to make NumPy as efficient as possible.
>>>>
>>>> As for priority, I guess this is below most of what I've talked
>>>> about so
>>>> far. Though it might suddenly look like a low hanging fruit that  
>>>> I go
>>>> for when I need a break from other stuff.
>>>>
>>>> [1] (Well, I consider custom parsing for that last line better than
>>>> inventing a wholly new syntax...and we've been talking about new
>>>> Python-style type references as well, which fits in here)
>>> This almost seems to magical for me, if one wants to use a buffer
>>> perhaps it is better to be explicit. But I'm curious to hear what
>>> other people think.
>> Let me just give one more concrete example for NumPy then. If you do
>>
>> cdef ndarray[int, 2] buf
>>
>> currently, then it is going to create code for checking whether it  
>> should
>> do indirect access, which is one if-test per dimension per lookup  
>> -- and
>> you *know* that for ndarrays, you can always get around with strided
>> access, i.e. something like
>>
>> cdef ndarray[int, 2, 'strided'] buf
>>
>> Now, I would kind of like to be able to tell people to use "object 
>> [int,
>> 2]" to write a generic buffer algorithm, but "ndarray[int, 2]" to  
>> write
>> something that only works with NumPy in an optimized fashion, and  
>> that is
>> it -- and the proposal kind of grew out of that, it is a way of  
>> letting
>> the users not have to type mode="strided" all the time.
>>
>> (BTW, do you like mode=different strings for this, or should I go with
>> "strided=True", "c=True", "fortran=True", etc?
> 
> I like just providing strings, which I am assuming map to access flags.


More or less, but there may not always be a direct mapping, even though 
at first there will be. I am mostly considering usability and efficiency 
and making use of the PEP for that end, not cloning the semantics of the 
PEP directly.

There is at least one case (the most common form of indirect indexing, 
where the last dimension is strided and the rest indirect... a "lines" 
mode) where I could introduce new assumptions that you cannot express in 
the buffer flags in order to get more optimal code in that case. (I'm 
wondering if this might be wanted for PIL, but I'll think of that at 
some later point -- if PIL is always 2D, introducing a suboffsets 
variable and setting suboffsets=(0,-1) by default will suffice).

-- 
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Buffer default options

Reply via email to