Hi,

this discussion was interesting. While I may seem to have en edge below 
I mean it all for the sake of the discussion and hope we can continue to 
have this "in a good spirit".

(Inserting this note was easier than redoing my mail. I'm excited, not 
angry :-) )

Andrew Straw wrote:
> Dag Sverre Seljebotn wrote:
>   
>> Andrew Straw wrote:
>>   
>>     
>>> Hi,
>>>
>>> I tracked down an issue that was giving me segfaults with a poor 
>>> __getbuffer__ implementation and modified the compiler to catch this case.
>>>     
>>>       
>> Background first:
>>
>> The situation is that this patch guards against returning erronous data 
>> from a GetBuffer call. According to the PEP (3118), I think it is pretty 
>> clear that data should either be set or an exception raised, so this 
>> patch only helps detect when you create an invalid __getbuffer__ 
>> implementation.
>>   
>>     
> I understand your reasoning, but Cython's __getbuffer__ works outside 
> the scope of PEP 3118. The PEP doesn't specify Cython behavior, and a 
>   
Only when the C file is compiled under Python 2.x. When compiling the C 
file under Python 3, __getbuffer__ fills in the slot specified by the PEP.

> great aspect of Cython, IMO, is that it makes Python/C bridges easier, 
> not to adhere to the minimum standards specified by a PEP. Thus a small 
> amount of additional error checking seems OK to me. Also, as another 
>   
Well, in this case, the error check should be done by inserting some 
code automatically in the definition of __getbuffer__, so that Python 3 
consumers also get advantage of it, and so that we don't second guess 
how good non-Cython implementations are. So the check is still inserted 
in the wrong place.

(I don't think it is worth it though because you can only check some 
things but not the most important/non-trivial ones like strides. The 
author of __getbuffer__ has to know some dirty details anyway. Also 
__getbuffer__ is supposed to be as close to the metal as possible I 
think, though I'm likely to follow whatever Stefan's opinion is here)

> significant difference, Cython's implementation is working in Python 
> earlier than 2.6, whereas the Python development process will only bring 
> this out with 2.6. (I'm certainly not complaining -- it's just to point 
> out that the PEP doesn't directly apply.)
>   
As it does apply when compiling the Cython code with newer 
implementations, the PEP does apply because the same Cython code should 
have the same semantics everywhere.

> Yes, there is the runtime penalty of checking in C whether a pointer is 
> NULL. Given that using buffers often saves copying large amounts of 
>   
My concerns aren't runtime performance as such (which is negligible in 
this case) but keeping the code and design clean.

If I read that C code (and didn't know about your reasons behind it), my 
first thought would be "why is this check here, this cannot ever be NULL 
by specification". If people start second-guessing implementations of 
hard specifications "just in case" we will have all sorts of 
inconsistency and problems (how much second-guessing do you do?)

Sprinkling tests more or less at random to trap specific corner cases 
never makes for good code --- it is a symptom that the whole approach is 
wrong, and one should test things properly at the right location.

> memory, I think that's an acceptable cost. Furthermore, there is already 
> a runtime check in the code (checking the number of dimesions in the 
> buffer using ndim, line 534 of Cython/Compiler/Buffer.py). According to 
> your reasoning above, that check should be removed, too. I would be 
>   
I disagree. "ndim" is explicitly provided by the PEP so that you can do 
this checking, and is used by end-users in a routinely fashion:

cdef numpy.ndarray[numpy.int_t, ndim=2] arr = numpy.zeros((2,3,4,5,6), 
dtype=numpy.int)

Here an exception is raised, yet there's nothing wrong with the numpy 
implementation of __getbuffer__! Adding try/except to deal with this 
situation dynamically and so on is perfectly OK (and this is intended 
and documented behaviour), while if __getbuffer__ is broken, it is plain 
broken.

> against that -- my position is that a couple sanity checks costing a 
> single C comparison each is worth fighting segfaults and other undefined 
> behavior.
Well, the above being said, there is some interest in a compiler 
directive to insert null-pointer-checks "everywhere". Note that you can 
currently do

cdef ndarray arr = None
print arr.shape[0]

and this will likely segfault too! There's so many ways for a Cython 
program to segfault and this must be dealt with in a more systematic manner.
>> So what I would propose instead (though I'll never get time for it :-) ) 
>> is to have a reusable "buffer tester" library, i.e. use Cython to create 
>> an application which checks that a given buffer can be acquired and 
>> read, and then feed your custom buffer to it. We could thus create a 
>> "standard PEP 3118 test" or similar.
>>   
>>     
> Yes, as mentioned above, I agree this is a good idea, but I think 
> orthogonal to the issue here.
>   
I mentioned it because a proper unit test for __getbuffer__ in the first 
place makes the test your patch does redundant, while it traps 
potentially a lot more and less special cases. And as mentioned above, 
the problem with redundant code is not performance but program 
complexity and clarity.

> Also, don't forget about the tiny issue of the ErrorBuffer 
> implementation in tests/run/bufaccess.pyx addressed in my patch.
>   
Thanks (I did forget!), I'll have a look at it.

About using PEP 3118, make sure to read the docs for this at 
http://wiki.cython.org/enhancements/buffer , there's some limits to how 
far the emulation can get you (i.e. it only works within Cython code and 
you need to do all the proper cimports of the potential buffer classes 
even if you are not using anything in the pxd directly. Oh, and 
inheritance isn't implemented and is likely to break.).

Anyway, enough procrastination for me now :-) I believe we understand 
each other's position, so it is up to other people here to give positive 
votes to counter my -1.

Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to