Robert Bradshaw, 09.10.2010 13:21:
> On Fri, Oct 8, 2010 at 11:21 PM, Stefan Behnel wrote:
>> So, for char* values, "s1 == s2" should translate into
>>
>>    (s1 == s2) || (
>>       (s1 != NULL) && (s2 != NULL) && (strcmp(s1, s2) == 0))
>>
>> Note that this allows both pointers to be NULL in the success case. I think
>> it makes sense to include this.
>>
>> This change certainly introduces new corner cases, though, as "ptr == ptr"
>> means "ptr is ptr" for everything else. So this will surprise at least some
>> users and also break existing code. But I think it's at least mostly obvious
>> to the average Python developer what the right fix is.
>
> +0.5 to this. If not, it would at least be worth a warning.

Thinking about this some more, "==" isn't the only operator where this 
makes sense. Basically, all comparison operators (also <, <=, >, >=) should 
potentially behave the same on bytes objects and char* objects. However, 
for everything but "==", there isn't a simple operator replacement like 
"is" that explicitly compares the pointers if the need arises, but a 
pointer cast to <void*> should do the trick.

I personally think that the main use case for comparison operators on char* 
values really *is* the comparison of the string values and not the 
pointers. So, would it be acceptable to give all comparison operators their 
Python bytes string semantics when applied to C string types? That would 
give us the following implementations:

== : (s1 == s2) || ((s1 != NULL) && (s2 != NULL) && (strcmp(s1, s2) == 0))

!= : (s1 != s2) && ((s1 == NULL) || (s2 == NULL) || (strcmp(s1, s2) != 0))

<,<=,>,>= : (s1 != NULL) && (s2 != NULL) && (strcmp(s1, s2)  [<=>]  0)

Note how NULL is allowed for == and !=, but disallowed for < and >. 
Actually, Python 2 semantics suggest that NULL is smaller than anything, 
and Python 3 semantics suggest raising a TypeError for NULL values. We 
could potentially emulate that. The Python 2 semantics would give this:

< : (s2 != NULL) && ((s1 == NULL) || (strcmp(s1,s2) < 0))

<= : (s1 == NULL) || ((s2 != NULL) && (strcmp(s1,s2) <= 0))

 > : (s1 != NULL) && ((s2 == NULL) || (strcmp(s1,s2) > 0))

 >= : (s2 == NULL) || ((s1 != NULL) && (strcmp(s1,s2) >= 0))

And the Python 3 semantics would simply raise a TypeError if one of the two 
values is NULL and otherwise use strcmp(). We could support both depending 
on the -3 switch, at the cost of disallowing the Python 3 semantics in 
nogil sections.

What do you think?

If we go that route, I think it might be worth waiting for a 0.14 release.

Stefan
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to