Stefan Behnel wrote:
> Hi,
>
> latest cython-devel can infer the type of a for-loop variable when 
> iterating over C arrays, C pointers and Python strings. It will infer 
> Py_UNICODE for unicode strings, but plain 'object' for a bytes string, as 
> this returns sliced strings in Py2 and integers in Py3, so there is no 
> common C type. So the following will infer c to be a plain Python object:
>
>      cdef bytes s = b'abcdefg'
>
>      c = s[4]
>      for c in s:
>          pass
>
> However, this:
>
>      c = b'abcdefg'[4]
>      for c in b'abcdefg':
>          pass
>
> will infer 'char' for c, as the bytes literal starts off as a char* string. 
> The main problem here is that 'char' does not behave like a Python bytes 
> object at all. I doubt that iterating over bytes literals is a common use 
> case, but I'm not sure about the 'least surprising' thing to do here.
>
> Should we special case this to prevent breaking Python-2 semantics, or 
> should we expect that users will usually want 'char' as a result anyway?
>
> Both behaviours are easy to get with a simple cast, so this is really only 
> a matter of consistency and least surprise. The thing that really bites me 
> here is that the bytes type in Py3 *does* return integers on iteration. So 
> returning 'char' on indexing and iteration would be both more efficient and 
> more future proof. But it would also be impossible to keep consistent in 
> Python-2, as faking it would mean that an untyped bytes object would return 
> a substring, whereas a typed one would return an integer. And I don't 
> really want to inject a type check branch into each getitem call to 
> override that behaviour...
>
> So ISTM that the only way to make this consistent is to follow Python 2 for 
> now, including literals, and to accept the different (but also consistent) 
> behaviour when running in Python 3.
>
> Opinions?
>   
"In the face of ambiguity, refuse the temptation to guess"?

I.e., I'd just disallow it from the language (that is, require a cast), 
because of this issue. I don't see iterating over string literals as 
important enough that one can't require a cast.

Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to