Dag Sverre Seljebotn wrote:
> Stefan Behnel wrote:
>> Hi Lisandro,
>>
>> that idea isn't that wild at all. It also occurred to me while I was
>> thinking about the string encoding stuff. However, there are certain
>> drawbacks to this that do not make it easily suitable.
>>
>> Lisandro Dalcin, 03.12.2009 16:08:
>>
>>> cdef current_encoding = "latin-1"
>>>
>>> cdef unicode charp2unicode(char* p):
>>> return p.decode(current_encoding)
>>>
>>> and (somehow) add a to mapping (new syntax likely required):
>>>
>>> C_2_Python['char*'] = charp2unicode
>>>
>>
>> Seeing this in action makes me think that a decorator would fit
>> nicely here:
>>
>> @cython.typemapper(default=True)
>> cdef inline unicode charp2unicode(char* p):
>> return p.decode(current_encoding)
>>
>> @cython.typemapper
>> cdef inline bytes charp2unicode(char* p):
>> return <bytes>p
>>
>> The compiler could then collect all such type mappers, make sure that
>> only
>> one of them is declared as the default mapper for an input type, and
>> then
>> just call them to do the type conversion between the types in a given
>> context.
>>
>> Disadvantages:
>>
>> 1) The above works well for a global setup, but I'd expect there's a
>> lot of
>> code that requires different mappings depending on the context, at least
>> for some types. (strings in lxml are certainly an example)
>>
>> 2) The above will not work for the unicode->char* case, for example, as
>> there is no way to store a Python reference outside of the converter
>> function scope. So this is limited to simple coercions that do not
>> create
>> new Python references.
>>
> There's also the possibility of "overloading conversion operators",
> similar to what you can do in C++. This binds it to type instead:
>
> cdef class utf8_charp:
> cdef char* ptr
>
> def __init__(self, char* ptr):
> self.ptr = ptr
>
> cdef char* __convert__(self):
> return self.ptr
> cdef object __convert__(self):
> return self.ptr.decode('utf-8')
>
> @classmethod
> cdef latin1_charp __coercefrom__(self, char* other):
> return latin1_charp(other)
This should be utf8_charp everywhere, sorry about that. I'm less than
happy with the signature of __coercefrom__, but a very slightly improvement:
@staticmethod
cdef utf8_charp __convertfrom__(char* other): ...
Just brainstorming though. To loose the object construction overhead one
could also allow this for structs, which would thus tend to simply
decorate a struct containing only a char* with custom conversion rules.
But now I'm really reinventing C++...not that that must be a bad thing...
Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev