Dag Sverre Seljebotn wrote:
> Stefan Behnel wrote:
>> Hi Lisandro,
>>
>> that idea isn't that wild at all. It also occurred to me while I was
>> thinking about the string encoding stuff. However, there are certain
>> drawbacks to this that do not make it easily suitable.
>>
>> Lisandro Dalcin, 03.12.2009 16:08:
>>  
>>> cdef current_encoding = "latin-1"
>>>
>>> cdef unicode charp2unicode(char* p):
>>>     return p.decode(current_encoding)
>>>
>>> and (somehow) add a to mapping (new syntax likely required):
>>>
>>> C_2_Python['char*'] = charp2unicode
>>>     
>>
>> Seeing this in action makes me think that a decorator would fit 
>> nicely here:
>>
>>     @cython.typemapper(default=True)
>>     cdef inline unicode charp2unicode(char* p):
>>          return p.decode(current_encoding)
>>
>>     @cython.typemapper
>>     cdef inline bytes charp2unicode(char* p):
>>          return <bytes>p
>>
>> The compiler could then collect all such type mappers, make sure that 
>> only
>> one of them is declared as the default mapper for an input type, and 
>> then
>> just call them to do the type conversion between the types in a given 
>> context.
>>
>> Disadvantages:
>>
>> 1) The above works well for a global setup, but I'd expect there's a 
>> lot of
>> code that requires different mappings depending on the context, at least
>> for some types. (strings in lxml are certainly an example)
>>
>> 2) The above will not work for the unicode->char* case, for example, as
>> there is no way to store a Python reference outside of the converter
>> function scope. So this is limited to simple coercions that do not 
>> create
>> new Python references.
>>   
> There's also the possibility of "overloading conversion operators", 
> similar to what you can do in C++. This binds it to type instead:
>
> cdef class utf8_charp:
>    cdef char* ptr
>
>    def __init__(self, char* ptr):
>        self.ptr = ptr
>
>    cdef char* __convert__(self):
>        return self.ptr
>    cdef object __convert__(self):
>        return self.ptr.decode('utf-8')
>
>    @classmethod
>    cdef latin1_charp __coercefrom__(self, char* other):
>        return latin1_charp(other)
This should be utf8_charp everywhere, sorry about that. I'm less than 
happy with the signature of __coercefrom__, but a very slightly improvement:

@staticmethod
cdef utf8_charp __convertfrom__(char* other): ...

Just brainstorming though. To loose the object construction overhead one 
could also allow this for structs, which would thus tend to simply 
decorate a struct containing only a char* with custom conversion rules. 
But now I'm really reinventing C++...not that that must be a bad thing...

Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to