Stefan Behnel wrote:
> Hi Lisandro,
>
> that idea isn't that wild at all. It also occurred to me while I was
> thinking about the string encoding stuff. However, there are certain
> drawbacks to this that do not make it easily suitable.
>
> Lisandro Dalcin, 03.12.2009 16:08:
>   
>> cdef current_encoding = "latin-1"
>>
>> cdef unicode charp2unicode(char* p):
>>     return p.decode(current_encoding)
>>
>> and (somehow) add a to mapping (new syntax likely required):
>>
>> C_2_Python['char*'] = charp2unicode
>>     
>
> Seeing this in action makes me think that a decorator would fit nicely here:
>
>     @cython.typemapper(default=True)
>     cdef inline unicode charp2unicode(char* p):
>          return p.decode(current_encoding)
>
>     @cython.typemapper
>     cdef inline bytes charp2unicode(char* p):
>          return <bytes>p
>
> The compiler could then collect all such type mappers, make sure that only
> one of them is declared as the default mapper for an input type, and then
> just call them to do the type conversion between the types in a given context.
>
> Disadvantages:
>
> 1) The above works well for a global setup, but I'd expect there's a lot of
> code that requires different mappings depending on the context, at least
> for some types. (strings in lxml are certainly an example)
>
> 2) The above will not work for the unicode->char* case, for example, as
> there is no way to store a Python reference outside of the converter
> function scope. So this is limited to simple coercions that do not create
> new Python references.
>   
There's also the possibility of "overloading conversion operators", 
similar to what you can do in C++. This binds it to type instead:

cdef class utf8_charp:
    cdef char* ptr

    def __init__(self, char* ptr):
        self.ptr = ptr

    cdef char* __convert__(self):
        return self.ptr
    cdef object __convert__(self):
        return self.ptr.decode('utf-8')

    @classmethod
    cdef latin1_charp __coercefrom__(self, char* other):
        return latin1_charp(other)

That might leave Sage with a global search&replace for char* for 
encoding issues...

Dag Sverre
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to