Re: [Cython] support for numpy bool type / unicode type

Cristi Constantin Mon, 15 Jun 2009 03:18:43 -0700

Good day.
I was following this discussion too.
How about unicode_t ?
I am trying to make the numpy tutorial you have on Cython Wiki, but instead of 
np.int_t i need numpy.unicode_ type.
Do you have any idea how i could do that?


Check this python code:
import numpy as np
a=np.array([0,1,2],np.unicode_)
print a # array([u'0', u'1', u'2'], dtype='<U1')
a[0]=u'\u2588'
a[1]=u'\u00b6'
a[2]=u'\u2248'
a.astype('I')
# UnicodeEncodeError: 'decimal' codec can't encode character u'\u2588' in 
position 0: invalid decimal Unicode string
a.astype(np.uint32)
# Same unicode error.

What's the best "cast" i can do to have my Unicode Numpy Array?
Thank you in advance.

--- On Tue, 6/9/09, Robert Bradshaw <[email protected]> wrote:

From: Robert Bradshaw <[email protected]>
Subject: Re: [Cython] support for numpy bool type
To: [email protected]
Date: Tuesday, June 9, 2009, 1:38 AM

On Jun 6, 2009, at 1:22 PM, Dag Sverre Seljebotn wrote:

> Robert Bradshaw wrote:
>> On Jun 6, 2009, at 2:20 AM, Dag Sverre Seljebotn wrote:
>>
>>> Eric Firing wrote:
>>>> Eric Firing wrote:
>>>>> In writing a cython extension to work with numpy masked arrays, I
>>>>> needed to work with the mask, which is dtype('bool').  Therefore I
>>>>> was expecting to be able to use np.bool_t, in analogy to  
>>>>> np.float_t.
>>>>> Instead I had to use np.int8_t, and cast the input to np.int8 in a
>>>>> python wrapper.  Can bool_t be added to numpy.pxd?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Eric
>>> The reason it is not there is that AFAIK Cython has no support  
>>> for an
>>> 8-bit boolean type. I.e. you want
>>>
>>> print arr[3] # should print "True", not 1.
>>>
>>> and furthermore you want
>>>
>>> arr[3] = obj # should mean arr[3] = bool(obj)
>>>
>>> Which brings up the questions
>>> 1) I suppose the best thing would be to add support for a "bool"  
>>> which
>>> would be a C99 _Bool if compiling on C99, and "unsigned char" with
>>> appropriate restrictions otherwise. Thoughts?
>>
>> How would this relate to the current bint type? bint is an int
>> because logical operations in C are ints, and also several operations
>> in the Python/C API (and elsewhere) return true/false values as ints.
>
> Well, bint is an int that has different semantics for conversion to/ 
> from
> Python,

That is exactly what it is.

> and that seems to be needed here as well (e.g. a "bchar", which I
> think is essentially C99 _Bool).

a C99 _Bool can only have two values, anything non-zero is converted  
to exactly 1.

> What happens if you do "cdef bint value = 4"? Is bool(4) called  
> resulting
> in 1 being placed in value? I think that's the wanted behaviour  
> here...

No, it retains the value 4. (This is one of the reasons I didn't call  
it a bool.)

> Using bint directly seems to be right out though as one must access  
> the
> data using an 8-bit integer type. (Well, I could hard-code buffers  
> to cast
> back and forth between "bint" and "char", after the pointer  
> dereference,
> but that seems very unclean compared to introducing a new type).
>
> Perhaps we should have "char bint", "short bint", "unsigned long long
> bint" and so on :-) Really, the need for a seperate type for bools  
> is a
> strict Cython feature, because what it affects is Python object
> conversion, so it doesn't hurt IMO that these are not in C. And  
> size and
> booleanness are orthogonal features. (But, this is a puristic approach
> only, and in practice "bchar" would suffice.)

Actually, that might work well. When I first introduced it, I saw no  
need to offer various sizes, but if one has arrays of them then it  
becomes important. And just when we thought we were simplifying the  
integer type system... :)

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] support for numpy bool type / unicode type

Reply via email to