Re: [Rpy] unicode support

Toby HOCKING Tue, 28 Apr 2009 00:51:33 -0700

Hi again, here is a temporary solution that I am using, in case anyone  
encounters the same problem and wants a quick dirty hack solution:


def encode_utf8(D):
     """Encode all unicode strs in a dict to normal strs.

     The idea is that you are going to call an R function with RPy, and
     you have a dict D that will be the kwargs you use in that function
     call. You pass D to this function and it will return you an
     identical dict but with all the unicode strings changed to normal
     python strings, encoded so that they work in R and represent the
     correct character when you use print() or plot(). Finally call
     your R function using something like

     from rpy import r
     r.your_fun(**encode_utf8(kwargs))

     Only works when values of D are among a few special data types:
     - unicode string
     - list of unicode strings
     - list of dicts

     RPy should do this for us, but it doesn't so I wrote this hack.

     Works with:
     rpy r736
     R version 2.9.0 (2009-04-17)
     Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)


     """
     for k,v in D.iteritems():
         if type(v)==unicode: # string scalar -> character scalar
             D[k]=v.encode("UTF-8")
         if type(v)==list:
             if type(v[0])==dict: # list of dicts -> call recursively
                 D[k]=[encode_utf8(x) for x in v]
             if type(v[0])==unicode: # string lists -> character vectors
                 D[k]=[x.encode("UTF-8") for x in v]
     return D

Quoting Toby HOCKING <[email protected]>:

> Hi fellow RPy-ers, long time no chat, hope everything is going well
> for you all,
>
> Maybe you guys remember me --- I emailed the list a year or so ago and
> I had a problem with unicode string conversion from Python to R using
> RPy1. Here is a webpage I just found with the relevant email
>
> http://article.gmane.org/gmane.comp.python.rpy/422
>
> We resolved the problem by using the "hack" --- that is, converting
> Python unicode strings to normal python strings, then giving those to
> R. This works fine until we start using non-ascii characters in
> unicode strings. I actually did a bunch of tests (illustrated below)
> and I found that passing unicode python strings to R works fine, as
> long as you don't use any non-ascii characters. But right now I am
> working in France and I would like to be able to pass non-ascii
> unicode strings with accent marks from python to R. The example I use
> below is that "c" u"c" and even "ç" works but u"ç" does not (and I
> believe it should). I was wondering if there is any possibility to add
> this support in the near future?
>
> Currently I am using Ubuntu 8.04 and these package versions
>
> rpy r736
> R version 2.9.0 (2009-04-17)
> Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
>
> Here is my test case session
>
> thock...@stagiaire-desktop:~/rpy$ R
>> print("François")
> [1] "François"
>> Encoding("François")
> [1] "UTF-8"
>>
> Save workspace image? [y/n/c]: n
>
> thock...@stagiaire-desktop:~/rpy$ python
>>>> from rpy import r
>>>> r.print_('c')
> [1] "c"
> 'c'
>>>> r.print_(u'c')
> [1] "c"
> 'c'
>>>> r.print_('ç')
> [1] "ç"
> '\xc3\xa7'
>>>> r.print_(u'ç')
>
>   *** caught segfault ***
> address 0x4, cause 'memory not mapped'
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Selection: 2
> Exception thread.error: 'release unlocked lock' in <class
> 'thread.error'> ignored
> thock...@stagiaire-desktop:~/rpy$
>
> Thanks for any help you can offer,
> Sincerely,
> Toby Dylan Hocking


------------------------------------------------------------------------------
Register Now & Save for Velocity, the Web Performance & Operations 
Conference from O'Reilly Media. Velocity features a full day of 
expert-led, hands-on workshops and two days of sessions from industry 
leaders in dedicated Performance & Operations tracks. Use code vel09scf 
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
_______________________________________________
rpy-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rpy-list

Re: [Rpy] unicode support

Reply via email to