Hi again, here is a temporary solution that I am using, in case anyone
encounters the same problem and wants a quick dirty hack solution:
def encode_utf8(D):
"""Encode all unicode strs in a dict to normal strs.
The idea is that you are going to call an R function with RPy, and
you have a dict D that will be the kwargs you use in that function
call. You pass D to this function and it will return you an
identical dict but with all the unicode strings changed to normal
python strings, encoded so that they work in R and represent the
correct character when you use print() or plot(). Finally call
your R function using something like
from rpy import r
r.your_fun(**encode_utf8(kwargs))
Only works when values of D are among a few special data types:
- unicode string
- list of unicode strings
- list of dicts
RPy should do this for us, but it doesn't so I wrote this hack.
Works with:
rpy r736
R version 2.9.0 (2009-04-17)
Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
"""
for k,v in D.iteritems():
if type(v)==unicode: # string scalar -> character scalar
D[k]=v.encode("UTF-8")
if type(v)==list:
if type(v[0])==dict: # list of dicts -> call recursively
D[k]=[encode_utf8(x) for x in v]
if type(v[0])==unicode: # string lists -> character vectors
D[k]=[x.encode("UTF-8") for x in v]
return D
Quoting Toby HOCKING <[email protected]>:
> Hi fellow RPy-ers, long time no chat, hope everything is going well
> for you all,
>
> Maybe you guys remember me --- I emailed the list a year or so ago and
> I had a problem with unicode string conversion from Python to R using
> RPy1. Here is a webpage I just found with the relevant email
>
> http://article.gmane.org/gmane.comp.python.rpy/422
>
> We resolved the problem by using the "hack" --- that is, converting
> Python unicode strings to normal python strings, then giving those to
> R. This works fine until we start using non-ascii characters in
> unicode strings. I actually did a bunch of tests (illustrated below)
> and I found that passing unicode python strings to R works fine, as
> long as you don't use any non-ascii characters. But right now I am
> working in France and I would like to be able to pass non-ascii
> unicode strings with accent marks from python to R. The example I use
> below is that "c" u"c" and even "ç" works but u"ç" does not (and I
> believe it should). I was wondering if there is any possibility to add
> this support in the near future?
>
> Currently I am using Ubuntu 8.04 and these package versions
>
> rpy r736
> R version 2.9.0 (2009-04-17)
> Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
>
> Here is my test case session
>
> thock...@stagiaire-desktop:~/rpy$ R
>> print("François")
> [1] "François"
>> Encoding("François")
> [1] "UTF-8"
>>
> Save workspace image? [y/n/c]: n
>
> thock...@stagiaire-desktop:~/rpy$ python
>>>> from rpy import r
>>>> r.print_('c')
> [1] "c"
> 'c'
>>>> r.print_(u'c')
> [1] "c"
> 'c'
>>>> r.print_('ç')
> [1] "ç"
> '\xc3\xa7'
>>>> r.print_(u'ç')
>
> *** caught segfault ***
> address 0x4, cause 'memory not mapped'
>
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Selection: 2
> Exception thread.error: 'release unlocked lock' in <class
> 'thread.error'> ignored
> thock...@stagiaire-desktop:~/rpy$
>
> Thanks for any help you can offer,
> Sincerely,
> Toby Dylan Hocking
------------------------------------------------------------------------------
Register Now & Save for Velocity, the Web Performance & Operations
Conference from O'Reilly Media. Velocity features a full day of
expert-led, hands-on workshops and two days of sessions from industry
leaders in dedicated Performance & Operations tracks. Use code vel09scf
and Save an extra 15% before 5/3. http://p.sf.net/sfu/velocityconf
_______________________________________________
rpy-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rpy-list