On Dec 15, 2014, at 1:37 PM, Spencer Graves <spencer.gra...@prodsyse.com> wrote:
> 
> 
>> On Dec 15, 2014, at 10:13 AM, Simon Urbanek <simon.urba...@r-project.org> 
>> wrote:
>> 
>>> 
>>> On Dec 15, 2014, at 12:21 PM, Kurt Hornik <kurt.hor...@wu.ac.at> wrote:
>>> 
>>>>>>>> Spencer Graves writes:
>>> 
>>>> Hello, All:  
>>>>      What would it take to make “iconv” portable?  
>>> 
>>> 
>>>>      I ask, because I want to convert accented characters to
>>>>      vanilla ASCII, thereby converting, e.g., ‘Raúl’ to “Raul”, and
>>>>      Milan Bouchet-Valet suggested on R-help that I use 'iconv(x,
>>>>      “", "ASCII//TRANSLIT”)’.  This worked under Windows but failed
>>>>      on Linux and Mac.  It’s part of the “subNonStandardCharacters”
>>>>      function in the Ecfun package. The development version on
>>>>      R-Forge uses this and returns “Raul” under Windows and NA
>>>>      under Mac OS X (and presumably also Linux).
>>> 
>>> Hmm.
>>> 
>>> R> iconv("Raúl", "", "ASCII//TRANSLIT")
>>> [1] "Raul"
>>> 
>>> seems to work for me on Linux ...
>>> 
>> 
>> also on OS X:
>> 
>>> iconv("Raúl", "", "ASCII//TRANSLIT")
>> [1] “Ra'ul"
> 
> 
>         Thanks for the replies.  I should have checked my examples more 
> carefully.  Consider the following example and a slight modification from 
> help(“iconv”):  
> 
> 
> > x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")
> > Encoding(x) <- "latin1"
> > x
> [1] "Ekstrøm"               "Jöreskog"              "bißchen Zürcher"      
> > iconv(x, "latin1", "ASCII//TRANSLIT")  # platform-dependent
> [1] "Ekstrom"            "J\"oreskog"         "bisschen Z\"urcher"
> > 
> > x <- c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")
> > x
> [1] "Ekstr\xf8m"            "J\xf6reskog"           "bi\xdfchen Z\xfcrcher"
> > iconv(x, "", "ASCII//TRANSLIT")  # platform-dependent
> [1] NA NA NA 
> 
> 
>         This suggests a two-step fix to my problem:  (1) Check Encoding(x) 
> and set to “latin1” if it’s “unknown”.

Well, that depends heavily on your source. In the above it is hand-crafted 
latin1 so if you don't declare it, the native encoding will be assumed - which 
can be anything and has nothing to do with your actual input in this particular 
case where you hand-constructed latin1.


>  (2) Delete any new \” added by iconv.  
> 

The whole point of translit is to create combinations of ASCII characters that 
represent the unicode characters, so " is just one many characters that can be 
used.

Cheers,
S


> 
>         Thanks again, 
>         Spencer 
> 
>> 
>> 
>> 
>>> -k
>>> 
>>> 
>>>>     The “iconv” R code merely calls compiled code, which I’ve used very 
>>>> little in 30 years.   
>>> 
>>> 
>>>>      Thanks, 
>>>>      Spencer 
>>> 
>>> 
>>> 
>>>>> On Nov 30, 2014, at 2:32 AM, Spencer Graves 
>>>>> <spencer.gra...@structuremonitoring.com 
>>>>> <mailto:spencer.gra...@structuremonitoring.com>> wrote:
>>>>> 
>>>>> Wonderful.  Thanks very much.  Spencer
>>>>> 
>>>>> 
>>>>> On 11/30/2014 2:25 AM, Milan Bouchet-Valat wrote:
>>> 
>>>>    [[alternative HTML version deleted]]
>>> 
>>>> ______________________________________________
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to