Perhaps this is a good advert for Linux :)  I'm afraid I don't use Windows
but I know we've had some hassles with students' machines and encodings
before.

Stack Overflow isn't very encouraging (see this, for example
<https://stackoverflow.com/questions/46728047/r-rstudio-console-encoding-windows>)
but perhaps you could try setting to a windows-native encoding that has the
relevant characters?

--M

On Mon, 2 Jul 2018 at 12:29 Holger Mitterer <holger.mitte...@um.edu.mt>
wrote:

> Dear Martin and Jan,
>
>
>
> thanks for the quick replies.
>
> What systems are you working on?
>
>
>
> I work on Windows, and I am not able to get it to work.
>
> get locale gives “English_United States.1252” for all.
>
>
>
> The Problem is that setlocale does not accept anything containing UTF-8,
>
> so I can do Sys.setlocale("LC_ALL", "German")
>
> but this only changes it to German_Germany.1252, or change it back to my 
> original by:
>
> Sys.setlocale("LC_CTYPE", "English_US.1252")
>
>
>
>
>
> Anything containg UTF is rejected, both
>
> Sys.setlocale("LC_ALL", "en_GB.UTF-8")
>
> or the more restrictive
>
> Sys.setlocale("LC_CTYPE", "en_GB.UTF-8")
>
>
>
> return the message:
>
> OS reports request to set locale to "en_GB.UTF-8" cannot be honored
>
>
>
> I have not found a way to change the Character type local to anything that 
> contains UTF8.
>
> Any ideas?
>
>
>
> Holger
>
>
>
>
>
>
>
>
>
> *From:* Martin Corley <martin.cor...@ed.ac.uk>
> *Sent:* Monday, July 2, 2018 12:25 PM
> *To:* Holger Mitterer <holger.mitte...@um.edu.mt>
> *Cc:* ling-r-lang-l@mailman.ucsd.edu
> *Subject:* Re: [R-lang] getting non-Ascii characters in and out of R
> unchanged
>
>
>
> What does Sys.getlocale() return?
>
>
>
> I can read your Malti examples fine in a UTF-8 environment...
>
>
>
> > library(readxl)
>
> > df <- read_excel('~/tmp/test_holg.xlsx')
>
> > df
>
> # A tibble: 2 x 1
>
>   Sentence
>
>   *<chr>*
>
> 1 Mario iħobb jimmaġina?
>
> 2 Anita tisfen il-ballet?
>
> > Sys.getlocale()
>
> [1] 
> "LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C"
>
> Best
>
>
>
> --M
>
>
>
> On Mon, 2 Jul 2018 at 11:15 Holger Mitterer <holger.mitte...@um.edu.mt>
> wrote:
>
> Hello List,
>
>
>
> I have a very simple programming task, which is adding ERP codes to input
> files for an EEG experiment.
>
> That is not the problem.
>
>
>
> However, the data contains non-ascii characters (see below) and I do not
> manage
>
> to get them in and out of R without changes. The files are originally
> .xlsx, but both readxlsx and readxl packages
> seem to ‘normalize’ the input (so that the “ħ” becomes an “h”), and the
> original character is lost.
>
>
>
> If I save the file as csv in Excel, and use read.csv, one of two things
> happens:
>
>
>
> I use fileEncoding = “UTF-8” and again, the special characters are
> converted to their nearest ASCII neighbor.
>
> I do not use fileEncoding and the non-ascii characters get garbled.
>
>
>
> Any ideas how to get non-American characters in and out of R without such
> changes?
>
>
>
> Best,
>
> Holger
>
>
>
> PS: An example of the input file:
>
>
>
> Run order            Filename             Condition
> OnsetADJ            OnsetChange    Question             Answer
>
> 1              MALTESEINCOR0032Idealista      MALINCONG     2.263
> 2.728     Mario iħobb jimmaġina?               Iva
>
> 2              MALTESEINCOR0032Grazzjuż     MALINCONG     1.615
> 2.048     Anita tisfen il-ballet?       Iva
>
> 3              SEMANOM20036gallettinaCor    SEMCONG
> 1.901
>
>
>
>
>
> -                      - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - -
>
> -                      Prof. Holger Mitterer PhD (Maastricht)
>
> -                      Department of Cognitive Science
>
> -                      Faculty of Media and Knowledge Sciences
>
> -                      University of Malta
>
> -                      +356 2340 3127 <+356%202340%203127>
>
>
>
> --
>
> Martin Corley
> University of Edinburgh
>
-- 

Martin Corley
University of Edinburgh
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Reply via email to