On 10/14/2009 2:29 PM, Adrian Dragulescu wrote:
Thank you.

If I use
gsub(" \xad", "-", x)
[1] "NEW YORK-NEW ENGLAND"

I get what I want.

Right, that's simpler than what I suggested.

Duncan Murdoch


Adrian

sessionInfo()
R version 2.9.2 (2009-08-24)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base


On Wed, 14 Oct 2009, Prof Brian Ripley wrote:

On Wed, 14 Oct 2009, Adrian Dragulescu wrote:

charToRaw(x)
[1] 4e 45 57 20 59 4f 52 4b 20 ad 4e 45 57 20 45 4e 47 4c 41 4e 44
charToRaw(y)
[1] 4e 45 57 20 59 4f 52 4b 20 2d 4e 45 57 20 45 4e 47 4c 41 4e 44


So they are different.

We really do need the 'at a minimum' information we asked you for in the posting guide. But in cp1252 (a guess as to what you might be using) \xad is a 'soft hyphen', and that is not the same thing as a hyphen -- you will get the same issues with 'non-breaking space'.

BDR


Adrian

I use R 2.8.1 on WinXP


On Wed, 14 Oct 2009, Duncan Murdoch wrote:

On 10/14/2009 1:30 PM, Adrian Dragulescu wrote:
Hello,

Below is some output that shows my issue.

I have a variable x that I read from a file (more on this below)

x
[1] "NEW YORK NEW ENGLAND"
gsub(" -", "-", x)            # this does not work!
[1] "NEW YORK NEW ENGLAND"

Well, I see no hyphen at all here, but then I am not on Windows.

It looks as though it worked, presumably because something got lost in your email.

Could you post charToRaw(x) so we can see what's in x?

Duncan Murdoch

Encoding(x)                   # is x in a special encoding? no
[1] "unknown"
y = "NEW YORK -NEW ENGLAND"   # I type in variable y
gsub(" -", "-", y)            # and gsub works as expected
[1] "NEW YORK-NEW ENGLAND"


I'm sure the problem has to do with the way I read the variable x. But even if I change the encoding for x to ASCII, I still cannot do the sub. I get x by reading a pdf file with pdftotext so you will not be able to replicate my issue.

Thanks for any suggestions,
Adrian

--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to