Re: [Rd] readLines interaction with gsub different in R-dev

2018-02-19 Thread Tomas Kalibera
Thank you for the report and analysis. Now fixed in R-devel. Tomas On 02/17/2018 08:24 PM, William Dunlap via R-devel wrote: I think the problem in R-devel happens when there are non-ASCII characters in any of the strings passed to gsub. txt <- vapply(list(as.raw(c(0x41, 0x6d, 0xc3, 0xa9,

Re: [Rd] readLines interaction with gsub different in R-dev

2018-02-17 Thread William Dunlap via R-devel
I think the problem in R-devel happens when there are non-ASCII characters in any of the strings passed to gsub. txt <- vapply(list(as.raw(c(0x41, 0x6d, 0xc3, 0xa9, 0x6c, 0x69, 0x65)), as.raw(c(0x41, 0x6d, 0x65, 0x6c, 0x69, 0x61))), rawToChar, "") txt #[1] "Amélie" "Amelia" Encoding(txt) #[1]

Re: [Rd] readLines interaction with gsub different in R-dev

2018-02-17 Thread Hugh Parsonage
| Confirmed for R-devel (current) on Ubuntu 17.10. But ... isn't the regexp | you use wrong, ie isn't R-devel giving the correct answer? No, I don't think R-devel is correct (or at least consistent with the documentation). My interpretation of gsub("(\\w)", "\\U\\1", entry, perl = TRUE) is "Take

Re: [Rd] readLines interaction with gsub different in R-dev

2018-02-17 Thread Dirk Eddelbuettel
On 17 February 2018 at 21:10, Hugh Parsonage wrote: | I was told to re-raise this issue with R-dev: | | In the documentation of R-dev and R-3.4.3, under ?gsub | | > replacement | >... For perl = TRUE only, it can also contain "\U" or "\L" to convert the rest of the replacement to upper or

[Rd] readLines interaction with gsub different in R-dev

2018-02-17 Thread Hugh Parsonage
I was told to re-raise this issue with R-dev: In the documentation of R-dev and R-3.4.3, under ?gsub > replacement >... For perl = TRUE only, it can also contain "\U" or "\L" to convert the > rest of the replacement to upper or lower case and "\E" to end case > conversion. However, the