Re: [R] Assigning factor to character vector
Le samedi 20 avril 2013 à 07:22 -0700, Bert Gunter a écrit : > Sorry, I failed to cc: the list. > > I also added a slight edit below to clarify my final statement. -- Bert > > On Sat, Apr 20, 2013 at 7:17 AM, Bert Gunter wrote: > > Milan: > > > > 1. The R Inferno was written by Pat Burns and is not in any way an > > "official" R document. So it is a "Pat Burns" not an "R" issue. You > > can contact him directly, if you wish -- though he monitors this list > > and almost surely has seen this. Sure, I did not imply that the R Inferno was an official document. > > 2. EVerything works exactly as documented and expected. See the R > > Language definition ... and perhaps the "Intro to R" tutorial.. But is there a mention of what happens precisely to assignments from factors? I could not find it. For example, the R Language Definition does not mention coercion in the Subset Assignment section [1]. > > Inline comments below. > > > > Cheers, > > Bert > > > > On Sat, Apr 20, 2013 at 5:49 AM, Milan Bouchet-Valat > > wrote: > >> Hi! > >> > >> Yesterday I accidentally discovered this: > >>> a <- LETTERS[1:5] > >>> a > >> [1] "A" "B" "C" "D" "E" > > > > a is a character vector. > >>> > >>> a[1] <- factor(a[1]) > > The RHS is an vector of integers with additional attributes that define a > > factor > > The replacement of the first element of a, a character vector, by an > > integer causes the integer to be silently coerced to a character. The > > default S3 replacement method is used -- see ?UseMethod. or the R > > Intro for info on S3 methods Thanks, but I already understand this part. My surprise comes from the fact that the default replacement method coerces a factor to a character in a way which is different from calling as.character() on it. It acts as if attributes were dropped _before_ coercion (and thus everything happens as if the factor was a mere integer). > >>> a > >> [1] "1" "B" "C" "D" "E" > >> > >> BUT: > >>> b <- factor(LETTERS[1:5]) > > b is a factor > > > >>> b > >> [1] A B C D E > >> Levels: A B C D E > >>> b[1] <- factor(b[1]) > >>> b > >> [1] A B C D E > >> Levels: A B C D E > >>> b[1] <- as.character(b[1]) > > The replacement method for a factor is used in > b[1] <- factor(b[1]) > See ?"[<-.factor" . Yeah, this part was here to show the asymmetric character of the factor <-> character assignments. Regards 1: http://cran.r-project.org/doc/manuals/R-lang.html#Subset-assignment > > Cheers, > > Bert > > > > The replacement > >>> b > >> [1] A B C D E > >> Levels: A B C D E > >> > >> I think this would definitely deserve a mention in the R Inferno... > >> > >> I guess this is documented somewhere (though I could not find anything > >> in help("[<-"). Would someone be kind enough to give me the explanation > >> of this behavior? I suspect this has something to do with the coercion > >> order, but I do not really get why a[1] does not get assigned the result > >> of as.character(factor(a[1]))... Probably, there is no special-casing of > >> factors, which are handled as integer vectors? > >> > >> Wouldn't it be useful to print a warning when this happens, since nobody > >> reasonable would rely on such a special behavior? I wish R had a "safe > >> mode" where all these tricky implicit coercion cases would warn... :-/ > >> > >> > >> Regards > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > > > -- > > > > Bert Gunter > > Genentech Nonclinical Biostatistics > > > > Internal Contact Info: > > Phone: 467-7374 > > Website: > > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning factor to character vector
Sorry, I failed to cc: the list. I also added a slight edit below to clarify my final statement. -- Bert On Sat, Apr 20, 2013 at 7:17 AM, Bert Gunter wrote: > Milan: > > 1. The R Inferno was written by Pat Burns and is not in any way an > "official" R document. So it is a "Pat Burns" not an "R" issue. You > can contact him directly, if you wish -- though he monitors this list > and almost surely has seen this. > > 2. EVerything works exactly as documented and expected. See the R > Language definition ... and perhaps the "Intro to R" tutorial.. > > Inline comments below. > > Cheers, > Bert > > On Sat, Apr 20, 2013 at 5:49 AM, Milan Bouchet-Valat > wrote: >> Hi! >> >> Yesterday I accidentally discovered this: >>> a <- LETTERS[1:5] >>> a >> [1] "A" "B" "C" "D" "E" > > a is a character vector. >>> >>> a[1] <- factor(a[1]) > The RHS is an vector of integers with additional attributes that define a > factor > The replacement of the first element of a, a character vector, by an > integer causes the integer to be silently coerced to a character. The > default S3 replacement method is used -- see ?UseMethod. or the R > Intro for info on S3 methods > >>> a >> [1] "1" "B" "C" "D" "E" >> >> BUT: >>> b <- factor(LETTERS[1:5]) > b is a factor > >>> b >> [1] A B C D E >> Levels: A B C D E >>> b[1] <- factor(b[1]) >>> b >> [1] A B C D E >> Levels: A B C D E >>> b[1] <- as.character(b[1]) > The replacement method for a factor is used in b[1] <- factor(b[1]) See ?"[<-.factor" . > > Cheers, > Bert > > The replacement >>> b >> [1] A B C D E >> Levels: A B C D E >> >> I think this would definitely deserve a mention in the R Inferno... >> >> I guess this is documented somewhere (though I could not find anything >> in help("[<-"). Would someone be kind enough to give me the explanation >> of this behavior? I suspect this has something to do with the coercion >> order, but I do not really get why a[1] does not get assigned the result >> of as.character(factor(a[1]))... Probably, there is no special-casing of >> factors, which are handled as integer vectors? >> >> Wouldn't it be useful to print a warning when this happens, since nobody >> reasonable would rely on such a special behavior? I wish R had a "safe >> mode" where all these tricky implicit coercion cases would warn... :-/ >> >> >> Regards >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning factor to character vector
Hi! Yesterday I accidentally discovered this: > a <- LETTERS[1:5] > a [1] "A" "B" "C" "D" "E" > > a[1] <- factor(a[1]) > a [1] "1" "B" "C" "D" "E" BUT: > b <- factor(LETTERS[1:5]) > b [1] A B C D E Levels: A B C D E > b[1] <- factor(b[1]) > b [1] A B C D E Levels: A B C D E > b[1] <- as.character(b[1]) > b [1] A B C D E Levels: A B C D E I think this would definitely deserve a mention in the R Inferno... I guess this is documented somewhere (though I could not find anything in help("[<-"). Would someone be kind enough to give me the explanation of this behavior? I suspect this has something to do with the coercion order, but I do not really get why a[1] does not get assigned the result of as.character(factor(a[1]))... Probably, there is no special-casing of factors, which are handled as integer vectors? Wouldn't it be useful to print a warning when this happens, since nobody reasonable would rely on such a special behavior? I wish R had a "safe mode" where all these tricky implicit coercion cases would warn... :-/ Regards __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.