Re: [R] Assigning factor to character vector

2013-04-20 Thread Milan Bouchet-Valat
Le samedi 20 avril 2013 à 07:22 -0700, Bert Gunter a écrit :
> Sorry, I failed to cc: the list.
> 
> I also added a slight edit below to clarify my final statement. -- Bert
> 
> On Sat, Apr 20, 2013 at 7:17 AM, Bert Gunter  wrote:
> > Milan:
> >
> > 1. The R Inferno was written by Pat Burns and is not in any way an
> > "official" R document. So it is a "Pat Burns" not an "R" issue. You
> > can contact him directly, if you wish -- though he monitors this list
> > and almost surely has seen this.
Sure, I did not imply that the R Inferno was an official document.

> > 2. EVerything works exactly as documented and expected. See the R
> > Language definition ... and perhaps the "Intro to R" tutorial..
But is there a mention of what happens precisely to assignments from
factors? I could not find it. For example, the R Language Definition
does not mention coercion in the Subset Assignment section [1].

> > Inline comments below.
> >
> > Cheers,
> > Bert
> >
> > On Sat, Apr 20, 2013 at 5:49 AM, Milan Bouchet-Valat  
> > wrote:
> >> Hi!
> >>
> >> Yesterday I accidentally discovered this:
> >>> a <- LETTERS[1:5]
> >>> a
> >> [1] "A" "B" "C" "D" "E"
> >
> > a is a character vector.
> >>>
> >>> a[1] <- factor(a[1])
> > The RHS is an vector of integers with additional attributes that define a 
> > factor
> > The replacement of the first element of a, a character vector,  by an
> > integer causes the integer to be silently coerced to a character. The
> > default S3 replacement method is used -- see ?UseMethod. or the R
> > Intro for info on S3 methods
Thanks, but I already understand this part. My surprise comes from the
fact that the default replacement method coerces a factor to a character
in a way which is different from calling as.character() on it. It acts
as if attributes were dropped _before_ coercion (and thus everything
happens as if the factor was a mere integer).

> >>> a
> >> [1] "1" "B" "C" "D" "E"
> >>
> >> BUT:
> >>> b <- factor(LETTERS[1:5])
> > b is a factor
> >
> >>> b
> >> [1] A B C D E
> >> Levels: A B C D E
> >>> b[1] <- factor(b[1])
> >>> b
> >> [1] A B C D E
> >> Levels: A B C D E
> >>> b[1] <- as.character(b[1])
> > The replacement method for a factor is used in
> b[1] <- factor(b[1])
> See ?"[<-.factor" .
Yeah, this part was here to show the asymmetric character of the
factor <-> character assignments.

Regards


1: http://cran.r-project.org/doc/manuals/R-lang.html#Subset-assignment


> > Cheers,
> > Bert
> >
> > The replacement
> >>> b
> >> [1] A B C D E
> >> Levels: A B C D E
> >>
> >> I think this would definitely deserve a mention in the R Inferno...
> >>
> >> I guess this is documented somewhere (though I could not find anything
> >> in help("[<-"). Would someone be kind enough to give me the explanation
> >> of this behavior? I suspect this has something to do with the coercion
> >> order, but I do not really get why a[1] does not get assigned the result
> >> of as.character(factor(a[1]))... Probably, there is no special-casing of
> >> factors, which are handled as integer vectors?
> >>
> >> Wouldn't it be useful to print a warning when this happens, since nobody
> >> reasonable would rely on such a special behavior? I wish R had a "safe
> >> mode" where all these tricky implicit coercion cases would warn... :-/
> >>
> >>
> >> Regards
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> >
> > Bert Gunter
> > Genentech Nonclinical Biostatistics
> >
> > Internal Contact Info:
> > Phone: 467-7374
> > Website:
> > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
> 
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assigning factor to character vector

2013-04-20 Thread Bert Gunter
Sorry, I failed to cc: the list.

I also added a slight edit below to clarify my final statement. -- Bert

On Sat, Apr 20, 2013 at 7:17 AM, Bert Gunter  wrote:
> Milan:
>
> 1. The R Inferno was written by Pat Burns and is not in any way an
> "official" R document. So it is a "Pat Burns" not an "R" issue. You
> can contact him directly, if you wish -- though he monitors this list
> and almost surely has seen this.
>
> 2. EVerything works exactly as documented and expected. See the R
> Language definition ... and perhaps the "Intro to R" tutorial..
>
> Inline comments below.
>
> Cheers,
> Bert
>
> On Sat, Apr 20, 2013 at 5:49 AM, Milan Bouchet-Valat  
> wrote:
>> Hi!
>>
>> Yesterday I accidentally discovered this:
>>> a <- LETTERS[1:5]
>>> a
>> [1] "A" "B" "C" "D" "E"
>
> a is a character vector.
>>>
>>> a[1] <- factor(a[1])
> The RHS is an vector of integers with additional attributes that define a 
> factor
> The replacement of the first element of a, a character vector,  by an
> integer causes the integer to be silently coerced to a character. The
> default S3 replacement method is used -- see ?UseMethod. or the R
> Intro for info on S3 methods
>
>>> a
>> [1] "1" "B" "C" "D" "E"
>>
>> BUT:
>>> b <- factor(LETTERS[1:5])
> b is a factor
>
>>> b
>> [1] A B C D E
>> Levels: A B C D E
>>> b[1] <- factor(b[1])
>>> b
>> [1] A B C D E
>> Levels: A B C D E
>>> b[1] <- as.character(b[1])
> The replacement method for a factor is used in
b[1] <- factor(b[1])
See ?"[<-.factor" .
>
> Cheers,
> Bert
>
> The replacement
>>> b
>> [1] A B C D E
>> Levels: A B C D E
>>
>> I think this would definitely deserve a mention in the R Inferno...
>>
>> I guess this is documented somewhere (though I could not find anything
>> in help("[<-"). Would someone be kind enough to give me the explanation
>> of this behavior? I suspect this has something to do with the coercion
>> order, but I do not really get why a[1] does not get assigned the result
>> of as.character(factor(a[1]))... Probably, there is no special-casing of
>> factors, which are handled as integer vectors?
>>
>> Wouldn't it be useful to print a warning when this happens, since nobody
>> reasonable would rely on such a special behavior? I wish R had a "safe
>> mode" where all these tricky implicit coercion cases would warn... :-/
>>
>>
>> Regards
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assigning factor to character vector

2013-04-20 Thread Milan Bouchet-Valat
Hi!

Yesterday I accidentally discovered this:
> a <- LETTERS[1:5]
> a
[1] "A" "B" "C" "D" "E"
> 
> a[1] <- factor(a[1])
> a
[1] "1" "B" "C" "D" "E"

BUT:
> b <- factor(LETTERS[1:5])
> b
[1] A B C D E
Levels: A B C D E
> b[1] <- factor(b[1])
> b
[1] A B C D E
Levels: A B C D E
> b[1] <- as.character(b[1])
> b
[1] A B C D E
Levels: A B C D E

I think this would definitely deserve a mention in the R Inferno...

I guess this is documented somewhere (though I could not find anything
in help("[<-"). Would someone be kind enough to give me the explanation
of this behavior? I suspect this has something to do with the coercion
order, but I do not really get why a[1] does not get assigned the result
of as.character(factor(a[1]))... Probably, there is no special-casing of
factors, which are handled as integer vectors?

Wouldn't it be useful to print a warning when this happens, since nobody
reasonable would rely on such a special behavior? I wish R had a "safe
mode" where all these tricky implicit coercion cases would warn... :-/


Regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.