On 2013-04-01 13:08, Matthew Lundberg wrote:
Note the edited subject line! I don't know why I typed it as it was before.
This says that as.numeric(as.character(f)) will work regardless of the
implementation, and I agree.
It's the recommendation to use as.numeric(levels(f))[f] that has me
wondering about section 2.3.1 of the language definition. I expect that
this idiom is in widespread use, and perhaps the language definition
should be changed.
I think that I may be getting an inkling of what your complaint is:
section 2.3.1 talks about
"an integer array to specify the _actual_ levels" [emphasis added]
and
"a second array of _names_ that are mapped to the integers". [ditto]
When you object to the use of "as.numeric(levels(f))[f]", are you
assuming that "levels(f)" is the set of _integers_ or the set of
_names_?
Anyway, it's indeed the set of names, as returned by the levels()
function.
Peter Ehlers
On Mon, Apr 1, 2013 at 2:58 PM, Bert Gunter <gunter.ber...@gene.com
<mailto:gunter.ber...@gene.com>> wrote:
Yup. Note also:
> as.character.factor
function (x, ...)
levels(x)[x]
But of course this is OK, since this can change if the implementation
does. Which is the whole point, of course.
-- Bert
On Mon, Apr 1, 2013 at 12:16 PM, Matthew Lundberg
<matthew.k.lundb...@gmail.com <mailto:matthew.k.lundb...@gmail.com>>
wrote:
>
> When used as an index, the factor is implicitly converted to
integer. In
> the expression as.numeric(levels(f))[f], the vector
as.numeric(levels(f))
> is indexed by as.integer(f).
>
> This appears to rely on the current implementation, as mentioned
in section
> 2.3.1 of the language definition.
>
>
> On Mon, Apr 1, 2013 at 1:49 PM, Peter Ehlers <ehl...@ucalgary.ca
<mailto:ehl...@ucalgary.ca>> wrote:
>
> > On 2013-04-01 10:48, Matthew Lundberg wrote:
> >
> >> These two seem to be at odds. Is this the case?
> >>
> >> From help(factor) - section Warning:
> >>>
> >>
> >> To transform a factor f to approximately its original numeric
values,
> >> as.numeric(levels(f))[f] is recommended and slightly more
efficient than
> >> as.numeric(as.character(f)).
> >>
> >> From the language definition - section 2.3.1:
> >>>
> >>
> >> Factors are currently implemented using an integer array to
specify the
> >> actual levels and
> >> a second array of names that are mapped to the integers. Rather
> >> unfortunately users often
> >> make use of the implementation in order to make some
calculations easier.
> >> This, however,
> >> is an implementation issue and is not guaranteed to hold in all
> >> implementations of R.
> >>
> >
> > Hint:
> >
> > f <- factor(sample(5, 10, TRUE))
> > as.numeric(levels(f))[f]
> >
> > g <- factor(sample(letters[1:5], 10, TRUE))
> > as.numeric(levels(g))[g]
> >
> > Peter Ehlers
> >
> >
> >
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________**________________
> >> R-help@r-project.org <mailto:R-help@r-project.org> mailing list
> >>
https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> >> PLEASE do read the posting guide http://www.R-project.org/**
> >> posting-guide.html <http://www.R-project.org/posting-guide.html>
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org <mailto:R-help@r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Bert Gunter
Genentech Nonclinical Biostatistics
Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.