Re: [Rd] Efficiency of factor objects

Milan Bouchet-Valat Sat, 05 Nov 2011 09:36:23 -0700

Le vendredi 04 novembre 2011 à 19:19 -0400, Stavros Macrakis a écrit :
> R factors are the natural way to represent factors -- and should be
> efficient since they use small integers.  But in fact, for many (but
> not all) operations, R factors are considerably slower than integers,
> or even character strings.  This appears to be because whenever a
> factor vector is subsetted, the entire levels vector is copied.
Is it so common for a factor to have so many levels? One can probably
argue that, in that case, using a numeric or character vector is
preferred - factors are no longer the "natural way" of representing this
kind of data.


Adding code to fix a completely theoretical problem is generally not a
good idea. I think you'd have to come up with a real use case to hope
convincing the developers a change is needed. There are probably many
more interesting areas where speedups can be gained than that.


Regards

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Efficiency of factor objects

Reply via email to