Le vendredi 04 novembre 2011 à 19:19 -0400, Stavros Macrakis a écrit : > R factors are the natural way to represent factors -- and should be > efficient since they use small integers. But in fact, for many (but > not all) operations, R factors are considerably slower than integers, > or even character strings. This appears to be because whenever a > factor vector is subsetted, the entire levels vector is copied. Is it so common for a factor to have so many levels? One can probably argue that, in that case, using a numeric or character vector is preferred - factors are no longer the "natural way" of representing this kind of data.
Adding code to fix a completely theoretical problem is generally not a good idea. I think you'd have to come up with a real use case to hope convincing the developers a change is needed. There are probably many more interesting areas where speedups can be gained than that. Regards ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel