>>>>> "PS" == Petr Savicky <savi...@cs.cas.cz> >>>>> on Fri, 8 May 2009 18:10:56 +0200 writes:
PS> On Fri, May 08, 2009 at 05:14:48PM +0200, Petr Savicky wrote: >> Let me suggest to consider the following modification, where match() is done >> on the strings, not on the original values. >> levels <- unique(as.character(sort(unique(x)))) >> x <- as.character(x) >> f <- match(x, levels) PS> An alternative solution is > ind <- order(x) > x <- as.character(x) # or any other conversion to character > levels <- unique(x[ind]) # get unique levels ordered by the original values > f <- match(x, levels) Yes, that's an interesting quite different and simple approach. PS> The advantage of this over the suggestion from my previous email is that PS> the string conversion is applied only once. The conversion need not be only PS> as.character(). There may be other choices specified by a parametr. I have PS> strong objections against the existing implementation of as.character(), {(because it is not *accurate* enough, right ?)} PS> but still i think that as.character() should be the default for factor() PS> for the sake of consistency of the R language. Hmm... Peter Dalgaard very early in this thread remarked that at least in the use of table(..), factor() should not be extremely accurate, and that's what R-devel's factor has been doing recently. But then, table(.) could be changed to explicitly call factor(signif(x, 15), ...) for the case of numeric x. BTW: I found that practically all the remaining border cases you had, are "solved" by using 14 instead of 15. I'm currently testing a version of factor() that uses 14, *and* adds an extra final level test, removing duplicated ones. Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel