It is documented to depend on your locale. I get
> sort(x)
[1] " A" " B" " C" "A" "B" "C"
in the C locale. The help page does say so:
The sort order for character vectors will depend on the collating
sequence of the locale in use: see 'Comparison'.
The default collation sequences for standard locales in Linux distros are
quite unintuitive (and are not character-by-character either). If you
want ASCII, ask for it by LC_COLLATE=C.
On Thu, 19 Aug 2004 [EMAIL PROTECTED] wrote:
> The following is not what I expected in sorting characters (single letters
> and the same letters with preceding spaces).
> Can someone enlighten me as to why the following might be a correct result
> for sorting?
>
> ; x <- c(LETTERS[1:3], paste(" ", LETTERS[1:3], sep=""))
> ; x
> [1] "A" "B" "C" " A" " B" " C"
> ; sort(x)
> [1] "A" " A" "B" " B" "C" " C"
> ; sort(x, method="shell")
> [1] "A" " A" "B" " B" "C" " C"
> ; sort(x, method="quick")
> [1] "A" " A" "B" " B" "C" " C"
>
> I would expect the result to be " A" " B" " C" "A" "B" "C" instead,
> going by ASCII codes (and a quick check with S-Plus 6.2 shows that this is
> what S-Plus thinks the sorted sequence is).
That explicitly says it uses ASCII. I believe that is a deficiency they
plan to correct.
--
Brian D. Ripley, [EMAIL PROTECTED]
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html