Tricky question, this order issue :-( Thank you so much for the detailed explanation.
Thus, please, must I conclude that I will have to survive with this ASCII order while working in Mac OS X 10.5.2 until Mac people fix this bug? You spoke about es_ES.ISO8859-15 in Mac. Will it do the trick? Yes, as far as I understand. But as I am using R.app, locale is set by the system preferences. Truly, I am kind of a mess with this issue. Could I force es_ES.ISO8859-15 as a locale in the Mac. Sorry of I put another question here... why does Excel order list correctly? I guess it doesn't relies on Mac settings. As a R newbie I must recognize that this, and others, behaviours are really hard to deal with. But I've seen, an even done, such an amount of wonderful things with R that it is worth all efforts. Thanks for your help. All the best, Ricardo Prof Brian Ripley wrote: > This is a known Mac OS X bug, nothing to do with R which uses the > system functions (strcoll/wcscoll) for such things. > > If you look at the help for sort, it refers you to ?Comparison. Which > says > > Comparison of strings in character vectors is lexicographic within > the strings using the collating sequence of the locale in use: see > 'locales'. The collating sequence of locales such as 'en_US' is > normally different from 'C' (which should use ASCII) and can be > surprising. Beware of making _any_ assumptions about the > collation order: e.g. in Estonian 'Z' comes between 'S' and 'T', > and collation is not necessarily character-by-character - in > Danish 'aa' sorts as a single letter, after 'z'. Some platforms > may not respect the locale and always sort in ASCII. (String > comparison is always for the part of the string up to the first > nul if there are embedded nuls.) > > Mac OS X (more specifically, 10.5.2 on i386) is one of those > disrespectful platforms. > >> x <- intToUtf8(c(32:127, 160:255), multiple=T) >> order(x) > [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 > 17 18 > [19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 > 35 36 > [37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 > 53 54 > [55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 > 71 72 > [73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 > 89 90 > [91] 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 > 107 108 > [109] 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 > 125 126 > [127] 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 > 143 144 > [145] 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 > 161 162 > [163] 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 > 179 180 > [181] 181 182 183 184 185 186 187 188 189 190 191 192 > > which is quite different from Linux or Solaris. This may not come > out, but paste(sort(x), collapse="") includes > > aAªáÁàÀâÂåÅäÄãÃæÆbBcCçÇdDeEéÉèÈêÊëË > > on Linux in es_ES.utf8 . > > Platforms are a lot worse at sorting in UTF-8 than 8-bit encodings. > Mac OS X has es_ES.ISO8859-15, and that does do a reasonable job > including aáàâåäãæ . -- Ricardo Rodríguez Your XEN ICT Team ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.