Re: [R] Sorting strings

2012-02-20 Thread statquant2
NICE DDE
It solves my problem !
Awesome stuff

--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404424.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread Ted Harding
On 20-Feb-2012 Petr Savicky wrote:
> On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote:
>> I did, but this does not give the answer to my question...
>> Anybody knows how to tweack the behaviour of sort or how to do ?
> 
> Hi.
> Try this
> 
>   Sys.setlocale("LC_COLLATE", "C") 
> 
> This comes from ?locale and reads there
> 
>   Sys.setlocale("LC_COLLATE", "C")   # turn off locale-specific sorting,
>  #  usually
> 
> See also ?sort
> 
>  The sort order for character vectors will depend on the
>  collating sequence of the locale in use: see 'Comparison'.
> 
> ?Comparison
> 
>  Comparison of strings in character vectors is lexicographic
>  within the strings using the collating sequence of the locale
>  in use: see 'locales'. The collating sequence of locales such
>  as 'en_US' is normally different from 'C' (which should use
>  ASCII) and can be surprising. Beware of making _any_ assumptions
>  about the collation order: ...
> 
> Hope this helps.
> Petr Savicky.

I've been following this thread with interest. I had begun composing
a reply on similar lines to Petr's above, but put it on one side
while waiting to see how the thread would evolve.

In view of the tangle of mixed experiences reported by different
users, I now wonder whether we should have something like "lc_collate"
as a specific parameter for sort(), e.g. so that one can set, for a
particular sorting operation,

   sort(c("X.","X0B"),lc_collate="C")

without affecting the system "LC_COLLATE" setting (i.e. the change
takes effect only within the execution of that sort() command).

Ted.

-
E-Mail: (Ted Harding) 
Date: 20-Feb-2012  Time: 17:16:47
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread De-Jian Zhao

On 2012-2-20 23:15, Rui Barradas wrote:

Could it be OS related?


Yes, it seems. I tried it on my local windows xp and redhat linux 
server, and got different results. Hope it will be fixed in the future 
versions. Maybe we should keep alert to check whether the results are 
consistent when transferring our code from one platform to another.



> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"
> R.version
   _
platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  9.1
year   2009
month  06
day26
svn rev48839
language   R
version.string R version 2.9.1 (2009-06-26)



> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X.Z"   "X0B.Z"
> R.version
   _
platform   i386-pc-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  2
minor  13.0
year   2011
month  04
day13
svn rev55427
language   R
version.string R version 2.13.0 (2011-04-13)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread De-Jian Zhao

Sorry, just made a mistake. This is the result from windows xp.

> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X.Z"   "X0B.Z"
> R.version
   _
platform   i386-pc-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  2
minor  13.0
year   2011
month  04
day13
svn rev55427
language   R
version.string R version 2.13.0 (2011-04-13)


On 2012-2-21 0:13, De-Jian Zhao wrote:
It seems OS-dependent. I got different results when trying it on 
windows xp and Redhat linux.



> R.version
   _
platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  9.1
year   2009
month  06
day26
svn rev48839
language   R
version.string R version 2.9.1 (2009-06-26)
> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X.Z"   "X0B.Z"


> R.version
   _
platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  9.1
year   2009
month  06
day26
svn rev48839
language   R
version.string R version 2.9.1 (2009-06-26)
> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"


On 2012-2-20 23:27, statquant2 wrote:

Ok I have :

R) str(R.Version())
List of 13
  $ platform  : chr "x86_64-unknown-linux-gnu"
  $ arch  : chr "x86_64"
  $ os: chr "linux-gnu"
  $ system: chr "x86_64, linux-gnu"
  $ status: chr ""
  $ major : chr "2"
  $ minor : chr "12.2"
  $ year  : chr "2011"
  $ month : chr "02"
  $ day   : chr "25"
  $ svn rev   : chr "54585"
  $ language  : chr "R"
  $ version.string: chr "R version 2.12.2 (2011-02-25)"

R) sort(c("X.","X0B"))
[1] "X."  "X0B"
R) sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"

I am using a linux redHat
$ uname -a
Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 
x86_64

x86_64 GNU/Linux


--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404298.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread Petr Savicky
On Mon, Feb 20, 2012 at 04:56:21PM +0100, Petr Savicky wrote:
> On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote:
> > I did, but this does not give the answer to my question...
> > Anybody knows how to tweack the behaviour of sort or how to do ?
> 
> Hi.
> 
> Try this
> 
>   Sys.setlocale("LC_COLLATE", "C") 
> 
> 
> This comes from ?locale and reads there

This is not in ?locale, but in ?locales

>  Sys.setlocale("LC_COLLATE", "C")   # turn off locale-specific sorting,
> #  usually

This in the example section at the end.

Try also to see

  Sys.getlocale()

Relevant can also be LC_CTYPE

  Sys.setlocale("LC_CTYPE", "C")

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread De-Jian Zhao
It seems OS-dependent. I got different results when trying it on windows 
xp and Redhat linux.



> R.version
   _
platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  9.1
year   2009
month  06
day26
svn rev48839
language   R
version.string R version 2.9.1 (2009-06-26)
> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X.Z"   "X0B.Z"


> R.version
   _
platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  9.1
year   2009
month  06
day26
svn rev48839
language   R
version.string R version 2.9.1 (2009-06-26)
> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"


On 2012-2-20 23:27, statquant2 wrote:

Ok I have :

R) str(R.Version())
List of 13
  $ platform  : chr "x86_64-unknown-linux-gnu"
  $ arch  : chr "x86_64"
  $ os: chr "linux-gnu"
  $ system: chr "x86_64, linux-gnu"
  $ status: chr ""
  $ major : chr "2"
  $ minor : chr "12.2"
  $ year  : chr "2011"
  $ month : chr "02"
  $ day   : chr "25"
  $ svn rev   : chr "54585"
  $ language  : chr "R"
  $ version.string: chr "R version 2.12.2 (2011-02-25)"

R) sort(c("X.","X0B"))
[1] "X."  "X0B"
R) sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"

I am using a linux redHat
$ uname -a
Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 x86_64
x86_64 GNU/Linux


--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404298.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread statquant2
Ok I have :

R) str(R.Version())
List of 13
 $ platform  : chr "x86_64-unknown-linux-gnu"
 $ arch  : chr "x86_64"
 $ os: chr "linux-gnu"
 $ system: chr "x86_64, linux-gnu"
 $ status: chr ""
 $ major : chr "2"
 $ minor : chr "12.2"
 $ year  : chr "2011"
 $ month : chr "02"
 $ day   : chr "25"
 $ svn rev   : chr "54585"
 $ language  : chr "R"
 $ version.string: chr "R version 2.12.2 (2011-02-25)"

R) sort(c("X.","X0B"))
[1] "X."  "X0B"
R) sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"  

I am using a linux redHat 
$ uname -a
Linux 2.6.18-238.9.1.el5 #1 SMP Fri Mar 18 12:42:39 EDT 2011 x86_64 x86_64
x86_64 GNU/Linux


--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404298.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread Rui Barradas
Hello,


statquant2 wrote
> 
> Ok so it changed from 2.12.2 to 2.14.1 ??
> Can somebody tell me how to modify my sort or whatever to get the save
> resilt that I would get in 2.14.1 ?
> 
> Cheers
> 

I don't know about 2.12.2 but for 2.12.0 I get:

> R.version
   _
platform   i386-pc-mingw32  
arch   i386 
os mingw32  
system i386, mingw32
status  
major  2
minor  12.0 
year   2010 
month  10   
day15   
svn rev53317
language   R
version.string R version 2.12.0 (2010-10-15)
> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z")) 
[1] "X.Z"   "X0B.Z"

And the same for 2.14.1:

> R.version
   _
platform   i386-pc-mingw32
[... deleted...]
version.string R version 2.14.1 (2011-12-22)
> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z")) 
[1] "X.Z"   "X0B.Z"

Could it be OS related?

Rui Barradas.

--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404267.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread Petr Savicky
On Mon, Feb 20, 2012 at 05:55:30AM -0800, statquant2 wrote:
> I did, but this does not give the answer to my question...
> Anybody knows how to tweack the behaviour of sort or how to do ?

Hi.

Try this

  Sys.setlocale("LC_COLLATE", "C") 


This comes from ?locale and reads there

 Sys.setlocale("LC_COLLATE", "C")   # turn off locale-specific sorting,
#  usually

See also ?sort

 The sort order for character vectors will depend on the collating
 sequence of the locale in use: see ‘Comparison’.

?Comparison

 Comparison of strings in character vectors is lexicographic within
 the strings using the collating sequence of the locale in use: see
 ‘locales’.  The collating sequence of locales such as ‘en_US’ is
 normally different from ‘C’ (which should use ASCII) and can be
 surprising.  Beware of making _any_ assumptions about the
 collation order: ...

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread R. Michael Weylandt
I don't *think* it's version specific, but rather it depends on your
(still unstated) locale, as the documentation goes to great lengths to
point out. Change that and you might see different behaviors.

Michael

On Mon, Feb 20, 2012 at 8:55 AM, statquant2  wrote:
> I did, but this does not give the answer to my question...
> Anybody knows how to tweack the behaviour of sort or how to do ?
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404091.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread statquant2
I did, but this does not give the answer to my question...
Anybody knows how to tweack the behaviour of sort or how to do ?

--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4404091.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread statquant2
Ok so it changed from 2.12.2 to 2.14.1 ??
Can somebody tell me how to modify my sort or whatever to get the save
resilt that I would get in 2.14.1 ?

Cheers

--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4403858.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread Enrico Schumann


See ?Comparison, which holds some warnings about what to expect when 
sorting strings.



Am 20.02.2012 11:51, schrieb Petr Savicky:

On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote:

Hi all, I am having difficulties to understand how R sort strings:

If I do
R) sort(c("X.","X0B"))
[1] "X."  "X0B"

So for me, as far as lexicographic order is concerned I can add whatever to
the end, the order will remain the same, but :


Hi.

This neednot be true for strings of different length.
For example

   ab
   abc

become by concatenation with z

   abcz
   abz

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Enrico Schumann
Lucerne, Switzerland
http://nmof.net/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread Keith Jewell

"Petr Savicky"  wrote in message 
news:20120220105153.gc21...@cs.cas.cz...
> On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote:
>> Hi all, I am having difficulties to understand how R sort strings:
>>
>> If I do
>> R) sort(c("X.","X0B"))
>> [1] "X."  "X0B"
>>
>> So for me, as far as lexicographic order is concerned I can add whatever 
>> to
>> the end, the order will remain the same, but :
>
> Hi.
>
> This neednot be true for strings of different length.
> For example
>
>  ab
>  abc
>
> become by concatenation with z
>
>  abcz
>  abz
>
> Petr Savicky.
>

That's not the explanation in this case.

The OP isn't telling us everything.
I get [R version 2.14.1 Platform: i386-pc-mingw32/i386 (32-bit)]:
> sort(c("X.","X0B"))
[1] "X."  "X0B"
> sort(c("X.Z","X0B.Z"))
[1] "X.Z"   "X0B.Z"

KJ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sorting strings

2012-02-20 Thread Petr Savicky
On Mon, Feb 20, 2012 at 02:18:42AM -0800, statquant2 wrote:
> Hi all, I am having difficulties to understand how R sort strings:
> 
> If I do
> R) sort(c("X.","X0B"))
> [1] "X."  "X0B"
> 
> So for me, as far as lexicographic order is concerned I can add whatever to
> the end, the order will remain the same, but :

Hi.

This neednot be true for strings of different length.
For example

  ab
  abc

become by concatenation with z

  abcz
  abz

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sorting strings

2012-02-20 Thread statquant2
Hi all, I am having difficulties to understand how R sort strings:

If I do
R) sort(c("X.","X0B"))
[1] "X."  "X0B"

So for me, as far as lexicographic order is concerned I can add whatever to
the end, the order will remain the same, but :
R) sort(c("X.Z","X0B.Z"))
[1] "X0B.Z" "X.Z"

Can somebody give me a trick for the order to become lexicographic ?  



--
View this message in context: 
http://r.789695.n4.nabble.com/Sorting-strings-tp4403696p4403696.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.