[R] Text data

2009-01-28 Thread Alice Lin

i have a data column of text entries:
26M_AN_C.bmp
22M_AN_C.bmp
20M_HA_O.bmp
20M_AN_C.bmp
26M_HA_O.bmp
22M_HA_O.bmp
31M_AN_C.bmp
38M_HA_O.bmp
.
.
.
.


And I would like to sort by the middle tag: AN, HA, etc.
Is there a way to parse text data in R? 

In excel, I would have used the left and right function to cut out just
the middle two letters out and put into another column to sort by. 

Thanks!

-- 
View this message in context: 
http://www.nabble.com/Text-data-tp21714334p21714334.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Text data

2009-01-28 Thread jim holtman
This will sort on those characters:

 x - readLines(textConnection(26M_AN_C.bmp
+ 22M_AN_C.bmp
+ 20M_HA_O.bmp
+ 20M_AN_C.bmp
+ 26M_HA_O.bmp
+ 22M_HA_O.bmp
+ 31M_AN_C.bmp
+ 38M_HA_O.bmp))
 closeAllConnections()
 # pick off characters between _
 sortKey - sub(.*_(.+)_.*, \\1, x)
 sortKey
[1] AN AN HA AN HA HA AN HA
 # output sorted list
 x[order(sortKey)]
[1] 26M_AN_C.bmp 22M_AN_C.bmp 20M_AN_C.bmp 31M_AN_C.bmp
20M_HA_O.bmp 26M_HA_O.bmp 22M_HA_O.bmp 38M_HA_O.bmp




On Wed, Jan 28, 2009 at 3:37 PM, Alice Lin alice...@gmail.com wrote:

 i have a data column of text entries:
 26M_AN_C.bmp
 22M_AN_C.bmp
 20M_HA_O.bmp
 20M_AN_C.bmp
 26M_HA_O.bmp
 22M_HA_O.bmp
 31M_AN_C.bmp
 38M_HA_O.bmp
 .
 .
 .
 .


 And I would like to sort by the middle tag: AN, HA, etc.
 Is there a way to parse text data in R?

 In excel, I would have used the left and right function to cut out just
 the middle two letters out and put into another column to sort by.

 Thanks!

 --
 View this message in context: 
 http://www.nabble.com/Text-data-tp21714334p21714334.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Text data

2009-01-28 Thread Nutter, Benjamin
Jim's solution is more elegant than the following (and probably more
efficient) but you could also try the following (This let's you sort by
AN/HN, and then by the number at the start of the filename):

 text - c( 26M_AN_C.bmp, 22M_AN_C.bmp, 20M_HA_O.bmp,
 20M_AN_C.bmp, 26M_HA_O.bmp, 22M_HA_O.bmp,
 31M_AN_C.bmp, 38M_HA_O.bmp)

 split - do.call(rbind,strsplit(text,_))

 o - order(split[,2],split[,1],split[,3])

 text[o]

[1] 20M_AN_C.bmp 22M_AN_C.bmp 26M_AN_C.bmp 31M_AN_C.bmp
20M_HA_O.bmp
[6] 22M_HA_O.bmp 26M_HA_O.bmp 38M_HA_O.bmp

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Alice Lin
Sent: Wednesday, January 28, 2009 3:38 PM
To: r-help@r-project.org
Subject: [R] Text data


i have a data column of text entries:
26M_AN_C.bmp
22M_AN_C.bmp
20M_HA_O.bmp
20M_AN_C.bmp
26M_HA_O.bmp
22M_HA_O.bmp
31M_AN_C.bmp
38M_HA_O.bmp
.
.
.
.


And I would like to sort by the middle tag: AN, HA, etc.
Is there a way to parse text data in R? 

In excel, I would have used the left and right function to cut out
just
the middle two letters out and put into another column to sort by. 

Thanks!

-- 
View this message in context:
http://www.nabble.com/Text-data-tp21714334p21714334.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News  World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.