[R] grep won't work finding one column

2014-10-14 Thread Kate Ignatius
I'm having an issue with grep:

I have numerous columns that end with .at... when I use grep like so:

df[,grep(.at,colnames(df))]

it works fine.  When I have one column that ends with .at, it does not
work.  Why is that?  As this is loop with varying number of columns
ending in .at I would like some code that would work with 1 to n
number of columns.

Is there something more optimal than grep?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread John McKown
On Tue, Oct 14, 2014 at 9:23 AM, Kate Ignatius kate.ignat...@gmail.com wrote:
 I'm having an issue with grep:

 I have numerous columns that end with .at... when I use grep like so:

 df[,grep(.at,colnames(df))]

 it works fine.  When I have one column that ends with .at, it does not
 work.  Why is that?  As this is loop with varying number of columns
 ending in .at I would like some code that would work with 1 to n
 number of columns.

 Is there something more optimal than grep?

 Thanks!

I can't answer your direct question. But do you realize that your code
does not match your words? The grep show does not _only_ match columns
who name end with the characters '.at'. It matches all column names
which contain any character followed by the characters at. To do the
match with only columns whose names end with the characters .at, you
need: grep(\.at$,colnames(df)).

You might want to post an example which fails. Just to be complete, be
sure to use the dput() function so that it is easy for members of the
group to cut'n'paste to get your data into our own R workspace.

-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! 
John McKown

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread Kate Ignatius
For example,

DF will usually have numerous columns with sample1.at sample1.dp
sample1.fg sample2.at sample2.dp sample2.fg and so on

I'm running this code in R as part of a shell script which runs over
several different file sizes so sometimes it will come across a file
with one sample in it: i.e. sample1: when the R code runs through this
file... trying to grep out  the sample1.at column does not work and
it will halt and stop.

Here is some sample data... say I want to get out the AT_ only column


Sample_1 AT_1
A/A RR
G/G AA
T/T AA
G/A RA
G/G RR
C/C AA
C/C AA
C/T RA
A/A AA
T/G RA

it will have a problem grepping out this single column.

On Tue, Oct 14, 2014 at 10:38 AM, John McKown
john.archie.mck...@gmail.com wrote:
 On Tue, Oct 14, 2014 at 9:23 AM, Kate Ignatius kate.ignat...@gmail.com 
 wrote:
 I'm having an issue with grep:

 I have numerous columns that end with .at... when I use grep like so:

 df[,grep(.at,colnames(df))]

 it works fine.  When I have one column that ends with .at, it does not
 work.  Why is that?  As this is loop with varying number of columns
 ending in .at I would like some code that would work with 1 to n
 number of columns.

 Is there something more optimal than grep?

 Thanks!

 I can't answer your direct question. But do you realize that your code
 does not match your words? The grep show does not _only_ match columns
 who name end with the characters '.at'. It matches all column names
 which contain any character followed by the characters at. To do the
 match with only columns whose names end with the characters .at, you
 need: grep(\.at$,colnames(df)).

 You might want to post an example which fails. Just to be complete, be
 sure to use the dput() function so that it is easy for members of the
 group to cut'n'paste to get your data into our own R workspace.

 --
 There is nothing more pleasant than traveling and meeting new people!
 Genghis Khan

 Maranatha! 
 John McKown

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread Jeff Newmiller
Your question is missing a reproducible example, and you don't say how it does 
not work, so we cannot tell what is going on.

Two things do come to mind, though.

A) Data frame subsets with only one column by default return a vector, which is 
a different type of object than a single-column data frame. You would need to 
read ?[.data.frame about the drop argument if you wanted to consistently 
get a data frame from this expression.

B) The period is a wildcard in regular expressions. If you expect to limit your 
search to literal .at at the end of the name then you should use the search 
pattern  \\.at$ instead (the first slash allows the second one to be stored 
by R in the string, and the second one is the only one seen by grep, which it 
reads as making the period not act like a wildcard). You really should read 
about regular expressions before using them. There are many tutorials on the 
web about this topic.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On October 14, 2014 7:23:55 AM PDT, Kate Ignatius kate.ignat...@gmail.com 
wrote:
I'm having an issue with grep:

I have numerous columns that end with .at... when I use grep like so:

df[,grep(.at,colnames(df))]

it works fine.  When I have one column that ends with .at, it does not
work.  Why is that?  As this is loop with varying number of columns
ending in .at I would like some code that would work with 1 to n
number of columns.

Is there something more optimal than grep?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread Ivan Calandra

Shouldn't it be
grep(\\.at$,colnames(df))
with double back slash?

Ivan

--
Ivan Calandra
University of Reims Champagne-Ardenne
GEGENA² - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calan...@univ-reims.fr
https://www.researchgate.net/profile/Ivan_Calandra

Le 14/10/14 16:38, John McKown a écrit :

On Tue, Oct 14, 2014 at 9:23 AM, Kate Ignatius kate.ignat...@gmail.com wrote:

I'm having an issue with grep:

I have numerous columns that end with .at... when I use grep like so:

df[,grep(.at,colnames(df))]

it works fine.  When I have one column that ends with .at, it does not
work.  Why is that?  As this is loop with varying number of columns
ending in .at I would like some code that would work with 1 to n
number of columns.

Is there something more optimal than grep?

Thanks!

I can't answer your direct question. But do you realize that your code
does not match your words? The grep show does not _only_ match columns
who name end with the characters '.at'. It matches all column names
which contain any character followed by the characters at. To do the
match with only columns whose names end with the characters .at, you
need: grep(\.at$,colnames(df)).

You might want to post an example which fails. Just to be complete, be
sure to use the dput() function so that it is easy for members of the
group to cut'n'paste to get your data into our own R workspace.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread John McKown
AT and at are not the same. If you want an case insensitive compare
for the characters at you need the ignore.case=TRUE added. E.g.:

df[,grep(.at,colnames(df),ignore.case=TRUE)

That should match the column name you gave. Which does not match your
initial description which said ending with .at. That has an embedded
AT. So I am still a bit confused about your needs.

On Tue, Oct 14, 2014 at 9:55 AM, Kate Ignatius kate.ignat...@gmail.com wrote:
 For example,

 DF will usually have numerous columns with sample1.at sample1.dp
 sample1.fg sample2.at sample2.dp sample2.fg and so on

 I'm running this code in R as part of a shell script which runs over
 several different file sizes so sometimes it will come across a file
 with one sample in it: i.e. sample1: when the R code runs through this
 file... trying to grep out  the sample1.at column does not work and
 it will halt and stop.

 Here is some sample data... say I want to get out the AT_ only column


 Sample_1 AT_1
 A/A RR
 G/G AA
 T/T AA
 G/A RA
 G/G RR
 C/C AA
 C/C AA
 C/T RA
 A/A AA
 T/G RA

 it will have a problem grepping out this single column.

 On Tue, Oct 14, 2014 at 10:38 AM, John McKown
 john.archie.mck...@gmail.com wrote:
 On Tue, Oct 14, 2014 at 9:23 AM, Kate Ignatius kate.ignat...@gmail.com 
 wrote:
 I'm having an issue with grep:

 I have numerous columns that end with .at... when I use grep like so:

 df[,grep(.at,colnames(df))]

 it works fine.  When I have one column that ends with .at, it does not
 work.  Why is that?  As this is loop with varying number of columns
 ending in .at I would like some code that would work with 1 to n
 number of columns.

 Is there something more optimal than grep?

 Thanks!

 I can't answer your direct question. But do you realize that your code
 does not match your words? The grep show does not _only_ match columns
 who name end with the characters '.at'. It matches all column names
 which contain any character followed by the characters at. To do the
 match with only columns whose names end with the characters .at, you
 need: grep(\.at$,colnames(df)).

 You might want to post an example which fails. Just to be complete, be
 sure to use the dput() function so that it is easy for members of the
 group to cut'n'paste to get your data into our own R workspace.

 --
 There is nothing more pleasant than traveling and meeting new people!
 Genghis Khan

 Maranatha! 
 John McKown



-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! 
John McKown

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread John McKown
You're right. I don't use regexps in R very much. In most other
languages, a single \ is needed. The R parser is different and I
forgot. Thanks for the heads up.

On Tue, Oct 14, 2014 at 10:01 AM, Ivan Calandra
ivan.calan...@univ-reims.fr wrote:
 Shouldn't it be
 grep(\\.at$,colnames(df))
 with double back slash?

 Ivan

 --
 Ivan Calandra
 University of Reims Champagne-Ardenne
 GEGENA² - EA 3795
 CREA - 2 esplanade Roland Garros
 51100 Reims, France
 +33(0)3 26 77 36 89
 ivan.calan...@univ-reims.fr
 https://www.researchgate.net/profile/Ivan_Calandra

 Le 14/10/14 16:38, John McKown a écrit :

 On Tue, Oct 14, 2014 at 9:23 AM, Kate Ignatius kate.ignat...@gmail.com
 wrote:

 I'm having an issue with grep:

 I have numerous columns that end with .at... when I use grep like so:

 df[,grep(.at,colnames(df))]

 it works fine.  When I have one column that ends with .at, it does not
 work.  Why is that?  As this is loop with varying number of columns
 ending in .at I would like some code that would work with 1 to n
 number of columns.

 Is there something more optimal than grep?

 Thanks!

 I can't answer your direct question. But do you realize that your code
 does not match your words? The grep show does not _only_ match columns
 who name end with the characters '.at'. It matches all column names
 which contain any character followed by the characters at. To do the
 match with only columns whose names end with the characters .at, you
 need: grep(\.at$,colnames(df)).

 You might want to post an example which fails. Just to be complete, be
 sure to use the dput() function so that it is easy for members of the
 group to cut'n'paste to get your data into our own R workspace.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! 
John McKown

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread Kate Ignatius
In the sense - it does not work.  it works when there are 50 samples
in the file, but it does not work when there is one.

The usual headings are:  sample1.at sample1.dp
sample1.fg sample2.at sample2.dp sample2.fg and so on to a max of
sample50.at sample50.dp sample50.fg

using this greps out all the .at columns perfectly:

df[,grep(.at,colnames(df))]

When I come across a file when there is one sample:

sample1.at sample1.dp sample1.fg

Using this:

df[,grep(.at,colnames(df))]

returns nothing.

Oh - AT/at was just an example... thats not my problem...



On Tue, Oct 14, 2014 at 10:57 AM, Jeff Newmiller
jdnew...@dcn.davis.ca.us wrote:
 Your question is missing a reproducible example, and you don't say how it 
 does not work, so we cannot tell what is going on.

 Two things do come to mind, though.

 A) Data frame subsets with only one column by default return a vector, which 
 is a different type of object than a single-column data frame. You would need 
 to read ?[.data.frame about the drop argument if you wanted to 
 consistently get a data frame from this expression.

 B) The period is a wildcard in regular expressions. If you expect to limit 
 your search to literal .at at the end of the name then you should use the 
 search pattern  \\.at$ instead (the first slash allows the second one to be 
 stored by R in the string, and the second one is the only one seen by grep, 
 which it reads as making the period not act like a wildcard). You really 
 should read about regular expressions before using them. There are many 
 tutorials on the web about this topic.

 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 On October 14, 2014 7:23:55 AM PDT, Kate Ignatius kate.ignat...@gmail.com 
 wrote:
I'm having an issue with grep:

I have numerous columns that end with .at... when I use grep like so:

df[,grep(.at,colnames(df))]

it works fine.  When I have one column that ends with .at, it does not
work.  Why is that?  As this is loop with varying number of columns
ending in .at I would like some code that would work with 1 to n
number of columns.

Is there something more optimal than grep?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grep won't work finding one column

2014-10-14 Thread Rolf Turner

On 15/10/14 04:09, Kate Ignatius wrote:

In the sense - it does not work.  it works when there are 50 samples
in the file, but it does not work when there is one.

The usual headings are:  sample1.at sample1.dp
sample1.fg sample2.at sample2.dp sample2.fg and so on to a max of
sample50.at sample50.dp sample50.fg

using this greps out all the .at columns perfectly:

df[,grep(.at,colnames(df))]

When I come across a file when there is one sample:

sample1.at sample1.dp sample1.fg

Using this:

df[,grep(.at,colnames(df))]

returns nothing.

Oh - AT/at was just an example... thats not my problem...


You are being (deliberately?) obtuse.

It's *all* your problem.  You have to be precise when working with 
computers and when providing examples.  Don't build examples with 
confusing red herrings.


Your assertion that df[,grep(.at,colnames(df))] returns nothing is 
simple ***INCORRECT***.  It works just fine.  See the (tidy, completely 
reproducible) example in the attached file kate.txt.


Note that, with a single .at column in your data frame, what is 
returned is ***NOT*** a data frame but rather a vector.  If you want a 
(one-column) data frame you need to use drop=FALSE in your 
subscripting call.


You need to study up on R and learn how it works (read the Introduction 
to R) and stop going off half-cocked.


cheers,

Rolf Turner

P.S.  It is a ***bad*** idea to use df as the name of a data frame. 
The string df is the name of a *function* in base R (it is the 
probability density function for the F distribution).  Although R is 
clever enough to distinguish functions from data objects in *most* 
circumstances, at the very least confusion could arise.


R. T.

--
Rolf Turner
Technical Editor ANZJS
#
# Check it out.
#

# Data frame with one .at column.
d1 - as.data.frame(matrix(1,ncol=3,nrow=10))
n1 - c(sample1.at,sample1.dp,sample1.g)
names(d1) - n1

# Data frame with many .at columns.
d2 - as.data.frame(matrix(1,ncol=50,nrow=10))
set.seed(42)
n2 - paste(sample,1:50,sample(c(.at,.dp,.fg),50,TRUE),sep=)
names(d2) - n2

# Extract the .at columns.
print(d1[,grep(.at,colnames(d1))])
print(d2[,grep(.at,colnames(d2))])
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.