Re: [R] regular expression for selection
Hi, Try grepl instead of sub, mena[grepl(m5., mena)] HTH, baptiste On 14 November 2011 21:45, Petr PIKAL petr.pi...@precheza.cz wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for selection
On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Hi Petr, How about: grep(m5,mena) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for selection
Hi Hi, Try grepl instead of sub, mena[grepl(m5., mena)] It does not select those m5? strings from those character vectors. I need as an output a vector m53, m54, m55, m56, m57, m58, m59 Regards Petr HTH, baptiste On 14 November 2011 21:45, Petr PIKAL petr.pi...@precheza.cz wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Regards Petr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for selection
Hi On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Hi Petr, How about: grep(m5,mena) It gives numeric values which tells me that there is a match in each string, but as a result I need only m53-m59 substrings. Regards Petr Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for selection
On 14.11.2011 10:22, Petr PIKAL wrote: Hi On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Hi Petr, How about: grep(m5,mena) It gives numeric values which tells me that there is a match in each string, but as a result I need only m53-m59 substrings. gsub(.*_(m5.).*, \\1, mena) Uwe Ligges Regards Petr Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for selection
Does library( stringr ) str_extract( mena, m5[0-9] ) achieve what you are looking for? Rgds, Rainer On Monday 14 November 2011 10:22:09 Petr PIKAL wrote: Hi On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Hi Petr, How about: grep(m5,mena) It gives numeric values which tells me that there is a match in each string, but as a result I need only m53-m59 substrings. Regards Petr Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for selection
Hi Thank you. It is a pure magic, something taught in Unseen University. this is what I got as a help for selecting only letters from set of character vector. vzor [1] 61A 62C/27 65A/27 66C/29 69A/29 70C/31 73A/31 [8] 74C/33 77A/33 81A/35 82C/37 85A/37 86C/39 89A/39 [15] 90C/41 93A/41 94C/43 97A/43 98C/45 101A/45 102C/47 [22] 105A/47 106C/49 109A/49 110C/51 113A/51 gsub([^A-z], , vzor) [1] A C A C A C A C A A C A C A C A C [18] A C A C A C A C A Therefore I expected that sub(m5., \\1, mena) or sub(m5., , mena) selects what I wanted. But it was not the case. Please can you correct me when I try to evaluate your solution? gsub(.*_(m5.).*, \\1, mena) or gsub(.*(m5.).*, \\1, mena) .* matches any characters () negation? or matching selection for back reference? Finally the expressin matches whole string and evaluates what is matched by parenthesised value. This evaluation is returned by backreference. Is it correct evaluation? Regards Petr On 14.11.2011 10:22, Petr PIKAL wrote: Hi On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Hi Petr, How about: grep(m5,mena) It gives numeric values which tells me that there is a match in each string, but as a result I need only m53-m59 substrings. gsub(.*_(m5.).*, \\1, mena) Uwe Ligges Regards Petr Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for selection
On 14.11.2011 11:27, Petr PIKAL wrote: Hi Thank you. It is a pure magic, something taught in Unseen University. this is what I got as a help for selecting only letters from set of character vector. vzor [1] 61A 62C/27 65A/27 66C/29 69A/29 70C/31 73A/31 [8] 74C/33 77A/33 81A/35 82C/37 85A/37 86C/39 89A/39 [15] 90C/41 93A/41 94C/43 97A/43 98C/45 101A/45 102C/47 [22] 105A/47 106C/49 109A/49 110C/51 113A/51 gsub([^A-z], , vzor) [1] A C A C A C A C A A C A C A C A C [18] A C A C A C A C A Therefore I expected that sub(m5., \\1, mena) or sub(m5., , mena) selects what I wanted. But it was not the case. Please can you correct me when I try to evaluate your solution? gsub(.*_(m5.).*, \\1, mena) or gsub(.*(m5.).*, \\1, mena) .* matches any characters Yes. () negation? or matching selection for back reference? The latter. See books about ergular expressions. I think it is also mentioned in ?regexp and with an example in ?gsub Finally the expressin matches whole string and evaluates what is matched by parenthesised value. This evaluation is returned by backreference. Is it correct evaluation? Indeed, where \\1 is the first backreference. Best, Uwe Regards Petr On 14.11.2011 10:22, Petr PIKAL wrote: Hi On 11/14/2011 07:45 PM, Petr PIKAL wrote: Dear all I am again (as usual) lost in regular expression use for selection. Here are my data: dput(mena) c(138516_10g_50ml_50c_250utes1_m53.00-_s1.imp, 138516_10g_50ml_50c_250utes1_m54.00_s1.imp, 138516_10g_50ml_50c_250utes1_m55.00_s1.imp, 138516_10g_50ml_50c_250utes1_m56.00_s1.imp, 138516_10g_50ml_50c_250utes1_m57.00_s1.imp, 138516_10g_50ml_50c_250utes1_m58.00_s1.imp, 138516_10g_50ml_50c_250utes1_m59.00_s1.imp) I want to select only values m foolowed by numbers from 53 to 59. I used sub(m5., , mena) which correctly selects those m53 - m59 values but, in contrary to my expectation, it replaced the selected values with specified replacement - in that case empty string. What I shall use if I want to get rid of all but m53-m59 from those strings? Hi Petr, How about: grep(m5,mena) It gives numeric values which tells me that there is a match in each string, but as a result I need only m53-m59 substrings. gsub(.*_(m5.).*, \\1, mena) Uwe Ligges Regards Petr Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.