Re: [R] element wise pattern recognition and string substitution

2016-09-09 Thread Jun Shen
Hi Jeff, I have been trying different methods and found your approach is the most efficient. I am able to resolve the string-parsing problem. Let me report back to the group. This following example explains what I was trying to achieve. melt.results is where the strings reside, testdata is a

Re: [R] element wise pattern recognition and string substitution

2016-09-09 Thread Ista Zahn
On Sep 9, 2016 12:14 AM, "Jun Shen" wrote: > > Hi Ista, > > Imagine we have a data set called "all.exposure" with variables "TX","WTCUT" for a function. I don't think imagining your situation is the best way. Make an example so we can actually see what you are working

Re: [R] element wise pattern recognition and string substitution

2016-09-08 Thread Jun Shen
Hi Ista, Imagine we have a data set called "all.exposure" with variables "TX","WTCUT" for a function. The concatenated strings are generated by some procedure within the function (the dot is used as separator, I can't change that). Now I want to parse the strings back to the original values as in

Re: [R] element wise pattern recognition and string substitution

2016-09-07 Thread Ista Zahn
On Mon, Sep 5, 2016 at 12:56 PM, Jun Shen wrote: > Thanks for the reply, Bert. > > Your solution solves the example. I actually have a more general situation > where I have this dot concatenated string from multiple variables. The > problem is those variables may have

Re: [R] element wise pattern recognition and string substitution

2016-09-07 Thread Ista Zahn
On Tue, Sep 6, 2016 at 11:59 PM, Jun Shen wrote: > Hi Ista, > > Thanks for the suggestion. I didn't know mapply can be used this way! Let me > take one more step. Instead of defining a pattern for each string, I would > like to define a set of patterns from all the possible

Re: [R] element wise pattern recognition and string substitution

2016-09-07 Thread Bert Gunter
Jeff: Not sure what you meant by this: "There is no other reason to put parentheses in the pattern... they are not grouping symbols." ... but in fact, from ?regexp "Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be

Re: [R] element wise pattern recognition and string substitution

2016-09-07 Thread Ista Zahn
If you want to mach each element of 'strings' to a different regex, do it. Here are three ways, using your original example. pattern1 <- "([^.]*)\\.([^.]*\\.[^.]*)\\.(.*)" pattern2 <- "([^.]*)\\.([^.]*)\\.(.*)" patterns <- c(pattern1,pattern2) strings <- c('TX.WT.CUT.mean','mg.tx.cv') for(i in

Re: [R] element wise pattern recognition and string substitution

2016-09-07 Thread Bert Gunter
Jun: 1. Tell us your desired result from your test vector and maybe someone will help. 2. As we played this game once already (you couldn't do it; I showed you how), this seems to be a function of your limitations with regular expressions. I'm probably not much better, but in any case, I don't

Re: [R] element wise pattern recognition and string substitution

2016-09-07 Thread Jeff Newmiller
My error. However, Jun has been severely abusing them... such use is unusual, and the "(?:" non-capturing group marker was invented because the capture side effect is so central to the use of the regular parenthesis. On Tue, 6 Sep 2016, Bert Gunter wrote: Jeff: Not sure what you meant by

Re: [R] element wise pattern recognition and string substitution

2016-09-07 Thread Jeff Newmiller
Here are some suggestions: test.string <- c( '240.m.g.>110.kg.geo.mean' , '3.mg.kg.>110.kg.P05' , '240.m.g.>50-70.kg.geo.mean' ) # based on your literal idea suggested.pattern1 <-

Re: [R] element wise pattern recognition and string substitution

2016-09-06 Thread Bert Gunter
Jun: "My problem is the pattern has to be dynamically constructed on the input data of the function " What does that mean? How can a pattern be "dynamically constructed" when you have not made clear (at least to me, perhaps also to yourself and/or others) *how* it is to be constructed? Cheers,

Re: [R] element wise pattern recognition and string substitution

2016-09-06 Thread Jun Shen
Hi Ista, Thanks for the suggestion. I didn't know mapply can be used this way! Let me take one more step. Instead of defining a pattern for each string, I would like to define a set of patterns from all the possible combination of the unique values of those variables. Then I need each string to

Re: [R] element wise pattern recognition and string substitution

2016-09-06 Thread Jun Shen
Hi Jeff, Thanks for the reply. I tried your suggestion and it doesn't seem to work and I tried a simple pattern as follows and it works as expected sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\1', "3.mg.kg.>50-70.kg.P05") [1] "3.mg.kg" sub("(3\\.mg\\.kg)\\.(>50-70\\.kg)\\.(.*)", '\\2',

Re: [R] element wise pattern recognition and string substitution

2016-09-06 Thread Jun Shen
Hi Bert, In the final.pattern, there are ten patterns. >sub(final.pattern, '\\1', test.string) Expected results: "240.m.g" "3.mg.kg" "240.m.g" Current results: "" "" "240.m.g" >sub(final.pattern, '\\2', test.string) Expected results: ">110.kg" ">110.kg" ">50-70.kg" Current results: "" ""

Re: [R] element wise pattern recognition and string substitution

2016-09-06 Thread Jeff Newmiller
I am not near my computer today, but each parenthesis gets its own result number, so you should put the parenthesis around the whole pattern of alternatives instead of having many parentheses. I recommend thinking in terms of what common information you expect to find in these various strings,

Re: [R] element wise pattern recognition and string substitution

2016-09-06 Thread Jun Shen
Hi Bert, I still couldn't make the multiple patterns to work. Here is an example. I make the pattern as follows final.pattern <-

Re: [R] element wise pattern recognition and string substitution

2016-09-06 Thread Bert Gunter
Just noticed: My clumsy do.call() line in my previously posted code below should be replaced with: pat <- paste(pat,collapse = "|") > pat <- c(pat1,pat2) > paste(pat,collapse="|") [1] "a+\\.*a+|b+\\.*b+" replace this ** > pat <- do.call(paste,c(as.list(pat),

Re: [R] element wise pattern recognition and string substitution

2016-09-05 Thread Bert Gunter
Jun: You need to provide a clear specification via regular expressions of the patterns you wish to match -- at least for me to decipher it. Others may be smarter than I, though... Jeff: Thanks. I have now convinced myself that it can be done (a "proof" of sorts): If pat1, pat2,..., patn are m

Re: [R] element wise pattern recognition and string substitution

2016-09-05 Thread Jun Shen
Thanks for the reply, Bert. Your solution solves the example. I actually have a more general situation where I have this dot concatenated string from multiple variables. The problem is those variables may have values with dots in there. The number of dots are not consistent for all values of a

Re: [R] element wise pattern recognition and string substitution

2016-09-05 Thread Jeff Newmiller
I am not the one who proved this... I can only respond to your suggested counterexamples. -- Sent from my phone. Please excuse my brevity. On September 5, 2016 9:01:12 AM PDT, Bert Gunter wrote: >Jeff: > >It is not obvious to me that the ability to *match* an arbitrary

Re: [R] element wise pattern recognition and string substitution

2016-09-05 Thread Bert Gunter
Jeff: It is not obvious to me that the ability to *match* an arbitrary pattern (including one of several different ones via "|" , per the link you included) implies that sub() and friends can extract it, e.g. via the /N construct or otherwise. I would appreciate it if you or someone else could

Re: [R] element wise pattern recognition and string substitution

2016-09-05 Thread Jeff Newmiller
Yes, sorry I did not look closer... regex can match any finite language, so there are no data sets you can feed to R that cannot be matched. [1] You may find it hard to see the pattern, or you may want to build the pattern programmatically to alleviate tedium for yourself, but regexes are not

Re: [R] element wise pattern recognition and string substitution

2016-09-04 Thread Bert Gunter
Well, he did provide an example, and... > z <- c('TX.WT.CUT.mean','mg.tx.cv') > sub("^.+?\\.(.+)\\.[^.]+$","\\1",z) [1] "WT.CUT" "tx" ## seems to do what was requested. Jeff would have to amplify on his initial statement however: do you mean that separate patterns can always be combined via

Re: [R] element wise pattern recognition and string substitution

2016-09-04 Thread Jeff Newmiller
Your opening assertion is false. Provide a reproducible example and someone will demonstrate. -- Sent from my phone. Please excuse my brevity. On September 4, 2016 9:06:59 PM PDT, Jun Shen wrote: >Dear list, > >I have a vector of strings that cannot be described by one

[R] element wise pattern recognition and string substitution

2016-09-04 Thread Jun Shen
Dear list, I have a vector of strings that cannot be described by one pattern. So let's say I construct a vector of patterns in the same length as the vector of strings, can I do the element wise pattern recognition and string substitution. For example, pattern1 <-