Ok, we decided to have a shot at modifying gregexpr. Let's see how it
works out. If anybody is interested in discussing this please contact
me. R-help doesn't seem like the right place for further discussion.
Is there a default place for discussing things like that?
Thanks everybody for your re
On Wed, Sep 29, 2010 at 1:58 PM, Michael Bedward
wrote:
> How is your C coding ? Bill ? Anyone else ? I could have a got at
> writing some prototype code to test in the next few days, though if
> someone else with decent C skills is itching to do it please speak up.
We have a skilled C- and R-pr
I'd definitely be a customer for it Titus. And it does seem like an
obvious hole in regex processing in R that cries out to be filled.
Um, ggregexpr isn't the sexiest of function names :) Perhaps we can
think of something a little easier ?
How is your C coding ? Bill ? Anyone else ? I could hav
Bill, Michael,
good to see I'm not the only one who sees potential for improvements
in the regexpr domain. Adding a subpattern argument is certainly a
step in the right direction and would make my life much easier.
However, in my application I need to know not only the position of one
group but a
Ah, that's interesting - thanks Bill. That's certainly on the right
track for me (Titus, you too ?) especially if the subpattern argument
accepted a vector of multiple group indices.
As you say, this is straightforward in C. I'd be happy to (try to)
make a patch for the R sources if there was some
> -Original Message-
> From: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Bedward
> Sent: Tuesday, September 28, 2010 12:46 AM
> To: Titus von der Malsburg
> Cc: r-help@r-project.org
> Subject: Re: [R] Regular expressio
On Tue, Sep 28, 2010 at 6:52 AM, Titus von der Malsburg
wrote:
> On Tue, Sep 28, 2010 at 9:46 AM, Michael Bedward
> wrote:
>> What Titus wants to do is akin to retrieving capturing groups from a
>> Matcher object in Java.
>
> Precisely. Here's the description:
>
> http://download.oracle.com/jav
On Tue, Sep 28, 2010 at 9:46 AM, Michael Bedward
wrote:
> What Titus wants to do is akin to retrieving capturing groups from a
> Matcher object in Java.
Precisely. Here's the description:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html#start(int)
Gabor's lookbe
What Titus wants to do is akin to retrieving capturing groups from a
Matcher object in Java. I also thought there must be an existing,
elegant solution to this some time ago and searched for it, including
looking at the sources (albeit with not much expertise) but came up
blank.
I also looked at t
On Mon, Sep 27, 2010 at 1:34 PM, Titus von der Malsburg
wrote:
> On Mon, Sep 27, 2010 at 7:29 PM, Gabor Grothendieck
> wrote:
>> Try this zero width negative look behind expression:
>>
>>> gregexpr("(?!a+)(b+)", "abcdaabbc", perl = TRUE)
>> [[1]]
>> [1] 2 7
>> attr(,"match.length")
>> [1] 1 2
>
>
You've tried:
gregexpr("b+", "abcdaabbc")
On Mon, Sep 27, 2010 at 12:48 PM, Titus von der Malsburg wrote:
> Dear list!
>
> > gregexpr("a+(b+)", "abcdaabbc")
> [[1]]
> [1] 1 5
> attr(,"match.length")
> [1] 2 4
>
> What I want is the offsets of the matches for the group (b+), i.e. 2
> and 7, not
You could do this:
gregexpr("ab+", "abcdaabbcbb")[[1]] + 1
On Mon, Sep 27, 2010 at 2:25 PM, Titus von der Malsburg
wrote:
> On Mon, Sep 27, 2010 at 7:16 PM, Henrique Dallazuanna
> wrote:
> > You've tried:
> >
> > gregexpr("b+", "abcdaabbc")
>
> But this would match the third occurrence of b+ in
On Mon, Sep 27, 2010 at 7:29 PM, Gabor Grothendieck
wrote:
> Try this zero width negative look behind expression:
>
>> gregexpr("(?!a+)(b+)", "abcdaabbc", perl = TRUE)
> [[1]]
> [1] 2 7
> attr(,"match.length")
> [1] 1 2
Thanks Gabor, but this gives me the same result as
gregexpr("b+", "abcdaab
On Mon, Sep 27, 2010 at 11:48 AM, Titus von der Malsburg
wrote:
> Dear list!
>
>> gregexpr("a+(b+)", "abcdaabbc")
> [[1]]
> [1] 1 5
> attr(,"match.length")
> [1] 2 4
>
> What I want is the offsets of the matches for the group (b+), i.e. 2
> and 7, not the offsets of the complete matches. Is there
On Mon, Sep 27, 2010 at 7:16 PM, Henrique Dallazuanna wrote:
> You've tried:
>
> gregexpr("b+", "abcdaabbc")
But this would match the third occurrence of b+ in "abcdaabbcbb". But
in this example I'm only interested in b+ if it's preceded by a+.
Titus
_
Thank you Jim, but just as the solution that I discussed, your
proposal involves deconstructing the pattern and searching several
times. I'm looking for a general and efficient solution. Internally,
the regexpr engine has all necessary information after one pass
through the string. What I need i
try this:
> x <- gregexpr("a+(b+)", "abcdaabbcaaacaaab")
> justA <- gregexpr("a+", "abcdaabbcaaacaaab")
> # find matches in 'x' for 'justA'
> indx <- which(justA[[1]] %in% x[[1]])
> # now determine where 'b' starts
> justA[[1]][indx] + attr(justA[[1]], 'match.length')[indx]
[1] 2 7 17
>
On M
Dear list!
> gregexpr("a+(b+)", "abcdaabbc")
[[1]]
[1] 1 5
attr(,"match.length")
[1] 2 4
What I want is the offsets of the matches for the group (b+), i.e. 2
and 7, not the offsets of the complete matches. Is there a way in R
to get that?
I know about gsubgn and strapply, but they only give me
18 matches
Mail list logo