Would someone please explain the usage of @=. I am getting
confuse from the  help file.

[EMAIL PROTECTED]       Matches the preceding atom with zero width. {not in Vi}
        Like "(?=pattern)" in Perl.
        Example                 matches ~
        foo\(bar\)[EMAIL PROTECTED]             "foo" in "foobar"
        foo\(bar\)[EMAIL PROTECTED]     nothing


To me, the second example matches nothing because there is
no foo in between the \( and \)

The first example, I am all confused.  If someone can
enlighten me, I would be greatful.

The pattern

        \(...\)[EMAIL PROTECTED]

is interpreted as "make sure that this matches here, but don't consume any of the characters so that things after the '=' begin at the same point as this".

In the first example, as stated it matches the "foo" in "foobar" because the "bar" can be found after the "foo", but it doesn't become part of the match. To see this as you're playing around, it's helpful to have

        :set hls

so you can see what matches.

In the second example, the regexp is asking for two disjoint things: "foo" followed by "bar" and also followed by a second "foo". It might be more clear if "foo" wasn't used twice:

        /foo\(bar\)[EMAIL PROTECTED]

This would match nothing as well, as it asks for "foo" followed immediately by "bar" as well as "foo" followed immediately by "fred".

For most uses, this isn't very helpful and can be more clearly expressed as

        /foo\zebar

where the "\ze" means "and I want the pattern to stop matching here".

I can concoct crazy uses for the "[EMAIL PROTECTED]" where it might be useful but most of them are refactorable:

        /foo\([[:print:]]+\)[EMAIL PROTECTED]

could become

        /foo[a-z]\ze[[:print:]]*

One could also use it for crazy filtering:

        /foo\(\%(.[aeiou]\)\{5}\)[EMAIL PROTECTED]

This would ensure that you have five pairs of "word-characters (\w) followed by a vowel" following "foo", and that the 4th letter following foo is an "a". The above could be written without using "[EMAIL PROTECTED]" as something ilke

        /foo\w[aeiou]\wa\w[aeiou]\w[aeiou]\w[aeiou]

Readability is in the eye of the beholder. :) With 2 characters times 5 instances plus 3+1+6, they balance out to about the same. As those numbers get larger, using the [EMAIL PROTECTED] notation might prove more helpful.

This allows you to do some pattern intersection (in the set-theory definition of "intersection") which might allow you to shorten the pattern if you have long stretches of things. It might be helpful in DNA sequencing or something of the like, where one is hunting for certain patterns of A/C/G/T and want to ensure that a certain repeating pattern exists, and then at a certain point in that pattern a given item is more constrained. One might have an alternating sequence where you know you want something like "agct" followed by 75 alternating pairs

        /agct\(\%([at][cg]\)\{75,}\)[EMAIL PROTECTED]

You can then tack on "but position 28 through 30 must be 'gag'" (I might be off-by-one here)

        /agct\(\%([at][cg]\)\{75,}\)[EMAIL PROTECTED](.\{27}gag\)[EMAIL 
PROTECTED]

The result will only be the "agct", but it will be followed by the context you need, as there might be many other instances of "agct" that you don't care about because they lack this context.

(the genetics example chosen as I've seen a couple genetics-searching related questions on the list)

As cautioned, they're fairly contrived instances, but I hope the above ramblings shed more light than they bewilder, and that using ":set hls" helps see what's considered when using the "[EMAIL PROTECTED]".

-tim




Reply via email to