Would someone please explain the usage of @=. I am getting
confuse from the help file.
[EMAIL PROTECTED] Matches the preceding atom with zero width. {not in Vi}
Like "(?=pattern)" in Perl.
Example matches ~
foo\(bar\)[EMAIL PROTECTED] "foo" in "foobar"
foo\(bar\)[EMAIL PROTECTED] nothing
To me, the second example matches nothing because there is
no foo in between the \( and \)
The first example, I am all confused. If someone can
enlighten me, I would be greatful.
The pattern
\(...\)[EMAIL PROTECTED]
is interpreted as "make sure that this matches here, but don't
consume any of the characters so that things after the '=' begin
at the same point as this".
In the first example, as stated it matches the "foo" in "foobar"
because the "bar" can be found after the "foo", but it doesn't
become part of the match. To see this as you're playing around,
it's helpful to have
:set hls
so you can see what matches.
In the second example, the regexp is asking for two disjoint
things: "foo" followed by "bar" and also followed by a second
"foo". It might be more clear if "foo" wasn't used twice:
/foo\(bar\)[EMAIL PROTECTED]
This would match nothing as well, as it asks for "foo" followed
immediately by "bar" as well as "foo" followed immediately by "fred".
For most uses, this isn't very helpful and can be more clearly
expressed as
/foo\zebar
where the "\ze" means "and I want the pattern to stop matching here".
I can concoct crazy uses for the "[EMAIL PROTECTED]" where it might be useful
but most of them are refactorable:
/foo\([[:print:]]+\)[EMAIL PROTECTED]
could become
/foo[a-z]\ze[[:print:]]*
One could also use it for crazy filtering:
/foo\(\%(.[aeiou]\)\{5}\)[EMAIL PROTECTED]
This would ensure that you have five pairs of "word-characters
(\w) followed by a vowel" following "foo", and that the 4th
letter following foo is an "a". The above could be written
without using "[EMAIL PROTECTED]" as something ilke
/foo\w[aeiou]\wa\w[aeiou]\w[aeiou]\w[aeiou]
Readability is in the eye of the beholder. :) With 2 characters
times 5 instances plus 3+1+6, they balance out to about the same.
As those numbers get larger, using the [EMAIL PROTECTED] notation might prove
more helpful.
This allows you to do some pattern intersection (in the
set-theory definition of "intersection") which might allow you to
shorten the pattern if you have long stretches of things. It
might be helpful in DNA sequencing or something of the like,
where one is hunting for certain patterns of A/C/G/T and want to
ensure that a certain repeating pattern exists, and then at a
certain point in that pattern a given item is more constrained.
One might have an alternating sequence where you know you want
something like "agct" followed by 75 alternating pairs
/agct\(\%([at][cg]\)\{75,}\)[EMAIL PROTECTED]
You can then tack on "but position 28 through 30 must be 'gag'"
(I might be off-by-one here)
/agct\(\%([at][cg]\)\{75,}\)[EMAIL PROTECTED](.\{27}gag\)[EMAIL
PROTECTED]
The result will only be the "agct", but it will be followed by
the context you need, as there might be many other instances of
"agct" that you don't care about because they lack this context.
(the genetics example chosen as I've seen a couple
genetics-searching related questions on the list)
As cautioned, they're fairly contrived instances, but I hope the
above ramblings shed more light than they bewilder, and that
using ":set hls" helps see what's considered when using the "[EMAIL PROTECTED]".
-tim