On Tue, May 02, 2006 at 08:27:49PM +0300, Yakov Lerner wrote:
BTW, can anyone explain why this pattern does *not*
work, does not match words that do not end with 'ion' :
/\i\+\(ion\)[EMAIL PROTECTED]/
I thought this pattern would match words not ending with
'ion'. But it matches all words, including words ending
with 'ion'. Why ?
The docs in :help /[EMAIL PROTECTED] answer this query pretty well. The issue is that
there are many places where a pattern doesn't match.
take the word zion.
\i\+ can match either z or zi or zio or zion. It is greedy so it will first
attempt to match zion.
Now, the \(ion\)[EMAIL PROTECTED] is applied. The current match position is just before
the EOL and EOL != ion so the entire pattern matches.
On Tue, May 02, 2006 at 02:03:47PM -0400, James Vega wrote:
/\i\+\(ion\)[EMAIL PROTECTED]>
This one is trying to get past the match position limitation in the wrong
way.. any time you try to describe what should be in the place of a zero
width match you must be extremely careful that you are only dealing with
terms that are important to you. In this case, the \i\{3\} isn't important
to you and it causes problems. This isn't a steadfast rule, but it can be a
useful guideline.
"Matthew Winn" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
That won't match words of fewer than four characters. To match all
words that don't end in "ion" it's better to do:
/\<\(\i*\(ion\)[EMAIL PROTECTED]|\i\i\=\)\>
^^^^ ^ ^^^^^^^^^^
This one attempts to correct the problems of the \i\{3\} by adding even more
inspection of tokens that aren't important to you. This is an example that
proves why it is bad to try to dictate what exists in the place of a zero
width match. :)
On Tue, May 02, 2006 at 08:27:49PM +0300, Yakov Lerner wrote:
Pattern
/\i\+\(ion\)\@<!\>/
matches words that do not end with 'ion'
This one is the best way of solving the presented problem (words not ending
in 'ion'). To break it down:
1. Match as many identifier chars as possible ( \i\+ )
2. Make sure the last three characters behind the current match point
are not 'ion' ( \(ion\)\@<! )
3. Make sure the current match point is a word boundary ( \> )
This regex will consume the whole word then back up on any words that do
contain ion but fail to match because of the \> requirement.