On Tue, May 02, 2006 at 08:27:49PM +0300, Yakov Lerner wrote:
BTW, can anyone explain why this pattern does *not*
work, does not match words that do not end with 'ion' :
   /\i\+\(ion\)[EMAIL PROTECTED]/
I thought this pattern would match words not ending with
'ion'. But it matches all words, including words ending
with 'ion'. Why ?

The docs in :help /[EMAIL PROTECTED] answer this query pretty well. The issue is that there are many places where a pattern doesn't match.
take the word zion.
\i\+ can match either z or zi or zio or zion. It is greedy so it will first attempt to match zion. Now, the \(ion\)[EMAIL PROTECTED] is applied. The current match position is just before the EOL and EOL != ion so the entire pattern matches.

On Tue, May 02, 2006 at 02:03:47PM -0400, James Vega wrote:
   /\i\+\(ion\)[EMAIL PROTECTED]>
This one is trying to get past the match position limitation in the wrong way.. any time you try to describe what should be in the place of a zero width match you must be extremely careful that you are only dealing with terms that are important to you. In this case, the \i\{3\} isn't important to you and it causes problems. This isn't a steadfast rule, but it can be a useful guideline.


"Matthew Winn" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED]
That won't match words of fewer than four characters.  To match all
words that don't end in "ion" it's better to do:

    /\<\(\i*\(ion\)[EMAIL PROTECTED]|\i\i\=\)\>
     ^^^^  ^                ^^^^^^^^^^
This one attempts to correct the problems of the \i\{3\} by adding even more inspection of tokens that aren't important to you. This is an example that proves why it is bad to try to dictate what exists in the place of a zero width match. :)

On Tue, May 02, 2006 at 08:27:49PM +0300, Yakov Lerner wrote:
Pattern
   /\i\+\(ion\)\@<!\>/
matches words that do not end with 'ion'

This one is the best way of solving the presented problem (words not ending in 'ion'). To break it down:
   1. Match as many identifier chars as possible ( \i\+ )
2. Make sure the last three characters behind the current match point are not 'ion' ( \(ion\)\@<! )
   3. Make sure the current match point is a word boundary ( \> )
This regex will consume the whole word then back up on any words that do contain ion but fail to match because of the \> requirement.

Reply via email to