Re: [pcre-dev] Partial match at end of subject

ph10 Sun, 14 Jul 2019 09:56:05 -0700

On Sat, 13 Jul 2019, ND via Pcre-dev wrote:

> At its core \z is positive lookahead assertion that want to inspect next
> character of subject.


I must admit I had not thought of it like that. I considered it just to 
be "are we at the end of the subject?".

> I propose following algorithm (for PARTIAL_HARD only disregarding the 
> existence
> of PARTIAL_SOFT):
> 
> . Are we at the end of the subject?   If no, backtrack
> . Is partial hard matching allowed?   If no, continue matching
> . Have we inspected any characters?   If yes, return a partial match   Else
> return "no match"

I have been experimenting with trying this out. It "fixes" your first 
example:

/\z/
abc\=ph
No match

Your third example is not a partial matching situation:

/c*/aftertext
ab\=ph 
 0: 
 0+ ab
 
This has found a complete match right at the start of the subject. It 
has not hit the end of the subject. However,

/c*/aftertext
ab\=ph,offset=2  
No match
         
Whereas before this would have given a complete match. 

Your second example still gives a full match.

/(?!\C)/aftertext
ab\=ph
 0: 
 0+ 
     
The reason is that the testing happens inside the assertion, so "no
match" means "assertion is true".

I am still not entirely convinced this change should be made. Zoltán, 
what do you think? It would involve making changes to JIT, of course.

Philip

-- 
Philip Hazel
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

Re: [pcre-dev] Partial match at end of subject

Reply via email to