Re: [pcre-dev] Limiting the Unicode validity check to the matched-over substring?

2015-08-18 Thread ph10
On Sun, 16 Aug 2015, Giuseppe D'Angelo wrote:

> My idea was that if the lookbehind amount is known at compile time
> (and it /should/ be, since lookbehinds are anyhow fixed-length; plus
> things like \b which need to inspect a fixed amount of data), then the
> check could be limited to the range
> 
> [ max(offset - lookbehind_length, 0) , length )
> 
> instead of spenning the entire subject string.

Yes, but not quite. The maximum lookbehind length is known, but it is in
*characters* not in code units. My idea is to count backwards through
the max lookbehind characters - without trying to check them - and then
do a forwards check from there to the end.

I have now implemented this in PCRE2 and committed the code. It was a
bit more fiddly than expected, and (as always) sorting out some tests 
and updating the documentation took almost as long as working on the 
code.

Philip

-- 
Philip Hazel

-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

[pcre-dev] [Bug 1669] Numeric categories missing matches

2015-08-18 Thread admin
https://bugs.exim.org/show_bug.cgi?id=1669

Hermann Zahnweh  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Hermann Zahnweh  ---
Thank you, my bad! -u resolves the issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

[pcre-dev] [Bug 1670] --color produces invalid UTF-8 for property matches

2015-08-18 Thread admin
https://bugs.exim.org/show_bug.cgi?id=1670

Hermann Zahnweh  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Hermann Zahnweh  ---
D’oh, I didn’t realize I needed -u. Thanks! I never became aware of that
switch, probably because I expected Unicode to be the default, and I mostly
specified patterns in verbatim, so I guess byte-by-byte matching would have
worked there. With -u, everything works as expected. Though I guess I will have
to ping some people who use PCRE as a library.

-- 
You are receiving this mail because:
You are on the CC list for the bug.-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev