Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: 1e14cbbdc2f52569760d7b86f3b0cec9709207ce
      
https://github.com/WebKit/WebKit/commit/1e14cbbdc2f52569760d7b86f3b0cec9709207ce
  Author: Michael Saboff <[email protected]>
  Date:   2025-03-12 (Wed, 12 Mar 2025)

  Changed paths:
    A JSTests/stress/regexp-multiple-char-matching.js
    M Source/JavaScriptCore/yarr/YarrJIT.cpp

  Log Message:
  -----------
  [Yarr] Improve processing of adjacent or near adjacent single characters
https://bugs.webkit.org/show_bug.cgi?id=289567
rdar://problem/146795365

Reviewed by Yusuke Suzuki.

Updated the multi-character load, compare and branch code in 
generatePatternCharacterOnce() to
consider all single character and single width character classes as a group.  
Given that these terms
may appear out of order due to other optimizations, we put the terms in order 
of their character
position in the alternation.  We then take the leading contiguous terms 
together, stopping at the
first non single width character or character class term or if there is a 
missing term for a
character position (like for a non-BMP codepoint).

We then create the largest width load, compare and branch sequence which may 
include masking out
the positions for character class terms that will be handled separately.  In 
some cases, we perform
overlapping loads.  For example if we have 7 characters to process, we can 
perform two 4 byte load
and compares that overlap with the middle character.

Overall, this reduces the number of load, compare and branch sequences.  This 
improvement significantly
helps the processing of RegExps with strings or strings with embedded single 
width character classes.
The SunSpider regex-dna subtest is a good example of the benefits of this 
improvement:

                         Baseline                 GroupChars

regexp-dna            4.1290+-0.1258     ^      2.8660+-0.1270        ^ 
definitely 1.4407x faster

<arithmetic>          4.1290+-0.1258     ^      2.8660+-0.1270        ^ 
definitely 1.4407x faster

Added tests that exercise the new code.

* JSTests/stress/regexp-multiple-char-matching.js: Added.
(arrayToString):
(objectToString):
(dumpValue):
(compareArray):
(compareGroups):
(testRegExp):
(testRegExpSyntaxError):
(printErrors):
(let.re.break.case.catch.continue.debugger.default.else.finally.if):
(RegExpTest):
(RegExpTest.prototype.runTest):
(RegExpTestList):
(RegExpTestList.prototype.addTest):
(RegExpTestList.prototype.runTests):
* Source/JavaScriptCore/yarr/YarrJIT.cpp:

Canonical link: https://commits.webkit.org/292003@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to