Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: 7f53c2a455abfe907a1fdeae7edf579e04e519cf
https://github.com/WebKit/WebKit/commit/7f53c2a455abfe907a1fdeae7edf579e04e519cf
Author: Sosuke Suzuki <[email protected]>
Date: 2026-02-16 (Mon, 16 Feb 2026)
Changed paths:
A JSTests/stress/yarr-word-boundary-surrogate-pair.js
M Source/JavaScriptCore/yarr/YarrInterpreter.cpp
Log Message:
-----------
[YARR] Fix word boundary assertion corrupting position with surrogate pairs
in bytecode interpreter
https://bugs.webkit.org/show_bug.cgi?id=307962
Reviewed by Yusuke Suzuki.
matchAssertionWordBoundary() used readChecked() to test whether adjacent
characters are word characters. readChecked() has a side effect: when it
reads a surrogate pair, it calls next() which advances pos by one. This
corrupts the input position for subsequent matching operations.
For example, /\B./u.exec("\u{10000}\u{10000}") incorrectly returned null
because after \B read U+10000 (a surrogate pair), pos was advanced, causing
the dot to read from the trail surrogate position and get errorCodePoint.
The fix replaces readChecked() with readCheckedDontAdvance(), which decodes
surrogate pairs without advancing pos. This is consistent with how
matchAssertionBOL() and matchAssertionEOL() already handle their reads.
This bug only affects the bytecode interpreter (--useRegExpJIT=false) since
the JIT compiler generates different code for word boundary assertions.
Test: JSTests/stress/yarr-word-boundary-surrogate-pair.js
* JSTests/stress/yarr-word-boundary-surrogate-pair.js: Added.
(shouldBe):
* Source/JavaScriptCore/yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::matchAssertionWordBoundary):
Canonical link: https://commits.webkit.org/307677@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications