Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: 7f53c2a455abfe907a1fdeae7edf579e04e519cf
      
https://github.com/WebKit/WebKit/commit/7f53c2a455abfe907a1fdeae7edf579e04e519cf
  Author: Sosuke Suzuki <[email protected]>
  Date:   2026-02-16 (Mon, 16 Feb 2026)

  Changed paths:
    A JSTests/stress/yarr-word-boundary-surrogate-pair.js
    M Source/JavaScriptCore/yarr/YarrInterpreter.cpp

  Log Message:
  -----------
  [YARR] Fix word boundary assertion corrupting position with surrogate pairs 
in bytecode interpreter
https://bugs.webkit.org/show_bug.cgi?id=307962

Reviewed by Yusuke Suzuki.

matchAssertionWordBoundary() used readChecked() to test whether adjacent
characters are word characters. readChecked() has a side effect: when it
reads a surrogate pair, it calls next() which advances pos by one. This
corrupts the input position for subsequent matching operations.

For example, /\B./u.exec("\u{10000}\u{10000}") incorrectly returned null
because after \B read U+10000 (a surrogate pair), pos was advanced, causing
the dot to read from the trail surrogate position and get errorCodePoint.

The fix replaces readChecked() with readCheckedDontAdvance(), which decodes
surrogate pairs without advancing pos. This is consistent with how
matchAssertionBOL() and matchAssertionEOL() already handle their reads.

This bug only affects the bytecode interpreter (--useRegExpJIT=false) since
the JIT compiler generates different code for word boundary assertions.

Test: JSTests/stress/yarr-word-boundary-surrogate-pair.js

* JSTests/stress/yarr-word-boundary-surrogate-pair.js: Added.
(shouldBe):
* Source/JavaScriptCore/yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::matchAssertionWordBoundary):

Canonical link: https://commits.webkit.org/307677@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications

Reply via email to