Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: f05a4503950fc1e7ab2a9a010c12cb2c7407c0cf
      
https://github.com/WebKit/WebKit/commit/f05a4503950fc1e7ab2a9a010c12cb2c7407c0cf
  Author: Chris Dumez <[email protected]>
  Date:   2026-04-05 (Sun, 05 Apr 2026)

  Changed paths:
    A 
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char-expected.html
    A 
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char.html
    M Source/WTF/wtf/text/StringView.h
    M Source/WebCore/editing/ICUSearcher.cpp
    M Source/WebCore/rendering/RenderText.cpp
    M Source/WebCore/rendering/RenderTextFragment.cpp

  Log Message:
  -----------
  text-transform: capitalize should handle supplementary Unicode characters
https://bugs.webkit.org/show_bug.cgi?id=311394

Reviewed by Darin Adler.

Fix two issues with text-transform:capitalize when the previous character
is a supplementary Unicode character (encoded as a surrogate pair in UTF-16):

1. RenderTextFragment::previousCharacter() used operator[] to get the last
   character before the fragment start, which returns a single char16_t.
   When the previous character is supplementary (e.g. U+1D400), this returns
   a lone trailing surrogate instead of the full code point. Fix by adding
   a new StringView::codePointBefore() helper that properly decodes
   surrogate pairs using ICU's U16_PREV macro. Adopt the new helper in
   RenderText::previousCharacter(), RenderTextFragment::previousCharacter(),
   and ICUSearcher's isWordStartMatch().

2. capitalize() passed the char32_t previous character through
   convertNoBreakSpaceToSpace() before U16_APPEND_UNSAFE, which silently
   truncated supplementary characters (U+1D400 became U+D400). This caused
   U16_APPEND_UNSAFE to write a single wrong code unit instead of a
   surrogate pair, so ICU's word break iterator saw a different character
   and inserted a spurious word boundary. Fix by moving the
   convertNoBreakSpaceToSpace() call into the loop over the resulting
   char16_t code units, and narrowing its type to char16_t since it now
   only operates on code units.

Test: 
imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char.html

* 
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char-expected.html:
 Added.
* 
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char.html:
 Added.
* Source/WTF/wtf/text/StringView.h:
(WTF::StringView::characterStartingAt const):
(WTF::StringView::codePointBefore const):
* Source/WebCore/editing/ICUSearcher.cpp:
(WebCore::isWordStartMatch):
* Source/WebCore/rendering/RenderText.cpp:
(WebCore::capitalize):
(WebCore::RenderText::previousCharacter const):
* Source/WebCore/rendering/RenderTextFragment.cpp:
(WebCore::RenderTextFragment::previousCharacter const):

Canonical link: https://commits.webkit.org/310597@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications

Reply via email to