Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: f05a4503950fc1e7ab2a9a010c12cb2c7407c0cf
https://github.com/WebKit/WebKit/commit/f05a4503950fc1e7ab2a9a010c12cb2c7407c0cf
Author: Chris Dumez <[email protected]>
Date: 2026-04-05 (Sun, 05 Apr 2026)
Changed paths:
A
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char-expected.html
A
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char.html
M Source/WTF/wtf/text/StringView.h
M Source/WebCore/editing/ICUSearcher.cpp
M Source/WebCore/rendering/RenderText.cpp
M Source/WebCore/rendering/RenderTextFragment.cpp
Log Message:
-----------
text-transform: capitalize should handle supplementary Unicode characters
https://bugs.webkit.org/show_bug.cgi?id=311394
Reviewed by Darin Adler.
Fix two issues with text-transform:capitalize when the previous character
is a supplementary Unicode character (encoded as a surrogate pair in UTF-16):
1. RenderTextFragment::previousCharacter() used operator[] to get the last
character before the fragment start, which returns a single char16_t.
When the previous character is supplementary (e.g. U+1D400), this returns
a lone trailing surrogate instead of the full code point. Fix by adding
a new StringView::codePointBefore() helper that properly decodes
surrogate pairs using ICU's U16_PREV macro. Adopt the new helper in
RenderText::previousCharacter(), RenderTextFragment::previousCharacter(),
and ICUSearcher's isWordStartMatch().
2. capitalize() passed the char32_t previous character through
convertNoBreakSpaceToSpace() before U16_APPEND_UNSAFE, which silently
truncated supplementary characters (U+1D400 became U+D400). This caused
U16_APPEND_UNSAFE to write a single wrong code unit instead of a
surrogate pair, so ICU's word break iterator saw a different character
and inserted a spurious word boundary. Fix by moving the
convertNoBreakSpaceToSpace() call into the loop over the resulting
char16_t code units, and narrowing its type to char16_t since it now
only operates on code units.
Test:
imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char.html
*
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char-expected.html:
Added.
*
LayoutTests/imported/w3c/web-platform-tests/css/css-pseudo/first-letter-capitalize-supplementary-char.html:
Added.
* Source/WTF/wtf/text/StringView.h:
(WTF::StringView::characterStartingAt const):
(WTF::StringView::codePointBefore const):
* Source/WebCore/editing/ICUSearcher.cpp:
(WebCore::isWordStartMatch):
* Source/WebCore/rendering/RenderText.cpp:
(WebCore::capitalize):
(WebCore::RenderText::previousCharacter const):
* Source/WebCore/rendering/RenderTextFragment.cpp:
(WebCore::RenderTextFragment::previousCharacter const):
Canonical link: https://commits.webkit.org/310597@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications