Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: fb1eae0edd1479f4642007ffaa4005d476dfe234
https://github.com/WebKit/WebKit/commit/fb1eae0edd1479f4642007ffaa4005d476dfe234
Author: Wenson Hsieh <[email protected]>
Date: 2025-11-11 (Tue, 11 Nov 2025)
Changed paths:
M Source/WebKit/Shared/TextExtractionToStringConversion.cpp
M Source/WebKit/Shared/TextExtractionToStringConversion.h
M Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm
M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h
M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm
M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtractionInternal.h
M
Source/WebKit/UIProcess/Cocoa/TextExtraction/WKWebView+TextExtraction.swift
M Tools/TestWebKitAPI/Tests/WebKitCocoa/TextExtractionTests.mm
Log Message:
-----------
Refactor text extraction filtering logic to be more extensible
https://bugs.webkit.org/show_bug.cgi?id=302301
rdar://164442518
Reviewed by Abrar Rahman Protyasha and Megan Gardner.
Currently, the `TextExtractionOptions` passed into `convertToText` (i.e.
converting extraction items
into text) only allow for a single text filtering callback which returns a
`NativePromise` (which
is resolved once all relevant text filtering steps have been performed).
However, this makes it
somewhat tricky to implement more sophisticated logic around conditionally
enabling filtering steps
at both build-time and runtime:
1. The `TextExtractionFilter` classifier is behind a compile-time flag, as
well as a runtime flag.
It's now also configurable by the WebKit client, via the new option flag.
2. The text recognition filter is behind the same compile-time and runtime
flags. It's also
configurable by the client, independently of (1).
3. The maximum word limit is configurable by the WebKit client.
To make this filtering callback mechanism more extensible, we convert the
callback into a vector of
callbacks that represent a filtering pipeline, where any of the above steps
(1-3) can be added as
needed (and any future steps can just be appended to the list as needed).
See below for more details.
Test: TextExtractionTests.FilterOptions
Test: Tools/TestWebKitAPI/Tests/WebKitCocoa/TextExtractionTests.mm
* Source/WebKit/Shared/TextExtractionToStringConversion.cpp:
(WebKit::TextExtractionAggregator::filter const):
(WebKit::TextExtractionAggregator::filterRecursive const):
Turn the `filterCallback` into a `Vector` of `filterCallbacks`; the callbacks
are invoked in order,
and each callback's output is fed into the next callback as input (unless the
promise rejects, in
which case we stop early).
* Source/WebKit/Shared/TextExtractionToStringConversion.h:
(WebKit::TextExtractionOptions::TextExtractionOptions):
* Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:
(-[WKWebView _debugTextWithConfiguration:completionHandler:]):
(joinAndTruncateLinesToWordLimit): Deleted.
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h:
Add a new enum options property, which allows clients to opt in or out of the
classifier and/or OCR
filter during extraction.
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm:
(-[_WKTextExtractionConfiguration _initForOnlyVisibleText:]):
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtractionInternal.h:
* Source/WebKit/UIProcess/Cocoa/TextExtraction/WKWebView+TextExtraction.swift:
* Tools/TestWebKitAPI/Tests/WebKitCocoa/TextExtractionTests.mm:
(TestWebKitAPI::TEST(TextExtractionTests, FilterOptions)):
Canonical link: https://commits.webkit.org/302850@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications