Hisoka-X commented on code in PR #9862:
URL: https://github.com/apache/seatunnel/pull/9862#discussion_r2368674895


##########
docs/en/transform-v2/tikadocument.md:
##########
@@ -0,0 +1,257 @@
+# TikaDocument
+
+> TikaDocument Transform Plugin
+
+## Description
+
+The `TikaDocument` transform plugin uses Apache Tika to extract text content 
and metadata from various document formats including PDF, Microsoft Office 
documents (Word, Excel, PowerPoint), plain text, HTML, XML, and many other file 
formats. This transform converts binary document data into structured text 
content and metadata fields.
+
+The plugin supports comprehensive error handling, content processing options, 
and can handle both binary data and Base64-encoded document content.

Review Comment:
   Let's link to tike docs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to