[
https://issues.apache.org/jira/browse/TIKA-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18057716#comment-18057716
]
Hudson commented on TIKA-4653:
------------------------------
SUCCESS: Integrated in Jenkins build Tika ยป tika-branch_3x-jdk11 #2385 (See
[https://ci-builds.apache.org/job/Tika/job/tika-branch_3x-jdk11/2385/])
TIKA-4653 -- fix up extra whitespace (tallison:
[https://github.com/apache/tika/commit/1a3bd2d10434fefbfc9976e90c7360a38b6448b6])
* (edit)
tika-core/src/main/java/org/apache/tika/sax/ToMarkdownContentHandler.java
* (edit)
tika-core/src/test/java/org/apache/tika/sax/ToMarkdownContentHandlerTest.java
> Add markdown contenthandler
> ---------------------------
>
> Key: TIKA-4653
> URL: https://issues.apache.org/jira/browse/TIKA-4653
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
> Fix For: 4.0.0, 3.3.0
>
>
> Would be nice to save a step in feeding tika output to an Llm. Markdown seems
> to be the industry standard.
> Maybe [https://github.com/furstenheim/copy-down,] but that is aging.
> Maybe: https://github.com/commonmark/commonmark-java ?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)