[ 
https://issues.apache.org/jira/browse/TIKA-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18057716#comment-18057716
 ] 

Hudson commented on TIKA-4653:
------------------------------

SUCCESS: Integrated in Jenkins build Tika ยป tika-branch_3x-jdk11 #2385 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-branch_3x-jdk11/2385/])
TIKA-4653 -- fix up extra whitespace (tallison: 
[https://github.com/apache/tika/commit/1a3bd2d10434fefbfc9976e90c7360a38b6448b6])
* (edit) 
tika-core/src/main/java/org/apache/tika/sax/ToMarkdownContentHandler.java
* (edit) 
tika-core/src/test/java/org/apache/tika/sax/ToMarkdownContentHandlerTest.java


> Add markdown contenthandler
> ---------------------------
>
>                 Key: TIKA-4653
>                 URL: https://issues.apache.org/jira/browse/TIKA-4653
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>             Fix For: 4.0.0, 3.3.0
>
>
> Would be nice to save a step in feeding tika output to an Llm. Markdown seems 
> to be the industry standard.
> Maybe [https://github.com/furstenheim/copy-down,] but that is aging.
> Maybe: https://github.com/commonmark/commonmark-java ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to