[ 
https://issues.apache.org/jira/browse/TIKA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13979379#comment-13979379
 ] 

Benoit Moreau commented on TIKA-1224:
-------------------------------------

In debug, Tika uses org.apache.tika.SourceCodeParser with "x-java-source" 
mime-type. It removes all end of lines (why?, mistake? readLine() doesn't 
return \n or/and \r), then gives the result to JHightlight. JHightlight result 
(entire html) is used as argument of characters() method of ContentHandler.

I just start with Tika, but I don't think that is good.

> Adding Source code (Java, Groovy, C) parser
> -------------------------------------------
>
>                 Key: TIKA-1224
>                 URL: https://issues.apache.org/jira/browse/TIKA-1224
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.5
>            Reporter: Hong-Thai Nguyen
>            Priority: Minor
>
> We can parser some source code file formats:
> text/x-java-source
> text/x-groovy
> text/x-c
> for HTML rendering from code, we can use jhightlight: 
> http://www.ohloh.net/p/jhighlight



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to