[ https://issues.apache.org/jira/browse/JCR-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jukka Zitting updated JCR-1878: ------------------------------- Fix Version/s: (was: 1.6.0) 2.0.0 OK, reverted the changes from 1.x in revision 779501. Targeting the 2.0 release instead. > Use Apache Tika for text extraction > ----------------------------------- > > Key: JCR-1878 > URL: https://issues.apache.org/jira/browse/JCR-1878 > Project: Jackrabbit Content Repository > Issue Type: Improvement > Components: jackrabbit-text-extractors > Reporter: Jukka Zitting > Assignee: Jukka Zitting > Fix For: 2.0.0 > > > Once Apache Tika is released with a resolution to TIKA-175 (making Tika > available to Java 1.4 projects), we should replace our direct parser library > dependencies with Tika parsers. Ideally we'd just use the Tika > AutoDetectParser that'll automatically detect the type of a binary and parse > it accordingly, solving JCR-728. > I guess we should keep some level of backwards compatibility with existing > textFilterClasses="..." configurations, perhaps by keeping the existing > TextExtractor classes as wrappers around respective Tika parsers. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.