[ https://issues.apache.org/jira/browse/TIKA-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096663#comment-15096663 ]
Uwe Schindler commented on TIKA-1830: ------------------------------------- bq. Speaking of integration with Solr, would you have a chance/any interest in offering feedback on our initial restructuring of the parser bundles for Tika 2.0 (TIKA-1824)? Or more generally, do you and your Solr colleagues have any wishes for the 2.0 roadmap? As already stated in the past, we would like to only bundle parsers for text document formats, because images, class files or else are not really useful for indexing by default. Users that want to do this, can still add the missing parser bundles and SPI will do the rest. Currently we have disabled some parsers by removing the JAR files (like asm-all.jar, netcdf.jar), so TIKA's SPI will disable them automatically (because of ClassNotFoundEx). This was a bit rude, but worked. The reason for this was partly also some version incompatibilities (ASM was old in TIKA, Lucene needs newest one), but ASM is not really useful for indexing anyways! In Solr we don't use transitive dependencies in Ivy, so we decide for each JAR file which one gets bundled, so we check every release anyways during update. > Upgrade to PDFBox 1.8.11 when available > --------------------------------------- > > Key: TIKA-1830 > URL: https://issues.apache.org/jira/browse/TIKA-1830 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Attachments: reports_pdfbox_1_8_11-rc1.zip > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)