[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16291935#comment-16291935 ]
Tim Allison commented on SOLR-11701: ------------------------------------ I merged [~kramachand...@commvault.com]'s mods and made a few updates for Tika 1.17. I ran an integration test against 643 files in Apache Tika's unit test docs, and I got the same # of documents indexed in Solr as tika-app.jar parsed without exceptions. {noformat} public static void main(String[] args) throws Exception { Path extracts = Paths.get("C:\\data\\tika_unit_tests_extracts"); SolrClient client = new HttpSolrClient.Builder("http://localhost:8983/solr/fileupload_passt/").build(); for (File f : extracts.toFile().listFiles()) { try (Reader r = Files.newBufferedReader(f.toPath(), StandardCharsets.UTF_8)) { List<Metadata> metadataList = JsonMetadataList.fromJson(r); String ex = metadataList.get(0).get(TikaCoreProperties.TIKA_META_EXCEPTION_PREFIX + "runtime"); if (ex == null) { SolrQuery q = new SolrQuery("id: "+f.getName().replace(".json", "")); QueryResponse response = client.query(q); SolrDocumentList results = response.getResults(); if (results.getNumFound() != 1) { System.err.println(f.getName() + " " + results.getNumFound()); } } } } } {noformat} I did the usual dance: {noformat} ant clean-jars jar-checksums ant precommit {noformat} [~erickerickson], this _should_ be good to go. > Upgrade to Tika 1.17 when available > ----------------------------------- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Tim Allison > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org