Chetan Mehrotra created OAK-2468: ------------------------------------ Summary: Index binary only if some Tika parser can support the bianries mimeType Key: OAK-2468 URL: https://issues.apache.org/jira/browse/OAK-2468 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-lucene Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.2
Currently all binaries are passed to Tika for text extraction. However Tika can only parse those for which it has supported parser present. Therefore extraction logic should parse a binary only if the mimeType is supported by Tika. With this change {{jcr:mimeType}} would become a mandatory property JR2 had a similar check [1] [1] https://github.com/apache/jackrabbit/blob/trunk/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/query/lucene/NodeIndexer.java#L932 -- This message was sent by Atlassian JIRA (v6.3.4#6332)