[ https://issues.apache.org/jira/browse/CXF-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031641#comment-14031641 ]
Andriy Redko commented on CXF-5549: ----------------------------------- As agreed, the first step is to create an extractor and test to do: - extract metadata and raw content from provided PDF docoument using Apache Tika - use Apache Lucene (RAM index / StandardAnalyzer) to index the metadata and context - use LuceneQueryVisitor on top of filter expression and the index to match the documents Thanks. Andriy > Introduce Tika Search Visitor > ----------------------------- > > Key: CXF-5549 > URL: https://issues.apache.org/jira/browse/CXF-5549 > Project: CXF > Issue Type: New Feature > Components: JAX-RS > Reporter: Sergey Beryozkin > Assignee: Andriy Redko > Priority: Minor > > Introduce TikaSearchVisitor which will convert FIQL/etc search expression > into Apache Tika component that can be used to search the binary data; for > example, the service can support something like "find all PDF files matching > a given expression" -- This message was sent by Atlassian JIRA (v6.2#6252)