[ 
https://issues.apache.org/jira/browse/CXF-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14031641#comment-14031641
 ] 

Andriy Redko commented on CXF-5549:
-----------------------------------

As agreed, the first step is to create an extractor and test to do:
 - extract metadata and raw content from provided PDF docoument using Apache 
Tika
 - use Apache Lucene (RAM index / StandardAnalyzer) to index the metadata and 
context 
 - use LuceneQueryVisitor on top of filter expression and the index to match 
the documents 

Thanks.
Andriy

> Introduce Tika Search Visitor
> -----------------------------
>
>                 Key: CXF-5549
>                 URL: https://issues.apache.org/jira/browse/CXF-5549
>             Project: CXF
>          Issue Type: New Feature
>          Components: JAX-RS
>            Reporter: Sergey Beryozkin
>            Assignee: Andriy Redko
>            Priority: Minor
>
> Introduce TikaSearchVisitor which will convert FIQL/etc search expression 
> into Apache Tika component that can be used to search the binary data; for 
> example, the service can support something like "find all PDF files matching 
> a given expression"



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to