Hi Ulf,

While I understand the desire to index PDF, I agree that Tika will increase
the jar load of the application quite substantially, so this decision should
be considered quite carefully. I've myself avoided Tika for this reason (in
one case it would have made my deployable application many times larger than
my own code).

Cheers,

Murray

>     [
> https://issues.apache.org/jira/browse/JSPWIKI-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800635#comment-16800635
> ]
>
> Ulf Dittmer commented on JSPWIKI-469:
> -------------------------------------
>
> A year later than I had hoped I was finally able to get back to this, and
> it turned out to be easier that I had feared. The TikaParser class is
> attached, it compiles against 2.11.0.M2
>
> jspwiki-main needs this dependency (which will draw in a LOT of other
> dependencies):
> <dependency>
>     <groupId>org.apache.tika</groupId>
>     <artifactId>tika-parsers</artifactId>
>     <version>1.20</version>
> </dependency>
>
>> Enhance LuceneSearchProvider for other Attachments
>> ---------------------------------------------------
>>
>>                 Key: JSPWIKI-469
>>                 URL: https://issues.apache.org/jira/browse/JSPWIKI-469
>>             Project: JSPWiki
>>          Issue Type: Improvement
>>            Reporter: NicolaFischer
>>            Assignee: Florian Holeczek
>>            Priority: Minor
>>             Fix For: FutureVersion
>>
>>         Attachments: TikaSearchProvider.java, patch.txt
>>
>>
>> LuceneProvider should index more filestypes then only plain text. This
>> is one attempt to index pdf-files.
>> Required jars:
>> * [Apache POI|http://ftp.tpnet.pl/vol/d1/apache/poi/release/bin] (not
>> tested with 3.0.1 final)
>> * [PDFBox|http://www.pdfbox.org]
>> * [FontBox|http://www.fontbox.org]
>> *
>> [OpenDocumentTextInputStream|http://books.evc-cit.info/odf_utils/index.html]
>> Patch attached for 2.8.1
>> Maybe we should check how to index more documents.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>
>



...........................................................................
Murray Altheim <murray18 at altheim dot com>                       = =  ===
http://www.altheim.com/murray/                                     ===  ===
                                                                   = =  ===
     In the evening
     The rice leaves in the garden
     Rustle in the autumn wind
     That blows through my reed hut.
            -- Minamoto no Tsunenobu



Reply via email to