On Dec 1, 2008, at 8:01 AM, tiziano bernardi wrote:
I tried to use pdfbox but gives me an error.
That the version of lucene and the pdfbox are incompatible.
Lucene knows nothing about PDFBox, so I don't see how they could be
incompatible, unless your are referring to PDFBox's Lucene Document
creator, in which case, you should ask on the PDFBox mailing list. I
think, however, that it's pretty straightforward to create a Lucene
document from PDFBox, so you shouldn't need to rely on their version.
Personally, I'd have a look at Tika (http://lucene.apache.org/tika),
which wraps PDFBox (and other extraction libraries) and gives you back
SAX-like events via a ContentHandler, which you can then use to create
Lucene documents. Else, I've been working on SOLR-284, which
integrates Tika into Solr, see https://issues.apache.org/jira/browse/SOLR-284
-Grant
I use pdf box 0.7.3 and lucene 2.1.0> Date: Mon, 1 Dec 2008 11:43:00
+0000> From: [EMAIL PROTECTED]> To: java-user@lucene.apache.org>
Subject: Re: Pdf in Lucene?> > Hi> > > Lucene only indexes text so
you'll have to get the text out of the PDF> and feed it to lucene.>
> Google for lucene pdf, or go straight to http://www.pdfbox.org/> >
> --> Ian.> > > > 2008/12/1 tiziano bernardi <[EMAIL PROTECTED]>:>
>> >> > Hi,> > I want to index PDF files with lucene is possible?> >
What like?> > Thanks Tiziano Bernardi> >
_________________________________________________________________> >
Fanne di tutti i colori, personalizza la tua Hotmail!> > http://imagine-windowslive.com/Hotmail/#0
> >
---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]>
For additional commands, e-mail: [EMAIL PROTECTED]>
_________________________________________________________________
50 nuovi schemi per giocare su CrossWire! Accetta la sfida!
http://livesearch.games.msn.com/crosswire/play_it/
--------------------------
Grant Ingersoll
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]