Hi Tiziano, What is the error you got? I think you can get the text easily using the code shown below.
FileInputStream fi = new FileInputStream(new File("sample.pdf")); PDFParser parser = new PDFParser(fi); parser.parse(); COSDocument cd = parser.getDocument(); PDFTextStripper stripper = new PDFTextStripper(); String text = stripper.getText(new PDDocument(cd)); cd.close(); After getting the value for text you can simply create the Lucene document. Document doc = new Document(); doc.add(new Field("content", text,Field.Store.NO <http://field.store.no/>, Field.Index.TOKENIZED)); > > > > > On Thu, Dec 4, 2008 at 6:20 PM, tiziano bernardi <[EMAIL PROTECTED]>wrote: > >> >> Thanks very kind ... >> But I've tried that code but I do not work ... >> You could send me a simple working class that uses it please? >> Thanks> Date: Thu, 4 Dec 2008 15:19:26 +0530> From: [EMAIL PROTECTED]> >> To: java-user@lucene.apache.org> Subject: Re: Pdf in Lucene?> > Hi,> > In >> my case I used PDFBox, just to extract the text from PDF document and> then >> I created the Lucene document giving the extracted text. (I didn't use> the >> PDFBox built in Lucene search engine). So I didn't get any> incompatibility >> problems.> > This blog post shows the way.> >> http://kalanir.blogspot.com/2008/08/indexing-pdf-documents-with-lucene.html> >> > It worked perfect for me.> > Thanks. >> _________________________________________________________________ >> Ci sai fare con l'italiano? Scoprilo con Typectionary! >> http://typectionary.it.msn.com/ >> > > > > -- > Kalani Ruwanpathirana > Department of Computer Science & Engineering > University of Moratuwa > -- Kalani Ruwanpathirana Department of Computer Science & Engineering University of Moratuwa