This may help: http://www.pdfbox.org/userguide/text_extraction.html#Lucene+Integration
ashwin kumar wrote: > hi all i am able to convert a pdf in to a text file using pdfbox. and this > is the code that i used > > import org.pdfbox.pdfparser.PDFParser; > import org.pdfbox.pdmodel.PDDocument; > import org.pdfbox.util.PDFTextStripper; > import org.pdfbox.*; > > import java.io.*; > > public class PDFConvert > { > > public static void main(String [] args) > { > String content = null; > try > { > > String pdfFile=new String ("D:\\ASHWIN\\res\\ashwin.pdf"); > PDDocument doc = PDDocument.load(pdfFile); > PDFTextStripper strip = new PDFTextStripper(); > content = strip.getText(doc); > System.out.println(content); > } > catch(Exception e) > { > e.printStackTrace(); > } > > } > } > > now i want to index this text information with lucene . wat is code > required > for that pls help > > regards > ashwin > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]