hi all i am able to convert a pdf in to a text file using pdfbox. and this
is the code that i used

import org.pdfbox.pdfparser.PDFParser;
import org.pdfbox.pdmodel.PDDocument;
import org.pdfbox.util.PDFTextStripper;
import org.pdfbox.*;

import java.io.*;

public class PDFConvert
{

public static void main(String [] args)
{
String content = null;
try
{

   String pdfFile=new String ("D:\\ASHWIN\\res\\ashwin.pdf");
   PDDocument doc = PDDocument.load(pdfFile);
   PDFTextStripper strip = new PDFTextStripper();
   content = strip.getText(doc);
   System.out.println(content);
}
catch(Exception e)
{
   e.printStackTrace();
}

}
}

now i want to index this text information with lucene . wat is code required
for that pls help

regards
ashwin

Reply via email to