If you just need the text from the document, try the ExtractorFactory
============================
import java.io.File;
import org.apache.poi.extractor.ExtractorFactory;
import org.apache.poi.POITextExtractor;
public class GetTextExample {
public static void main(String[] args) {
try {
File inputFile = new File("c:\\test\\docs\\test.docx");
POITextExtractor extractor =
ExtractorFactory.createExtractor(inputFile);
System.out.println("Word Document Text: ");
System.out.println("====================");
System.out.println(extractor.getText());
}
catch (Exception ex) {
ex.printStackTrace();
}
}
}
============================
my classpath:
poi-3.5-beta5/poi-3.5-beta5-20090219.jar
poi-3.5-beta5/poi-contrib-3.5-beta5-20090219.jar
poi-3.5-beta5/poi-ooxml-3.5-beta5-20090219.jar
poi-3.5-beta5/poi-scratchpad-3.5-beta5-20090219.jar
poi-3.5-beta5/lib/log4j-1.2.13.jar
poi-3.5-beta5/ooxml-lib/*.jar
HTH
Leigh
> pof wrote:
> >
> > Hi, I was wondering if someone could provide an
> example how to parse out
> > the plain text from a docx using poi 3.5 beta5?
> >
> > Cheers, Brett.
> >
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]