Hi all, I tried to extract MS Word 2007 .docx files using below implementation (Using poi-3.9.0) in order to fix an existing bug in API Manager 1.7.0. (The bug was about giving an error while extracting and indexing .docx files in MSWordIndexer)
XWPFDocument doc = new XWPFDocument(new ByteArrayInputStream(fileData.data)); XWPFWordExtractor extractor = new XWPFWordExtractor(doc); String wordText = extractor.getText(); Then I applied the patch and tried to upload and extract a .docx file, but following error was given. [2014-07-14 09:46:20,328] ERROR - AsyncIndexer Error while indexing. java.lang.ExceptionInInitializerError at org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument$Factory.parse(Unknown Source) at org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:134) at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159) at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:123) at org.wso2.carbon.apimgt.impl.indexing.indexer.MSWordIndexer.getIndexedDocument(MSWordIndexer.java:40) at org.wso2.carbon.registry.indexing.solr.SolrClient.indexDocument(SolrClient.java:178) at org.wso2.carbon.registry.indexing.AsyncIndexer$IndexingTask.doWork(AsyncIndexer.java:203) at org.wso2.carbon.registry.indexing.AsyncIndexer$IndexingTask.run(AsyncIndexer.java:189) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: Cannot load SchemaTypeSystem. Unable to load class with name schemaorg_apache_xmlbeans.system.sE130CAA0A01A7CDE5A2B4FEB8B311707.TypeSystemHolder. Make sure the generated binary files are on the classpath. at org.apache.xmlbeans.XmlBeans.typeSystemForClassLoader(XmlBeans.java:783) at org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument.<clinit>(Unknown Source) ... 14 more Caused by: java.lang.ClassNotFoundException: schemaorg_apache_xmlbeans.system.sE130CAA0A01A7CDE5A2B4FEB8B311707.TypeSystemHolder at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:501) at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:421) at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:412) at org.eclipse.osgi.internal.baseadaptor.DefaultClassLoader.loadClass(DefaultClassLoader.java:107) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at org.apache.xmlbeans.XmlBeans.typeSystemForClassLoader(XmlBeans.java:769) ... 15 more Then I put poi-ooxml-schemas-3.9.jar into repository/components/lib directory in order to fix above error. But when I again tried to extract a .docx file, the below class casting error was occurred. [2014-07-14 09:52:22,518] ERROR - AsyncIndexer Could not index the resource: path=/_system/governance/apimgt/applicationdata/provider/admin/dwd/2/documentation/files/r.docx, media type=application/msword java.lang.ClassCastException: org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.DocumentDocumentImpl cannot be cast to org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument at org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument$Factory.parse(Unknown Source) at org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:134) at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159) at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:123) at org.wso2.carbon.apimgt.impl.indexing.indexer.MSWordIndexer.getIndexedDocument(MSWordIndexer.java:40) at org.wso2.carbon.registry.indexing.solr.SolrClient.indexDocument(SolrClient.java:178) at org.wso2.carbon.registry.indexing.AsyncIndexer$IndexingTask.doWork(AsyncIndexer.java:203) at org.wso2.carbon.registry.indexing.AsyncIndexer$IndexingTask.run(AsyncIndexer.java:189) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) What may be the issue here? -- Thilini Shanika Software Engineer WSO2, Inc.; http://wso2.com 20, Palmgrove Avenue, Colombo 3 E-mail: tgtshan...@gmail.com
_______________________________________________ Dev mailing list Dev@wso2.org http://wso2.org/cgi-bin/mailman/listinfo/dev