I have raised an issue and a patch is provided. Please confirm if it helps https://issues.apache.org/jira/browse/SOLR-964
On Fri, Jan 16, 2009 at 3:52 PM, Noble Paul നോബിള് नोब्ळ् <noble.p...@gmail.com> wrote: > stax parser automatically tries to fetch the DTD. How can we disable > that at the parser level? > > On Fri, Jan 16, 2009 at 3:34 PM, Fergus McMenemie <fer...@twig.me.uk> wrote: >> Hello all, as the subject says: >> DIH XPathEntityProcessor fails with docs containing <!DOCTYPE> >> >> This is using a solr nightly build from monday. >> >> INFO: Server startup in 3623 ms >> Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.SolrWriter >> readIndexerProperties >> INFO: Read dataimport.properties >> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrCore execute >> INFO: [jdocs] webapp=/solr path=/walkj params={command=full-import} status=0 >> QTime=13 >> Jan 16, 2009 9:54:12 AM org.apache.solr.handler.dataimport.DataImporter >> doFullImport >> INFO: Starting Full Import >> Jan 16, 2009 9:54:12 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll >> INFO: [jdocs] REMOVING ALL DOCUMENTS FROM INDEX >> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy onInit >> INFO: SolrDeletionPolicy.onInit: commits:num=2 >> >> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_c,version=1232026423291,generation=12,filenames=[segments_c, >> _4.fnm, _4.frq, _4.prx, _4.tis, _4.tii, _4.nrm, _4.fdx, _4.fdt] >> >> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_d,version=1232026423292,generation=13,filenames=[segments_d] >> Jan 16, 2009 9:54:12 AM org.apache.solr.core.SolrDeletionPolicy updateCommits >> INFO: last commit = 1232026423292 >> Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DocBuilder >> buildDocument >> SEVERE: Exception while processing: jcurrent document : null >> org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing >> failed for xml, url:/j/dtd/jxml/data/news/2008/frp70450.xmlrows processed :0 >> Processing Document # 1 >> at >> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) >> at >> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:252) >> at >> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:177) >> at >> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) >> at >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) >> at >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) >> at >> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) >> at >> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) >> at >> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) >> at >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) >> at >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) >> Caused by: java.lang.RuntimeException: >> com.ctc.wstx.exc.WstxParsingException: (was java.io.FileNotFoundException) >> /../config/jml-delivery-norm-2.1.dtd (No such file or directory) >> at [row,col {unknown-source}]: [3,81] >> at >> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85) >> at >> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:242) >> ... 9 more >> Caused by: com.ctc.wstx.exc.WstxParsingException: (was >> java.io.FileNotFoundException) /../config/jml-delivery-norm-2.1.dtd (No such >> file or directory) >> at [row,col {unknown-source}]: [3,81] >> at >> com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630) >> at >> com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461) >> at >> com.ctc.wstx.sr.ValidatingStreamReader.findDtdExtSubset(ValidatingStreamReader.java:475) >> at >> com.ctc.wstx.sr.ValidatingStreamReader.finishDTD(ValidatingStreamReader.java:358) >> at >> com.ctc.wstx.sr.BasicStreamReader.skipToken(BasicStreamReader.java:3351) >> at >> com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:1988) >> at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069) >> at >> org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:141) >> at >> org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89) >> at >> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82) >> ... 10 more >> Jan 16, 2009 9:54:13 AM org.apache.solr.handler.dataimport.DataImporter >> doFullImport >> SEVERE: Full Import failed >> >> A fragment from the top of the failing document is >> >> <?xml version="1.0" encoding="ISO-8859-1"?> >> <?xml-stylesheet type="text/xsl" >> href="../../../../config/support/j-deliver.xsl"?> >> <!DOCTYPE j:record SYSTEM "../../../../config/jml-delivery-norm-2.1.dtd"> >> <j:record xmlns:j="http://dtd.j.com/2002/Content/" id="frp70450" >> urname="record"> >> <j:metadata xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="" >> urname="metadata" xlink:type="simple"> >> <dc:date xmlns:dc="http://purl.org/dc/elements/1.1/" >> qualifier="pdate">20080131</dc:date> >> >> The DTD does exist at the specified location. Removing the DOCTYPE directive >> fixes everything. I know that use of DOCTYPE is out of fashion, and it does >> not exist in our newer documents, however there are lots of older XML docs >> about! >> >> Regards Fergus. >> -- >> >> =============================================================== >> Fergus McMenemie Email:fer...@twig.me.uk >> Techmore Ltd Phone:(UK) 07721 376021 >> >> Unix/Mac/Intranets Analyst Programmer >> =============================================================== >> > > > > -- > --Noble Paul > -- --Noble Paul