Shalin, thanks for the speedy response. >Which version of Solr are you using? Solr Implementation Version: nightly exported - yonik - 2008-11-13 08:05:48
> >I think there should be a dataSource="null" in the child entity as well. OK that had an effect; I now get:- Jan 13, 2009 4:42:28 PM org.apache.solr.core.SolrDeletionPolicy updateCommits INFO: last commit = 1231864933487 Jan 13, 2009 4:42:28 PM org.apache.solr.handler.dataimport.DocBuilder buildDocument SEVERE: Exception while processing: janescurrent document : null org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed for xml, url:/Volumes/spare/ts/janes/dtd/janesxml/data/news/jtic/groups/jwit0009.xmlrows processed :0 Processing Document # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:252) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:177) at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:283) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:309) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:137) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:337) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:397) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:378) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85) at org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:242) ... 9 more Caused by: java.lang.NullPointerException at com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245) at com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331) at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:81) ... 10 more Jan 13, 2009 4:42:28 PM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed for xml, url:/Volumes/spare/ts/janes/dtd/janesxml/data/news/jtic/groups/jwit0009.xmlrows processed :0 Processing Document # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:252) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:177) at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:283) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:309) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:137) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:337) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:397) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:378) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85) at org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:242) ... 9 more Caused by: java.lang.NullPointerException at com.ctc.wstx.io.ReaderBootstrapper.initialLoad(ReaderBootstrapper.java:245) at com.ctc.wstx.io.ReaderBootstrapper.bootstrapInput(ReaderBootstrapper.java:132) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:543) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:604) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:660) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:331) at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:81) ... 10 more > >On Tue, Jan 13, 2009 at 9:28 PM, Fergus McMenemie <fer...@twig.me.uk> wrote: > >> Hello, >> >> I am trying to use DIH with FileListEntityProcessor to to walk the >> disk and read XML documents. I have a dataConfig.xml as follows:- >> >> <dataConfig> >> <document> >> <entity name="jcurrent" >> processor="FileListEntityProcessor" >> fileName=".*xml" >> newerThan="'NOW-1000DAYS'" >> recursive="true" >> rootEntity="false" >> dataSource="null" >> baseDir="/Volumes/spare/ts/j/groups"> >> <entity name="x" >> processor="XPathEntityProcessor" >> url="${jcurrent.fileAbsolutePath}" >> stream="false" >> forEach="/record" >> transformer="DateFormatTransformer">0 >> <field column="title" xpath="/record/title"/> >> <field column="subject" >> xpath="/record/metadata/subje...@qualifier='fullTitle']"/> >> <field column="text" xpath="//para"/> >> <field column="pubname" >> xpath="/record/metadata/subje...@qualifier='publication']"/> >> <field column="pubabrev" >> xpath="/record/metadata/subje...@qualifier='pubAbbrev']"/> >> <field column="pubdate" >> xpath="/record/metadata/da...@qualifier='pubDate']"/> >> >> </entity> >> </entity> >> </document> >> </dataConfig> >> >> But when I try and start the walker I get:- >> >> INFO: [jdocs] REMOVING ALL DOCUMENTS FROM INDEX >> Jan 13, 2009 3:38:11 PM org.apache.solr.core.SolrDeletionPolicy onInit >> INFO: SolrDeletionPolicy.onInit: commits:num=2 >> >> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_1,version=1231861070710,generation=1,filenames=[segments_1] >> >> commit{dir=/Volumes/spare/ts/solrnightlyj/data/index,segFN=segments_2,version=1231861070711,generation=2,filenames=[segments_2] >> Jan 13, 2009 3:38:11 PM org.apache.solr.core.SolrDeletionPolicy >> updateCommits >> INFO: last commit = 1231861070711 >> Jan 13, 2009 3:38:11 PM org.apache.solr.handler.dataimport.DocBuilder >> buildDocument >> SEVERE: Exception while processing: jcurrent document : null >> org.apache.solr.handler.dataimport.DataImportHandlerException: No >> dataSource :null available for entity :x Processing Document # 1 >> at >> org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:287) >> at >> org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:86) >> at >> org.apache.solr.handler.dataimport.XPathEntityProcessor.init(XPathEntityProcessor.java:78) >> at >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:243) >> at >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:309) >> at >> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179) >> at >> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:137) >> at >> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:337) >> at >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:397) >> at >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:378) >> Jan 13, 2009 3:38:11 PM org.apache.solr.handler.dataimport.DataImporter >> doFullImport >> SEVERE: Full Import failed >> org.apache.solr.handler.dataimport.DataImportHandlerException: No >> dataSource :null available for entity :x Processing Document # 1 >> at >> org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:287) >> at >> org.apache.solr.handler.dataimport.ContextImpl.getDataSource(ContextImpl.java:86) >> at >> org.apache.solr.handler.dataimport.XPathEntityProcessor.init(XPathEntityProcessor.java:78) >> at >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:243) >> at >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:309) >> at >> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179) >> at >> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:137) >> at >> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:337) >> at >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:397) >> at >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:378) >> >> Anybody able to point out what I have done wrong? >> >> Regards Fergus. > >-- >Regards, >Shalin Shekhar Mangar. -- =============================================================== Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer ===============================================================