Ah, it needs a null check for multi valued fields. I've committed a fix to trunk. The next nightly build should have it. You can checkout and build from the trunk if need this immediately.
On Mon, Jan 19, 2009 at 7:02 PM, Fergus McMenemie <fer...@twig.me.uk> wrote: > Hmmm, > > Just to clarify I retested the thing using the nightly as of today > 18-jan-2009. The problem is still there and this traceback is from > that nightly. > > >>This looks fine. Can you post the stack trace? > >> > >Yep, here is the juicy bit. Let me know if you need more. > > > >Jan 19, 2009 11:08:03 AM org.apache.catalina.startup.Catalina start > >INFO: Server startup in 2390 ms > >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrCore execute > >INFO: [janesdocs] webapp=/solr path=/dataimport > params={command=full-import} status=0 QTime=12 > >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.SolrWriter > readIndexerProperties > >INFO: Read dataimport.properties > >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter > doFullImport > >INFO: Starting Full Import > >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2 > deleteAll > >INFO: [janesdocs] REMOVING ALL DOCUMENTS FROM INDEX > >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy onInit > >INFO: SolrDeletionPolicy.onInit: commits:num=2 > > > commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_1,version=1232363283058,generation=1,filenames=[segments_1] > > > commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_2,version=1232363283059,generation=2,filenames=[segments_2] > >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy > updateCommits > >INFO: last commit = 1232363283059 > >Jan 19, 2009 11:14:06 AM > org.apache.solr.handler.dataimport.EntityProcessorBase applyTransformer > >WARNING: transformer threw error > >java.lang.NullPointerException > > at java.io.StringReader.<init>(StringReader.java:33) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) > > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) > > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) > > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) > > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) > > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) > >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DocBuilder > buildDocument > >SEVERE: Exception while processing: janescurrent document : null > >org.apache.solr.handler.dataimport.DataImportHandlerException: > java.lang.NullPointerException > > at > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) > > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) > > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) > > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) > > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) > > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) > >Caused by: java.lang.NullPointerException > > at java.io.StringReader.<init>(StringReader.java:33) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) > > ... 9 more > >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter > doFullImport > >SEVERE: Full Import failed > >org.apache.solr.handler.dataimport.DataImportHandlerException: > java.lang.NullPointerException > > at > org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197) > > at > org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313) > > at > org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339) > > at > org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202) > > at > org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147) > > at > org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321) > > at > org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381) > > at > org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362) > >Caused by: java.lang.NullPointerException > > at java.io.StringReader.<init>(StringReader.java:33) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71) > > at > org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54) > > at > org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187) > > ... 9 more > >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2 > rollback > >INFO: start rollback > >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2 > rollback > >INFO: end_rollback > > > > > >>On Mon, Jan 19, 2009 at 4:14 PM, Fergus McMenemie <fer...@twig.me.uk> > wrote: > >> > >>> Hello all, > >>> > >>> I have the following DIH data-config.xml file. Adding > >>> HTMLStripTransformer and the associated stripHTML on the > >>> para tag seems to have broke things. I am using a nightly > >>> build from 12-jan-2009 > >>> > >>> The /record/sect1/para contains HTML sub tags which need > >>> to be discarded. Is my use of stripHTML correct? > >>> > >>> <dataConfig> > >>> <dataSource name="myfilereader" type="FileDataSource"/> > >>> <document> > >>> <entity name="jcurrent" > >>> processor="FileListEntityProcessor" > >>> fileName=".*xml" > >>> newerThan="'NOW-1000DAYS'" > >>> recursive="true" > >>> rootEntity="false" > >>> dataSource="null" > >>> baseDir="/Volumes/spare/ts/jxml/data/news/groups"> > >>> > >>> <entity name="x" > >>> dataSource="myfilereader" > >>> processor="XPathEntityProcessor" > >>> url="${jcurrent.fileAbsolutePath}" > >>> stream="false" > >>> forEach="/record" > >>> > >>> > transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,HTMLStripTransformer"> > >>> > >>> <field column="fileAbsPath" > >>> template="${jcurrent.fileAbsolutePath}" /> > >>> <field column="fileWebPath" regex="/Volumes/spare/ts/(.*)" > >>> replaceWith="$1" sourceColName="fileAbsePath"/> > >>> <field column="title" xpath="/record/title" /> > >>> <field column="para" xpath="/record/sect1/para" > >>> stripHTML="true" /> > >>> <field column="subject" > >>> xpath="/record/metadata/subje...@qualifier='fullTitle']" /> > >>> <field column="pubname" > >>> xpath="/record/metadata/subje...@qualifier='publication']" /> > >>> <field column="pubdate" > >>> xpath="/record/metadata/da...@qualifier='pubDate']" > >>> dateTimeFormat="yyyyMMdd" /> > >>> </entity> > >>> </entity> > >>> </document> > >>> </dataConfig> > >>> > >>> -- > >>> > >>> =============================================================== > >>> Fergus McMenemie > >>> Email:fer...@twig.me.uk<email%3afer...@twig.me.uk> > <email%3afer...@twig.me.uk <email%253afer...@twig.me.uk>> > >>> Techmore Ltd Phone:(UK) 07721 376021 > >>> > >>> Unix/Mac/Intranets Analyst Programmer > >>> =============================================================== > >>> > >> > >> > >> > >>-- > >>Regards, > >>Shalin Shekhar Mangar. > > > >-- > > > >=============================================================== > >Fergus McMenemie > >Email:fer...@twig.me.uk<email%3afer...@twig.me.uk> > >Techmore Ltd Phone:(UK) 07721 376021 > > > >Unix/Mac/Intranets Analyst Programmer > >=============================================================== > > -- > > =============================================================== > Fergus McMenemie > Email:fer...@twig.me.uk<email%3afer...@twig.me.uk> > Techmore Ltd Phone:(UK) 07721 376021 > > Unix/Mac/Intranets Analyst Programmer > =============================================================== > -- Regards, Shalin Shekhar Mangar.