Ah, it needs a null check for multi valued fields. I've committed a fix to
trunk. The next nightly build should have it. You can checkout and build
from the trunk if need this immediately.

On Mon, Jan 19, 2009 at 7:02 PM, Fergus McMenemie <fer...@twig.me.uk> wrote:

> Hmmm,
>
> Just to clarify I retested the thing using the nightly as of today
> 18-jan-2009. The problem is still there and this traceback is from
> that nightly.
>
> >>This looks fine. Can you post the stack trace?
> >>
> >Yep, here is the juicy bit. Let me know if you need more.
> >
> >Jan 19, 2009 11:08:03 AM org.apache.catalina.startup.Catalina start
> >INFO: Server startup in 2390 ms
> >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrCore execute
> >INFO: [janesdocs] webapp=/solr path=/dataimport
> params={command=full-import} status=0 QTime=12
> >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.SolrWriter
> readIndexerProperties
> >INFO: Read dataimport.properties
> >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> >INFO: Starting Full Import
> >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2
> deleteAll
> >INFO: [janesdocs] REMOVING ALL DOCUMENTS FROM INDEX
> >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy onInit
> >INFO: SolrDeletionPolicy.onInit: commits:num=2
> >
> commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_1,version=1232363283058,generation=1,filenames=[segments_1]
> >
> commit{dir=/Volumes/spare/ts/solrnightlyjanes/data/index,segFN=segments_2,version=1232363283059,generation=2,filenames=[segments_2]
> >Jan 19, 2009 11:14:06 AM org.apache.solr.core.SolrDeletionPolicy
> updateCommits
> >INFO: last commit = 1232363283059
> >Jan 19, 2009 11:14:06 AM
> org.apache.solr.handler.dataimport.EntityProcessorBase applyTransformer
> >WARNING: transformer threw error
> >java.lang.NullPointerException
> >       at java.io.StringReader.<init>(StringReader.java:33)
> >       at
> org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71)
> >       at
> org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54)
> >       at
> org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187)
> >       at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197)
> >       at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147)
> >       at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321)
> >       at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381)
> >       at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362)
> >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DocBuilder
> buildDocument
> >SEVERE: Exception while processing: janescurrent document : null
> >org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.NullPointerException
> >       at
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64)
> >       at
> org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203)
> >       at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197)
> >       at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147)
> >       at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321)
> >       at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381)
> >       at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362)
> >Caused by: java.lang.NullPointerException
> >       at java.io.StringReader.<init>(StringReader.java:33)
> >       at
> org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71)
> >       at
> org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54)
> >       at
> org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187)
> >       ... 9 more
> >Jan 19, 2009 11:14:06 AM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> >SEVERE: Full Import failed
> >org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.NullPointerException
> >       at
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:64)
> >       at
> org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:203)
> >       at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:197)
> >       at
> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:160)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:313)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:339)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:202)
> >       at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:147)
> >       at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:321)
> >       at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:381)
> >       at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:362)
> >Caused by: java.lang.NullPointerException
> >       at java.io.StringReader.<init>(StringReader.java:33)
> >       at
> org.apache.solr.handler.dataimport.HTMLStripTransformer.stripHTML(HTMLStripTransformer.java:71)
> >       at
> org.apache.solr.handler.dataimport.HTMLStripTransformer.transformRow(HTMLStripTransformer.java:54)
> >       at
> org.apache.solr.handler.dataimport.EntityProcessorBase.applyTransformer(EntityProcessorBase.java:187)
> >       ... 9 more
> >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2
> rollback
> >INFO: start rollback
> >Jan 19, 2009 11:14:06 AM org.apache.solr.update.DirectUpdateHandler2
> rollback
> >INFO: end_rollback
> >
> >
> >>On Mon, Jan 19, 2009 at 4:14 PM, Fergus McMenemie <fer...@twig.me.uk>
> wrote:
> >>
> >>> Hello all,
> >>>
> >>> I have the following DIH data-config.xml file. Adding
> >>> HTMLStripTransformer and the associated stripHTML on the
> >>> para tag seems to have broke things. I am using a nightly
> >>> build from 12-jan-2009
> >>>
> >>> The /record/sect1/para contains HTML sub tags which need
> >>> to be discarded. Is my use of stripHTML correct?
> >>>
> >>> <dataConfig>
> >>>  <dataSource name="myfilereader" type="FileDataSource"/>
> >>>  <document>
> >>>     <entity name="jcurrent"
> >>>        processor="FileListEntityProcessor"
> >>>        fileName=".*xml"
> >>>        newerThan="'NOW-1000DAYS'"
> >>>        recursive="true"
> >>>        rootEntity="false"
> >>>        dataSource="null"
> >>>        baseDir="/Volumes/spare/ts/jxml/data/news/groups">
> >>>
> >>>        <entity name="x"
> >>>           dataSource="myfilereader"
> >>>           processor="XPathEntityProcessor"
> >>>           url="${jcurrent.fileAbsolutePath}"
> >>>           stream="false"
> >>>           forEach="/record"
> >>>
> >>>
> transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,HTMLStripTransformer">
> >>>
> >>>           <field column="fileAbsPath"
> >>> template="${jcurrent.fileAbsolutePath}" />
> >>>           <field column="fileWebPath" regex="/Volumes/spare/ts/(.*)"
> >>> replaceWith="$1" sourceColName="fileAbsePath"/>
> >>>           <field column="title"    xpath="/record/title" />
> >>>           <field column="para"     xpath="/record/sect1/para"
> >>> stripHTML="true" />
> >>>           <field column="subject"
> >>>  xpath="/record/metadata/subje...@qualifier='fullTitle']"   />
> >>>           <field column="pubname"
> >>>  xpath="/record/metadata/subje...@qualifier='publication']" />
> >>>           <field column="pubdate"
> >>>  xpath="/record/metadata/da...@qualifier='pubDate']"
> >>> dateTimeFormat="yyyyMMdd"   />
> >>>           </entity>
> >>>        </entity>
> >>>     </document>
> >>>  </dataConfig>
> >>>
> >>> --
> >>>
> >>> ===============================================================
> >>> Fergus McMenemie               
> >>> Email:fer...@twig.me.uk<email%3afer...@twig.me.uk>
> <email%3afer...@twig.me.uk <email%253afer...@twig.me.uk>>
> >>> Techmore Ltd                   Phone:(UK) 07721 376021
> >>>
> >>> Unix/Mac/Intranets             Analyst Programmer
> >>> ===============================================================
> >>>
> >>
> >>
> >>
> >>--
> >>Regards,
> >>Shalin Shekhar Mangar.
> >
> >--
> >
> >===============================================================
> >Fergus McMenemie               
> >Email:fer...@twig.me.uk<email%3afer...@twig.me.uk>
> >Techmore Ltd                   Phone:(UK) 07721 376021
> >
> >Unix/Mac/Intranets             Analyst Programmer
> >===============================================================
>
> --
>
> ===============================================================
> Fergus McMenemie               
> Email:fer...@twig.me.uk<email%3afer...@twig.me.uk>
> Techmore Ltd                   Phone:(UK) 07721 376021
>
> Unix/Mac/Intranets             Analyst Programmer
> ===============================================================
>



-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to