I'm indexing .xml documents and using the XPathEntityProcessor for data 
importing.  Here is a snippet of my conf file

      <entity name="meta"
          dataSource="myfilereader"
          processor="XPathEntityProcessor"
          url="${jcurrent.fileAbsolutePath}"
          stream="false"
          forEach="/TEI/teiHeader/fileDesc"
          xsl="xslt/meta.xsl"
          >
          <field column="title" xpath="/TEI/teiHeader//title" flatten="true"/>
          <field column="author" xpath="/TEI/teiHeader//author" />
          <field column="publisher" xpath="/TEI/teiHeader//publisher" />
          <field column="accession" xpath="/TEI/teiHeader//idno" />
          <field column="date" xpath="/TEI/teiHeader//date" flatten="true" />
          <field column="origin" xpath="/TEI/teiHeader//origin" />
          <field column="origPlace" xpath="/TEI/teiHeader//origPlace" />
          <field column="origGeo" xpath="/TEI/teiHeader//origGeo" />
          <field column="settlement" xpath="/TEI/teiHeader//settlement" />
          <field column="region" xpath="/TEI/teiHeader//region" />
          <field column="country" xpath="/TEI/teiHeader//country" />
          <field column="when" xpath="/TEI/teiHeader//when" />
          <field column="when-custom" xpath="/TEI/teiHeader//when-custom" />
          <field column="notAfter" xpath="/TEI/teiHeader//notAfter" />
          <field column="notBefore" xpath="/TEI/teiHeader//notBefore" />
          <field column="note" xpath="/TEI/teiHeader//note" flatten="true" />
          <field column="annotator" xpath="/TEI/teiHeader//annotator" />
          <field column="scribe" xpath="/TEI/teiHeader//scribe" />
          <field column="recipient" xpath="/TEI/teiHeader//recipient" />
       </entity>

I noticed spaces at the ends of my elements when exporting a result into json 
or xml.

I thought is was my javascript fetch call that was appending the string but 
looking at the query page on the solr admin site I can clearly see a trailing 
space.  Doesn't matter how the field is stored string or text_general is the 
same.

here is a snippet of the query response

|{ "date":"1884-09-09 September 9, 1884 ", "note":"Handwritten by Mary on a postcard from Boston, Massachusetts. ", "country":"USA ", "origGeo":"42.3584308 -71.0597732 ", "author":"Mary ", "authorString":"Mary ", "origin":"1884-09-09 ", "originSort":"1884-09-09 ", "accession":"639P3.65.026 ", "accessionSort":"639P3.65.026 ", "title":"\n Mary to Mary Baker Eddy, \n September 9, 1884 \n \n ", "titleSort":"\n Mary to Mary Baker Eddy, \n September 9, 1884 \n \n ", "when":"1884-09-09 ", "settlement":"Boston ", "recipient":"Mary Baker Eddy", "recipientString":"Mary Baker Eddy", "publisher":"The Mary Baker Eddy Library ", "origPlace":"places.xml#boston_ma ", "region":"MA ", "type":"incoming_correspondence", "places":"Boston ", "placesString":"Boston ", "people":"Mary ", "peopleString":"Mary ", "body":"Paper rec received Thanks, Just looked it over, good . Have moved at last! Will find me at cor: Shawmut Ave. & Pleasant St. a few doors from 66 S. Ave, further downtown. Hope you will find time to come in. Not yet settled, but like much better. Hope you are prospering. Wanted to see you last Sabbath eve but too tired In love Mary – ", "closer":"Boston Sept 9. 1884 . ", "id":"3272bf21-e6c2-4053-85ef-db3ec5a7f0ae", "_version_":1710182653070671872},|



I'm guessing its the XPathEntityProcessor that is doing it but I'm certainly 
open to pilot error!

Any ideas how I can get rid of the trailing space?

thanks,

Scott


Reply via email to