Alexandre,

        perfect!!!

        There is a built in white space trim factory, 
TrimFieldUpdateProcessorFactory that I added to the default chain and now all 
is good!

        thanks again,

Scott

On 9/7/21 7:32 AM, Alexandre Rafalovitch wrote:
The general answer is to add UpdateRequestProcessor pipeline. That gives
you a lot of post processing flexibility.

But you can also try having the xpath specify  ..../text(), maybe that will
deal with space specifically.  Did not test it myself though, just a
thought.

Regards,
     Alex

On Mon., Sep. 6, 2021, 11:10 p.m. Scott Derrick, <[email protected]> wrote:

I'm indexing .xml documents and using the XPathEntityProcessor for data
importing.  Here is a snippet of my conf file

        <entity name="meta"
            dataSource="myfilereader"
            processor="XPathEntityProcessor"
            url="${jcurrent.fileAbsolutePath}"
            stream="false"
            forEach="/TEI/teiHeader/fileDesc"
            xsl="xslt/meta.xsl"
            >
            <field column="title" xpath="/TEI/teiHeader//title"
flatten="true"/>
            <field column="author" xpath="/TEI/teiHeader//author" />
            <field column="publisher" xpath="/TEI/teiHeader//publisher" />
            <field column="accession" xpath="/TEI/teiHeader//idno" />
            <field column="date" xpath="/TEI/teiHeader//date"
flatten="true" />
            <field column="origin" xpath="/TEI/teiHeader//origin" />
            <field column="origPlace" xpath="/TEI/teiHeader//origPlace" />
            <field column="origGeo" xpath="/TEI/teiHeader//origGeo" />
            <field column="settlement" xpath="/TEI/teiHeader//settlement" />
            <field column="region" xpath="/TEI/teiHeader//region" />
            <field column="country" xpath="/TEI/teiHeader//country" />
            <field column="when" xpath="/TEI/teiHeader//when" />
            <field column="when-custom" xpath="/TEI/teiHeader//when-custom"
/>
            <field column="notAfter" xpath="/TEI/teiHeader//notAfter" />
            <field column="notBefore" xpath="/TEI/teiHeader//notBefore" />
            <field column="note" xpath="/TEI/teiHeader//note"
flatten="true" />
            <field column="annotator" xpath="/TEI/teiHeader//annotator" />
            <field column="scribe" xpath="/TEI/teiHeader//scribe" />
            <field column="recipient" xpath="/TEI/teiHeader//recipient" />
         </entity>

I noticed spaces at the ends of my elements when exporting a result into
json or xml.

I thought is was my javascript fetch call that was appending the string
but looking at the query page on the solr admin site I can clearly see a
trailing space.  Doesn't matter how the field is stored string or
text_general is the same.

here is a snippet of the query response

|{ "date":"1884-09-09 September 9, 1884 ", "note":"Handwritten by Mary on
a postcard from Boston, Massachusetts. ", "country":"USA ",
"origGeo":"42.3584308 -71.0597732 ", "author":"Mary ", "authorString":"Mary
", "origin":"1884-09-09 ",
"originSort":"1884-09-09 ", "accession":"639P3.65.026 ",
"accessionSort":"639P3.65.026 ", "title":"\n Mary to Mary Baker Eddy, \n
September 9, 1884 \n \n ", "titleSort":"\n Mary to Mary Baker Eddy, \n
September 9, 1884 \n \n ", "when":"1884-09-09 ",
"settlement":"Boston ", "recipient":"Mary Baker Eddy",
"recipientString":"Mary Baker Eddy", "publisher":"The Mary Baker Eddy
Library ", "origPlace":"places.xml#boston_ma ", "region":"MA ",
"type":"incoming_correspondence", "places":"Boston ",
"placesString":"Boston ", "people":"Mary ", "peopleString":"Mary ",
"body":"Paper rec received Thanks, Just looked it over, good . Have moved
at last! Will find me at cor: Shawmut Ave. & Pleasant St. a few doors from
66 S. Ave, further downtown. Hope
you will find time to come in. Not yet settled, but like much better. Hope
you are prospering. Wanted to see you last Sabbath eve but too tired In
love Mary – ", "closer":"Boston Sept 9. 1884 . ",
"id":"3272bf21-e6c2-4053-85ef-db3ec5a7f0ae",
"_version_":1710182653070671872},|



I'm guessing its the XPathEntityProcessor that is doing it but I'm
certainly open to pilot error!

Any ideas how I can get rid of the trailing space?

thanks,

Scott






Reply via email to