[
https://issues.apache.org/jira/browse/SOLR-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268456#comment-13268456
]
david babits commented on SOLR-3434:
------------------------------------
Yes, specifying fieldnames works, and worked yesterday too, I forgot to mention
it.
To close this out:
My goal is to accept a random file, generated by extract from a database, and
load it into Solr.
Database extract comes with fields aligned, hence the white space in the header
and values.
I do not know the fieldnames ahead of time, so I was hoping to specify
header=true&trim=true and have Solr take care of parsing.
This proved not to work.
Since I have to massage the data anyway to remove spaces, I might as well parse
out the header line at the same time using sed and construct fieldnames
variable.
I also found that I need <dynamicField name="*" type="string"
multiValued="true" /> since I do not know header up front, and can't rely on _s
etc, and it wouldn't work otherwise.
So, trim=true&header=false&skipLines=2&fieldnames=$fieldnames
This is the workaround.
My opinion is: 'trim' should be true by default, and certainly apply to both
data and header, although I understand it would break backward-compatibility.
Thanks again for your help.
> CSVRequestHandler does not trim header when using header=true&trim=true
> -----------------------------------------------------------------------
>
> Key: SOLR-3434
> URL: https://issues.apache.org/jira/browse/SOLR-3434
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 3.6
> Environment: Linux
> Reporter: david babits
> Labels: CSV,, header, separator
>
> when using {{header=true&trim=true}} the field names in the header row are
> not trimmed.
> this is consistent with the documentation, but that doesn't mean it makes
> sense.
> would be good to change this so trim=true also applied to the header row (at
> least by default)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]