Makes total sense. Thanks to both of your for the clarification!

On 8/18/18, 8:03 AM, "Alexandre Rafalovitch" <arafa...@gmail.com> wrote:

>Amd part of the issue is that SolrEntityProcessor does not take individual
>field definitions. So that part is ignored and instead just 'fl' mapping
>is
>used as Shawn explained.
>
>So you could also remap authorText in that definition to an ignored field.
>See
>https://github.com/apache/lucene-solr/blob/master/solr/example/example-DIH
>/solr/solr/conf/solr-data-config.xml
>
>Regards,
>    Alex
>
>On Fri, Aug 17, 2018, 11:50 PM Shawn Heisey, <apa...@elyograg.org> wrote:
>
>> On 8/17/2018 6:15 PM, Zimmermann, Thomas wrote:
>> > I¹m trying to track down an odd issue I¹m seeing when using the
>> SolrEntityProcessor to seed some test data from a solr 4.x cluster to a
>> solr 7.x cluster. It seems like strings are being interpreted as
>> multivalued when passed from a string field to a text field via the
>>copyTo
>> directive. Any clever ideas how to resolve this?
>>
>> What's happening is deceptively simple.
>>
>> In the source system, you're copying from author to authorText.  Both
>> fields are stored.  So if you have "Jeff Hartley" in author, you also
>> have "Jeff Hartley" in authorText. So what's happening is that when the
>> destination system imports from the source system, it gets "Jeff
>> Hartley" in both fields, and then copyField says "put a copy of what's
>> in author into authorText" ... and suddenly there are two copies of
>> "Jeff Hartley" in authorText.
>>
>> There are two ways to deal with this:
>>
>> 1) In the query you're doing with SolrEntityProcessor, add an "fl"
>> parameter and list all the fields *except* authorText and any other
>> field where this same problem is happening.
>>
>> 2) Remove the copyField from the schema until after the import from the
>> source server is done.
>>
>> Thanks,
>> Shawn
>>
>>

Reply via email to