Hello!

Once again thanks for the response ;) So the solution is to generate
the data files once again and either adding the space after doubled
encapsulator or changing the encapsulator to the character that does
not occur in the filed values (of course the one taht will be
split).


-- 
Regards,
 Rafał Kuć
 http://solr.pl

> Multi-valued CSV fields are double encoded.

> We start with: "aaa ""bbb""ccc"'
> Then decoding one leve, we get:  aaa "bbb"ccc
> Decoding again to get individual values results in a decode error
> because the encapsulator appears unescaped in the middle of the second
> value (i.e. invalid CSV).

> One easier way to fix this is to use a different encapsulator for the
> sub-values of a multi-valued field by adding f.title.encapsulator=%27
> (a single quote char)

> But I can't really tell you exactly how to encode or specify options
> to the CSV loader when I don't know what the actual values you want
> after "aaa ""bbb""ccc"' is decoded.

> -Yonik
> http://www.lucidimagination.com



> On Mon, Jun 20, 2011 at 5:46 PM, Rafał Kuć <r....@solr.pl> wrote:
>> Hi!
>>
>>  Yonik, thanks for the reply. I just realized that the example I gave
>> was not full - the error is returned by Solr only when the field is
>> multivalued and the values in the fields are splited. For example, the
>> following curl command give me the mentioned error:
>>
>> curl
>> 'http://localhost:8983/solr/update/csv?fieldnames=id,title&commit=true&en
>> capsulator=%22&f.title.split=true&f.title.separator=%20' -H
>> 'Content-type:text/plain' -d '"1","aaa ""bbb""ccc"'
>>
>> while the following is executed without any problem:
>> curl
>> 'http://localhost:8983/solr/update/csv?fieldnames=id,title&commit=true&en
>> capsulator=%22&f.title.split=true&f.title.separator=%20' -H
>> 'Content-type:text/plain' -d '"1","aaa ""bbb"" ccc"'
>>
>> The only difference between those two is the additional space
>> character in between bbb"" and ccc in the second example.
>>
>> Am I doing something wrong ? ;)
>>
>> --
>> Regards,
>>  Rafał Kuć
>>  http://solr.pl
>>
>>> This works fine for me:
>>
>>> curl http://localhost:8983/solr/update/csv -H
>>> 'Content-type:text/plain' -d 'id,name
>>> "1","aaa ""bbb"" ccc"'
>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>
>>
>>> On Mon, Jun 20, 2011 at 3:17 PM, Rafał Kuć <r....@solr.pl> wrote:
>>>> Hello!
>>>>
>>>>  I have a question about the CSV update handler. Lets say I have the
>>>> following file sent to CSV update handler using curl:
>>>>
>>>> id,name
>>>> "1","aaa ""bbb""ccc"
>>>>
>>>> It throws an error, saying that:
>>>> Error 400 java.io.IOException: (line 0) invalid char between encapsulated 
>>>> token end delimiter
>>>>
>>>> If I change the contents of the file to:
>>>>
>>>> id,name
>>>> "1","aaa ""bbb"" ccc"
>>>>
>>>> it works without a problem. This anyone encountered this ? Is it know 
>>>> behavior ?
>>>>
>>>> --
>>>> Regards,
>>>>  Rafał Kuć
>>>>
>>>>
>>>>
>>
>>
>>
>>
>>




Reply via email to