Re: Problem with CSV update handler

2011-06-21 Thread Rafał Kuć
Hello!

Once again thanks for the response ;) So the solution is to generate
the data files once again and either adding the space after doubled
encapsulator or changing the encapsulator to the character that does
not occur in the filed values (of course the one taht will be
split).


-- 
Regards,
 Rafał Kuć
 http://solr.pl

 Multi-valued CSV fields are double encoded.

 We start with: aaa bbbccc'
 Then decoding one leve, we get:  aaa bbbccc
 Decoding again to get individual values results in a decode error
 because the encapsulator appears unescaped in the middle of the second
 value (i.e. invalid CSV).

 One easier way to fix this is to use a different encapsulator for the
 sub-values of a multi-valued field by adding f.title.encapsulator=%27
 (a single quote char)

 But I can't really tell you exactly how to encode or specify options
 to the CSV loader when I don't know what the actual values you want
 after aaa bbbccc' is decoded.

 -Yonik
 http://www.lucidimagination.com



 On Mon, Jun 20, 2011 at 5:46 PM, Rafał Kuć r@solr.pl wrote:
 Hi!

  Yonik, thanks for the reply. I just realized that the example I gave
 was not full - the error is returned by Solr only when the field is
 multivalued and the values in the fields are splited. For example, the
 following curl command give me the mentioned error:

 curl
 'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
 capsulator=%22f.title.split=truef.title.separator=%20' -H
 'Content-type:text/plain' -d '1,aaa bbbccc'

 while the following is executed without any problem:
 curl
 'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
 capsulator=%22f.title.split=truef.title.separator=%20' -H
 'Content-type:text/plain' -d '1,aaa bbb ccc'

 The only difference between those two is the additional space
 character in between bbb and ccc in the second example.

 Am I doing something wrong ? ;)

 --
 Regards,
  Rafał Kuć
  http://solr.pl

 This works fine for me:

 curl http://localhost:8983/solr/update/csv -H
 'Content-type:text/plain' -d 'id,name
 1,aaa bbb ccc'

 -Yonik
 http://www.lucidimagination.com


 On Mon, Jun 20, 2011 at 3:17 PM, Rafał Kuć r@solr.pl wrote:
 Hello!

  I have a question about the CSV update handler. Lets say I have the
 following file sent to CSV update handler using curl:

 id,name
 1,aaa bbbccc

 It throws an error, saying that:
 Error 400 java.io.IOException: (line 0) invalid char between encapsulated 
 token end delimiter

 If I change the contents of the file to:

 id,name
 1,aaa bbb ccc

 it works without a problem. This anyone encountered this ? Is it know 
 behavior ?

 --
 Regards,
  Rafał Kuć














Re: Problem with CSV update handler

2011-06-21 Thread Yonik Seeley
On Tue, Jun 21, 2011 at 2:15 AM, Rafał Kuć r@solr.pl wrote:
 Hello!

 Once again thanks for the response ;) So the solution is to generate
 the data files once again and either adding the space after doubled
 encapsulator

Maybe...
I can't tell if the file is encoded correctly or not since I don't
know what the decoded values are supposed to be from your example.

-Yonik
http://www.lucidimagination.com

 or changing the encapsulator to the character that does
 not occur in the filed values (of course the one taht will be
 split).


 --
 Regards,
  Rafał Kuć
  http://solr.pl

 Multi-valued CSV fields are double encoded.

 We start with: aaa bbbccc'
 Then decoding one leve, we get:  aaa bbbccc
 Decoding again to get individual values results in a decode error
 because the encapsulator appears unescaped in the middle of the second
 value (i.e. invalid CSV).

 One easier way to fix this is to use a different encapsulator for the
 sub-values of a multi-valued field by adding f.title.encapsulator=%27
 (a single quote char)

 But I can't really tell you exactly how to encode or specify options
 to the CSV loader when I don't know what the actual values you want
 after aaa bbbccc' is decoded.

 -Yonik
 http://www.lucidimagination.com



 On Mon, Jun 20, 2011 at 5:46 PM, Rafał Kuć r@solr.pl wrote:
 Hi!

  Yonik, thanks for the reply. I just realized that the example I gave
 was not full - the error is returned by Solr only when the field is
 multivalued and the values in the fields are splited. For example, the
 following curl command give me the mentioned error:

 curl
 'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
 capsulator=%22f.title.split=truef.title.separator=%20' -H
 'Content-type:text/plain' -d '1,aaa bbbccc'

 while the following is executed without any problem:
 curl
 'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
 capsulator=%22f.title.split=truef.title.separator=%20' -H
 'Content-type:text/plain' -d '1,aaa bbb ccc'

 The only difference between those two is the additional space
 character in between bbb and ccc in the second example.

 Am I doing something wrong ? ;)

 --
 Regards,
  Rafał Kuć
  http://solr.pl

 This works fine for me:

 curl http://localhost:8983/solr/update/csv -H
 'Content-type:text/plain' -d 'id,name
 1,aaa bbb ccc'

 -Yonik
 http://www.lucidimagination.com


 On Mon, Jun 20, 2011 at 3:17 PM, Rafał Kuć r@solr.pl wrote:
 Hello!

  I have a question about the CSV update handler. Lets say I have the
 following file sent to CSV update handler using curl:

 id,name
 1,aaa bbbccc

 It throws an error, saying that:
 Error 400 java.io.IOException: (line 0) invalid char between encapsulated 
 token end delimiter

 If I change the contents of the file to:

 id,name
 1,aaa bbb ccc

 it works without a problem. This anyone encountered this ? Is it know 
 behavior ?

 --
 Regards,
  Rafał Kuć















Problem with CSV update handler

2011-06-20 Thread Rafał Kuć
Hello!

 I have a question about the CSV update handler. Lets say I have the
following file sent to CSV update handler using curl:

id,name
1,aaa bbbccc

It throws an error, saying that:
Error 400 java.io.IOException: (line 0) invalid char between encapsulated token 
end delimiter

If I change the contents of the file to:

id,name
1,aaa bbb ccc

it works without a problem. This anyone encountered this ? Is it know behavior ?

-- 
Regards,
 Rafał Kuć




Re: Problem with CSV update handler

2011-06-20 Thread Yonik Seeley
This works fine for me:

curl http://localhost:8983/solr/update/csv -H
'Content-type:text/plain' -d 'id,name
1,aaa bbb ccc'

-Yonik
http://www.lucidimagination.com


On Mon, Jun 20, 2011 at 3:17 PM, Rafał Kuć r@solr.pl wrote:
 Hello!

  I have a question about the CSV update handler. Lets say I have the
 following file sent to CSV update handler using curl:

 id,name
 1,aaa bbbccc

 It throws an error, saying that:
 Error 400 java.io.IOException: (line 0) invalid char between encapsulated 
 token end delimiter

 If I change the contents of the file to:

 id,name
 1,aaa bbb ccc

 it works without a problem. This anyone encountered this ? Is it know 
 behavior ?

 --
 Regards,
  Rafał Kuć





Re: Problem with CSV update handler

2011-06-20 Thread Rafał Kuć
Hi!

  Yonik, thanks for the reply. I just realized that the example I gave
was not full - the error is returned by Solr only when the field is
multivalued and the values in the fields are splited. For example, the
following curl command give me the mentioned error:

curl
'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
capsulator=%22f.title.split=truef.title.separator=%20' -H
'Content-type:text/plain' -d '1,aaa bbbccc'

while the following is executed without any problem:
curl
'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
capsulator=%22f.title.split=truef.title.separator=%20' -H
'Content-type:text/plain' -d '1,aaa bbb ccc'

The only difference between those two is the additional space
character in between bbb and ccc in the second example.

Am I doing something wrong ? ;)

-- 
Regards,
 Rafał Kuć
 http://solr.pl

 This works fine for me:

 curl http://localhost:8983/solr/update/csv -H
 'Content-type:text/plain' -d 'id,name
 1,aaa bbb ccc'

 -Yonik
 http://www.lucidimagination.com


 On Mon, Jun 20, 2011 at 3:17 PM, Rafał Kuć r@solr.pl wrote:
 Hello!

  I have a question about the CSV update handler. Lets say I have the
 following file sent to CSV update handler using curl:

 id,name
 1,aaa bbbccc

 It throws an error, saying that:
 Error 400 java.io.IOException: (line 0) invalid char between encapsulated 
 token end delimiter

 If I change the contents of the file to:

 id,name
 1,aaa bbb ccc

 it works without a problem. This anyone encountered this ? Is it know 
 behavior ?

 --
 Regards,
  Rafał Kuć









Re: Problem with CSV update handler

2011-06-20 Thread Yonik Seeley
Multi-valued CSV fields are double encoded.

We start with: aaa bbbccc'
Then decoding one leve, we get:  aaa bbbccc
Decoding again to get individual values results in a decode error
because the encapsulator appears unescaped in the middle of the second
value (i.e. invalid CSV).

One easier way to fix this is to use a different encapsulator for the
sub-values of a multi-valued field by adding f.title.encapsulator=%27
(a single quote char)

But I can't really tell you exactly how to encode or specify options
to the CSV loader when I don't know what the actual values you want
after aaa bbbccc' is decoded.

-Yonik
http://www.lucidimagination.com



On Mon, Jun 20, 2011 at 5:46 PM, Rafał Kuć r@solr.pl wrote:
 Hi!

  Yonik, thanks for the reply. I just realized that the example I gave
 was not full - the error is returned by Solr only when the field is
 multivalued and the values in the fields are splited. For example, the
 following curl command give me the mentioned error:

 curl
 'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
 capsulator=%22f.title.split=truef.title.separator=%20' -H
 'Content-type:text/plain' -d '1,aaa bbbccc'

 while the following is executed without any problem:
 curl
 'http://localhost:8983/solr/update/csv?fieldnames=id,titlecommit=trueen
 capsulator=%22f.title.split=truef.title.separator=%20' -H
 'Content-type:text/plain' -d '1,aaa bbb ccc'

 The only difference between those two is the additional space
 character in between bbb and ccc in the second example.

 Am I doing something wrong ? ;)

 --
 Regards,
  Rafał Kuć
  http://solr.pl

 This works fine for me:

 curl http://localhost:8983/solr/update/csv -H
 'Content-type:text/plain' -d 'id,name
 1,aaa bbb ccc'

 -Yonik
 http://www.lucidimagination.com


 On Mon, Jun 20, 2011 at 3:17 PM, Rafał Kuć r@solr.pl wrote:
 Hello!

  I have a question about the CSV update handler. Lets say I have the
 following file sent to CSV update handler using curl:

 id,name
 1,aaa bbbccc

 It throws an error, saying that:
 Error 400 java.io.IOException: (line 0) invalid char between encapsulated 
 token end delimiter

 If I change the contents of the file to:

 id,name
 1,aaa bbb ccc

 it works without a problem. This anyone encountered this ? Is it know 
 behavior ?

 --
 Regards,
  Rafał Kuć