Dominique Hermsdorff created CASSANDRA-17200:
------------------------------------------------

             Summary: " as first char of a text column (and unique occurrence) 
during COPY with DELIMITER set to |
                 Key: CASSANDRA-17200
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17200
             Project: Cassandra
          Issue Type: Bug
            Reporter: Dominique Hermsdorff


Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.11 | CQL spec 3.4.4 | Native protocol v4]

During an import using | as delimiter, e.g.:

COPY cy4secure_company_demo.demo_contacts_clear 
(record_id,state,city,first_name,last_name,email,salary,credit_rating,debt,ssn,liabilities,comments,picture)
 FROM '/var/www/cy4secure/csv/001_demo_contacts_clear_100000.csv' WITH 
DELIMITER='|' AND NUMPROCESSES=2 AND MAXBATCHSIZE=1 AND CHUNKSIZE=50 AND 
SKIPROWS=2;


Yep you can have double-quotes and quotes in a text column, no problem, example:

{{<other fields>...|Earth 
{color:#FF0000}"{color}observation{color:#FF0000}"{color} taken 
{color:#FF0000}'{color}during{color:#FF0000}'{color} a day pass by...|...<other 
fields>}}

BUT if you have a unique " at the beginning it fails, example:

<other fields>...|{color:#FF0000}"{color}Earth observation taken during a day 
pass by...|...<other fields>

This results in my case with: Failed to import 1 rows: ParseError - Invalid row 
length 12 should be 13,  given up without retries

And the rows part of the chunk are not imported.

My workaround: filter text values that starts with a ".

Otherwise, you'll probably want to take a look at it.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to