[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768907#comment-15768907 ] Stefania commented on CASSANDRA-12909: -- CI looks good. The [failed test|http://cassci.datastax.com/job/stef1927-12909-cqlsh-cqlsh-tests/lastCompletedBuild/cython=yes,label=ctool-lab/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_copy_from_with_wrong_order_or_missing_UDT_fields/] on trunk is the new test added for CASSANDRA-12959, which is expected to fail since I did not rebase the patch branch. I verified that it passes locally. Committed to 2.2 as f4fd0928e6bc56600b7fd4d2469353c8f9d9da7d and merged upwards. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.9, 3.0.11, 3.10 > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768709#comment-15768709 ] Stefania commented on CASSANDRA-12909: -- Thanks for the review. I've updated the remaining branches and I am currently running CI on them. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15766442#comment-15766442 ] Benjamin Lerer commented on CASSANDRA-12909: The new patch looks good to me. Thanks. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15766296#comment-15766296 ] Stefania commented on CASSANDRA-12909: -- bq. Therefore, in order to avoid converting values to unicode, we can equivalently encode the query text into utf-8. I've changed it again so that we encode the column names rather than the query, this covers one more failure that I found today with a new [test case|https://github.com/riptano/cassandra-dtest/pull/1393/files#diff-2d1d32d5724e8f118d09beb660163899R3124]. Only the 2.2 patch is up-tp-date and CI for 2.2 is pending. I'm waiting for another review before updating the remaining branches. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15763382#comment-15763382 ] Stefania commented on CASSANDRA-12909: -- This is a sample [failure|http://cassci.datastax.com/job/stef1927-12909-cqlsh-2.2-cqlsh-tests/cython=no,label=ctool-lab/2/testReport/junit/junit/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_all_datatypes_round_trip/] that we get without calling {{unicode}} in the protectors. The problem is: {code} Traceback (most recent call last): File "/home/automaton/cassandra/bin/../pylib/cqlshlib/copyutil.py", line 2262, in make_statement return inner_make_statement(query, conv, batch, replicas) File "/home/automaton/cassandra/bin/../pylib/cqlshlib/copyutil.py", line 2321, in make_non_prepared_batch_statement statement = SimpleStatement(query % (','.join(batch['rows'][0]),), consistency_level=self.consistency_level) UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 114: ordinal not in range(128) {code} It fails to append arguments to the query string if they are not of type unicode. The query is of type unicode because it is created by concatenating the query text to the the column names, which are unicode types. Column names are wrapped by a call to protect_name, older versions of the driver were converting unicode column names to utf-8, whilst now protect_name no longer encodes strings to utf-8. Therefore, in order to avoid converting values to unicode, we can equivalently encode the query text into utf-8. This shouldn't be too risky since it was the behavior prior to CASSANDRA-11850. It is also probably a little bit more efficient, in addition to avoiding the special casing for ascii types. I only modified the 2.2 [patch|https://github.com/apache/cassandra/compare/trunk...stef1927:12909-cqlsh-2.2], let me know if you agree with this new approach [~blerer]. CI for 2.2 is pending. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15763053#comment-15763053 ] Stefania commented on CASSANDRA-12909: -- Protectors are used more widely than converters for non-prepared statements, converters are only applied to partition keys to determine the routing, protectors to all values. The call to {{unicode}} in {{_get_protector}} was added by CASSANDRA-11850 when we upgraded the driver to 3.5. Five tests were failing with Unicode conversion problems without it, but I cannot recall which ones and the original test results for 11850 have been deleted. So I don't know if the call to {{unicode}} can be moved to the converters. I am repeating the tests for the 2.2 patch above without a {{unicode}} call in the protectors. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15760896#comment-15760896 ] Benjamin Lerer commented on CASSANDRA-12909: It is not clear to me why we call {{unicode}} in {{_get_protector}}. It seems to me that it should have been called in the converters instead. What do you think? > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15744364#comment-15744364 ] Stefania commented on CASSANDRA-12909: -- [~blerer], gentle ping. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15672957#comment-15672957 ] Stefania commented on CASSANDRA-12909: -- What's happening is that when we [protect values|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L1822] for non-prepared statements, we convert strings to Unicode and neither our [converter|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L1867], nor the [python driver when using python version 2|https://github.com/datastax/python-driver/blob/master/cassandra/cqltypes.py#L425], convert them back to ascii. Then as we know, [python 2 won't pack unicode strings|https://bugs.python.org/issue10783], so they must be encoded first. I've fixed it by introducing an ascii specific converter. ||2.2||3.0||3.X||trunk|| |[patch|https://github.com/stef1927/cassandra/tree/12909-cqlsh-2.2]|[patch|https://github.com/stef1927/cassandra/tree/12909-cqlsh-3.0]|[patch|https://github.com/stef1927/cassandra/tree/12909-cqlsh-3.X]|[patch|https://github.com/stef1927/cassandra/tree/12909-cqlsh]| |[tests|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12909-cqlsh-2.2-cqlsh-tests/]|[tests|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12909-cqlsh-3.0-cqlsh-tests/]|[tests|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12909-cqlsh-3.X-cqlsh-tests/]|[tests|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12909-cqlsh-cqlsh-tests/]| The new test is [here|https://github.com/riptano/cassandra-dtest/pull/1393]. > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12909) cqlsh copy cannot parse strings when counters are present
[ https://issues.apache.org/jira/browse/CASSANDRA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15662472#comment-15662472 ] Stefania commented on CASSANDRA-12909: -- This fixes it but I still don't understand why we don't already have a UTF-8 encoded string, and why this only happens with non-prepared statements, given that the converters are identical: |3.0|[patch|https://github.com/apache/cassandra/compare/trunk...stef1927:12909-cqlsh-3.0]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12909-cqlsh-3.0-cqlsh-tests/]| > cqlsh copy cannot parse strings when counters are present > - > > Key: CASSANDRA-12909 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12909 > Project: Cassandra > Issue Type: Bug >Reporter: Stefania >Assignee: Stefania > Fix For: 2.2.x, 3.0.x, 3.x > > > We get parse error {{Failed to import 1 rows: ParseError - argument for 's' > must be a string}} when using the following table and data: > {code} > CREATE TABLE ks.test ( > object_id ascii, > user_id timeuuid, > counter_id ascii, > count counter, > PRIMARY KEY ((object_id, user_id), counter_id) > ) > {code} > {code} > EVT:be3bd2d0-a68d-11e6-90d4-1b2a65b8a28a,f7ce3ac0-a66e-11e6-b58e-4e29450fd577,SA,2 > {code} > The problem is this line > [here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/copyutil.py#L2114], > strings are serialized as unicode rather than ordinary strings but only for > non-prepared statements (unsure why). -- This message was sent by Atlassian JIRA (v6.3.4#6332)