[ 
https://issues.apache.org/jira/browse/CASSANDRA-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224545#comment-13224545
 ] 

paul cannon commented on CASSANDRA-4003:
----------------------------------------

Sure. Since the CQL driver deserializes column names before the client software 
(cqlsh) can see them, and does not expose the Cassandra data type for the 
column names, it was not always possible to determine from returned column 
names how they were meant to be interpreted. For example, it was sometimes 
impossible to tell TimeUUIDType from UUIDType, or any of the various integer or 
counter types apart, or even BytesType from AsciiType.

Cqlsh makes an effort to display data in the most meaningful form, and 
secondarily to visually distinguish data that would otherwise be too ambiguous 
using colors. So it needs to know the original column name type.

The CQL driver does not expose that, so this code uses internals to get it. 
Clearly it would make more sense to expose the info from the driver side, and I 
plan to do that, but it takes some extra process and testing. This hack is 
backwards compatible with older CQL driver versions, but possibly not 
forwards-compat.

Maybe it would be best to do a runtime check against the driver to see if it 
supports exposing column types before making this call.
                
> cqlsh still failing to handle decode errors in some column names
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-4003
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4003
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>    Affects Versions: 1.0.8
>            Reporter: paul cannon
>            Assignee: paul cannon
>            Priority: Minor
>              Labels: cqlsh
>             Fix For: 1.0.9
>
>
> Columns which are expected to be text, but which are not valid utf8, cause 
> cqlsh to display an error and not show any output:
> {noformat}
> cqlsh:ks> CREATE COLUMNFAMILY test (a text PRIMARY KEY) WITH comparator = 
> timestamp;
> cqlsh:ks> INSERT INTO test (a, '2012-03-05') VALUES ('val1', 'val2');
> cqlsh:ks> ASSUME test NAMES ARE text;
> cqlsh:ks> select * from test;
> 'utf8' codec can't decode byte 0xe1 in position 4: invalid continuation byte
> {noformat}
> the traceback with cqlsh --debug:
> {noformat}
> Traceback (most recent call last):
>   File "bin/cqlsh", line 581, in onecmd
>     self.handle_statement(st)
>   File "bin/cqlsh", line 606, in handle_statement
>     return custom_handler(parsed)
>   File "bin/cqlsh", line 663, in do_select
>     self.perform_statement_as_tokens(parsed.matched, decoder=decoder)
>   File "bin/cqlsh", line 666, in perform_statement_as_tokens
>     return self.perform_statement(cqlhandling.cql_detokenize(tokens), 
> decoder=decoder)
>   File "bin/cqlsh", line 693, in perform_statement
>     self.print_result(self.cursor)
>   File "bin/cqlsh", line 728, in print_result
>     self.print_static_result(cursor)
>   File "bin/cqlsh", line 742, in print_static_result
>     formatted_names = map(self.myformat_colname, colnames)
>   File "bin/cqlsh", line 413, in myformat_colname
>     wcwidth.wcswidth(name.decode(self.output_codec.name)))
>   File "/usr/local/Cellar/python/2.7.2/lib/python2.7/encodings/utf_8.py", 
> line 16, in decode
>     return codecs.utf_8_decode(input, errors, True)
> UnicodeDecodeError: 'utf8' codec can't decode byte 0xe1 in position 4: 
> invalid continuation byte
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to