[jira] [Comment Edited] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-18 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063908#comment-15063908
 ] 

Paulo Motta edited comment on CASSANDRA-10875 at 12/18/15 12:45 PM:


bq. Furthermore, cqlsh --encoding=utf8 doesn't seem to work correctly.

You're right. Although the encoding is correctly used for encoding the output, 
it's not being used to decode the input (which uses ascii by default).

The use of {{sys.setdefaultencoding('utf-8')}} is 
[discouraged|http://blog.notdot.net/2010/07/Getting-unicode-right-in-Python]. 
Can you try the following approach?

{noformat}
diff --git a/bin/cqlsh b/bin/cqlsh
index 651420d..115cc09 100755
--- a/bin/cqlsh
+++ b/bin/cqlsh
@@ -929,7 +929,7 @@ class Shell(cmd.Cmd):
 
 def get_input_line(self, prompt=''):
 if self.tty:
-self.lastcmd = raw_input(prompt)
+self.lastcmd = raw_input(prompt).decode(self.encoding)
 line = self.lastcmd + '\n'
 else:
 self.lastcmd = self.stdin.readline()
{noformat}

If it works, please provide patches for 2.1, 2.2, 3.0 and trunk (if they don't 
merge up correctly).



was (Author: pauloricardomg):
bq. Furthermore, cqlsh --encoding=utf8 doesn't seem to work correctly.

You're right. Although the encoding is correctly used for encoding the output, 
it's not being used to decode the input (which uses ascii by default).

The use of . Can you try the following approach?


> cqlsh fails to decode utf-8 characters for text typed columns.
> --
>
> Key: CASSANDRA-10875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
> Fix For: 2.1.13, 3.1
>
> Attachments: 10875-2.1.12.txt, 10875-3.1.txt
>
>
> Hi, we've found a bug that cqlsh can't handle unicode text in select 
> conditions even if it were text type.
> {noformat}
> $ ./bin/cqlsh
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.2-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> cqlsh> create table test.test(txt text primary key);
> cqlsh> insert into test.test (txt) values('日本語');
> cqlsh> select * from test.test where txt='日本語';
> 'ascii' codec can't decode byte 0xe6 in position 35: ordinal not in range(128)
> cqlsh> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10875) cqlsh fails to decode utf-8 characters for text typed columns.

2015-12-17 Thread Yasuharu Goto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15063294#comment-15063294
 ] 

Yasuharu Goto edited comment on CASSANDRA-10875 at 12/18/15 2:14 AM:
-

Thank you for your response!

I didn't notice --encoding option.
I checked --help and --encoding option.

In cassandra-2.1.9, cqlsh doesn't have --encoding option.
{noformat}

$ cqlsh --help
Usage: cqlsh [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY
Use the provided Thrift transport factory function.
  --debug   Show additional debugging information
  --cqlversion=CQLVERSION
Specify a particular CQL version (default: 3).
Examples: "2", "3.0.0-beta1"
  -2, --cql2Shortcut notation for --cqlversion=2
  -3, --cql3Shortcut notation for --cqlversion=3

Connects to localhost:9160 by default. These defaults can be changed by
setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number)
are given on the command line, they take precedence over any defaults.

$ cqlsh --encode=utf8
Usage: cqlsh [options] [host [port]]

cqlsh: error: no such option: --encode
{noformat}

In Cassandra-3.0.0, cqlsh has it. But the help says encoding is utf8 already.
{noformat}
./cqlsh --help
Usage: cqlsh.py [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  --ssl Use SSL
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  --debug   Show additional debugging information
  --encoding=ENCODING   Specify a non-default encoding for output.  If you are
experiencing problems with unicode characters, using
utf8 may fix the problem. (Default from system
preferences: UTF-8)
  --cqlshrc=CQLSHRC Specify an alternative cqlshrc file location.
  --cqlversion=CQLVERSION
Specify a particular CQL version (default: 3.3.1).
Examples: "3.0.3", "3.1.0"
  -e EXECUTE, --execute=EXECUTE
Execute the statement and quit.
  --connect-timeout=CONNECT_TIMEOUT
Specify the connection timeout in seconds (default: 5
seconds).

Connects to 127.0.0.1:9042 by default. These defaults can be changed by
setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number)
are given on the command line, they take precedence over any defaults.
{noformat}

Furthermore,  cqlsh --encoding=utf8 doesn't seem to work correctly.

{noformat}
./cqlsh --encoding=utf8
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.0.0 | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> select * from test where id='日本語';
'ascii' codec can't decode byte 0xe6 in position 29: ordinal not in range(128)
cqlsh> 
{noformat}


was (Author: yasuharu):
Thank you for your response!

I didn't notice --encoding option.
I checked --help and --encoding option.

In cassandra-2.1.9, cqlsh doesn't have --encoding option.
{noformat}

$ cqlsh --help
Usage: cqlsh [options] [host [port]]

CQL Shell for Apache Cassandra

Options:
  --version show program's version number and exit
  -h, --helpshow this help message and exit
  -C, --color   Always use color output
  --no-colorNever use color output
  -u USERNAME, --username=USERNAME
Authenticate as user.
  -p PASSWORD, --password=PASSWORD
Authenticate using password.
  -k KEYSPACE, --keyspace=KEYSPACE
Authenticate to the given keyspace.
  -f FILE, --file=FILE  Execute commands from FILE, then exit
  -t TRANSPORT_FACTORY, --transport-factory=TRANSPORT_FACTORY