[ https://issues.apache.org/jira/browse/CASSANDRA-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127558#comment-15127558 ]
Stefania edited comment on CASSANDRA-11030 at 2/2/16 3:10 AM: -------------------------------------------------------------- You are correct, it finally works. I think I inserted the data initially by copy and paste in a git bash terminal (launched via ConEmu), the only one where I could paste a unicode character, but for this terminal the default encoding was cp1252 since I only worked out today how to change it to cp65001. So even if I inserted the data with --encoding=UTF-8 it would have probably caused problems. From other terminals (command prompt, power shell) I could not paste the character into cqlsh and trying to insert something like u'\uXXXX' would give a syntax error. The following works however (unicode.cql is encoded with utf-8): {code} chcp 65001 C:\Users\stefania\git\cstar\cassandra>type unicode.cql INSERT INTO test.test (val) VALUES ('não'); C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8 --file=unicode.cql C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8 Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 2.2.5-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] Use HELP for help. cqlsh> select * from test.test; val ----- não {code} The source command also works *provided the encoding specified via the command line is the same as the file encoding*, otherwise we get a missing character glyph (a square). Inserting the character directly from git bash also works now, but because I changed the code page to 65001 for it, otherwise it causes the original problem. You are probably right regarding changing default encoding, I'm + 1 to change it to 'utf-8' if you want. Also, shouldn't {{do_source}} use the same encoding as the file encoding? I think we should also stress that whichever terminal people are using on Windows, it should have the same encoding as the one used by cqlsh. We can commit this ticket as is and open a new ticket re. default encoding or change it here, up to you. was (Author: stefania): You are correct, it finally works. I think I inserted the data initially by copy and paste in a git bash terminal (launched via ConEmu), the only one where I could paste a unicode character, but for this terminal the default encoding was cp1252 since I only worked out today how to change it to cp65001. So even if I inserted the data with --encoding=UTF-8 it would have probably caused problems. From other terminals (command prompt, power shell) I could not paste the character into cqlsh and trying to insert something like u'\uXXXX' would give a syntax error. The following works however (unicode.cql is encoded with utf-8): {code} chcp 65001 C:\Users\stefania\git\cstar\cassandra>type unicode.cql INSERT INTO test.test (val) VALUES ('não'); C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8 --file=unicode.cql C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8 Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 2.2.5-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] Use HELP for help. cqlsh> select * from test.test; val ----- não {code} The source command also works *provided the encoding specified via the command line is the same as the file encoding*, otherwise we get a missing character glyph (a square). Inserting the character directly from git bash also works now, but because I changed the code page to 65001 for it, otherwise it causes the original problem. You are probably right regarding changing default encoding, I'm + 1 to change it to 'utf-8' if you want. Also, shouldn't {{do_source}} use the same encoding as the file encoding? I think we should also stress that whichever terminal people are using on Windows, it should have the same encoding as the one used by cqlsh. We can commit this ticket as it and open a new ticket re. default encoding or change it here, up to you. > utf-8 characters incorrectly displayed/inserted on cqlsh on Windows > ------------------------------------------------------------------- > > Key: CASSANDRA-11030 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11030 > Project: Cassandra > Issue Type: Bug > Reporter: Paulo Motta > Assignee: Paulo Motta > Priority: Minor > Labels: cqlsh, windows > > {noformat} > C:\Users\Paulo\Repositories\cassandra [2.2-10948 +6 ~1 -0 !]> .\bin\cqlsh.bat > --encoding utf-8 > Connected to test at 127.0.0.1:9042. > [cqlsh 5.0.1 | Cassandra 2.2.4-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4] > Use HELP for help. > cqlsh> INSERT INTO bla.test (bla ) VALUES ('não') ; > cqlsh> select * from bla.test; > bla > ----- > n?o > (1 rows) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)