[ https://issues.apache.org/jira/browse/CASSANDRA-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tyler Hobbs updated CASSANDRA-8101: ----------------------------------- Attachment: 8101.txt The ascii problem was exactly as the descriptions says, and the changes in AsciiType fix that. When it comes to UTF8, the issue runs deeper. There ended up being a netty bug (which I will open a ticket for shortly) that caused characters outside of the specified charset to be replaced (by \uFFFD). Since the native protocol specifies that all strings must be UTF-8, the validation happens in CBUtil.readString(). I've pushed a [dtest|https://github.com/thobbs/cassandra-dtest/tree/CASSANDRA-8101] to cover both cases. In addition to the attached patch, there's also a [branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-8101]. > Invalid ASCII and UTF-8 chars not rejected in CQL string literals > ----------------------------------------------------------------- > > Key: CASSANDRA-8101 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8101 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Tyler Hobbs > Assignee: Tyler Hobbs > Priority: Critical > Fix For: 2.0.11, 2.1.1 > > Attachments: 8101.txt > > > When processing CQL string literals, we ultimately use > {{String.getBytes(Charset)}}, which has the following note: > {quote} > This method always replaces malformed-input and unmappable-character > sequences with this charset's default replacement byte array. The > CharsetEncoder class should be used when more control over the encoding > process is required. > {quote} > So, if we insert a non-ASCII character into an ascii string literal, it will > be replaced with a {{?}} char. Something similar happens for UTF-8. > For example: > {noformat} > cqlsh:ks1> create table badstrings (a int primary key, b ascii); > cqlsh:ks1> insert into badstrings (a, b) VALUES ( 0, 'ΎΔδϠ'); > cqlsh:ks1> select * from badstrings; > a | b > ---+------ > 0 | ???? > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)