[ 
https://issues.apache.org/jira/browse/CASSANDRA-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14317902#comment-14317902
 ] 

Johnny Miller edited comment on CASSANDRA-8790 at 2/12/15 9:48 AM:
-------------------------------------------------------------------

[~brandon.williams] I appreciate your point, but [~thobbs] example isn't the 
actual problem. Its specifically non-printable unicode characters.

This isn't a problem, when your using the drivers - it is only when using CQLSH.

The problem is that if I start writing data like this via the drivers and then 
for some reason I ever need to query it via CQLSH I will get the wrong answer 
when doing a select.

i.e. SELECT * from testunicode where id = 'state\u001Ccard' will not return any 
results via CQLSH when the data does actually exist in my table and the same 
query via the drivers (java) would actually return a result.

The workaround to use blobs is not great (IMHO) - its a shame to have model 
your data round this specific CQLSH limitation

We should either provide some functionality to handle this this or 
alternatively error if someone enters a non-printable unicode character in 
CQLSH as the answer we get back is incorrect and likely to mislead people.

If you have an example of handling this via CQLSH on any OS, please share it


was (Author: johnny15676):
[~brandon.williams] I appreciate your point, but [~thobbs] example isn't the 
actual problem. Its specifically non-printable unicode characters.

This isn't a problem, when your using the drivers - it is only when using CQLSH.

The problem is that if I start writing data like this via the drivers and then 
for some reason I ever need to query it via CQLSH I will get the wrong answer 
when doing a select.

i.e. SELECT * from testunicode where id = 'state\u001Ccard' will not return any 
results via CQLSH when the data does actually exist in my table and the same 
query via the drivers (java) would actually return a result.

The workaround to use blobs is not great (IMHO) - its a shame to have model 
your data round this specific CQLSH limitation

We should either provide some functionality to handle this this or 
alternatively error if someone enters a non-printable unicode character in 
CQLSH as the answer we get back is incorrect and likely to mislead people.

> Improve handling of non-printable unicode characters in text fields and CQLSH
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8790
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8790
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Johnny Miller
>            Priority: Minor
>
> Currently to store a string/text value that contains a non-printable unicode 
> character and than subsequently be able to query it CQLSH I need to store the 
> field as a blob via the blobAsText and textAsBlob functions. 
> This is not really optimal - it would be better if CQLSH handled this rather 
> than having to model data around this limitation.
> For example:
> {code:title=Example Code|borderStyle=solid}
> String createTableCql = "CREATE TABLE IF NOT EXISTS test_ks.testunicode (id 
> blob PRIMARY KEY, inserted_on timestamp, lorem text)";
> session.execute(createTableCql);
> System.out.println("Table created.");         
>               
> String dimension1 = "state";
> String dimension2 = "card";
> String key = dimension1 + '\u001C' + dimension2;
> Date now = new Date();
> String lorem = "Lorem ipsum dolor sit amet.";
>               
> String insertcql = "INSERT INTO testunicode (id, inserted_on, lorem) VALUES 
> (textAsBlob(?), ?, ?)";
> PreparedStatement ps = session.prepare(insertcql);
> BoundStatement bs = new BoundStatement(ps);
> bs.bind(key, now, lorem);
> session.execute(bs);
> System.out.println("Row inserted with key "+key);             
>               
> String selectcql = "SELECT blobAsText(id) AS id, inserted_on, lorem FROM 
> testunicode WHERE id = textAsBlob(?)";
> PreparedStatement ps2 = session.prepare(selectcql);
> BoundStatement bs2 = new BoundStatement(ps2);
> bs2.bind(key);
> ResultSet results = session.execute(bs2);
>               
> System.out.println("Got results...");
>               
> for (Row row : results) {
>                   System.out.println(String.format("%-30s\t%-20s\t%-20s", 
> row.getString("id"), row.getDate("inserted_on"), row.getString("lorem")));
> }
> {code}
> And to query via CQLSH:
> {code}
> select * from testunicode where id = 0x73746174651c63617264 ;
>  id                     | inserted_on              | lorem
> ------------------------+--------------------------+-----------------------------
>  0x73746174651c63617264 | 2015-02-11 20:32:20+0000 | Lorem ipsum dolor sit 
> amet.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to