[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225968#comment-14225968
 ] 

Sylvain Lebresne commented on CASSANDRA-8354:
-

Thing is, I'm not sure how we could properly convert people out of those empty 
values completely without breaking thrift compatibility.

I'm typically not sure how that {{strict_cql_values}} option would work in 
practice (would that be a global yaml option that affects thrift too btw?).

That said, I've kind of mixed two issues in this ticket. The main reason I've 
opened this was the UDF question, but I realize that this question is actually 
already a problem with {{null}} and so I've created a separate issue for it 
(CASSANDRA-8374). Provided we fix that latter issue, it's probably ok for UDT 
to consider that empty values (for types for which they are not reasonable 
values) are always converted to {{null}} (which is already how it works in 
fact).

Still, it would be nice to change the default for CQL so that empty values are 
refused. I'm just not sure I see how to make that happen in practice without a 
syntax addition.


 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-25 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224854#comment-14224854
 ] 

Sylvain Lebresne commented on CASSANDRA-8354:
-

The one data point is that I remember clearly having a debate with some arguing 
that using empty values as a way to emulate nulls in clustering columns was 
useful for him. Now, as much as I personally disagree with that, I would 
slightly prefer to take the less radical option of keeping the old behavior 
possible, but hidden behind a non-default flag. Provided we properly document 
it and make it clear that it's a bad idea to use unless for backward 
compatibility sake, I don't think having it in the syntax is such a bug deal. 
In fact, that ALLOW EMPTY (or whatever equivalent syntax) could be useful for 
types like strings or blob when you know an empty string/blob doesn't make 
sense and you want the database to validate it (that is, allowing more precise 
validation server side is not a bad thing imo).

The other thing is that automatically and inconditionally converting empty 
values on upgrade could be a pretty painful upgrade for users that do use those 
empty values.

Anyway, my point is, I wish as much as anyone else that we had no empty value 
for type for which it doesn't make sense from day one, but since that's not the 
case, I'd have a preference for the option that give us the proper default 
while making it as little painful as possible for upgraders (even those 
upgraders that we disagree with).

 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-25 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14224867#comment-14224867
 ] 

Aleksey Yeschenko commented on CASSANDRA-8354:
--

Orthogonally to the primary question here, can we maybe start allowing explicit 
null in partition key columns/clustering columns? (encode size as -1, as we do 
for tuples and UDTs now).

 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225824#comment-14225824
 ] 

Jonathan Ellis commented on CASSANDRA-8354:
---

bq. ALLOW EMPTY (or whatever equivalent syntax) could be useful for types 
like strings or blob when you know an empty string/blob doesn't make sense and 
you want the database to validate it

This is too specific a use case to warrant special syntax.  We can certainly 
add CHECK constraints using UDF though.

 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223462#comment-14223462
 ] 

Jonathan Ellis commented on CASSANDRA-8354:
---

Is there a way we can avoid permanently enshrining this wart?

What if for instance we added an option {{strict_cql_values}} to 3.0 that 
defaults to false.  When enabled it rejects nonsensical empty values.  For 3.1 
we default to true, and give people a tool to convert empty to null or some 
other value.  For 4.0 it stays permanently true.

 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-24 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223472#comment-14223472
 ] 

Aleksey Yeschenko commented on CASSANDRA-8354:
--

bq. What if for instance we added an option strict_cql_values to 3.0 that 
defaults to false. When enabled it rejects nonsensical empty values. For 3.1 we 
default to true, and give people a tool to convert empty to null or some other 
value. For 4.0 it stays permanently true.

That. Except it's not just CQL, there is thrift too, where we should enforce 
this, so maybe should name it 'reject_empty_types' or something. As a tool, 
upgradesstables will probably do.

Don't want to legitimize it on CQL syntax level either.

 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8354) A better story for dealing with empty values

2014-11-21 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220930#comment-14220930
 ] 

Robert Stupp commented on CASSANDRA-8354:
-

I'd prefer something like {{ALLOW NULL}} for UDFs since {{null}} and empty 
are equivalent for a UDF (it cannot handle an _empty int_ or _empty uuid_).

 A better story for dealing with empty values
 

 Key: CASSANDRA-8354
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8354
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
 Fix For: 3.0


 In CQL, a value of any type can be empty, even for types for which such 
 values doesn't make any sense (int, uuid, ...). Note that it's different from 
 having no value (i.e. a {{null}}). This is due to historical reasons, and we 
 can't entirely disallow it for backward compatibility, but it's pretty 
 painful when working with CQL since you always need to be defensive about 
 such largely non-sensical values.
 This is particularly annoying with UDF: those empty values are represented as 
 {{null}} for UDF and that plays weirdly with UDF that use unboxed native 
 types.
 So I would suggest that we introduce variations of the types that don't 
 accept empty byte buffers for those type for which it's not a particularly 
 sensible value.
 Ideally we'd use those variant by default, that is:
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int)
 {noformat}
 would not accept empty values for {{v}}. But
 {noformat}
 CREATE TABLE foo (k text PRIMARY, v int ALLOW EMPTY)
 {noformat}
 would.
 Similarly, for UDF, a function like:
 {noformat}
 CREATE FUNCTION incr(v int) RETURNS int LANGUAGE JAVA AS 'return v + 1';
 {noformat}
 would be guaranteed it can only be applied where no empty values are allowed. 
 A
 function that wants to handle empty values could be created with:
 {noformat}
 CREATE FUNCTION incr(v int ALLOW EMPTY) RETURNS int ALLOW EMPTY LANGUAGE JAVA 
 AS 'return (v == null) ? null : v + 1';
 {noformat}
 Of course, doing that has the problem of backward compatibility. One option 
 could be to say that if a type doesn't accept empties, but we do have an 
 empty internally, then we convert it to some reasonably sensible default 
 value (0 for numeric values, the smallest possible uuid for uuids, etc...). 
 This way, we could allow convesion of types to and from 'ALLOW EMPTY'. And 
 maybe we'd say that existing compact tables gets the 'ALLOW EMPTY' flag for 
 their types by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)