[jira] [Comment Edited] (CASSANDRA-10085) Allow BATCH with conditions to span multiple tables with same partition key

Dennis Roppelt (JIRA) Mon, 07 Sep 2015 00:08:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733311#comment-14733311
 ]


Dennis Roppelt edited comment on CASSANDRA-10085 at 9/7/15 7:06 AM:
--------------------------------------------------------------------

Hello Sylvain,
Thank you for the quick and informative reply.
I already anticipated that this would not be trivial to implement, and might go 
against some design principles created in C*.
I also can see why this won't be implemented soon or at all. I will look into a 
different approach of modelling my data or see if another database is a better 
fit (obviously standard RDBMS will work for the model).

It is a bit disappointing though, that this limitation was not mentioned in the 
documentation. Compared to other NoSQL-DBs, Cassandra usually has an 
outstandingly good documentation, with lots of examples and details.
How would I suggest adding a hint about the limitiation with batch + 
conditional statements across tables? Creating a new issue and link to this one?


was (Author: roppeldidu):
Hello Sylvain,
Thank you for the quick and informative reply.
I already anticipated that this would not be trivial to implement, and might go 
against some design principles created in C*.
I also can see why this won't be implemented soon or at all. I will look into a 
different approach of modelling my data or see if another database is a better 
fit (obviously standard RDBMS will work for the model).

It is a bit disappointing though, that this limitation was not mentioned in the 
documentation. Compared to other NoSQL-DBs, Cassandra has an outstandingly good 
documentation.
How would I suggest adding a hint about the limitiation with batch + 
conditional statements across tables? Creating a new issue and link to this one?

> Allow BATCH with conditions to span multiple tables with same partition key
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10085
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10085
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Dennis Roppelt
>              Labels: batch, conditional-statement, partitioning
>
> h4. Use case:
> I want to use batch to keep data between tables synchronized with help of 
> conditionals. In particular, I want to have a ONE-to-ONE relationship in my 
> data model. I dont want to have an unintended upsert while trying to maintain 
> a consistent one-to-one relationship (write requests from other clients that 
> try to create a different relationship with one of the keys I inteded to 
> use), which is why I would like the IF NOT EXISTS when inserting data into 
> multiple tables in a batch.
> But when trying to insert data in a batch-block with an IF NOT EXISTS, I get 
> the following response (reproduceable example further below):
> {code}
> cqlsh:testkeyspace> 
> BEGIN BATCH 
> INSERT INTO key1_to_key2 (partKey, key1, key2) values (1,2,3) IF NOT EXISTS; 
> INSERT INTO key2_to_key1 (partKey, key1, key2) values (1,2,3) IF NOT EXISTS; 
> APPLY BATCH;
> Bad Request: Batch with conditions cannot span multiple tables{code}
> The tables have a primary key on (partKey, key1) and (partKey, key2). Which 
> means that partKey in both cases is their partition key. Which in the example 
> also is the same value.
> h4. Why I want to use a BATCH-statement that way:
> In traditional databases, to design a one-to-one relationship, I would create 
> one table with two keys, both with a unique constraint:
> {code:sql}
> CREATE TABLE  `myOneToOneTable` (
>   `ID` int NOT NULL,
>   `KEY_ONE` int NOT NULL,
>   `KEY_TWO` int NOT NULL,
> // some other fields
>   PRIMARY KEY (`ID`),
>   UNIQUE KEY `KEY_ONE`,
>   UNIQUE KEY `KEY_TWO`
> );
> {code}
> (simplified example taken from an [one-to-one example for 
> mysql|http://www.mkyong.com/mysql/how-to-define-one-to-one-relationship-in-mysql/])
> As far as I know, unique constraints are not supported in Cassandra, so 
> another way around it would be to maintain the relationshipt between two 
> tables, each sharing the same partition key. For which I would like Cassandra 
> to make sure that when I insert data, either both are inserted at the same 
> time or none (to prevent other writing client to insert a different 
> one-to-one relationship than me). 
> [According to the documentation, this is what BATCH-Statements are used 
> for|http://docs.datastax.com/en/cql/3.3/cql/cql_using/useBatch.html]
> {quote}
> Instead, using batches to synchronize data to tables is a legitimate 
> operation. 
> {quote}
> I expected it to be able to do it, but as seen above, I cannot enforce INSERT 
> IF NOT EXISTS over multiple tables.
> I do understand the background as to *why* there is a limitation coded that 
> way, so that not too many nodes have to take part in processing of the batch. 
> However, if the operations use
> # primary keys in their statements for each table
> # all primary keys contain the same value, thus resulting in the same hash 
> for partioning
> # tables belong to the same keyspace
> the batch statement should be able to be limited to the same amount of nodes 
> as if the batch contained INSERT IF NOT EXISTs for only one table. 
> h4. Steps to reproduce desired behaviour:
> I used the [cassandra ova on virtual 
> box|http://www.planetcassandra.org/install-cassandra-ova-on-virtualbox/], but 
> also a Windows Setup with 2.2.0
> {code}
> Connected to Cluster on a Stick at localhost:9160.                            
>   
> [cqlsh 4.1.1 | Cassandra 2.0.7 | CQL spec 3.1.1 | Thrift protocol 19.39.0]    
>   
> Use HELP for help.                                                            
>   
> cqlsh> CREATE KEYSPACE conditionalBatch                                       
>   
> WITH REPLICATION = {'class':'SimpleStrategy','replication_factor':1};    
> cqlsh> use conditionalbatch ;                                                 
>   
> cqlsh:conditionalbatch> CREATE TABLE key1_to_key2 (                           
>   
> partKey int,                                            
> key1 int,                                               
> key2 int,                                               
> PRIMARY KEY (partKey, key1)                             
> );                                                      
> cqlsh:conditionalbatch> CREATE TABLE key2_to_key1 ( partKey int, key2 int, 
> key1 
> int, PRIMARY KEY (partKey, key2) 
> );                                             
> cqlsh:conditionalbatch> BEGIN BATCH                                           
>   
> INSERT INTO key1_to_key2 (partKey, key1, key2) VALUES (1,2,3) IF NOT EXISTS;  
>                                                           
> INSERT INTO key2_to_key1 (partKey, Key1, key2) VALUES (1,2,3) IF NOT EXISTS;  
>                                                           
> APPLY BATCH;                                            
> {code}
> h4. What should happen:
> Either succeding, or:
> {code}
>  applied | partkey | key1 | key2|                                             
>  
> -----------+---------+------+------                                           
>   
>      false |       1 |    2 |    3                                            
>   
>      false |       1 |    2 |    3   
> {code}
> h4. What happens instead:
> {code}
> Bad Request: Batch with conditions cannot span multiple tables 
> {code}
> In my opinion, this is a very crucial feature wich enables the users to 
> maintain unique constraints with a relatively small amount of work. There 
> certainly are other ways to maintain unique constrains, but I have not found 
> a trivial way within cassandra that lets me that easily. But maybe I have 
> overlooked something though.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-10085) Allow BATCH with conditions to span multiple tables with same partition key

Reply via email to