[jira] [Commented] (CASSANDRA-8480) Update of primary key should be possible

2014-12-16 Thread Jason Kania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248363#comment-14248363
 ] 

Jason Kania commented on CASSANDRA-8480:


Ultimately as active contributors, you will decide whether you want to do this, 
but the issue is that usability of the DB is seriously hampered. I have read 
many responses from the Cassandra development team that make the statement that 
the schema modeling is incorrect, but without a really comprehensive set of 
examples to explain how one would model in such scenarios, it will drive away 
users. I personally have no idea how I could model other than what I have done 
given the circular dependencies that I stated above. I have had to model to 
accommodate restrictions as the majority of my efforts and if you look at the 
problems many people encounter and ask for assistance with, it is usually tied 
to these restrictions.

The problem I stated above that if you can update a column, you can't search 
for it, or if you can search on a column, you can't update it shuts down many 
uses of the database.

 Update of primary key should be possible
 

 Key: CASSANDRA-8480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8480
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Jason Kania

 While attempting to update a column in a row, I encountered the error
 PRIMARY KEY part thingy found in SET part
 The error is not helpful as it doesn't state why this is problem so I looked 
 on google and encountered many, many entries from people who have experienced 
 the issue including those with single column table who have to hack to work 
 around this.
 After looking around further in the documentation, I discovered that it is 
 not possible to update a primary key but I still have not found a good 
 explanation. I suspect that that this is because it would change the indexing 
 location of the record effectively requiring a delete followed by an insert. 
 If the question is one of guaranteeing no update to a deleted row, a client 
 will have the same issue.
 To me, this really should be handled behind the API because:
 1) it is an expected capability in a database to update all columns and 
 having these limitations only puts off potential users especially when they 
 have to discover the limitation after the fact
 2) being able to use a column in a WHERE clause requires it to be part of the 
 primary key so what this limitation means is if you can update a column, you 
 can't search for it, or if you can search on a column, you can't update it 
 which leaves a serious gap in handling a wide number of use cases.
 3) deleting and inserting a row with an updated primary key will mean sucking 
 in all the data from the row up to the client and sending it all back down 
 even when a single column in the primary key was all that was updated.
 Why not document the issue but make the interface more usable by supporting 
 the operation?
 Jason



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8480) Update of primary key should be possible

2014-12-16 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248388#comment-14248388
 ] 

Aleksey Yeschenko commented on CASSANDRA-8480:
--

Don't get me wrong. We obviously want Cassandra to be as usable as possible, 
and CQL as expressive as possible, too.

Yet, there are things that are fundamentally anti-Cassandra, and this request 
is one of those things. One of the core principles of Cassandra is not 
introducing implicit reads to the write path, thus having the write path have 
consistent performance characteristics.

Another ticket similar to this one is CASSANDRA-6750. That one would also be 
good to have. We recognize that. That said, being a distributed database with a 
focus on scaling out, we have to restrict certain functionality that doesn't 
fit the overall direction of the project :(

 Update of primary key should be possible
 

 Key: CASSANDRA-8480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8480
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Jason Kania

 While attempting to update a column in a row, I encountered the error
 PRIMARY KEY part thingy found in SET part
 The error is not helpful as it doesn't state why this is problem so I looked 
 on google and encountered many, many entries from people who have experienced 
 the issue including those with single column table who have to hack to work 
 around this.
 After looking around further in the documentation, I discovered that it is 
 not possible to update a primary key but I still have not found a good 
 explanation. I suspect that that this is because it would change the indexing 
 location of the record effectively requiring a delete followed by an insert. 
 If the question is one of guaranteeing no update to a deleted row, a client 
 will have the same issue.
 To me, this really should be handled behind the API because:
 1) it is an expected capability in a database to update all columns and 
 having these limitations only puts off potential users especially when they 
 have to discover the limitation after the fact
 2) being able to use a column in a WHERE clause requires it to be part of the 
 primary key so what this limitation means is if you can update a column, you 
 can't search for it, or if you can search on a column, you can't update it 
 which leaves a serious gap in handling a wide number of use cases.
 3) deleting and inserting a row with an updated primary key will mean sucking 
 in all the data from the row up to the client and sending it all back down 
 even when a single column in the primary key was all that was updated.
 Why not document the issue but make the interface more usable by supporting 
 the operation?
 Jason



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8480) Update of primary key should be possible

2014-12-16 Thread Jason Kania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248421#comment-14248421
 ] 

Jason Kania commented on CASSANDRA-8480:


Thanks for the response and explanation.

I am quite certain that with all your combined efforts to date you are looking 
to make Cassandra as usable as possible. It is just a question of what you rule 
out as possible versus documenting as performance impacting. I have worked in 
the capacity of performance architect on several large scale production systems 
and have had to balance user needs versus performance many times. My experience 
is that when choosing between stopping what users can do versus having the big 
flashing danger sign, the big flashing danger sign is usually what the end 
users are looking for.

I would suggest that it might be worth polling users about which approach would 
work versus falling back to core principles that may not be in the best 
interests of wider product adoption.

 Update of primary key should be possible
 

 Key: CASSANDRA-8480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8480
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Jason Kania

 While attempting to update a column in a row, I encountered the error
 PRIMARY KEY part thingy found in SET part
 The error is not helpful as it doesn't state why this is problem so I looked 
 on google and encountered many, many entries from people who have experienced 
 the issue including those with single column table who have to hack to work 
 around this.
 After looking around further in the documentation, I discovered that it is 
 not possible to update a primary key but I still have not found a good 
 explanation. I suspect that that this is because it would change the indexing 
 location of the record effectively requiring a delete followed by an insert. 
 If the question is one of guaranteeing no update to a deleted row, a client 
 will have the same issue.
 To me, this really should be handled behind the API because:
 1) it is an expected capability in a database to update all columns and 
 having these limitations only puts off potential users especially when they 
 have to discover the limitation after the fact
 2) being able to use a column in a WHERE clause requires it to be part of the 
 primary key so what this limitation means is if you can update a column, you 
 can't search for it, or if you can search on a column, you can't update it 
 which leaves a serious gap in handling a wide number of use cases.
 3) deleting and inserting a row with an updated primary key will mean sucking 
 in all the data from the row up to the client and sending it all back down 
 even when a single column in the primary key was all that was updated.
 Why not document the issue but make the interface more usable by supporting 
 the operation?
 Jason



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8480) Update of primary key should be possible

2014-12-16 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248485#comment-14248485
 ] 

Aleksey Yeschenko commented on CASSANDRA-8480:
--

You are not wrong. But it's not just about core principles (which are 
important, but not the only things that matters). It's also about complexity.

Imagine a query like `UPDATE foo SET partition_key = 'bar' WHERE partition_key 
= 'baz'`. Executing it would involve reading the whole partition (potentially 
from many nodes, depending on CL), and very likely streaming it to a whole new 
set of replicas. This would require an entirely new write code path, and would 
break in spectacular ways from time to time. Operations like that are what 
Spark is for, not a good fit for CQL. Besides, it would not be idempotent, and 
writes must be idempotent in C* (with the exception of counters, and let's 
forget about lists for a second).

Ultimately, this is not an often-requested feature, if at all, and it's not 
something that can be implemented *well* on the Cassandra side, if at all, in a 
general way. So, on balance (complexity of implementation plus fundamental 
incompatibility with Cassandra write path vs. users' desire for the feature) 
the chances of this wish materializing are not high. Sorry.

 Update of primary key should be possible
 

 Key: CASSANDRA-8480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8480
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Jason Kania

 While attempting to update a column in a row, I encountered the error
 PRIMARY KEY part thingy found in SET part
 The error is not helpful as it doesn't state why this is problem so I looked 
 on google and encountered many, many entries from people who have experienced 
 the issue including those with single column table who have to hack to work 
 around this.
 After looking around further in the documentation, I discovered that it is 
 not possible to update a primary key but I still have not found a good 
 explanation. I suspect that that this is because it would change the indexing 
 location of the record effectively requiring a delete followed by an insert. 
 If the question is one of guaranteeing no update to a deleted row, a client 
 will have the same issue.
 To me, this really should be handled behind the API because:
 1) it is an expected capability in a database to update all columns and 
 having these limitations only puts off potential users especially when they 
 have to discover the limitation after the fact
 2) being able to use a column in a WHERE clause requires it to be part of the 
 primary key so what this limitation means is if you can update a column, you 
 can't search for it, or if you can search on a column, you can't update it 
 which leaves a serious gap in handling a wide number of use cases.
 3) deleting and inserting a row with an updated primary key will mean sucking 
 in all the data from the row up to the client and sending it all back down 
 even when a single column in the primary key was all that was updated.
 Why not document the issue but make the interface more usable by supporting 
 the operation?
 Jason



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8480) Update of primary key should be possible

2014-12-15 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246642#comment-14246642
 ] 

Aleksey Yeschenko commented on CASSANDRA-8480:
--

The explanation for this is that primary key columns are internally parts of 
the cell name (or a partition's key), not a cell's value. And while you can 
update a cell's value, you can't update its name (and most certainly not a 
partition key).

Indeed, we kinda sorta maybe could implement it by deleting the old record 
entirely and writing a totally new one behind the scenes. I don't see a way to 
make it eventually consistent, however. Plus, it's at best debatable whether or 
not further hiding what's really going internally is a good thing.

If you need to update part of primary key, then in Cassandra view, you've 
modeled your schema incorrectly, and should redesign it.

 Update of primary key should be possible
 

 Key: CASSANDRA-8480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8480
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Jason Kania

 While attempting to update a column in a row, I encountered the error
 PRIMARY KEY part thingy found in SET part
 The error is not helpful as it doesn't state why this is problem so I looked 
 on google and encountered many, many entries from people who have experienced 
 the issue including those with single column table who have to hack to work 
 around this.
 After looking around further in the documentation, I discovered that it is 
 not possible to update a primary key but I still have not found a good 
 explanation. I suspect that that this is because it would change the indexing 
 location of the record effectively requiring a delete followed by an insert. 
 If the question is one of guaranteeing no update to a deleted row, a client 
 will have the same issue.
 To me, this really should be handled behind the API because:
 1) it is an expected capability in a database to update all columns and 
 having these limitations only puts off potential users especially when they 
 have to discover the limitation after the fact
 2) being able to use a column in a WHERE clause requires it to be part of the 
 primary key so what this limitation means is if you can update a column, you 
 can't search for it, or if you can search on a column, you can't update it 
 which leaves a serious gap in handling a wide number of use cases.
 3) deleting and inserting a row with an updated primary key will mean sucking 
 in all the data from the row up to the client and sending it all back down 
 even when a single column in the primary key was all that was updated.
 Why not document the issue but make the interface more usable by supporting 
 the operation?
 Jason



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)