[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903890#comment-15903890
 ] 

Ryan Svihla commented on CASSANDRA-13315:
-

I don't think any should be removed, but nearly every time I dig into an 
EACH_QUORUM use case with someone..it ends up being not what they want. 
EACH_QUORUM doesn't roll back the writes, so even if it fails because of a lack 
of replicas you can still be returning the 'failed write' successfully on the 
nodes it did succeed on, so in effect during DC connection outages unless you 
just turn writes off you get divergence between the TWO DCs and reads in one DC 
show up and not in another. Also several customers with EACH_QUORUM have had 
downgrading retry policy on..defeating it entirely.

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> o
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
> EDIT better names bases on comments.
>* EVENTUALLY = LOCAL_ONE reads and writes
>* STRONG = LOCAL_QUORUM reads and writes
>* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know 
> what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as 
> correct as Id like)
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Russell Spitzer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903888#comment-15903888
 ] 

Russell Spitzer commented on CASSANDRA-13315:
-

+1 On moving towards semantically meaningful terms instead of technically 
correct ones. 

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> o
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
> EDIT better names bases on comments.
>* EVENTUALLY = LOCAL_ONE reads and writes
>* STRONG = LOCAL_QUORUM reads and writes
>* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know 
> what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as 
> correct as Id like)
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903882#comment-15903882
 ] 

DOAN DuyHai commented on CASSANDRA-13315:
-

As long as you don't remove the EACH_QUORUM I'm fine. There are very rare cases 
(where 2 DCs are very close geographically to each other) where customer want 
EACH_QUORUM to be sure that mutations has been applied in both DCs.

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> o
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
> EDIT better names bases on comments.
>* EVENTUALLY = LOCAL_ONE reads and writes
>* STRONG = LOCAL_QUORUM reads and writes
>* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know 
> what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as 
> correct as Id like)
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Caleb Rackliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903649#comment-15903649
 ] 

Caleb Rackliffe commented on CASSANDRA-13315:
-

+1 on the idea of synonyms as recipes, and I'll add some naming ideas, because 
why not?

EVENTUAL_SAME_DC
STRONG_SAME_DC
SERIAL_SAME_DC

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> o
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
> EDIT better names bases on comments.
>* EVENTUALLY = LOCAL_ONE reads and writes
>* STRONG = LOCAL_QUORUM reads and writes
>* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know 
> what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as 
> correct as Id like)
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903481#comment-15903481
 ] 

Ryan Svihla commented on CASSANDRA-13315:
-

For SERIAL/TRANSACTIONAL I think there is a better middle ground name in 
there..LIGHTWEIGHT_TRANSACTION maybe, more cassandra specific and a hint for 
new users without being as technically incorrect as TRANSACTIONAL

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> o
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
> EDIT better names bases on comments.
>* EVENTUALLY = LOCAL_ONE reads and writes
>* STRONG = LOCAL_QUORUM reads and writes
>* SERIAL = LOCAL_SERIAL reads and writes (though a ton of folks dont know 
> what SERIAL means so this is why I suggested TRANSACTIONAL even if its not as 
> correct as Id like)
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903468#comment-15903468
 ] 

Ryan Svihla commented on CASSANDRA-13315:
-

Jeff how about just tagging it with _DC..slight push back everyone is happily 
calling multidc rdbms acid compliant even when thats only inside a dc

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
>* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes
>* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes
>* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Benjamin Roth (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903459#comment-15903459
 ] 

Benjamin Roth commented on CASSANDRA-13315:
---

I had the same problems in the beginning, so generally +1. But IMHO this should 
go along with an explaining section in the official docs.

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
>* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes
>* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes
>* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903457#comment-15903457
 ] 

Jeff Jirsa commented on CASSANDRA-13315:


Bikeshed: Calling {{LOCAL_}} anything highly or strong consistency is probably 
asking for trouble.

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
>* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes
>* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes
>* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903419#comment-15903419
 ] 

Ryan Svihla commented on CASSANDRA-13315:
-

On the power users it'll just be up to the driver implementers right how they 
handle that (different packages and methods for the power user for example).

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps (EDIT based on Jonathan's comment):
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket to set, conditions still need to be 
> required for SERIAL/LOCAL_SERIAL
> 2. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
>* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes
>* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes
>* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903404#comment-15903404
 ] 

Ryan Svihla commented on CASSANDRA-13315:
-

those are better names +1 on that.

Dual CL yeah I mistated that and we're on the same page with intent, as I've 
stated it's hard to even talk about it in text without getting bewildered. Just 
as long as we have only a single bucket to set and we require a condition for 
SERIAL I'm fine. 

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps:
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket at the protocol level.
> 2. To enable #1 just reject writes or updates done without a condition when 
> SERIAL/LOCAL_SERIAL is specified.
> 3. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
>* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes
>* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes
>* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13315) Consistency is confusing for new users

2017-03-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15903379#comment-15903379
 ] 

Jonathan Ellis commented on CASSANDRA-13315:


I like this idea a lot.  We have a lot more experience now with how people use 
and misuse CL in the wild so I am comfortable getting a lot more opinionated in 
how we push people towards certain options and away from others.

1/2: The dual CL for Serial isn't for what to do w/ no condition, it's for the 
"commit" to EC land from the Paxos sandbox.  So mandating a condition (don't we 
already?) doesn't make that go away.  But, I think we could make that default 
to Q and call it good.  (I'm having trouble thinking of a situation where you 
would need LWT, which requires a quorum to participate already, but also need 
lower CL on commit.)

3: I would bikeshed this to

# EVENTUAL
# STRONG
# SERIAL

4. It sounds like we can do all of this at the drivers level except for adding 
some aliases to CQLSH.  I don't see any benefit to adding synonyms at the 
protocol level. 

5. How do we give power users the ability to use classic CL if they need it?

> Consistency is confusing for new users
> --
>
> Key: CASSANDRA-13315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13315
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ryan Svihla
>
> New users really struggle with consistency level and fall into a large number 
> of tarpits trying to decide on the right one.
> 1. There are a LOT of consistency levels and it's up to the end user to 
> reason about what combinations are valid and what is really what they intend 
> it to be. Is there any reason why write at ALL and read at CL TWO is better 
> than read at CL ONE? 
> 2. They require a good understanding of failure modes to do well. It's not 
> uncommon for people to use CL one and wonder why their data is missing.
> 3. The serial consistency level "bucket" is confusing to even write about and 
> easy to get wrong even for experienced users.
> So I propose the following steps:
> 1. Remove the "serial consistency" level of consistency levels and just have 
> all consistency levels in one bucket at the protocol level.
> 2. To enable #1 just reject writes or updates done without a condition when 
> SERIAL/LOCAL_SERIAL is specified.
> 3. add 3 new consistency levels pointing to existing ones but that infer 
> intent much more cleanly:
>* EVENTUALLY_CONSISTENT = LOCAL_ONE reads and writes
>* HIGHLY_CONSISTENT = LOCAL_QUORUM reads and writes
>* TRANSACTIONALLY_CONSISTENT = LOCAL_SERIAL reads and writes
> for global levels of this I propose keeping the old ones around, they're 
> rarely used in the field except by accident or particularly opinionated and 
> advanced users.
> Drivers should put the new consistency levels in a new package and docs 
> should be updated to suggest their use. Likewise setting default CL should 
> only provide those three settings and applying it for reads and writes at the 
> same time.
> CQLSH I'm gonna suggest should default to HIGHLY_CONSISTENT. New sysadmins 
> get surprised by this frequently and I can think of a couple very major 
> escalations because people were confused what the default behavior was.
> The benefit to all this change is we shrink the surface area that one has to 
> understand when learning Cassandra greatly, and we have far less bad initial 
> experiences and surprises. New users will more likely be able to wrap their 
> brains around those 3 ideas more readily then they can "what happens when I 
> have RF2, QUROUM writes and ONE reads". Advanced users get access to all the 
> way still, while new users don't have to learn all the ins and outs of 
> distributed theory just to write data and be able to read it back.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)