[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509053#comment-17509053 ] Ekaterina Dimitrova edited comment on CASSANDRA-17164 at 3/18/22, 7:41 PM: --- I didn't find a ticket for this one and it is consistently failing. Is there a plan? Or a ticket that I am missing? CC our build lead this week [~stefan.miklosovic] was (Author: e.dimitrova): I didn't find a ticket for this one and it is consistently failing. Is there a plan? Or a ticket for it? CC our build lead this week [~stefan.miklosovic] > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504677#comment-17504677 ] Ekaterina Dimitrova edited comment on CASSANDRA-17164 at 3/11/22, 8:02 PM: --- This commit broke all unit tests - compression and cdc. I believe because of removed empty line in the test cassandra.yaml. I will test now and if I am right I will ninja fix it... [https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/] was (Author: e.dimitrova): This commit broke all unit tests - compression and cdc. I believe because of removed empty line in the test cassandra.yaml. I will test now and if I am write I will ninja fix it... [https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/] > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504677#comment-17504677 ] Ekaterina Dimitrova edited comment on CASSANDRA-17164 at 3/11/22, 1:27 AM: --- This commit broke all unit tests - compression and cdc. I believe because of removed empty line in the test cassandra.yaml. I will test now and if I am write I will ninja fix it... [https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/] was (Author: e.dimitrova): This commit broke all unit tests - compression and cdc. I believe because of removed empty line is the tests cassandra.yaml. I will test now and if I am write I will ninja fix it... https://jenkins-cm4.apache.org/job/Cassandra-trunk/1002/ > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17491985#comment-17491985 ] Branimir Lambov edited comment on CASSANDRA-17164 at 2/14/22, 1:14 PM: --- I am ready with my high-level review. The new solution looks good to me and I am convinced of its correctness. It is a great improvement on both the performance and a number of issues of the older solution. I tried to [document|https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md] everything I found non-trivial to find and/or understand. This text will hopefully help all others looking at this code to understand the implementation's logic and should be included in the patch (with any corrections and additions you may want to add). was (Author: blambov): I am ready with my high-level review. The new version looks good to me and I am convinced of its correctness. It is a great improvement on both the performance and a number of issues of the older solution. I tried to [document|https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md] everything I found non-trivial to find and/or understand. This text will hopefully help all others looking at this code to understand the implementation's logic and should be included in the patch (with any corrections and additions you may want to add). > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17483827#comment-17483827 ] Branimir Lambov edited comment on CASSANDRA-17164 at 1/28/22, 4:03 PM: --- This is okay. I'm writing documentation with proofs, and having them separated makes things a little harder to prove. However, it's not hard to map what we actually do to the proof's definitions, so I added an explanation instead. I've reached this far: [https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md] I need some clarification on read commutativity/concurrent reads. It looks like we use "promised" (i.e. the read promise) when we decide acceptance of a proposal (the JavaDoc description also says so). This is sensible and correct and appears necessary, but does this not mean that no reads can actually execute concurrently? In other words, do we really gain anything by storing read promises separately? If we do skip proposing on a "safe" read, this is not a problem, but this is a separate optimization that does not appear to require read promises. was (Author: blambov): This is okay. I'm writing documentation with proofs, and having them separated makes things a little harder to prove. However, it's not hard to map what we actually do to the proof's definitions, so I added an explanation instead. I've reached this far: [https://github.com/blambov/cassandra/blob/17164-trunk/src/java/org/apache/cassandra/service/paxos/Paxos.md] I need some clarification on read commutativity/concurrent reads. It looks like we use "promised" (i.e. the read promise) when we decide acceptance of a proposal (the JavaDoc description also says so). This is sensible and correct and appears necessary, but does this not mean that no reads can actually execute concurrently? In other words, do we really gain anything by storing read promises separately? If we do skip proposing on a "safe" read, this is not a problem, but this optimization does not appear to require the commutativity. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482995#comment-17482995 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/27/22, 9:24 AM: -- I think I am leaning towards not implementing these suggestions, for the following reasons 1) To persist to the {{promised}} (actually {{promisedWrite}}) register safely we would need to modify the timestamp behaviour, i.e. writing promises with a lower timestamp than accepts or commits, which harms seamless downgrades and backwards compatibility. Not doing so safely obviates any benefit, as we still need to perform this comparison in memory to resolve the equal timestamp case. 2) Separate {{accept}} and {{commit}} registers are used for essentially the same reason - I agree it would be simpler to maintain just one register here for both the mutation and the ballot, but the current representation is 100% downgradeable, and any modification to persistence would not be. Do you agree? We _could_ front-load these changes to the loading from disk phase, and modify the {{Snapshot}}, however, if that works for you? This could also modify the network representation. I think these changes would be fine, although I think they result in only a modest clarity improvement. was (Author: benedict): 1) To persist to the {{promised}} (actually {{promisedWrite}}) register safely we would need to modify the timestamp behaviour, i.e. writing promises with a lower timestamp than accepts or commits, which harms seamless downgrades and backwards compatibility. Not doing so safely obviates any benefit, as we still need to perform this comparison in memory to resolve the equal timestamp case. 2) Separate {{accept}} and {{commit}} registers are used for essentially the same reason - I agree it would be simpler to maintain just one register here for both the mutation and the ballot, but the current representation is 100% downgradeable, and any modification to persistence would not be. We _could_ front-load these changes to the loading from disk phase, and modify the {{Snapshot}}, however, if that works for you? This could also modify the network representation. I think these changes would be fine, although I think they result in only a modest clarity improvement. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482510#comment-17482510 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/26/22, 2:20 PM: -- I think this would be fine, I need to think through why I did not do this originally, as I did consider it. It may simply have been to minimise the change in semantics to {{system.paxos}}, without there having been any particularly good reason. Perhaps also for debugging purposes, it helps to be able to see whether something had been promised before it was accepted/committed. was (Author: benedict): I think this would be fine, I need to think through why I did not do this originally, as I did consider it. It may simply have been to minimise the change in semantics to {{system.paxos}}, without there having been any particularly good reason. Perhaps also for debugging purposes, it helps to be able to see whether something had been promised before it was accepted/committed. I think there is perhaps also some additional complexity when a replica has promised a ballot with a timestamp, and then proceeds to accept a different one with the same timestamp. I don't think we would resolve these reliably, so we might not actually overwrite it anyway, and I don't think there is a trivial solution to this problem in the storage layer. I'm not certain if this would actually cause problems, but it would require some careful consideration to rule it out. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482510#comment-17482510 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/26/22, 2:19 PM: -- I think this would be fine, I need to think through why I did not do this originally, as I did consider it. It may simply have been to minimise the change in semantics to {{system.paxos}}, without there having been any particularly good reason. Perhaps also for debugging purposes, it helps to be able to see whether something had been promised before it was accepted/committed. I think there is perhaps also some additional complexity when a replica has promised a ballot with a timestamp, and then proceeds to accept a different one with the same timestamp. I don't think we would resolve these reliably, so we might not actually overwrite it anyway, and I don't think there is a trivial solution to this problem in the storage layer. I'm not certain if this would actually cause problems, but it would require some careful consideration to rule it out. was (Author: benedict): I think this would be fine, I need to think through why I did not do this originally, as I did consider it. It may simply have been to minimise the change in semantics to {{system.paxos}}, without there having been any particularly good reason. Perhaps also for debugging purposes, it helps to be able to see whether something had been promised before it was accepted/committed. I think there is perhaps also some additional complexity when a replica has promised a ballot with the same timestamp, and then proceed to accept it. I don't think we would resolve these reliably, so we might not actually overwrite it anyway, and I don't think there is a trivial solution to this problem in the storage layer. I'm not certain if this would actually cause problems, but it would require some careful consideration to rule it out. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17482510#comment-17482510 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/26/22, 2:18 PM: -- I think this would be fine, I need to think through why I did not do this originally, as I did consider it. It may simply have been to minimise the change in semantics to {{system.paxos}}, without there having been any particularly good reason. Perhaps also for debugging purposes, it helps to be able to see whether something had been promised before it was accepted/committed. I think there is perhaps also some additional complexity when a replica has promised a ballot with the same timestamp, and then proceed to accept it. I don't think we would resolve these reliably, so we might not actually overwrite it anyway, and I don't think there is a trivial solution to this problem in the storage layer. I'm not certain if this would actually cause problems, but it would require some careful consideration to rule it out. was (Author: benedict): I think this would be fine, I need to think through why I did not do this originally, as I did consider it. It may simply have been to minimise the change in semantics to {{system.paxos}}, without there having been any particularly good reason. Perhaps also for debugging purposes, it helps to be able to see whether something had been promised before it was accepted/committed. I think there is perhaps also some additional complexity when a replica has promised a ballot with the same timestamp, and then proceed to accept it. I don't think we would resolve these reliably, so we might not actually overwrite it anyway, and I don't think there is a trivial solution to this problem in the storage layer. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481981#comment-17481981 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/25/22, 5:21 PM: -- But this is only an optimisation, I think that would be equivalent to just removing the condition and ternary operator entirely, and simply returning {{latest(promised, latestWriteOrLowBound)}}? i.e. on entry {{latestWriteOrLowBound >= promisedWrite}} so this is just a single instruction short-circuit permitting us to avoid evaluating {{Ballot#compareTo}} was (Author: benedict): But this is only an optimisation, I think that would be equivalent to just removing the condition and ternary operator entirely, and saying {{latest(promised, latestWriteOrLowBound)}}? i.e. {{latestWriteOrLowBound}} already is greater than or equal to {{promisedWrite}} so this is just a single instruction short-circuit permitting us to avoid evaluating {{Ballot#compareTo}} > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481966#comment-17481966 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/25/22, 5:06 PM: -- bq. I don’t think there’s a scenario where that’s not a problem, and we’re checking, so we may as well log a warning. I think it's likely that log messages for out of range operations will be configurable in the future, but you're right that there's no harm in logging warnings in the meantime. bq. Sorry, I meant for the latestWitnessedOrLowBound method. Edit: although now I see this is enforced in the constructor, so nevermind. Although a comment in latestWitnessedOrLowBound explaining that would be helpful Ah, that explains it. Certainly, I'll update with a comment justifying it. was (Author: benedict): bq. I don’t think there’s a scenario where that’s not a problem, and we’re checking, so we may as well log a warning. I think it's likely that log messages for out of range operations will be configurable in the future, but you're right that there's no harm in logging warnings in the meantime. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17481962#comment-17481962 ] Blake Eggleston edited comment on CASSANDRA-17164 at 1/25/22, 5:01 PM: --- {quote}should we log out of range requests in isInRangeAndShouldProcess? I think this is something that still hasn't taken full shape in OSS, so I'm not sure what the right course of action is. Happy to do whatever you feel is appropriate until a broader strategy is sorted out. {quote} I don’t think there’s a scenario where that’s not a problem, and we’re checking, so we may as well log a warning. {quote}I’d prefer we use .equals instead of checking pointer equality here. .equals should check for pointer equality, but will also guard against unintended behavior For the assert? That would be fine, but it makes enforcing the identity constraint harder with automated testing. In the other case, it's a pure optimisation so better to keep IMO (else the benefit is likely lost) {quote} Sorry, I meant for the {{latestWitnessedOrLowBound}} method. Edit: although now I see this is enforced in the constructor, so nevermind. Although a comment in {{latestWitnessedOrLowBound}} explaining that would be helpful was (Author: bdeggleston): {quote} should we log out of range requests in isInRangeAndShouldProcess? I think this is something that still hasn't taken full shape in OSS, so I'm not sure what the right course of action is. Happy to do whatever you feel is appropriate until a broader strategy is sorted out. {quote} I don’t think there’s a scenario where that’s not a problem, and we’re checking, so we may as well log a warning. {quote} I’d prefer we use .equals instead of checking pointer equality here. .equals should check for pointer equality, but will also guard against unintended behavior For the assert? That would be fine, but it makes enforcing the identity constraint harder with automated testing. In the other case, it's a pure optimisation so better to keep IMO (else the benefit is likely lost) {quote} Sorry, I meant for the {{latestWitnessedOrLowBound}} method > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480311#comment-17480311 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/22/22, 1:11 AM: -- {quote}Why did you remove the logged warning about non linearizable reads? {quote} -Hopefully unintentionally...- Skimming your edits, it looks like this might have been down to it having been duplicated, as it is also logged at StorageProxy startup? {quote} Why did you change type.decompose to type.decomposeUntyped in makeInternalOptionsWithNowInSec? {quote} I think because dtest-api uses UUID and so a full migration to TimeUUID in all places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after so long. {quote}especially since it’s not checking for the partition key or static row {quote} I'm sure this is an oversight that's just not important for the use case - which unfortunately I forget, so I'll just fix up this oversight. {quote}should we log out of range requests in isInRangeAndShouldProcess? {quote} I think this is something that still hasn't taken full shape in OSS, so I'm not sure what the right course of action is. Happy to do whatever you feel is appropriate until a broader strategy is sorted out. {quote}I’d prefer we use .equals instead of checking pointer equality here. .equals should check for pointer equality, but will also guard against unintended behavior {quote} For the assert? That would be fine, but it makes enforcing the identity constraint harder with automated testing. In the other case, it's a pure optimisation so better to keep IMO (else the benefit is likely lost) {quote}WDYT about returning an empty collection if we don’t know about the table? {quote} I'll have a ponder over the weekend. {quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is ambiguous {quote} WFM {quote}These changes seem like they may have been for debugging purposes, are they something that should be committed? {quote} Oops, yep. {quote}A description and explanation of the upgrade procedure should also be included, as well as an entry in NEWS.txt {quote} Yep, agreed. Thanks! was (Author: benedict): {quote}Why did you remove the logged warning about non linearizable reads? {quote} Hopefully unintentionally... {quote} Why did you change type.decompose to type.decomposeUntyped in makeInternalOptionsWithNowInSec? {quote} I think because dtest-api uses UUID and so a full migration to TimeUUID in all places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after so long. {quote}especially since it’s not checking for the partition key or static row {quote} I'm sure this is an oversight that's just not important for the use case - which unfortunately I forget, so I'll just fix up this oversight. {quote}should we log out of range requests in isInRangeAndShouldProcess? {quote} I think this is something that still hasn't taken full shape in OSS, so I'm not sure what the right course of action is. Happy to do whatever you feel is appropriate until a broader strategy is sorted out. {quote}I’d prefer we use .equals instead of checking pointer equality here. .equals should check for pointer equality, but will also guard against unintended behavior {quote} For the assert? That would be fine, but it makes enforcing the identity constraint harder with automated testing. In the other case, it's a pure optimisation so better to keep IMO (else the benefit is likely lost) {quote}WDYT about returning an empty collection if we don’t know about the table? {quote} I'll have a ponder over the weekend. {quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is ambiguous {quote} WFM {quote}These changes seem like they may have been for debugging purposes, are they something that should be committed? {quote} Oops, yep. {quote}A description and explanation of the upgrade procedure should also be included, as well as an entry in NEWS.txt {quote} Yep, agreed. Thanks! > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17480311#comment-17480311 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/22/22, 1:08 AM: -- {quote}Why did you remove the logged warning about non linearizable reads? {quote} Hopefully unintentionally... {quote} Why did you change type.decompose to type.decomposeUntyped in makeInternalOptionsWithNowInSec? {quote} I think because dtest-api uses UUID and so a full migration to TimeUUID in all places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after so long. {quote}especially since it’s not checking for the partition key or static row {quote} I'm sure this is an oversight that's just not important for the use case - which unfortunately I forget, so I'll just fix up this oversight. {quote}should we log out of range requests in isInRangeAndShouldProcess? {quote} I think this is something that still hasn't taken full shape in OSS, so I'm not sure what the right course of action is. Happy to do whatever you feel is appropriate until a broader strategy is sorted out. {quote}I’d prefer we use .equals instead of checking pointer equality here. .equals should check for pointer equality, but will also guard against unintended behavior {quote} For the assert? That would be fine, but it makes enforcing the identity constraint harder with automated testing. In the other case, it's a pure optimisation so better to keep IMO (else the benefit is likely lost) {quote}WDYT about returning an empty collection if we don’t know about the table? {quote} I'll have a ponder over the weekend. {quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is ambiguous {quote} WFM {quote}These changes seem like they may have been for debugging purposes, are they something that should be committed? {quote} Oops, yep. {quote}A description and explanation of the upgrade procedure should also be included, as well as an entry in NEWS.txt {quote} Yep, agreed. Thanks! was (Author: benedict): {quote}Why did you remove the logged warning about non linearizable reads?{quote} Hopefully unintentionally... {quote} Why did you change type.decompose to type.decomposeUntyped in makeInternalOptionsWithNowInSec? {quote} I think because dtest-api uses UUID and so a full migration to TimeUUID in all places was kind of tricky, particularly for e.g. upgrade tests? It's hazy after so long. {quote}especially since it’s not checking for the partition key or static row{quote} I'm sure this is an oversight that's just not important for the use case - which unfortunately I forget, so I'll just fix up this oversight. {quote}should we log out of range requests in isInRangeAndShouldProcess?{quote} I think this is something that still hasn't taken full shape in OSS, so I'm not sure what the right course of action is. Happy to do whatever you feel is appropriate until a broader strategy is sorted out. {quote}I’d prefer we use .equals instead of checking pointer equality here. .equals should check for pointer equality, but will also guard against unintended behavior{quote} For the assert? That would be fine, but it makes enforcing the identity constraint harder with automated testing. In the other case, it's a pure optimisation so better to keep IMO (else the benefit is likely lost) {quote}WDYT about returning an empty collection if we don’t know about the table?{quote} I'll have a ponder over the weekend. {quote}WDYT about renaming rawTimestamp to uuidTimestamp? Raw is ambiguous{quote} WFM {quote}These changes seem like they may have been for debugging purposes, are they something that should be committed?{quote} Oops, yep. Thanks! > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479183#comment-17479183 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/20/22, 9:58 AM: -- {quote}Specifically, what stops the following from happening?{quote} In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but {{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending its ballot? {quote}The fact that we tie accepting a promise to a majority with matching "most recent commit" means that we effectively treat it as the identifier of the current session{quote} It is used as an identifier for a commit that was accepted by a majority using that ballot, only to ensure it is durable (made it to at least a quorum) before we propose a new value. We don't tie accepting a promise to matching the "most recent commit" but we cannot safely propose anything without ensuring the prior commit is durable, which is why we may simply "refresh" and continue using the promises we sought... I feel like there may be some minor disconnects here in our language, though, which might be complicating the discussion a little. {quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and trigger reproposal and commit{quote} {{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} and if so, returns false, i.e. this would not be considered an incomplete proposal. was (Author: benedict): {quote}Specifically, what stops the following from happening?{quote} In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but {{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending its ballot? {quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and trigger reproposal and commit{quote} {{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} and if so, returns false, i.e. this would not be considered an incomplete proposal. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479183#comment-17479183 ] Benedict Elliott Smith edited comment on CASSANDRA-17164 at 1/20/22, 9:46 AM: -- {quote}Specifically, what stops the following from happening?{quote} In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but {{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending its ballot? {quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and trigger reproposal and commit{quote} {{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} and if so, returns false, i.e. this would not be considered an incomplete proposal. was (Author: benedict): {quote}Specifically, what stops the following from happening?{quote} In this example, I believe (3) would not yield {{FOUND_INCOMPLETE_COMMIT}} but {{FOUND_INCOMPLETE_ACCEPTED}} and so would correctly re-propose (2), expending its ballot? {quote}It seems the next prepare will return FOUND_INCOMPLETE_ACCEPTED and trigger reproposal and commit{quote} {{hasInProgressProposal}} first checks if {{latestAccepted.update.isEmpty()}} and if so, returns false, i.e. this would not be considered an incomplete proposal. The fast read optimisation does however leave an incomplete _promise_ that we might want to change to asynchronously send an empty proposal. > CEP-14: Paxos Improvements > -- > > Key: CASSANDRA-17164 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17164 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Consistency/Repair >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.1 > > > This ticket encompasses work for [CEP-14| > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-14%3A+Paxos+Improvements]. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17164) CEP-14: Paxos Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17479167#comment-17479167 ] Branimir Lambov edited comment on CASSANDRA-17164 at 1/20/22, 8:45 AM: --- {quote}I think this is actually the typical way of implementing Classic Paxos, even though Lamport's paper seems to suggest you must only contact the nodes that responded to the prepare (there may be something else specific about his formulation that necessitates this, I forget, as I dislike his writings on the topic). This is corroborated by Heidi Howard's dissertation, which was the easiest place I could find a straight-forward formulation of Classic Paxos besides that of Lamport. See Algorithm 3 on Page 30. {quote} This is an excellent pointer, thank you. Lamport's proof also works with separate proposal quorum – my only request here is that this should be stated somewhere as other contributors might start with Lamport's original definition too. {quote}I don't quite follow what you mean by this, as this is not limited to "most recent commit", but a ballot directly maps to the instance id of classic paxos, it just avoids pre-splitting the range of integers. {quote} Well, I don't follow this one. A ballot id is something one replica selects and sends it as a prepare message to everyone. At this point some nodes may have committed a proposal, others may have accepted it, and yet others may have never heard of it. From the point of view of a sequence of paxos instances, the first two will definitely make promises for different instances, and likely the third one will be promising for yet another one. The fact that we tie accepting a promise to a majority with matching "most recent commit" means that we effectively treat it as the identifier of the current session. More interestingly, we can then send a proposal to nodes with older most recent commit (i.e. nodes that are effectively working on a previous paxos instance), that proposal will be accepted (overwriting the decided value but not advancing the most recent commit), and this might lead to a decision using just a small number of participants with the right "most recent commit". Specifically, what stops the following from happening? {code:java} A mrcA acceptedA promised B mrc B accepted B promised C mrcC accepted C promised prepare(1) -> ABC- -1- - 1 - -1 propose(1, X) -> ABC - 1,X1- 1,X 1 - 1,X 1 commit(1, X) -> AB 1,X -1 1,X - 1 - -1 prepare(2) -> AB1,X -2 1,X - 2 - 1,X 1 majority AB, no refresh propose(2, Y) -> BC 1,X -2 1,X 2,Y 2 - 2,Y 2 returns successful Y prepare(3) -> AC1,X1,X3 1,X 2,Y 2 1,X -3 refresh stale, then forms majority AC propose(3, Z) -> ABC1,X3,Z3 1,X 3,Z 3 1,X3,Z 3 returns successful Z {code} If this is permitted, there's a further error possibility if B is partitioned after the 2,Y proposal succeeds and a commit is executed in B. Then the committed value can be read from B, and later wiped with a conflicting decision. In LWT terms, if X: {{{}column = X if not exists{}}}, Y: {{column = Y if column == X}} and Z: {{{}column = Z if column == X{}}}, we may be able to read (serially) both Y and Z. {quote}Could you explain what you are referring to here? I think this is all standard stuff for Paxos, we're again just recording the most recently used instance number for each register. {quote} Judging from the above, this may be insufficient. I'm likely missing something here – that something needs to be documented. {quote}The final commit phase is only required to ensure any "decree" (decision) is disseminated. If we have proposed that no decree be made, there is nothing to disseminate, and nothing to complete if another transaction encounters it. This is in some ways an artefact of the feature of Cassandra's implementation, that we initiate a paxos round without knowing if it will do anything, though this feature would I suppose be present for read-only operations anyway. {quote} This is also hard to read from the code: {{Paxos.cas}} does not