[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617639#comment-16617639 ] Ariel Weisberg commented on CASSANDRA-14705: This got merged as [047bcd7ad171d6a4aa89128c5e6c6ed5f012b1c0|https://github.com/apache/cassandra/commit/047bcd7ad171d6a4aa89128c5e6c6ed5f012b1c0]? Is this issue ready to resolve? > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613475#comment-16613475 ] Ariel Weisberg commented on CASSANDRA-14705: +1 unless you want to do more changes as part of this. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613459#comment-16613459 ] Benedict commented on CASSANDRA-14705: -- So, I've pushed a squashed and rebased version [here|https://github.com/belliottsmith/cassandra/tree/14705-rebase], however I'm considering on last modification - Alex's most recent patches mean we can perhaps get rid of the ReplicaPlan.Shared concept. I may try, and see how it turns out. Previously we needed to construct the readRepair upfront, but now we could construct it on demand. It might be that it is a useful abstraction anyway, since we at least make it clear that these are modified. But it would actually localise more the scope of who can see what modifications to a single class. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613130#comment-16613130 ] Benedict commented on CASSANDRA-14705: -- Ah, thanks. I must have gone amiss somehow with the summary GitHub UI. It does look like we can just use the provided replica plan safely, so I've removed the CL parameters. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612773#comment-16612773 ] Ariel Weisberg commented on CASSANDRA-14705: [This is what I was talking about|https://github.com/apache/cassandra/pull/265/files#diff-0246c72855070863c2fdbee6d97f494dR63]. Couldn't this refer to the CL from the parameter ReplicaPlan? Can you remove the TODO? > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612761#comment-16612761 ] Benedict commented on CASSANDRA-14705: -- So, my personal view on TODOs is that they can be much more useful left inline with their context, for when you or somebody else next comes to touch the surrounding code - whether in the intended JIRA, or another related one. They're also a bit specific for a JIRA, they're more guidance for what to consider when shaping your next solution. They're like documentation, only they document inadequacy, things for the next author to watch out for, and a suggestion that (if time permits) they resolve them at the same time. There are multiple JIRA filed already for these code paths, any of which might hopefully encompass refactoring / fixing these TODOs. {quote}If you could remove that one code comment, maybe clean it up to only refer to the one ReplicaPlan {quote} Which one? The one I saw was already missing from the code, hence the comment being out of date for the PR (and my missing it) on GitHub UI. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612720#comment-16612720 ] Ariel Weisberg commented on CASSANDRA-14705: If you could remove that one code comment, maybe clean it up to only refer to the one ReplicaPlan and do something about the other TODOs that need to either be done, removed, or JIRAed. That's a good policy for TODOs right? +1 after that. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612687#comment-16612687 ] Benedict commented on CASSANDRA-14705: -- Not loving the GitHub comments experience at the moment. I've responded to the comments I could find. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612656#comment-16612656 ] Ariel Weisberg commented on CASSANDRA-14705: These two are outdated so you will need to click view outdated, but there is https://github.com/apache/cassandra/pull/265#pullrequestreview-153925151 and https://github.com/apache/cassandra/pull/265#pullrequestreview-153914279 which both seem to not be responded to. The first is the one that really matters because I think not understanding why there are multiple plans is an issue. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612298#comment-16612298 ] Alex Petrov commented on CASSANDRA-14705: - Thank you for the changes & discussions. +1, LGTM. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612251#comment-16612251 ] Benedict commented on CASSANDRA-14705: -- {quote}Maybe it's actually good to fail queries where we have collected half the replicas from one token metadata and the other one from other? Not sure. {quote} Yeah, not sure either, but since we're already exposed (generally) to range ownership races, it's _probably_ better to hold off until we fix those across the board IMO, to least surprise our users. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611857#comment-16611857 ] Alex Petrov commented on CASSANDRA-14705: - bq. I'm not sure we can - this would be a race condition on the Keyspace. True, I haven't thought of it. We should probably try to make replication factor immutable for the time of query and attached to replica plan at some point in future. bq. Not quite sure what you mean here? That, for an AbstractBounds, we fully cover it? I mostly meant something like [this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/EndpointsForRange.java#L94-L96]. Maybe it's actually good to fail queries where we have collected half the replicas from one token metadata and the other one from other? Not sure. bq. I guess we could have a shared method that accepts a boolean indicating if we should throw, if you like? We could also leave it for later, really. It's rather minor and I'm not very worried about it right now, just thought it's worth persisting it. bq. ReplicaCount can now use the new count method Sorry, I was thinking about something different, let's leave it as is. bq. so a quick check to confirm sufficiency is, well, sufficient. Right, agreed. Maybe if we used a boolean for check and threw if it returned {{false}}, we could still consolidate those, but this is also not a huge win. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > - {{commitPaxos}} was using only live nodes, when needed to include down > - We were not writing to pending transient replicas > - On write, we were not hinting to full nodes with transient > replication enabled (since we filtered to {{liveOnly}}, in order to include > our transient replicas above {{blockFor}}) > - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) > we would only consult the same node we had speculated too. This also applied > to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611774#comment-16611774 ] Benedict commented on CASSANDRA-14705: -- Thanks. I vacillated over a number of these naming decisions myself, so I'm happy to accept your suggestions. I'll respond to just a few points: {quote}we can avoid iterating in {{ReplicaPlan#forWrite}} checking for {{isTransient}} and short-circuit it for non-transient keyspace with {{CL#hasTransient}}. {quote} I'm not sure we can - this would be a race condition on the Keyspace. Very rare, admittedly, but worth avoiding. Perhaps we could cache the value of this inside of Endpoints, though, and introduce a {{hasTransient()}} method? Ideally, we will later cache the {{ReplicaLayout}} (or perhaps even {{ReplicaPlan}}) at which point this would short-circuit this work for all queries. {quote}in {{EndpointsFor(Range|Token)}} we have checks for range/token so they matched for concatenated ranges. Should we apply same checks when constructing layouts? {quote} Not quite sure what you mean here? That, for an AbstractBounds, we fully cover it? I did wonder about this, but there's actually potential race conditions in the code where we generate our ranges to query, and actually query them, where we might fail this check. But this race condition is inherent to the way we do range ownerships anyway, so not a lot of point in killing queries today, I reckon. This obviously doesn't apply to Token, since they are the unit by which range ownerships can move. {quote}I know it was same before the patch and we can do it in a separate one, but {{isSufficientReplicasForRead}} and {{assureSufficientReplicas}} share the logic which potentially makes it prone to modifications just in one place and not the other. Looks like we could combine the two. In same methods, naming is either liveReplicas or allLive for same argument. {quote} I agree, but at present it would be a bit painful to modify, because {{assureSufficientReplicas}} throws exceptions that utilise the computations. I guess we could have a shared method that accepts a boolean indicating if we should throw, if you like? My intention was that all of this would be cleaned up in a later Jira, where we might address the overall management of 'sufficiency' - as this is dotted all over resolvers, write handlers, CL, read-repair... {quote}{{ReplicaCount}} can now use the new {{count}} method {quote} I don't follow? Do you mean {{ReplicaCollection.count()}} - if so, this only returns an int, and we need two ints... {quote}{{ReplicaPlans#maybeMerge}} is currently checking for {{isSufficientReplicasForRead}} and then proceeds with filtering the nodes, unlike token reads, which filter and then check. Since this is the only place {{isSufficientReplicasForRead}}, this might be an opportunity for consolidation. {quote} These checks are a little different - we're testing to see if two sufficient plans can be merged into one, so a quick check to confirm sufficiency is, well, sufficient. We aren't then performing a later check to assert sufficiency, because if they aren't we just use the two separate plans. That said, we could make the structure of the code similar, by checking after construction - but I assume the point of this was originally to avoid generating more garbage than necessary. {quote}looks like you're half a step away from removing all the logic from {{ConsistencyLevel}}, which is great. Theoretically, all static methods from there could be moved to {{ReplicaPlan}} and things like "filterForQuery" could have stronger invariants, such as {{ForRead>}}, same is true for {{isSufficientReplicasForRead/write}}, especially is applicable for write since we have to pass {{Endpoints}} separately there even though they're technically in the same bucket of {{ForWrite}}. {quote} Agreed entirely, CL should ideally return to being mostly just an enum. Since most of its logic is for tracking sufficiency (either in the is/assert checks, or in resolvers, write handlers, etc.), when we address that we can move all of the logic out of CL. I also already have a follow-up bug fix that moves {{filterForQuery}} into {{ReplicaPlans}} {quote}we've discussed it offline and it's probably worth to do in a follow-up ticket, but collecting replicas and {{ReplicaLayouts}} in general might be worth to cover with tests. They're tested indirectly, however mostly through dtests. Maybe flushing out some things, especially related to pending nodes (which are generally hard to test), might be good candidates there. {quote} I will try to rustle up some unit tests for sure, though you may be right that a follow-up ticket is best, so we can get this committed ASAP. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL:
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611749#comment-16611749 ] Alex Petrov commented on CASSANDRA-14705: - Thank you for the patch. Some comments, all rather minor: * we could rename {{liveOnly}} to {{live}} unless "only" here bears * {{allUncontactedCandidates}} could be just {{uncontactedCandidates}} * {{ForTokenWrite}} can be just {{ForWrite}} since we've removed {{ForRangeWrite}}, similarly in {{forTokenWriteLiveAndDown}} and {{forTokenWrite}} * we can avoid iterating in {{ReplicaPlan#forWrite}} checking for {{isTransient}} and short-circuit it for non-transient keyspace with {{CL#hasTransient}}. * in {{EndpointsFor(Range|Token)}} we have checks for range/token so they matched for concatenated ranges. Should we apply same checks when constructing layouts? * I know it was same before the patch and we can do it in a separate one, but {{isSufficientReplicasForRead}} and {{assureSufficientReplicas}} share the logic which potentially makes it prone to modifications just in one place and not the other. Looks like we could combine the two. In same methods, naming is either liveReplicas or allLive for same argument. * {{assureSufficientReplicas}} might need word "live" in it * {{ReplicaCount}} can now use the new {{count}} method * {{ReplicaPlans#maybeMerge}} is currently checking for {{isSufficientReplicasForRead}} and then proceeds with filtering the nodes, unlike token reads, which filter and then check. Since this is the only place {{isSufficientReplicasForRead}}, this might be an opportunity for consolidation. And some that we've discussed offline for history: * looks like you're half a step away from removing all the logic from {{ConsistencyLevel}}, which is great. Theoretically, all static methods from there could be moved to {{ReplicaPlan}} and things like "filterForQuery" could have stronger invariants, such as {{ForRead>}}, same is true for {{isSufficientReplicasForRead/write}}, especially is applicable for write since we have to pass {{Endpoints}} separately there even though they're technically in the same bucket of {{ForWrite}}. * we've discussed it offline and it's probably worth to do in a follow-up ticket, but collecting replicas and {{ReplicaLayouts}} in general might be worth to cover with tests. They're tested indirectly, however mostly through dtests. Maybe flushing out some things, especially related to pending nodes (which are generally hard to test), might be good candidates there. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > - {{commitPaxos}} was using only live nodes, when needed to include down > - We were not writing to pending transient replicas > - On write, we were not hinting to full nodes with transient > replication enabled (since we filtered to {{liveOnly}}, in order to include > our transient replicas above {{blockFor}}) > - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) > we would only consult the same node we had speculated too. This also applied > to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609615#comment-16609615 ] Ariel Weisberg commented on CASSANDRA-14705: [~ifesdjeen] that branch you linked to in your PR is the wrong one, it's 14705 > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > - {{commitPaxos}} was using only live nodes, when needed to include down > - We were not writing to pending transient replicas > - On write, we were not hinting to full nodes with transient > replication enabled (since we filtered to {{liveOnly}}, in order to include > our transient replicas above {{blockFor}}) > - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) > we would only consult the same node we had speculated too. This also applied > to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607529#comment-16607529 ] Benedict commented on CASSANDRA-14705: -- FYI, my fix for {{maybeSendAdditionalReads}} is incomplete, I will follow up with it on Monday, but it will be very minor. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > - {{commitPaxos}} was using only live nodes, when needed to include down > - We were not writing to pending transient replicas > - On write, we were not hinting to full nodes with transient > replication enabled (since we filtered to {{liveOnly}}, in order to include > our transient replicas above {{blockFor}}) > - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) > we would only consult the same node we had speculated too. This also applied > to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607440#comment-16607440 ] Benedict commented on CASSANDRA-14705: -- I've tried to split the patch into digestible chunks. The initial two commits are just to get most of the changes that touch a lot of files out of the way, to reduce the cognitive burden of review. They consist of only a handful of IDE refactors (and a fairly safe find/replace of replicaLayout->replicaPlan). You can probably ignore these commits almost entirely. The "main refactor" commit has the guts of the work. * Introduce {{ReplicaPlan}}, made up of ** two {{ReplicaLayout}}, for the {{live}} and {{liveAndDown}} replicas (maybe we will want to lazily build a separate {{downOnly}} in future, but I've kept it simple for now) ** two Endpoints, containing the candidate replicas and those we intend to contact; there are now comments clearly defining what the contents of each are in any given scenario. It's probably the case there were some bugs introduced wrt totalEndpoints before this, but I didn't waste time corroborating. * There are now separate {{ReplicaPlan}} and {{ReplicaLayout}} for each of read/write and token/range, and we enforce only the correct type be provided to each recipient. {{ReplicaLayout}} only supports {{natural()}} for read operations, so it is impossible to invoke {{pending()}} here, or think you have {{pending()}} nodes when you do not. <*Open question*: should we disable {{all()}} for read queries, and only support it for writes? Haven't tried, and might get a little ugly, but might be conceptually cleaner> * ReplicaCollection ** I have caved and implemented a {{count}} method, because there is none in {{Iterables}}. I'm actually revisiting the review feedback I gave to {{Ariel}} in his version of the patch to not implement all of these, for two reasons: we now use these _extensively_, and we have only one implementation, so it might be OK to implement a {{filterLazily}} and even an {{{any,all,none}Match}} ** I have removed {{select()}} because the correct logic did not benefit from this abstraction. I have, however, renamed the {{keep()}} method in {{Endpoints}} that re-orders the contents to {{select()}}, to make clear the distinction of semantics. Then there are a few follow up commits to either fix bugs or perform some follow up clean ups. Looking forward to your feedback. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > - {{commitPaxos}} was using only live nodes, when needed to include down > - We were not writing to pending transient replicas > - On write, we were not hinting to full nodes with transient > replication enabled (since we filtered to {{liveOnly}}, in order to include > our transient replicas above {{blockFor}}) > - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) > we would only consult the same node we had speculated too. This also applied > to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607398#comment-16607398 ] Benedict commented on CASSANDRA-14705: -- [trunk|https://github.com/belliottsmith/cassandra/tree/14705] [CI|https://circleci.com/workflow-run/4f121d12-86b5-48c6-a8e2-b78597ef47c2] > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new ReplicaLayout code, separating it into ReplicaPlan (for what > we want to do) and ReplicaLayout (for what we know about the cluster), with > well defined semantics (and comments in the rare cases those semantics are > weird) > Found and fixed bugs: > - commitPaxos was using live, when needed all included > - We were not writing to pending transient replicas > - On write, we were not hinting to full nodes with transient > replication enabled (since we filtered to liveOnly, in order to include our > transient replicas above blockFor) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org