[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-17 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617639#comment-16617639
 ] 

Ariel Weisberg commented on CASSANDRA-14705:


This got merged as 
[047bcd7ad171d6a4aa89128c5e6c6ed5f012b1c0|https://github.com/apache/cassandra/commit/047bcd7ad171d6a4aa89128c5e6c6ed5f012b1c0]?

Is this issue ready to resolve?

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-13 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613475#comment-16613475
 ] 

Ariel Weisberg commented on CASSANDRA-14705:


+1 unless you want to do more changes as part of this.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-13 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613459#comment-16613459
 ] 

Benedict commented on CASSANDRA-14705:
--

So, I've pushed a squashed and rebased version 
[here|https://github.com/belliottsmith/cassandra/tree/14705-rebase], however 
I'm considering on last modification - Alex's most recent patches mean we can 
perhaps get rid of the ReplicaPlan.Shared concept.  I may try, and see how it 
turns out.  Previously we needed to construct the readRepair upfront, but now 
we could construct it on demand.  It might be that it is a useful abstraction 
anyway, since we at least make it clear that these are modified.  But it would 
actually localise more the scope of who can see what modifications to a single 
class.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-13 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613130#comment-16613130
 ] 

Benedict commented on CASSANDRA-14705:
--

Ah, thanks.  I must have gone amiss somehow with the summary GitHub UI.  It 
does look like we can just use the provided replica plan safely, so I've 
removed the CL parameters.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612773#comment-16612773
 ] 

Ariel Weisberg commented on CASSANDRA-14705:


[This is what I was talking 
about|https://github.com/apache/cassandra/pull/265/files#diff-0246c72855070863c2fdbee6d97f494dR63].
 Couldn't this refer to the CL from the parameter ReplicaPlan? Can you remove 
the TODO?

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612761#comment-16612761
 ] 

Benedict commented on CASSANDRA-14705:
--

So, my personal view on TODOs is that they can be much more useful left inline 
with their context, for when you or somebody else next comes to touch the 
surrounding code - whether in the intended JIRA, or another related one.  
They're also a bit specific for a JIRA, they're more guidance for what to 
consider when shaping your next solution.

They're like documentation, only they document inadequacy, things for the next 
author to watch out for, and a suggestion that (if time permits) they resolve 
them at the same time.  

There are multiple JIRA filed already for these code paths, any of which might 
hopefully encompass refactoring / fixing these TODOs.
{quote}If you could remove that one code comment, maybe clean it up to only 
refer to the one ReplicaPlan
{quote}
Which one?  The one I saw was already missing from the code, hence the comment 
being out of date for the PR (and my missing it) on GitHub UI.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612720#comment-16612720
 ] 

Ariel Weisberg commented on CASSANDRA-14705:


If you could remove that one code comment, maybe clean it up to only refer to 
the one ReplicaPlan and do something about the other TODOs that need to either 
be done, removed, or JIRAed. That's a good policy for TODOs right?

+1 after that.



> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612687#comment-16612687
 ] 

Benedict commented on CASSANDRA-14705:
--

Not loving the GitHub comments experience at the moment.  I've responded to the 
comments I could find.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612656#comment-16612656
 ] 

Ariel Weisberg commented on CASSANDRA-14705:


These two are outdated so you will need to click view outdated, but there is 
https://github.com/apache/cassandra/pull/265#pullrequestreview-153925151 and 
https://github.com/apache/cassandra/pull/265#pullrequestreview-153914279 which 
both seem to not be responded to.

The first is the one that really matters because I think not understanding why 
there are multiple plans is an issue.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612298#comment-16612298
 ] 

Alex Petrov commented on CASSANDRA-14705:
-

Thank you for the changes & discussions. 

+1, LGTM.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612251#comment-16612251
 ] 

Benedict commented on CASSANDRA-14705:
--

{quote}Maybe it's actually good to fail queries where we have collected half 
the replicas from one token metadata and the other one from other? Not sure.
{quote}
Yeah, not sure either, but since we're already exposed (generally) to range 
ownership races, it's _probably_ better to hold off until we fix those across 
the board IMO, to least surprise our users.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>  * {{commitPaxos}} was using only live nodes, when needed to include down
>  * We were not writing to pending transient replicas
>  * On write, we were not hinting to full nodes with transient replication 
> enabled (since we filtered to {{liveOnly}}, in order to include our transient 
> replicas above {{blockFor}})
>  * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we 
> would only consult the same node we had speculated too. This also applied to 
> {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.
>  * Transient->Full movements mishandled consistency level upgrade
>  ** While we need to treat a transitioning node as ‘full’ for writes, so that 
> it can safely begin serving full data requests once it has finished, we 
> cannot maintain it in the ‘pending’ collection else we will also increase our 
> consistency requirements by a node that doesn’t exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611857#comment-16611857
 ] 

Alex Petrov commented on CASSANDRA-14705:
-

bq. I'm not sure we can - this would be a race condition on the Keyspace.

True, I haven't thought of it. We should probably try to make replication 
factor immutable for the time of query and attached to replica plan at some 
point in future.

bq. Not quite sure what you mean here?  That, for an AbstractBounds, we fully 
cover it?

I mostly meant something like 
[this|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/EndpointsForRange.java#L94-L96].
 Maybe it's actually good to fail queries where we have collected half the 
replicas from one token metadata and the other one from other? Not sure.

bq.  I guess we could have a shared method that accepts a boolean indicating if 
we should throw, if you like? 

We could also leave it for later, really. It's rather minor and I'm not very 
worried about it right now, just thought it's worth persisting it.

bq. ReplicaCount can now use the new count method

Sorry, I was thinking about something different, let's leave it as is.

bq.  so a quick check to confirm sufficiency is, well, sufficient. 

Right, agreed. Maybe if we used a boolean for check and threw if it returned 
{{false}}, we could still consolidate those, but this is also not a huge win.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>   - {{commitPaxos}} was using only live nodes, when needed to include down
>   - We were not writing to pending transient replicas
>   - On write, we were not hinting to full nodes with transient 
> replication enabled (since we filtered to {{liveOnly}}, in order to include 
> our transient replicas above {{blockFor}})
> - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) 
> we would only consult the same node we had speculated too.  This also applied 
> to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611774#comment-16611774
 ] 

Benedict commented on CASSANDRA-14705:
--

Thanks.  I vacillated over a number of these naming decisions myself, so I'm 
happy to accept your suggestions.  I'll respond to just a few points:
{quote}we can avoid iterating in {{ReplicaPlan#forWrite}} checking for 
{{isTransient}} and short-circuit it for non-transient keyspace with 
{{CL#hasTransient}}.
{quote}
I'm not sure we can - this would be a race condition on the Keyspace.  Very 
rare, admittedly, but worth avoiding. Perhaps we could cache the value of this 
inside of Endpoints, though, and introduce a {{hasTransient()}} method?  
Ideally, we will later cache the {{ReplicaLayout}} (or perhaps even 
{{ReplicaPlan}}) at which point this would short-circuit this work for all 
queries.
{quote}in {{EndpointsFor(Range|Token)}} we have checks for range/token so they 
matched for concatenated ranges. Should we apply same checks when constructing 
layouts?
{quote}
Not quite sure what you mean here?  That, for an AbstractBounds, we fully cover 
it?  I did wonder about this, but there's actually potential race conditions in 
the code where we generate our ranges to query, and actually query them, where 
we might fail this check.  But this race condition is inherent to the way we do 
range ownerships anyway, so not a lot of point in killing queries today, I 
reckon.  This obviously doesn't apply to Token, since they are the unit by 
which range ownerships can move.
{quote}I know it was same before the patch and we can do it in a separate one, 
but {{isSufficientReplicasForRead}} and {{assureSufficientReplicas}} share the 
logic which potentially makes it prone to modifications just in one place and 
not the other. Looks like we could combine the two. In same methods, naming is 
either liveReplicas or allLive for same argument.
{quote}
I agree, but at present it would be a bit painful to modify, because 
{{assureSufficientReplicas}} throws exceptions that utilise the computations.  
I guess we could have a shared method that accepts a boolean indicating if we 
should throw, if you like?  My intention was that all of this would be cleaned 
up in a later Jira, where we might address the overall management of 
'sufficiency' - as this is dotted all over resolvers, write handlers, CL, 
read-repair...
{quote}{{ReplicaCount}} can now use the new {{count}} method
{quote}
I don't follow?  Do you mean {{ReplicaCollection.count()}} - if so, this only 
returns an int, and we need two ints...
{quote}{{ReplicaPlans#maybeMerge}} is currently checking for 
{{isSufficientReplicasForRead}} and then proceeds with filtering the nodes, 
unlike token reads, which filter and then check. Since this is the only place 
{{isSufficientReplicasForRead}}, this might be an opportunity for consolidation.
{quote}
These checks are a little different - we're testing to see if two sufficient 
plans can be merged into one, so a quick check to confirm sufficiency is, well, 
sufficient.  We aren't then performing a later check to assert sufficiency, 
because if they aren't we just use the two separate plans.  That said, we could 
make the structure of the code similar, by checking after construction - but I 
assume the point of this was originally to avoid generating more garbage than 
necessary.
{quote}looks like you're half a step away from removing all the logic from 
{{ConsistencyLevel}}, which is great. Theoretically, all static methods from 
there could be moved to {{ReplicaPlan}} and things like "filterForQuery" could 
have stronger invariants, such as {{ForRead>}}, same is true for 
{{isSufficientReplicasForRead/write}}, especially is applicable for write since 
we have to pass {{Endpoints}} separately there even though they're technically 
in the same bucket of {{ForWrite}}.
{quote}
Agreed entirely, CL should ideally return to being mostly just an enum.  Since 
most of its logic is for tracking sufficiency (either in the is/assert checks, 
or in resolvers, write handlers, etc.), when we address that we can move all of 
the logic out of CL.  I also already have a follow-up bug fix that moves 
{{filterForQuery}} into {{ReplicaPlans}}
{quote}we've discussed it offline and it's probably worth to do in a follow-up 
ticket, but collecting replicas and {{ReplicaLayouts}} in general might be 
worth to cover with tests. They're tested indirectly, however mostly through 
dtests. Maybe flushing out some things, especially related to pending nodes 
(which are generally hard to test), might be good candidates there.
{quote}
I will try to rustle up some unit tests for sure, though you may be right that 
a follow-up ticket is best, so we can get this committed ASAP.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: 

[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-12 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611749#comment-16611749
 ] 

Alex Petrov commented on CASSANDRA-14705:
-

Thank you for the patch. Some comments, all rather minor:

  * we could rename {{liveOnly}} to {{live}} unless "only" here bears 
  * {{allUncontactedCandidates}} could be just {{uncontactedCandidates}} 
  * {{ForTokenWrite}} can be just {{ForWrite}} since we've removed 
{{ForRangeWrite}}, similarly in {{forTokenWriteLiveAndDown}} and 
{{forTokenWrite}}
  * we can avoid iterating in {{ReplicaPlan#forWrite}} checking for 
{{isTransient}} and short-circuit it for non-transient keyspace with 
{{CL#hasTransient}}.
 * in {{EndpointsFor(Range|Token)}} we have checks for range/token so they 
matched for concatenated ranges. Should we apply same checks when constructing 
layouts?
  * I know it was same before the patch and we can do it in a separate one, but 
{{isSufficientReplicasForRead}} and {{assureSufficientReplicas}} share the 
logic which potentially makes it prone to modifications just in one place and 
not the other. Looks like we could combine the two. In same methods, naming is 
either liveReplicas or allLive for same argument. 
  * {{assureSufficientReplicas}} might need word "live" in it
  * {{ReplicaCount}} can now use the new {{count}} method
  * {{ReplicaPlans#maybeMerge}} is currently checking for 
{{isSufficientReplicasForRead}} and then proceeds with filtering the nodes, 
unlike token reads, which filter and then check. Since this is the only place 
{{isSufficientReplicasForRead}}, this might be an opportunity for consolidation.

And some that we've discussed offline for history:
  * looks like you're half a step away from removing all the logic from 
{{ConsistencyLevel}}, which is great. Theoretically, all static methods from 
there could be moved to {{ReplicaPlan}} and things like "filterForQuery" could 
have stronger invariants, such as {{ForRead>}}, same is true for 
{{isSufficientReplicasForRead/write}}, especially is applicable for write since 
we have to pass {{Endpoints}} separately there even though they're technically 
in the same bucket of {{ForWrite}}.
  * we've discussed it offline and it's probably worth to do in a follow-up 
ticket, but collecting replicas and {{ReplicaLayouts}} in general might be 
worth to cover with tests. They're tested indirectly, however mostly through 
dtests. Maybe flushing out some things, especially related to pending nodes 
(which are generally hard to test), might be good candidates there.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>   - {{commitPaxos}} was using only live nodes, when needed to include down
>   - We were not writing to pending transient replicas
>   - On write, we were not hinting to full nodes with transient 
> replication enabled (since we filtered to {{liveOnly}}, in order to include 
> our transient replicas above {{blockFor}})
> - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) 
> we would only consult the same node we had speculated too.  This also applied 
> to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-10 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609615#comment-16609615
 ] 

Ariel Weisberg commented on CASSANDRA-14705:


[~ifesdjeen] that branch you linked to in your PR is the wrong one, it's 14705

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>   - {{commitPaxos}} was using only live nodes, when needed to include down
>   - We were not writing to pending transient replicas
>   - On write, we were not hinting to full nodes with transient 
> replication enabled (since we filtered to {{liveOnly}}, in order to include 
> our transient replicas above {{blockFor}})
> - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) 
> we would only consult the same node we had speculated too.  This also applied 
> to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-07 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607529#comment-16607529
 ] 

Benedict commented on CASSANDRA-14705:
--

FYI, my fix for {{maybeSendAdditionalReads}} is incomplete, I will follow up 
with it on Monday, but it will be very minor.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>   - {{commitPaxos}} was using only live nodes, when needed to include down
>   - We were not writing to pending transient replicas
>   - On write, we were not hinting to full nodes with transient 
> replication enabled (since we filtered to {{liveOnly}}, in order to include 
> our transient replicas above {{blockFor}})
> - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) 
> we would only consult the same node we had speculated too.  This also applied 
> to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-07 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607440#comment-16607440
 ] 

Benedict commented on CASSANDRA-14705:
--

I've tried to split the patch into digestible chunks.  

The initial two commits are just to get most of the changes that touch a lot of 
files out of the way, to reduce the cognitive burden of review.  They consist 
of only a handful of IDE refactors (and a fairly safe find/replace of 
replicaLayout->replicaPlan).  You can probably ignore these commits almost 
entirely.

The "main refactor" commit has the guts of the work.
* Introduce {{ReplicaPlan}}, made up of
** two {{ReplicaLayout}}, for the {{live}} and {{liveAndDown}} replicas (maybe 
we will want to lazily build a separate {{downOnly}} in future, but I've kept 
it simple for now)
** two Endpoints, containing the candidate replicas and those we intend to 
contact; there are now comments clearly defining what the contents of each are 
in any given scenario.  It's probably the case there were some bugs introduced 
wrt totalEndpoints before this, but I didn't waste time corroborating.
* There are now separate {{ReplicaPlan}} and {{ReplicaLayout}} for each of 
read/write and token/range, and we enforce only the correct type be provided to 
each recipient.  {{ReplicaLayout}} only supports {{natural()}} for read 
operations, so it is impossible to invoke {{pending()}} here, or think you have 
{{pending()}} nodes when you do not.  <*Open question*: should we disable 
{{all()}} for read queries, and only support it for writes?  Haven't tried, and 
might get a little ugly, but might be conceptually cleaner>
* ReplicaCollection
** I have caved and implemented a {{count}} method, because there is none in 
{{Iterables}}.   I'm actually revisiting the review feedback I gave to 
{{Ariel}} in his version of the patch to not implement all of these, for two 
reasons: we now use these _extensively_, and we have only one implementation, 
so it might be OK to implement a {{filterLazily}} and even an 
{{{any,all,none}Match}}
** I have removed {{select()}} because the correct logic did not benefit from 
this abstraction.  I have, however, renamed the {{keep()}} method in 
{{Endpoints}} that re-orders the contents to {{select()}}, to make clear the 
distinction of semantics.

Then there are a few follow up commits to either fix bugs or perform some 
follow up clean ups.

Looking forward to your feedback.

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for 
> what we want to do) and {{ReplicaLayout}} (for what we know about the 
> cluster), with well defined semantics (and comments in the rare cases those 
> semantics are weird)
> Found and fixed some bugs:
>   - {{commitPaxos}} was using only live nodes, when needed to include down
>   - We were not writing to pending transient replicas
>   - On write, we were not hinting to full nodes with transient 
> replication enabled (since we filtered to {{liveOnly}}, in order to include 
> our transient replicas above {{blockFor}})
> - If we speculated, in {{maybeSendAdditionalReads}} (in read repair) 
> we would only consult the same node we had speculated too.  This also applied 
> to {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up

2018-09-07 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607398#comment-16607398
 ] 

Benedict commented on CASSANDRA-14705:
--

[trunk|https://github.com/belliottsmith/cassandra/tree/14705]  
[CI|https://circleci.com/workflow-run/4f121d12-86b5-48c6-a8e2-b78597ef47c2]

> ReplicaLayout follow-up
> ---
>
> Key: CASSANDRA-14705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14705
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
>
> Clarify the new ReplicaLayout code, separating it into ReplicaPlan (for what 
> we want to do) and ReplicaLayout (for what we know about the cluster), with 
> well defined semantics (and comments in the rare cases those semantics are 
> weird)
> Found and fixed bugs:
>   - commitPaxos was using live, when needed all included
>   - We were not writing to pending transient replicas
>   - On write, we were not hinting to full nodes with transient 
> replication enabled (since we filtered to liveOnly, in order to include our 
> transient replicas above blockFor)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org