[jira] [Comment Edited] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-10 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1614#comment-1614
 ] 

Jeff Jirsa edited comment on CASSANDRA-12884 at 8/10/17 8:34 PM:
-

[~iamaleksey] will have a more comprehensive review, I'm sure, but a few notes 
from a very cursory glance:

-1) I don't see the purpose of stubbing out {{BatchlogManager::shuffle}} as a 
helper function here.- (You're overriding it for deterministic testing)

2) In the case where {{validated.keySet().size() == 1}} , shuffling all of the 
IPs in a given rack may not be all that efficient - may be quicker to just pick 
2 random ints, and grab the IPs at those offsets (like we do for the case where 
we have more than 2 racks, 
{{result.add(rackMembers.get(getRandomInt(rackMembers.size(;}} )




was (Author: jjirsa):
[~iamaleksey] will have a more comprehensive review, I'm sure, but a few notes 
from a very cursory glance:

1) I don't see the purpose of stubbing out {{BatchlogManager::shuffle}} as a 
helper function here.

2) In the case where {{validated.keySet().size() == 1}} , shuffling all of the 
IPs in a given rack may not be all that efficient - may be quicker to just pick 
2 random ints, and grab the IPs at those offsets (like we do for the case where 
we have more than 2 racks, 
{{result.add(rackMembers.get(getRandomInt(rackMembers.size(;}} )



> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Daniel Cranford
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12884) Batch logic can lead to unbalanced use of system.batches

2017-08-09 Thread Daniel Cranford (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119980#comment-16119980
 ] 

Daniel Cranford edited comment on CASSANDRA-12884 at 8/9/17 2:26 PM:
-

Same bug as CASSANDRA-8735. Regression.


was (Author: daniel.cranford):
Same bug. Regression.

> Batch logic can lead to unbalanced use of system.batches
> 
>
> Key: CASSANDRA-12884
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12884
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Adam Hattrell
>Assignee: Joshua McKenzie
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 0001-CASSANDRA-12884.patch
>
>
> It looks as though there are some odd edge cases in how we distribute the 
> copies in system.batches.
> The main issue is in the filter method for 
> org.apache.cassandra.batchlog.BatchlogManager
> {code:java}
>  if (validated.size() - validated.get(localRack).size() >= 2)
>  {
> // we have enough endpoints in other racks
> validated.removeAll(localRack);
>   }
>  if (validated.keySet().size() == 1)
>  {
>// we have only 1 `other` rack
>Collection otherRack = 
> Iterables.getOnlyElement(validated.asMap().values());
>
> return Lists.newArrayList(Iterables.limit(otherRack, 2));
>  }
> {code}
> So with one or two racks we just return the first 2 entries in the list.  
> There's no shuffle or randomisation here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org