[GitHub] [hadoop-ozone] timmylicheng commented on issue #668: HDDS-3139. Pipeline placement should max out pipeline usage

2020-04-20 Thread GitBox


timmylicheng commented on issue #668:
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-616920062


   @sodonnel Thanks for the efforts for benchmarking. That was a valid concern 
for large cluster in the future. I created 
https://issues.apache.org/jira/browse/HDDS-3466 to track this potential 
bottleneck.
   Say we allow 5 pipelines per DN, 10K pipelines require 10K*3/5 = 6K DNs.
   My team is testing how many DNs every OM and SCM can hold up. My takeaway is 
we are going to resolve quite a lot of issues on GRPC and Ratis before we move 
Ozone up to 6K DNs per SCM. So we may not be able to see this bottleneck in 
short term in prod cluster. However, we def wanna track this or do some testing 
for this to prevent it from being a blocker. 
   
   The major impact for this once being a bottleneck is createPipeline process 
would be long which could cause a slow cold start for SCM and DNs. On the other 
hand, 10K pipeline leads to many pipeline reports and container reports for 
single SCM. We may wanna limit overall pipeline counts per SCM ultimately.
   
   @fapifta Yea I agree with you. Let's use HDDS-3466 to track this and we are 
going to learn more about SCM's ability and best setup as we gradually move up 
to large cluster. My team is testing 200 DNs with single OM and SCM at this 
point (but in k8s env). Hopefully we can move further to have a clearer 
picture. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on issue #668: HDDS-3139. Pipeline placement should max out pipeline usage

2020-04-15 Thread GitBox
timmylicheng commented on issue #668: HDDS-3139. Pipeline placement should max 
out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-613865710
 
 
   > I think this looks better and I have just a few minor comments. I added a 
few inline plus these two:
   > 
   > At line 243:
   > 
   > ```
   > // First choose an anchor nodes randomly
   > DatanodeDetails anchor = chooseNode(healthyNodes);
   > ```
   > 
   > This no longer picks a random node - it just picks the lowest loaded node. 
I wonder if this first anchor node should be a random node?
   > 
   > Method getHigherLoadNodes() does not seem to be used any longer, so we can 
remove it.
   > 
   > This code has changed a lot with this patch, and this is really my first 
time looking at it. It would be great if @ChenSammi could give this a review 
too, after addressing the minor points I mentioned here?
   
   I've addressed these comments. Thanks.
   I will check if @ChenSammi had bandwidth to do a review. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on issue #668: HDDS-3139. Pipeline placement should max out pipeline usage

2020-04-10 Thread GitBox
timmylicheng commented on issue #668: HDDS-3139. Pipeline placement should max 
out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-611927572
 
 
   > > @sodonnel I rebase the code with minor conflicts in test class, but the 
test won't pass. I took a close look and made some change. But I realize the 
issue that I mention in the last comment about how to leverage with 
chooseNodeFromTopology. Wanna learn your thoughts.
   > 
   > I think one problem is this line:
   > 
   > ```
   > datanodeDetails = nodes.stream().findAny().get();
   > ```
   > 
   > The findAny method does not seem to return a random entry - so the same 
node is returned until it uses up its pipeline allocation.
   > 
   > I am also not sure about the limit calculation in getLowerLoadNodes:
   > 
   > ```
   >  int limit = nodes.size() * heavyNodeCriteria
   > / HddsProtos.ReplicationFactor.THREE.getNumber();
   > ```
   > 
   > Adding debug, I find this method starts to return an empty list when there 
are still available nodes to handle the pipeline.
   > 
   > Also in `filterViableNodes()` via the `meetCriteria()` method, nodes with 
more than the heavy load limit are already filtered out, so you are guaranteed 
your healthy node list container only nodes with the capacity to take another 
pipeline. So I wonder why we need to filter the nodes further.
   > 
   > > But I realize the issue that I mention in the last comment about how to 
leverage with chooseNodeFromTopology.
   > 
   > There seems to be some inconsistency in how we pick the nodes (not just in 
this PR, but in the wider code). Eg in `chooseNodeBasedOnRackAwareness()` we 
don't call into NetworkTopology(), but instead we use the 
`getNetworkLocation()` method on the `DatanodeDetails` object to find nodes 
that do not match the anchor's location.
   > 
   > Then later in `chooseNodeFromNetworkTopology()` we try to find a node 
where location is equal to the anchor and that is where we call into 
`networkTopology.chooseRandom()`. Could we not avoid that call, and avoid 
generating a new list of nodes and do something similar to 
`chooseNodeBasedOnRackAwareness()`, using the `getNetworkLocation()` method to 
find matching nodes. That would probably be more efficient that the current 
implementation.
   > 
   > As we are also then able to re-use the same list of healthy nodes 
everywhere without more filtering, maybe we could sort that list once by 
pipeline count in filterViableNodes or meetCriteria and then later always pick 
the node with the lowest load, filling the nodes up that way.
   > 
   > I hope this comment makes sense as it is very long.
   
   @sodonnel Thanks for the consideration. I've updated the patch according to 
your example.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on issue #668: HDDS-3139 Pipeline placement should max out pipeline usage

2020-04-01 Thread GitBox
timmylicheng commented on issue #668: HDDS-3139 Pipeline placement should max 
out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-607065266
 
 
   > Hi @timmylicheng this patch is giving some conflicts now, probably as we 
merged the other pipeline related change. Could you rebase against master and 
push it again please?
   
   @sodonnel I rebase the code with minor conflicts in test class, but the test 
won't pass. I took a close look and made some change. But I realize the issue 
that I mention in the last comment about how to leverage with 
chooseNodeFromTopology. Wanna learn your thoughts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on issue #668: HDDS-3139 Pipeline placement should max out pipeline usage

2020-03-25 Thread GitBox
timmylicheng commented on issue #668: HDDS-3139 Pipeline placement should max 
out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-603840741
 
 
   Before
   pipeline max: 13, pipeline actual: 12
   Pipeline count on node: 1
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   
   After
   pipeline max: 13, pipeline actual: 13
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 4
   Pipeline count on node: 5
   Pipeline count on node: 5
   
   When the node number gets large, they perform identically.
   Before
   pipeline max: 63, pipeline actual: 63
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 4
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   
   After
   pipeline max: 63, pipeline actual: 63
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 4
   Pipeline count on node: 5


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on issue #668: HDDS-3139 Pipeline placement should max out pipeline usage

2020-03-18 Thread GitBox
timmylicheng commented on issue #668: HDDS-3139 Pipeline placement should max 
out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#issuecomment-600464593
 
 
   When there are 5 nodes and each is allowed to have 5 pipelines:
   Former pipeline placement track:
   Pipeline count on node: 2
   Pipeline count on node: 5
   Pipeline count on node: 4
   Pipeline count on node: 5
   Pipeline count on node: 5
   
   pipeline max: 8, pipeline actual: 7
   
   after this change:
   Pipeline count on node: 4
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   Pipeline count on node: 5
   
   pipeline max: 8, pipeline actual: 8
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org