[ https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803849#comment-16803849 ]
Amelchev Nikita commented on IGNITE-9913: ----------------------------------------- I have implemented lightweight PME (based on PR by Ivan Rakov) for the case when the baseline server leaves topology. I have benchmarked it with master under yardstick load (IgniteGetAndPutTxBenchmark, 6 servers, 2 clients by 64 threads): master: !master_yardstick.png! with my changes: !9913_yardstick.png! PME duration master: servers 1440+-35 ms (servers); 989+-87 ms (clients) with changes: 117+-10 ms (servers and clients) Also, max latency of transactions was decreased: master: 1439 ms with changes: 293 ms In summary, PME duration was decreased by 10 times and the maximum latency of transactions was decreased by 4-5 times. TC tests look good. (testRebalancingDuringLoad_N can be muted until IGNITE-11623 will be resolved). > Prevent data updates blocking in case of backup BLT server node leave > --------------------------------------------------------------------- > > Key: IGNITE-9913 > URL: https://issues.apache.org/jira/browse/IGNITE-9913 > Project: Ignite > Issue Type: Improvement > Components: general > Reporter: Ivan Rakov > Assignee: Amelchev Nikita > Priority: Major > Fix For: 2.8 > > Attachments: 9913_yardstick.png, master_yardstick.png > > Time Spent: 10m > Remaining Estimate: 0h > > Ignite cluster performs distributed partition map exchange when any server > node leaves or joins the topology. > Distributed PME blocks all updates and may take a long time. If all > partitions are assigned according to the baseline topology and server node > leaves, there's no actual need to perform distributed PME: every cluster node > is able to recalculate new affinity assigments and partition states locally. > If we'll implement such lightweight PME and handle mapping and lock requests > on new topology version correctly, updates won't be stopped (except updates > of partitions that lost their primary copy). -- This message was sent by Atlassian JIRA (v7.6.3#76005)