[ https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871096#comment-16871096 ]
Amelchev Nikita commented on IGNITE-9913: ----------------------------------------- Hi, [~ivan.glukos]. I found two possible blockers to do such lightweight PME without blocking updates: 1. Finalize partitions counter. It seems that we can't correctly collect gaps and process them without completing all txs. See the {{GridDhtPartitionTopologyImpl#finalizeUpdateCounters}} method. 2. Apply update counters. We can't correctly set {{HWM}} counter if primary left the cluster and sent updates to part of backups. Such updates can be processed later and break guarantee that {{LWM<=HWM}}. Could you take a look? > Prevent data updates blocking in case of backup BLT server node leave > --------------------------------------------------------------------- > > Key: IGNITE-9913 > URL: https://issues.apache.org/jira/browse/IGNITE-9913 > Project: Ignite > Issue Type: Improvement > Components: general > Reporter: Ivan Rakov > Assignee: Amelchev Nikita > Priority: Major > Fix For: 2.8 > > Attachments: 9913_yardstick.png, master_yardstick.png > > Time Spent: 50m > Remaining Estimate: 0h > > Ignite cluster performs distributed partition map exchange when any server > node leaves or joins the topology. > Distributed PME blocks all updates and may take a long time. If all > partitions are assigned according to the baseline topology and server node > leaves, there's no actual need to perform distributed PME: every cluster node > is able to recalculate new affinity assigments and partition states locally. > If we'll implement such lightweight PME and handle mapping and lock requests > on new topology version correctly, updates won't be stopped (except updates > of partitions that lost their primary copy). -- This message was sent by Atlassian JIRA (v7.6.3#76005)