[jira] [Commented] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

Ilya Lantukh (JIRA) Thu, 28 Mar 2019 07:20:12 -0700


    [ 
https://issues.apache.org/jira/browse/IGNITE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803972#comment-16803972
 ]


Ilya Lantukh commented on IGNITE-9913:
--------------------------------------

Hi [~NSAmelchev],

 

Thanks for the contribution! I've added some comments on your PR on github.

In general, I think that what you have done doesn't match the ticket's 
description. PME should definitely be faster now, because you removed the 
distributed exchange phase out of it. But cache operations might still be 
blocked until PME is finished on all nodes. For large clusters it might take 
significant amount of time for NODE_LEFT event to reach all nodes, and for that 
time some nodes will have topVer == X, while others will have it == X-1. If a 
cache operation involves nodes from both subsets, it will get blocked until 
node with lower version updates it to a higher version.

 

[~ivan.glukos], do you agree with that?

> Prevent data updates blocking in case of backup BLT server node leave
> ---------------------------------------------------------------------
>
>                 Key: IGNITE-9913
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9913
>             Project: Ignite
>          Issue Type: Improvement
>          Components: general
>            Reporter: Ivan Rakov
>            Assignee: Amelchev Nikita
>            Priority: Major
>             Fix For: 2.8
>
>         Attachments: 9913_yardstick.png, master_yardstick.png
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ignite cluster performs distributed partition map exchange when any server 
> node leaves or joins the topology.
> Distributed PME blocks all updates and may take a long time. If all 
> partitions are assigned according to the baseline topology and server node 
> leaves, there's no actual need to perform distributed PME: every cluster node 
> is able to recalculate new affinity assigments and partition states locally. 
> If we'll implement such lightweight PME and handle mapping and lock requests 
> on new topology version correctly, updates won't be stopped (except updates 
> of partitions that lost their primary copy).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (IGNITE-9913) Prevent data updates blocking in case of backup BLT server node leave

Reply via email to