Re: Partition map exchange in detail

Ilya Lantukh Fri, 07 Sep 2018 14:18:26 -0700

Hi Eugene,

1) PME happens when topology is modified (TopologyVersion is incremented).
The most common events that trigger it are: node start/stop/fail, cluster
activation/deactivation, dynamic cache start/stop.
2) It is done by a separate ExchangeWorker. Events that trigger PME are
transferred using DiscoverySpi instead of CommunicationSpi.
3) All nodes wait for all pending cache operations to finish and then send
their local partition maps to the coordinator (oldest node). Then
coordinator calculates new global partition maps and sends them to every
node.
4) All cache operations.
5) Exchange is never retried. Ignite community is currently working on PME
failure handling that should kick all problematic nodes after timeout is
reached (see
https://cwiki.apache.org/confluence/display/IGNITE/IEP-25%3A+Partition+Map+Exchange+hangs+resolving
for details), but it isn't done yet.
6) You shouldn't consider PME failure as a error by itself, but rather as a
result of some other error. The most common reason of PME hang-up is
pending cache operation that couldn't finish. Check your logs - it should
list pending transactions and atomic updates. Search for "Found long
running" substring.


Hope this helps.

On Fri, Sep 7, 2018 at 11:45 PM, eugene miretsky <eugene.miret...@gmail.com>
wrote:

> Hello,
>
> Out cluster occasionally fails with "partition map exchange failure"
> errors, I have searched around and it seems that a lot of people have had a
> similar issue in the past. My high-level understanding is that when one of
> the nodes fails (out of memory, exception, GC etc.) nodes fail to exchange
> partition maps. However, I have a few questions
> 1) When does partition map exchange happen? Periodically, when a node
> joins, etc.
> 2) Is it done in the same thread as communication SPI, or is a separate
> worker?
> 3) How does the exchange happen? Via a coordinator, peer to peer, etc?
> 4) What does the exchange block?
> 5) When is the exchange retried?
> 5) How to resolve the error? The only thing I have seen online is to
> decrease failureDetectionTimeout
>
> Our settings are
> - Zookeeper SPI
> - Persistence enabled
>
> Cheers,
> Eugene
>



-- 
Best regards,
Ilya

Re: Partition map exchange in detail

Reply via email to