[ https://issues.apache.org/jira/browse/IGNITE-10218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874269#comment-16874269 ]
Denis Chudov commented on IGNITE-10218: --------------------------------------- [~ivan.glukos], could you review my fix, please? > Detecting lost partitions PME phase triggered twice on coordinator > ------------------------------------------------------------------ > > Key: IGNITE-10218 > URL: https://issues.apache.org/jira/browse/IGNITE-10218 > Project: Ignite > Issue Type: Bug > Reporter: Sergey Antonov > Assignee: Denis Chudov > Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > scenarion: 1 server left > coordinator node seems to detect partition losses twice per exchange > {noformat} > [16:54:22,027][INFO][exchange-worker-#66][time] Finished exchange init > [topVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], crd=true] > [16:54:22,163][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] > Skipping checkpoint (no pages were modified) [checkpointLockWait=0ms, > checkpointLockHoldTime=525ms, reason='timeout'] > [16:54:22,338][INFO][sys-#136][GridDhtPartitionsExchangeFuture] Coordinator > received single message [ver=AffinityTopologyVersion [topVer=13, > minorTopVer=0], node=2b69b32f-1bea-4c83-a70d-d7ff8ad7e319, allReceived=false] > [16:54:22,401][INFO][sys-#137][GridDhtPartitionsExchangeFuture] Coordinator > received single message [ver=AffinityTopologyVersion [topVer=13, > minorTopVer=0], node=933628df-5237-435c-81d3-7d4be20d8cea, allReceived=false] > [16:54:22,405][INFO][sys-#73][GridDhtPartitionsExchangeFuture] Coordinator > received single message [ver=AffinityTopologyVersion [topVer=13, > minorTopVer=0], node=549935a6-48b0-47cd-a763-13cef4706960, allReceived=false] > [16:54:22,413][INFO][sys-#121][GridDhtPartitionsExchangeFuture] Coordinator > received single message [ver=AffinityTopologyVersion [topVer=13, > minorTopVer=0], node=fbdde4e1-2422-49af-a0bb-d6797cc723fe, allReceived=false] > [16:54:22,722][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Coordinator > received single message [ver=AffinityTopologyVersion [topVer=13, > minorTopVer=0], node=d67a748d-9c63-4ede-8dae-064b63dd1586, allReceived=true] > [16:54:23,493][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] > Skipping checkpoint (no pages were modified) [checkpointLockWait=0ms, > checkpointLockHoldTime=849ms, reason='timeout'] > [16:54:23,494][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Coordinator > received all messages, try merge [ver=AffinityTopologyVersion [topVer=13, > minorTopVer=0]] > [16:54:23,494][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Exchanges > merging performed in 0 ms. > [16:54:23,494][INFO][sys-#122][GridDhtPartitionsExchangeFuture] > finishExchangeOnCoordinator [topVer=AffinityTopologyVersion [topVer=13, > minorTopVer=0], resVer=AffinityTopologyVersion [topVer=13, minorTopVer=0]] > [16:54:24,223][INFO][sys-#122][CacheAffinitySharedManager] Affinity > recalculation (on server left) performed in 729 ms. > [16:54:24,371][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] > Skipping checkpoint (no pages were modified) [checkpointLockWait=0ms, > checkpointLockHoldTime=726ms, reason='timeout'] > [16:54:25,146][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] > Skipping checkpoint (no pages were modified) [checkpointLockWait=1ms, > checkpointLockHoldTime=493ms, reason='timeout'] > [16:54:26,443][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] > Skipping checkpoint (no pages were modified) [checkpointLockWait=8ms, > checkpointLockHoldTime=776ms, reason='timeout'] > [16:54:26,443][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Affinity > changes (coordinator) applied in 2949 ms. > [16:54:26,758][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Partitions > validation performed in 307 ms. > [16:54:27,398][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] > Skipping checkpoint (no pages were modified) [checkpointLockWait=0ms, > checkpointLockHoldTime=725ms, reason='timeout'] > [16:54:27,646][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Detecting > lost partitions performed in 887 ms. > [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Preparing > Full Message performed in 1138 ms. > [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Sending Full > Message performed in 0 ms. > [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Sending Full > Message to all nodes performed in 0 ms. > [16:54:28,908][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], > resVer=AffinityTopologyVersion [topVer=13, minorTopVer=0], err=null] > [16:54:29,171][INFO][db-checkpoint-thread-#86][GridCacheDatabaseSharedManager] > Skipping checkpoint (no pages were modified) [checkpointLockWait=0ms, > checkpointLockHoldTime=486ms, reason='timeout'] > [16:54:29,316][INFO][sys-#122][GridDhtPartitionsExchangeFuture] Detecting > lost partitions performed in 407 ms. > {noformat} > This method is invoked two times: > # GridDhtPartitionsExchangeFuture#finishExchangeOnCoordinator() > # GridDhtPartitionsExchangeFuture#onDone() > Do we really need to perform Detecting lost partitions twice? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)