Yakov Zhdanov created IGNITE-5155:
-------------------------------------

             Summary: Need to improve stats dump on exchange timeout
                 Key: IGNITE-5155
                 URL: https://issues.apache.org/jira/browse/IGNITE-5155
             Project: Ignite
          Issue Type: Improvement
            Reporter: Yakov Zhdanov
            Assignee: Stanilovsky Evgeny
             Fix For: 2.1


Currently, on large topologies info dumped on "Failed to wait for partition map 
exchange" 
(org/apache/ignite/internal/processors/cache/GridCachePartitionExchangeManager.java:1713)
 floods the log and we need to reduce information dumped.

1. Reduce output for exchange futures that are already done. Keep event, 
topology version, servers count, clients count (more?)
2. Do not dump the whole communication stats, but send message to exchange 
coordinator, ask for its status and for number of messages received and for 
acked messages from local node.
3. we can think of sending new message from cache node to coordinator that may 
be a sign of a problem on that node (e.g. unreleased tx locks or still renting 
partitions) and coordinator may include this info to a status thus every Ignite 
node may point to a problem node in the logs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to