[ 
https://issues.apache.org/jira/browse/IGNITE-12950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Mironovich reassigned IGNITE-12950:
----------------------------------------

    Assignee: Ivan Mironovich

> Partitions validator must check sizes even if update counters are different
> ---------------------------------------------------------------------------
>
>                 Key: IGNITE-12950
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12950
>             Project: Ignite
>          Issue Type: Improvement
>          Components: cache
>            Reporter: Ivan Mironovich
>            Assignee: Ivan Mironovich
>            Priority: Major
>             Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> We have method in GridDhtPartitionsStateValidator:
> {code:java}
> // public void validatePartitionCountersAndSizes(
>         GridDhtPartitionsExchangeFuture fut,
>         GridDhtPartitionTopology top,
>         Map<UUID, GridDhtPartitionsSingleMessage> messages
>     ) throws IgniteCheckedException {
>         final Set<UUID> ignoringNodes = new HashSet<>();
>         // Ignore just joined nodes.
>         for (DiscoveryEvent evt : fut.events().events()) {
>             if (evt.type() == EVT_NODE_JOINED)
>                 ignoringNodes.add(evt.eventNode().id());
>         }
>         AffinityTopologyVersion topVer = 
> fut.context().events().topologyVersion();
>         // Validate update counters.
>         Map<Integer, Map<UUID, Long>> result = 
> validatePartitionsUpdateCounters(top, messages, ignoringNodes);
>         if (!result.isEmpty())
>             throw new IgniteCheckedException("Partitions update counters are 
> inconsistent for " + fold(topVer, result));
>         // For sizes validation ignore also nodes which are not able to send 
> cache sizes.
>         for (UUID id : messages.keySet()) {
>             ClusterNode node = cctx.discovery().node(id);
>             if (node != null && 
> node.version().compareTo(SIZES_VALIDATION_AVAILABLE_SINCE) < 0)
>                 ignoringNodes.add(id);
>         }
>         if (!cctx.cache().cacheGroup(top.groupId()).mvccEnabled()) { // TODO: 
> Remove "if" clause in IGNITE-9451.
>             // Validate cache sizes.
>             result = validatePartitionsSizes(top, messages, ignoringNodes);
>             if (!result.isEmpty())
>                 throw new IgniteCheckedException("Partitions cache sizes are 
> inconsistent for " + fold(topVer, result));
>         }
>     }
> {code}
>  We should check partitions sizes even if update counters are different. It 
> could be helpful for debugging problems on production.
>  We must print information about all copies, if a partition is in an 
> inconsistent state. Now we could get the message on cache group with 3 
> backups:
> {code:java}
> // Partition states validation has failed for group: CACHEGROUP. Partitions 
> update counters are inconsistent for Part 3415: [10.104.6.10:47500=2577263 
> 10.104.6.12:47500=2577263 10.104.6.23:47500=2577262 10.104.6.9:47500=2577263 
> ] Part 4960: [10.104.6.11:47500=2560994 10.104.6.23:47500=2560993 ]
> {code}
> (part 4960 contains information about 2 copies only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to