[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-15 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334233883
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java
 ##
 @@ -3839,7 +3969,14 @@ private void finalizePartitionCounters() {
 cctx.kernalContext().getSystemExecutorService(),
 nonLocalCacheGroups(),
 grp -> {
-grp.topology().finalizeUpdateCounters();
+AffinityTopologyVersion topVer = 
sharedContext().exchange().readyAffinityVersion();
+
+// Failed node's primary partitions or just all local 
backups in case of possible exchange merge.
+Set parts = nodeId != null ?
 
 Review comment:
   Is this intended as an optimization ?
   This is not really needed, because here all possible updates are received 
and all remaining gaps could be safely removed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-15 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334781796
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java
 ##
 @@ -3826,10 +3952,14 @@ private void assignPartitionsStates() {
 }
 
 /**
- * Removes gaps in the local update counters. Gaps in update counters are 
possible on backup node when primary
- * failed to send update counter deltas to backup.
+ * Removes gaps in the local update counters. Affects specified node's 
primary partitions in case node specified and
+ * all local partitins otherwise.
+ *
+ * Gaps in update counters are possible on backup node when primary failed 
to send update counter deltas to backup.
+ *
+ * @param nodeId Failed node id, {@code null} in case of merged failures.
  */
-private void finalizePartitionCounters() {
+private void finalizePartitionCounters(UUID nodeId) {
 
 Review comment:
   I wouldn't mind )


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-14 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334555914
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java
 ##
 @@ -3826,10 +3952,14 @@ private void assignPartitionsStates() {
 }
 
 /**
- * Removes gaps in the local update counters. Gaps in update counters are 
possible on backup node when primary
- * failed to send update counter deltas to backup.
+ * Removes gaps in the local update counters. Affects specified node's 
primary partitions in case node specified and
+ * all local partitins otherwise.
+ *
+ * Gaps in update counters are possible on backup node when primary failed 
to send update counter deltas to backup.
+ *
+ * @param nodeId Failed node id, {@code null} in case of merged failures.
  */
-private void finalizePartitionCounters() {
+private void finalizePartitionCounters(UUID nodeId) {
 
 Review comment:
   To emphasize special meaning of null argument. We have lots of use of 
@Nullable annotation in the codebase.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334233104
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/ExchangeContext.java
 ##
 @@ -77,6 +95,28 @@ public ExchangeContext(boolean crd, 
GridDhtPartitionsExchangeFuture fut) {
 evts = new ExchangeDiscoveryEvents(fut);
 }
 
+/**
+ * @param fut Future.
+ */
+private boolean localRecoveryNeeded(GridDhtPartitionsExchangeFuture fut) {
+for (CacheGroupContext grp : 
fut.sharedContext().cache().cacheGroups()) {
+if (grp.isLocal())
+continue;
+
+GridAffinityAssignmentCache aff = grp.affinity();
+
+Set failedPrimaries = 
aff.primaryPartitions(fut.exchangeId().eventNode().id(), aff.lastVersion());
+Set loc = grp.topology().localPartitionMap().keySet();
 
 Review comment:
   You should only take into account owning partitions here (avoid renting 
partitions, moving are not possible because previous assignment is ideal).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334233883
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java
 ##
 @@ -3839,7 +3969,14 @@ private void finalizePartitionCounters() {
 cctx.kernalContext().getSystemExecutorService(),
 nonLocalCacheGroups(),
 grp -> {
-grp.topology().finalizeUpdateCounters();
+AffinityTopologyVersion topVer = 
sharedContext().exchange().readyAffinityVersion();
+
+// Failed node's primary partitions or just all local 
backups in case of possible exchange merge.
+Set parts = nodeId != null ?
 
 Review comment:
   It this intended as an optimization ?
   This is not really needed, because here all possible updates are received 
and all remaining gaps could be safely removed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334233052
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/ExchangeContext.java
 ##
 @@ -58,7 +66,17 @@
 public ExchangeContext(boolean crd, GridDhtPartitionsExchangeFuture fut) {
 int protocolVer = 
exchangeProtocolVersion(fut.firstEventCache().minimumNodeVersion());
 
-if (compatibilityNode || (crd && fut.localJoinExchange())) {
+GridDiscoveryManager disco = fut.sharedContext().discovery();
+
+if (protocolVer > 2 && fut.exchangeId().isLeft() && 
fut.isFirstEventNodeInBaseline() &&
 
 Review comment:
   Protocol v3 should be disabled if previous assignment is not ideal. 
Otherwise it means rebalancing is not finished and some nodes can see different 
local partition states at the moment of assignment calculation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334232912
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java
 ##
 @@ -3826,10 +3952,14 @@ private void assignPartitionsStates() {
 }
 
 /**
- * Removes gaps in the local update counters. Gaps in update counters are 
possible on backup node when primary
- * failed to send update counter deltas to backup.
+ * Removes gaps in the local update counters. Affects specified node's 
primary partitions in case node specified and
+ * all local partitins otherwise.
+ *
+ * Gaps in update counters are possible on backup node when primary failed 
to send update counter deltas to backup.
+ *
+ * @param nodeId Failed node id, {@code null} in case of merged failures.
  */
-private void finalizePartitionCounters() {
+private void finalizePartitionCounters(UUID nodeId) {
 
 Review comment:
   @Nullable is missing


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334228120
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java
 ##
 @@ -1798,6 +1799,24 @@ public int size() {
 return resolveDiscoCache(CU.cacheId(null), topVer).baselineNodes();
 }
 
+/**
+ * Compares baselines at different topologies.
+ *
+ * @param topVer1 Topology version 1.
+ * @param topVer2 Topology version 2.
+ */
+public boolean baselineChanged(AffinityTopologyVersion topVer1, 
AffinityTopologyVersion topVer2) {
+assert !topVer1.equals(topVer2);
+
+DiscoCache disco1 = discoCache(topVer1);
+DiscoCache disco2 = discoCache(topVer2);
+
+BaselineTopology baseline1 = disco1 != null ? 
disco1.state().baselineTopology() : null;
+BaselineTopology baseline2 = disco2 != null ? 
disco2.state().baselineTopology() : null;
+
+return !Objects.equals(baseline1, baseline2);
+}
+
 
 Review comment:
   Looks like we only care about baseline change comparing to previous version. 
Better to add a flag to DiscoCache indicating what baseline is changed 
comparing to previous version and get rid of internal collections comparison.
   See 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager#createDiscoCache.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334228034
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/CacheAffinitySharedManager.java
 ##
 @@ -1617,6 +1617,23 @@ public void 
onServerJoinWithExchangeMergeProtocol(GridDhtPartitionsExchangeFutur
 return onReassignmentEnforced(fut);
 }
 
+/**
+ * Called on exchange initiated by baseline server node leave.
+ *
+ * @param fut Exchange future.
+ * @param crd Coordinator flag.
+ * @return {@code True} if affinity should be assigned by coordinator 
(local node for this case).
+ */
+public boolean onBaselineNodeLeft(final GridDhtPartitionsExchangeFuture 
fut, boolean crd) {
+assert (fut.events().hasServerLeft() && 
!fut.firstEvent().eventNode().isClient()) : fut.firstEvent();
+
+assert !fut.context().mergeExchanges();
+
+onReassignmentEnforced(fut);
+
+return true;
+}
+
 
 Review comment:
   crd argument is not used
   onReassignmentEnforced calls createAffinityDiffMessages which is not 
necessary for baseline node left scenario.
   I think it can be replaced only by call to 
initAffinityBasedOnPartitionsAvailability.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334233186
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java
 ##
 @@ -1357,7 +1370,9 @@ private ExchangeType onServerNodeEvent(boolean crd) 
throws IgniteCheckedExceptio
 
 exchCtx.events().warnNoAffinityNodes(cctx);
 
-centralizedAff = cctx.affinity().onCentralizedAffinityChange(this, 
crd);
+centralizedAff = exchCtx.baselineNodeLeft() ?
 
 Review comment:
   centralizedAff  is a part of old protocol. It's better not to mix old and 
new protocols, some refactoring is needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [ignite] ascherbakoff commented on a change in pull request #6942: IGNITE-9913

2019-10-12 Thread GitBox
ascherbakoff commented on a change in pull request #6942: IGNITE-9913
URL: https://github.com/apache/ignite/pull/6942#discussion_r334233269
 
 

 ##
 File path: 
modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java
 ##
 @@ -2225,7 +2351,7 @@ private String exchangeTimingsLogMessage(String header, 
List timings) {
 
 boolean locNodeNotCrd = crd == null || !crd.isLocal();
 
-if (locNodeNotCrd && (serverNodeDiscoveryEvent() || 
localJoinExchange()))
+if ((locNodeNotCrd && (serverNodeDiscoveryEvent() || 
localJoinExchange())) || exchCtx.baselineNodeLeft())
 
 Review comment:
   Make sure partition loss policy works well with new protocol. I suggest 
removing nodes one by one under load until some subset of partitions will have 
no owners more.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services