Hi, that's it.
06.05.2020, 19:50, "Ivan Rakov" <ivan.glu...@gmail.com>: > Hi, > > IGNITE_WRITE_REBALANCE_PARTITION_DISTRIBUTION_THRESHOLD - threshold >> duration rebalance of cache group after which partitions distribution is >> output, set in milliseconds, default value is 10 minutes. > > Does it mean that if the rebalancing process took less than 10 minutes, > only a short version of the message (with supplier statistics) will show up? > > In general, I have no objections. > > On Mon, May 4, 2020 at 10:38 AM ткаленко кирилл <tkalkir...@yandex.ru> > wrote: > >> Hi, Igniters! >> >> I'd like to share a new small feature in AI [1]. >> >> Current rebalance logging does not allow you to quickly answer following >> questions: >> 1)How long was the balance(divided by supplier)? >> 2)How many records and bytes per supplier were rebalanced? >> 3)How many times did rebalance restart? >> 4)Which partitions were rebalanced and from which nodes did they receive >> them? >> 5)When did rebalance for all cache groups end? >> >> What you can see in logs now: >> >> 1)Starting rebalance with order of cache groups. >> Rebalancing scheduled [order=[ignite-sys-cache, grp1, grp0], >> top=AffinityTopologyVersion [topVer=2, minorTopVer=0], force=false, >> evt=NODE_JOINED, node=c2146a04-dc23-4bc9-870d-dfbb55c00001] >> >> 2)Start rebalance of cache group from a specific supplier, specifying >> partition ids and mode - historical or full. >> Starting rebalance routine [ignite-sys-cache, >> topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], >> supplier=8c525892-703b-4fc4-b28b-b2f139700000, fullPartitions=[0-99], >> histPartitions=[]] >> >> 3)Getting partial or complete partitions of cache group. >> Completed rebalancing [grp=ignite-sys-cache, >> supplier=8c525892-703b-4fc4-b28b-b2f139700000, >> topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], progress=1/2] >> Completed (final) rebalancing [grp=ignite-sys-cache, >> supplier=c2146a04-dc23-4bc9-870d-dfbb55c00001, >> topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], progress=2/2] >> >> 4)End rebalance of cache group. >> Completed rebalance future: RebalanceFuture [grp=CacheGroupContext >> [grp=ignite-sys-cache], topVer=AffinityTopologyVersion [topVer=2, >> minorTopVer=0], rebalanceId=1, routines=1, receivedBytes=1200, >> receivedKeys=0, partitionsLeft=0, startTime=1588519707607, endTime=-1, >> lastCancelledTime=-1] >> >> Rebalance statistics: >> >> To speed up rebalance analysis, statistics will be output for each cache >> group and total for all cache groups. >> If duration rebalance for cache group is greater than threshold value, >> partition distribution is output. >> Statistics will you to analyze duration of the balance for each supplier >> to understand which of them has been transmitting data for longest time. >> >> System properties are used to output statistics: >> >> IGNITE_QUIET - to output statistics, value must be false; >> IGNITE_WRITE_REBALANCE_PARTITION_DISTRIBUTION_THRESHOLD - threshold >> duration rebalance of cache group after which partitions distribution is >> output, set in milliseconds, default value is 10 minutes. >> >> Statistics examples: >> >> Successful full and historical rebalance of group cache, without >> partitions distribution. >> Rebalance information per cache group (successful rebalance): [id=3181548, >> name=grp1, startTime=2020-04-13 10:55:16,117, finishTime=2020-04-13 >> 10:55:16,127, d=10 ms, restarted=0] Supplier statistics: [nodeId=0, p=5, >> d=10 ms] [nodeId=1, p=5, d=10 ms] Aliases: p - partitions, e - entries, b - >> bytes, d - duration, h - historical, nodeId mapping >> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest1] >> [1=rebalancing.RebalanceStatisticsTest0] >> Rebalance information per cache group (successful rebalance): [id=3181547, >> name=grp0, startTime=2020-04-13 15:01:44,000, finishTime=2020-04-13 >> 15:01:44,116, d=116 ms, restarted=0] Supplier statistics: [nodeId=0, hp=10, >> he=300, hb=30267, d=116 ms] Aliases: p - partitions, e - entries, b - >> bytes, d - duration, h - historical, nodeId mapping >> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest0] >> >> Successful full and historical rebalance of group cache, with partitions >> distribution. >> Rebalance information per cache group (successful rebalance): [id=3181548, >> name=grp1, startTime=2020-04-13 10:55:16,117, finishTime=2020-04-13 >> 10:55:16,127, d=10 ms, restarted=0] Supplier statistics: [nodeId=0, p=5, >> d=10 ms] [nodeId=1, p=5, d=10 ms] Aliases: p - partitions, e - entries, b - >> bytes, d - duration, h - historical, nodeId mapping >> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest1] >> [1=rebalancing.RebalanceStatisticsTest0] Rebalance duration was greater >> than 5 ms, printing detailed information about partitions distribution >> (threshold can be changed by setting number of milliseconds into >> IGNITE_WRITE_REBALANCE_PARTITION_DISTRIBUTION_THRESHOLD) 0 = >> [0,bu,su],[1,bu],[2,pr,su] 1 = [0,bu,su],[1,bu],[2,pr,su] 2 = >> [0,bu,su],[1,bu],[2,pr,su] 3 = [0,bu,su],[1,bu],[2,pr,su] 4 = >> [0,bu,su],[1,bu],[2,pr,su] 5 = [0,bu,su],[1,bu],[2,pr,su] 6 = >> [0,bu,su],[1,bu],[2,pr,su] 7 = [0,bu,su],[1,bu],[2,pr,su] 8 = >> [0,bu,su],[1,bu],[2,pr,su] 9 = [0,bu,su],[1,bu],[2,pr,su] Aliases: pr - >> primary, bu - backup, su - supplier node, h - historical, nodeId mapping >> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest1] >> [1=rebalancing.RebalanceStatisticsTest2] >> [2=rebalancing.RebalanceStatisticsTest0] >> Rebalance information per cache group (successful rebalance): [id=3181547, >> name=grp0, startTime=2020-04-13 15:01:44,000, finishTime=2020-04-13 >> 15:01:44,116, d=116 ms, restarted=0] Supplier statistics: [nodeId=0, hp=10, >> he=300, hb=30267, d=116 ms] Aliases: p - partitions, e - entries, b - >> bytes, d - duration, h - historical, nodeId mapping >> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest0] Rebalance >> duration was greater than 100 ms, printing detailed information about >> partitions distribution (threshold can be changed by setting number of >> milliseconds into IGNITE_WRITE_REBALANCE_PARTITION_DISTRIBUTION_THRESHOLD) >> 0 = [0,pr,su],[1,bu],[2,bu] h 1 = [0,bu,su],[1,bu],[2,pr] h 2 = >> [0,pr,su],[1,bu],[2,bu] h 3 = [0,bu,su],[1,bu],[2,pr] h 4 = >> [0,pr,su],[1,bu],[2,bu] h 5 = [0,pr,su],[1,bu],[2,bu] h 6 = >> [0,bu,su],[1,bu],[2,pr] h 7 = [0,pr,su],[1,bu],[2,bu] h 8 = >> [0,pr,su],[1,bu],[2,bu] h 9 = [0,pr,su],[1,bu],[2,bu] h Aliases: pr - >> primary, bu - backup, su - supplier node, h - historical, nodeId mapping >> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest0] >> [1=rebalancing.RebalanceStatisticsTest2] >> [2=rebalancing.RebalanceStatisticsTest1] >> >> Interrupted rebalance of group cache. >> Rebalance information per cache group (interrupted rebalance): >> [id=644280849, name=default2, startTime=2020-04-13 14:55:24,969, >> finishTime=2020-04-13 14:55:24,969, d=0 ms, restarted=0] >> >> Total full and historical rebalance for all cache groups. >> Rebalance total information (including successful and not rebalances): >> [startTime=2020-04-13 10:55:18,726, finishTime=2020-04-13 10:55:18,780, >> d=54 ms] Supplier statistics: [nodeId=0, p=60, e=250, b=25000, d=54 ms] >> [nodeId=1, p=60, e=250, b=24945, d=54 ms] Aliases: p - partitions, e - >> entries, b - bytes, d - duration, h - historical, nodeId mapping >> (nodeId=id,consistentId) [0=rebalancing.RebalanceStatisticsTest1] >> [1=rebalancing.RebalanceStatisticsTest0] >> Rebalance total information (including successful and not rebalances): >> [startTime=2020-04-13 15:01:43,822, finishTime=2020-04-13 15:01:44,116, >> d=294 ms] Supplier statistics: [nodeId=0, hp=20, he=500, hb=50445, d=294 >> ms] Aliases: p - partitions, e - entries, b - bytes, d - duration, h - >> historical, nodeId mapping (nodeId=id,consistentId) >> [0=rebalancing.RebalanceStatisticsTest0] >> >> [1] - https://issues.apache.org/jira/browse/IGNITE-12080