Hi,
I am trying to recover lost data in case of partition loss.
In my ignite configuration native persistence is *off*.
I have started event listener on EVT_CACHE_REBALANCE_PART_DATA_LOST  event.
This listener will get lost partition list using cache.lostPartitions()
method.
The issue is that listener gets call per partition. So if there 100
partition loss due to single node termination then 100 time this
listener will get called and last multiple calls to the listener will fetch
all lost partition list.

*Lets take a scenario:*
Started two server nodes  Node A and Node B.  Started cache with
partition mode and the number of backup set to 0 in order to facilitate
simulation of partition loss scenarios
Started event listener on both node listening  to
EVT_CACHE_REBALANCE_PART_DATA_LOST  event.

Number of partitions on node A = 500
Number of partitions on node B = 524

Now stop node B. After termination of node B listener running on node A
gets call multiple time per partition.
I have printed logs on listener

primary partition size after loss:1024
*Lost partion Nos.1*
IgniteThread [compositeRwLockIdx=1, stripe=-1, plc=-1,
name=exchange-worker-#42%springDataNode%]::*[0]*
Event Detail:CacheRebalancingEvent [cacheName=ASSET_GROUP_CACHE, part=0,
discoNode=TcpDiscoveryNode [id=1bb17828-3556-499f-a4e6-98cfdc1d11fb,
addrs=[0:0:0:0:0:0:0:1, 10.113.14.98, 127.0.0.1], sockAddrs=[],
discPort=47501, order=2, intOrder=2, lastExchangeTime=1568357181089,
loc=false, ver=2.6.0#20180710-sha1:669feacc, isClient=false],
discoEvtType=12, discoTs=1568357376683, discoEvtName=NODE_FAILED,
nodeId8=499400ac, msg=Cache rebalancing event.,
type=CACHE_REBALANCE_PART_DATA_LOST, tstamp=1568357376714]
primary partition size after loss:1024
*Lost partion Nos.2*
IgniteThread [compositeRwLockIdx=1, stripe=-1, plc=-1,
name=exchange-worker-#42%springDataNode%]::*[0, 1]*
Event Detail:CacheRebalancingEvent [cacheName=ASSET_GROUP_CACHE, part=1,
discoNode=TcpDiscoveryNode [id=1bb17828-3556-499f-a4e6-98cfdc1d11fb,
addrs=[0:0:0:0:0:0:0:1, 10.113.14.98, 127.0.0.1], sockAddrs=[],
discPort=47501, order=2, intOrder=2, lastExchangeTime=1568357181089,
loc=false, ver=2.6.0#20180710-sha1:669feacc, isClient=false],
discoEvtType=12, discoTs=1568357376683, discoEvtName=NODE_FAILED,
nodeId8=499400ac, msg=Cache rebalancing event.,
type=CACHE_REBALANCE_PART_DATA_LOST, tstamp=1568357376726]
primary partition size after loss:1024
*Lost partion Nos.3*
IgniteThread [compositeRwLockIdx=1, stripe=-1, plc=-1,
name=exchange-worker-#42%springDataNode%]::*[0, 1, 2]*
Event Detail:CacheRebalancingEvent [cacheName=ASSET_GROUP_CACHE, part=2,
discoNode=TcpDiscoveryNode [id=1bb17828-3556-499f-a4e6-98cfdc1d11fb,
addrs=[0:0:0:0:0:0:0:1, 10.113.14.98, 127.0.0.1], sockAddrs=[],
discPort=47501, order=2, intOrder=2, lastExchangeTime=1568357181089,
loc=false, ver=2.6.0#20180710-sha1:669feacc, isClient=false],
discoEvtType=12, discoTs=1568357376683, discoEvtName=NODE_FAILED,
nodeId8=499400ac, msg=Cache rebalancing event.,
type=CACHE_REBALANCE_PART_DATA_LOST, tstamp=1568357376726]
primary partition size after loss:1024
*Lost partion Nos.4*
IgniteThread [compositeRwLockIdx=1, stripe=-1, plc=-1,
name=exchange-worker-#42%springDataNode%]::*[0, 1, 2, 4]*
Event Detail:CacheRebalancingEvent [cacheName=ASSET_GROUP_CACHE, part=4,
discoNode=TcpDiscoveryNode [id=1bb17828-3556-499f-a4e6-98cfdc1d11fb,
addrs=[0:0:0:0:0:0:0:1, 10.113.14.98, 127.0.0.1], sockAddrs=[],
discPort=47501, order=2, intOrder=2, lastExchangeTime=1568357181089,
loc=false, ver=2.6.0#20180710-sha1:669feacc, isClient=false],
discoEvtType=12, discoTs=1568357376683, discoEvtName=NODE_FAILED,
nodeId8=499400ac, msg=Cache rebalancing event.,
type=CACHE_REBALANCE_PART_DATA_LOST, tstamp=1568357376736]
primary partition size after loss:1024
*Lost partion Nos.5*
*.*
*.*
*.*
*.*
IgniteThread [compositeRwLockIdx=1, stripe=-1, plc=-1,
name=exchange-worker-#42%springDataNode%]::[0, 1, 2, 4, 5, 6, 7, 11, 13,
17, 22, 26, 28, 29, 30, 33, 34, 37, 38, 41, 43, 45, 47, 48, 49, 50, 55, 58,
61, 62, 64, 65, 68, 70, 71, 75, 77, 79, 81, 82, 85, 87, 88, 89, 90, 93,
100, 101, 102, 104, 110, 112, 114, 116, 121, 123, 125, 126, 132, 133, 135,
137, 138, 139, 140, 144, 145, 146, 147, 149, 150, 151, 154, 156, 157, 158,
163, 164, 165, 169, 170, 172, 173, 176, 178, 180, 182, 183, 184, 185, 195,
196, 198, 199, 203, 204, 212, 213, 215, 217, 219, 220, 222, 223, 224, 226,
227, 230, 233, 234, 236, 237, 240, 242, 245, 248, 250, 251, 253, 255, 257,
258, 263, 265, 266, 267, 269, 270, 272, 273, 275, 276, 277, 278, 281, 282,
283, 287, 288, 292, 293, 295, 296, 297, 298, 300, 301, 302, 305, 308, 309,
310, 311, 313, 314, 315, 318, 319, 320, 322, 323, 324, 326, 327, 328, 329,
330, 331, 332, 333, 336, 340, 342, 344, 347, 348, 349, 351, 352, 353, 354,
355, 357, 362, 364, 369, 370, 371, 373, 374, 375, 376, 380, 382, 383, 387,
389, 394, 395, 396, 397, 398, 401, 402, 403, 407, 408, 409, 410, 411, 412,
413, 416, 417, 421, 424, 425, 427, 430, 431, 433, 435, 437, 438, 439, 440,
441, 442, 443, 445, 446, 452, 454, 455, 456, 459, 461, 463, 466, 470, 472,
474, 475, 476, 480, 481, 482, 484, 485, 489, 492, 494, 495, 496, 497, 498,
499, 501, 502, 503, 504, 505, 508, 510, 511, 512, 513, 514, 515, 516, 519,
523, 525, 526, 527, 529, 530, 531, 532, 535, 536, 539, 540, 541, 543, 545,
546, 550, 552, 553, 555, 557, 560, 569, 572, 573, 575, 576, 579, 582, 589,
591, 593, 594, 597, 599, 602, 603, 604, 605, 607, 608, 610, 612, 613, 614,
615, 616, 617, 619, 622, 624, 625, 626, 627, 630, 631, 632, 633, 634, 635,
636, 637, 638, 639, 640, 641, 642, 643, 645, 646, 647, 648, 649, 652, 653,
654, 656, 657, 660, 662, 663, 666, 668, 669, 670, 671, 679, 681, 683, 686,
688, 691, 693, 698, 701, 702, 703, 705, 706, 709, 712, 713, 716, 717, 719,
721, 723, 726, 730, 737, 738, 740, 741, 742, 745, 747, 750, 752, 755, 756,
759, 760, 761, 763, 764, 765, 766, 767, 768, 770, 771, 772, 777, 779, 785,
786, 789, 790, 792, 793, 794, 799, 801, 804, 811, 816, 818, 822, 823, 824,
825, 826, 827, 832, 833, 836, 838, 840, 841, 843, 844, 846, 850, 851, 852,
853, 855, 856, 858, 862, 864, 867, 872, 873, 876, 877, 878, 879, 883, 884,
886, 887, 890, 892, 895, 897, 898, 899, 900, 902, 903, 904, 905, 906, 907,
908, 910, 914, 916, 918, 919, 920, 921, 922, 925, 926, 928, 929, 933, 935,
936, 939, 940, 943, 945, 950, 951, 952, 953, 960, 961, 963, 964, 966, 967,
972, 973, 975, 977, 979, 980, 982, 983, 984, 985, 987, 989, 991, 992, 995,
996, 999, 1002, 1003, 1005, 1007, 1011, 1014, 1015, 1016, 1018, 1020, 1021]
Event Detail:CacheRebalancingEvent [cacheName=ASSET_GROUP_CACHE, part=*412*,
discoNode=TcpDiscoveryNode [id=1bb17828-3556-499f-a4e6-98cfdc1d11fb,
addrs=[0:0:0:0:0:0:0:1, 10.113.14.98, 127.0.0.1], sockAddrs=[],
discPort=47501, order=2, intOrder=2, lastExchangeTime=1568357181089,
loc=false, ver=2.6.0#20180710-sha1:669feacc, isClient=false],
discoEvtType=12, discoTs=1568357376683, discoEvtName=NODE_FAILED,
nodeId8=499400ac, msg=Cache rebalancing event.,
type=CACHE_REBALANCE_PART_DATA_LOST, tstamp=1568357423500]
primary partition size after loss:1024

*Lost partion Nos.524*
IgniteThread [compositeRwLockIdx=1, stripe=-1, plc=-1,
name=exchange-worker-#42%springDataNode%]::[0, 1, 2, 4, 5, 6, 7, 11, 13,
17, 22, 26, 28, 29, 30, 33, 34, 37, 38, 41, 43, 45, 47, 48, 49, 50, 55, 58,
61, 62, 64, 65, 68, 70, 71, 75, 77, 79, 81, 82, 85, 87, 88, 89, 90, 93,
100, 101, 102, 104, 110, 112, 114, 116, 121, 123, 125, 126, 132, 133, 135,
137, 138, 139, 140, 144, 145, 146, 147, 149, 150, 151, 154, 156, 157, 158,
163, 164, 165, 169, 170, 172, 173, 176, 178, 180, 182, 183, 184, 185, 195,
196, 198, 199, 203, 204, 212, 213, 215, 217, 219, 220, 222, 223, 224, 226,
227, 230, 233, 234, 236, 237, 240, 242, 245, 248, 250, 251, 253, 255, 257,
258, 263, 265, 266, 267, 269, 270, 272, 273, 275, 276, 277, 278, 281, 282,
283, 287, 288, 292, 293, 295, 296, 297, 298, 300, 301, 302, 305, 308, 309,
310, 311, 313, 314, 315, 318, 319, 320, 322, 323, 324, 326, 327, 328, 329,
330, 331, 332, 333, 336, 340, 342, 344, 347, 348, 349, 351, 352, 353, 354,
355, 357, 362, 364, 369, 370, 371, 373, 374, 375, 376, 380, 382, 383, 387,
389, 394, 395, 396, 397, 398, 401, 402, 403, 407, 408, 409, 410, 411, 412,
413, 416, 417, 421, 424, 425, 427, 430, 431, 433, 435, 437, 438, 439, 440,
441, 442, 443, 445, 446, 452, 454, 455, 456, 459, 461, 463, 466, 470, 472,
474, 475, 476, 480, 481, 482, 484, 485, 489, 492, 494, 495, 496, 497, 498,
499, 501, 502, 503, 504, 505, 508, 510, 511, 512, 513, 514, 515, 516, 519,
523, 525, 526, 527, 529, 530, 531, 532, 535, 536, 539, 540, 541, 543, 545,
546, 550, 552, 553, 555, 557, 560, 569, 572, 573, 575, 576, 579, 582, 589,
591, 593, 594, 597, 599, 602, 603, 604, 605, 607, 608, 610, 612, 613, 614,
615, 616, 617, 619, 622, 624, 625, 626, 627, 630, 631, 632, 633, 634, 635,
636, 637, 638, 639, 640, 641, 642, 643, 645, 646, 647, 648, 649, 652, 653,
654, 656, 657, 660, 662, 663, 666, 668, 669, 670, 671, 679, 681, 683, 686,
688, 691, 693, 698, 701, 702, 703, 705, 706, 709, 712, 713, 716, 717, 719,
721, 723, 726, 730, 737, 738, 740, 741, 742, 745, 747, 750, 752, 755, 756,
759, 760, 761, 763, 764, 765, 766, 767, 768, 770, 771, 772, 777, 779, 785,
786, 789, 790, 792, 793, 794, 799, 801, 804, 811, 816, 818, 822, 823, 824,
825, 826, 827, 832, 833, 836, 838, 840, 841, 843, 844, 846, 850, 851, 852,
853, 855, 856, 858, 862, 864, 867, 872, 873, 876, 877, 878, 879, 883, 884,
886, 887, 890, 892, 895, 897, 898, 899, 900, 902, 903, 904, 905, 906, 907,
908, 910, 914, 916, 918, 919, 920, 921, 922, 925, 926, 928, 929, 933, 935,
936, 939, 940, 943, 945, 950, 951, 952, 953, 960, 961, 963, 964, 966, 967,
972, 973, 975, 977, 979, 980, 982, 983, 984, 985, 987, 989, 991, 992, 995,
996, 999, 1002, 1003, 1005, 1007, 1011, 1014, 1015, 1016, 1018, 1020, 1021]
Event Detail:CacheRebalancingEvent [cacheName=ASSET_GROUP_CACHE, part=*413*,
discoNode=TcpDiscoveryNode [id=1bb17828-3556-499f-a4e6-98cfdc1d11fb,
addrs=[0:0:0:0:0:0:0:1, 10.113.14.98, 127.0.0.1], sockAddrs=[],
discPort=47501, order=2, intOrder=2, lastExchangeTime=1568357181089,
loc=false, ver=2.6.0#20180710-sha1:669feacc, isClient=false],
discoEvtType=12, discoTs=1568357376683, discoEvtName=NODE_FAILED,
nodeId8=499400ac, msg=Cache rebalancing event.,
type=CACHE_REBALANCE_PART_DATA_LOST, tstamp=1568357423500]
primary partition size after loss:1024

*Lost partion No.524*

*The number of lost partitions  gets increment on each consecutive event
call and the last many calls to listener have complete list of lost
partitions.*
*Questions: 1)Is there any way get list of complete lost partitions?
Because I want start cache loading for these partitions. Its getting
difficult to determine when to call cache loading due event call on per
partition.*
*                    2)I want reset only partitions using
*resetLostPartitions()
*those I have handled in partition lost event listener. *

Thanks,
Akash

Reply via email to