Github user Ethanlm commented on the issue:

    https://github.com/apache/storm/pull/2270
  
    I did some experiments on ThroughputVsLatency (modified to add some 
Configs) and the initial results seem similar among `shuffle`, `localOrShuffle` 
and `localityAwareShuffle (LocalityASG)`. 
    
    Config:
    `TOPOLOGY.MESSAGE.TIMEOUT: 300`
    `TOPOLOGY_MAX_SPOUT_PENDING: 5000`
    `LoadAware` is enabled by default
    
    Env: Two openstack VM's, 8GB RAM, 4 VCPUs; 1Gbps Ethernet 
    
    Note:  
    1 All numbers in the tables are medians;
    2 The numbers fluctuated slightly every time I ran the experiments. The 
results shown below are sampled from repeated experiments. 
    
    #### Experiment1
    Normal network connection.
    This is trying to compare the performance in normal situation.
    
    Rate | numWorkers | numSpout | numSplit(Bolt) | totalTime(min) | 
ThroughputVsLatency | acked | acked/sec | failed | 99% | 99.90% | min | max | 
mean | stddev | user | sys | gc | mem
    -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 
-- | -- | -- | --
    100000 | 4 | 16 | 16 | 5 | LocalityASG | 2,204,700 | 73,490.00 | 0 | 
69,726,109,695 | 72,410,464,255 | 51,271,172,096 | 74,423,730,175 | 
59,100,119,611.86 | 5,520,976,189.22 | 176,570 | 33,110 | 19,981 | 1,729.69
    100000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 2,082,000 | 67,333.33 | 0 | 
66,102,231,039 | 66,538,438,655 | 50,096,766,976 | 66,806,874,111 | 
57,631,698,540.71 | 3,643,782,002.52 | 172,500 | 30,060 | 20,803 | 1,175.51
    100000 | 4 | 16 | 16 | 5 | Shuffle | 2,067,640 | 68,812.00 | 0 | 
72,611,790,847 | 73,752,641,535 | 50,298,093,568 | 74,557,947,903 | 
62,488,041,809.45 | 5,480,336,956.59 | 184,020 | 29,220 | 23,905 | 1,363.61
    30000 | 4 | 16 | 16 | 5 | LocalityASG | 906,300 | 30,210.00 | 0 | 
46,727,167 | 119,865,343 | 6,529,024 | 164,364,287 | 15,393,162.30 | 
8,306,088.53 | 102,760 | 46,790 | 1,825 | 498.63
    30000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 906,580 | 30,219.33 | 0 | 
51,838,975 | 135,921,663 | 6,561,792 | 174,063,615 | 16,187,180.85 | 
9,488,799.34 | 103,680 | 46,240 | 1,832 | 742.3
    30000 | 4 | 16 | 16 | 5 | Shuffle | 906,360 | 30,206.00 | 0 | 45,318,143 | 
80,936,959 | 6,680,576 | 132,055,039 | 17,106,693.24 | 7,055,551.93 | 105,420 | 
47,040 | 1,759 | 424.65
    20000 | 4 | 16 | 16 | 5 | LocalityASG | 605,040 | 20,168.00 | 0 | 
31,965,183 | 49,840,127 | 5,500,928 | 96,534,527 | 13,949,066.47 | 4,487,713.39 
| 94,370 | 48,930 | 1,253 | 433.45
    20000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 604,640 | 20,154.67 | 0 | 
32,161,791 | 87,621,631 | 5,525,504 | 124,518,399 | 14,090,342.83 | 
5,387,101.60 | 92,780 | 47,170 | 1,239 | 356.32
    20000 | 4 | 16 | 16 | 5 | Shuffle | 604,840 | 20,151.33 | 0 | 33,406,975 | 
70,909,951 | 4,399,104 | 98,697,215 | 14,103,326.48 | 5,074,175.87 | 90,870 | 
47,230 | 1,193 | 396.6
    10000 | 4 | 16 | 16 | 5 | LocalityASG | 302,260 | 10,072.00 | 0 | 
27,295,743 | 57,901,055 | 4,374,528 | 88,473,599 | 12,366,716.52 | 3,887,709.71 
| 75,670 | 50,950 | 796 | 347.49
    10000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 302,340 | 10,077.33 | 0 | 
27,721,727 | 44,728,319 | 3,999,744 | 79,298,559 | 12,859,667.20 | 3,668,796.69 
| 77,890 | 52,570 | 752 | 349.92
    10000 | 4 | 16 | 16 | 5 | Shuffle | 302,260 | 10,074.00 | 0 | 29,655,039 | 
60,489,727 | 4,378,624 | 83,165,183 | 13,519,393.44 | 4,205,346.88 | 77,580 | 
52,300 | 769 | 367.62
    
    #### Experiment 2
    Add 10ms latency on both VMs:  `tc qdisc add dev eth0 root netem delay 
10ms`, which means 20ms latency in total.
    This is trying to simulate slow network connection
    
    Rate | numWorkers | numSpout | numSplit(Bolt) | totalTime(min) | 
ThroughputVsLatency | acked | acked/sec | failed | 99% | 99.90% | min | max | 
mean | stddev | user | sys | gc | mem
    -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 
-- | -- | -- | --
    100000 | 4 | 16 | 16 | 5 | LocalityASG | 2,257,220 | 75,240.67 | 0 | 
51,942,260,735 | 53,317,992,447 | 37,446,746,112 | 53,821,308,927 | 
45,735,430,020.04 | 5,110,395,891.55 | 181,100 | 28,930 | 22,157 | 812.64
    100000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 2,428,780 | 80,959.33 | 0 | 
61,773,709,311 | 62,746,787,839 | 3,118,465,024 | 63,082,332,159 | 
40,427,975,429.87 | 15,339,145,604.59 | 176,330 | 33,200 | 16,475 | 743.08
    100000 | 4 | 16 | 16 | 5 | Shuffle | 2,138,040 | 71,268.00 | 0 | 
59,726,888,959 | 60,733,521,919 | 40,500,199,424 | 61,438,164,991 | 
50,071,346,036.42 | 5,670,751,071.76 | 170,910 | 34,900 | 14,231 | 638.34
    30000 | 4 | 16 | 16 | 5 | LocalityASG | 908,140 | 30,271.33 | 0 | 
330,301,439 | 383,254,527 | 7,000,064 | 416,808,959 | 48,647,996.71 | 
55,147,277.57 | 105,160 | 44,800 | 1,975 | 579.67
    30000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 902,680 | 30,089.33 | 0 | 
341,049,343 | 383,254,527 | 6,946,816 | 414,973,951 | 51,019,429.45 | 
58,367,846.16 | 104,980 | 47,240 | 1,895 | 642.05
    30000 | 4 | 16 | 16 | 5 | Shuffle | 908,860 | 30,295.33 | 0 | 375,914,495 | 
492,044,287 | 7,737,344 | 655,884,287 | 60,776,243.04 | 68,004,908.46 | 107,910 
| 45,240 | 1,938 | 422.43
    20000 | 4 | 16 | 16 | 5 | LocalityASG | 605,920 | 20,188.00 | 0 | 
323,485,695 | 386,138,111 | 5,238,784 | 427,294,719 | 45,021,622.26 | 
52,931,457.82 | 91,630 | 47,140 | 1,307 | 442.26
    20000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 605,180 | 20,172.67 | 0 | 
339,476,479 | 391,905,279 | 4,431,872 | 425,721,855 | 46,271,845.50 | 
55,536,624.12 | 93,060 | 47,820 | 1,333 | 467.91
    20000 | 4 | 16 | 16 | 5 | Shuffle | 604,280 | 20,047.33 | 0 | 356,777,983 | 
432,799,743 | 4,493,312 | 576,192,511 | 52,029,378.68 | 57,552,110.96 | 93,570 
| 46,560 | 1,285 | 446.67
    10000 | 4 | 16 | 16 | 5 | LocalityASG | 301,200 | 10,029.33 | 0 | 
330,563,583 | 378,798,079 | 4,399,104 | 413,138,943 | 42,995,014.97 | 
50,182,495.91 | 77,450 | 47,930 | 776 | 301.93
    10000 | 4 | 16 | 16 | 5 | LocalOrShuffle | 303,240 | 10,108.00 | 0 | 
337,379,327 | 383,516,671 | 4,100,096 | 422,838,271 | 43,840,226.34 | 
52,874,295.96 | 80,080 | 47,990 | 764 | 274.21
    10000 | 4 | 16 | 16 | 5 | Shuffle | 300,440 | 10,012.00 | 0 | 333,971,455 | 
396,623,871 | 4,411,392 | 629,669,887 | 49,247,857.01 | 53,009,320.86 | 77,350 
| 46,420 | 726 | 286.33
    
    #### Experiment 3
    Add 10ms latency on both VMs. 
    This is trying to show the difference between `localOrShuffle` and 
`localityAwareShuffle`. Since we 4 `workers`, 4 `spout` and 3 `split (bolt)` 
here, one of the spouts needs to choose a bolt task on other worker/host. 
Current `localityAwareShuffle` implementation will choose the worker in the 
same host. `localOrShuffle` will shuffle among these three `split (bolt)`
    
    Rate | numWorkers | numSpout | numSplit(Bolt) | totalTime(min) | 
ThroughputVsLatency | acked | acked/sec | failed | 99% | 99.90% | min | max | 
mean | stddev | user | sys | gc | mem
    -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | 
-- | -- | -- | --
    100000 | 4 | 4 | 3 | 5 | LocalityASG | 2,178,680 | 72,622.67 | 0 | 
65,699,577,855 | 66,538,438,655 | 8,413,184 | 66,639,101,951 | 
28,354,892,840.64 | 24,151,617,972.76 | 140,830 | 36,650 | 5,041 | 288.98
    100000 | 4 | 4 | 3 | 5 | LocalOrShuffle | 2,260,940 | 75,364.67 | 0 | 
58,082,721,791 | 58,485,374,975 | 36,842,766,336 | 58,619,592,703 | 
45,712,216,573.06 | 5,811,071,103.74 | 145,400 | 35,490 | 6,218 | 304
    100000 | 4 | 4 | 3 | 5 | Shuffle | 1,998,040 | 66,601.33 | 0 | 
61,706,600,447 | 62,310,580,223 | 31,138,512,896 | 62,646,124,543 | 
43,860,244,218.07 | 9,311,454,441.44 | 137,820 | 37,530 | 4,677 | 321.52
    30000 | 4 | 4 | 3 | 5 | LocalityASG | 907,700 | 30,256.67 | 0 | 92,667,903 
| 124,125,183 | 6,455,296 | 148,504,575 | 33,165,419.48 | 10,665,312.03 | 
93,720 | 41,890 | 1,369 | 313.16
    30000 | 4 | 4 | 3 | 5 | LocalOrShuffle | 901,300 | 30,043.33 | 0 | 
87,162,879 | 123,535,359 | 7,098,368 | 192,675,839 | 34,214,948.18 | 
11,589,701.13 | 90,650 | 42,140 | 1,334 | 227.66
    30000 | 4 | 4 | 3 | 5 | Shuffle | 907,720 | 30,006.00 | 0 | 93,126,655 | 
135,921,663 | 7,286,784 | 237,764,607 | 37,632,076.72 | 13,370,589.38 | 92,140 
| 45,050 | 1,458 | 248.57
    20000 | 4 | 4 | 3 | 5 | LocalityASG | 600,640 | 20,021.33 | 0 | 79,495,167 
| 118,292,479 | 4,685,824 | 198,967,295 | 31,632,119.39 | 9,579,664.67 | 84,470 
| 45,580 | 1,038 | 272.13
    20000 | 4 | 4 | 3 | 5 | LocalOrShuffle | 605,200 | 20,173.33 | 0 | 
78,118,911 | 122,552,319 | 7,319,552 | 205,520,895 | 33,266,995.51 | 
10,770,593.95 | 84,670 | 45,980 | 1,108 | 277.52
    20000 | 4 | 4 | 3 | 5 | Shuffle | 605,160 | 20,172.00 | 0 | 84,934,655 | 
129,040,383 | 7,413,760 | 194,510,847 | 38,185,410.13 | 12,666,069.40 | 84,020 
| 43,870 | 1,039 | 248.27
    10000 | 4 | 4 | 3 | 5 | LocalityASG | 300,420 | 10,014.00 | 0 | 70,713,343 
| 118,554,623 | 4,151,296 | 156,368,895 | 30,923,105.91 | 8,611,777.75 | 72,840 
| 48,890 | 733 | 246.86
    10000 | 4 | 4 | 3 | 5 | LocalOrShuffle | 302,600 | 10,086.00 | 0 | 
71,630,847 | 118,882,303 | 3,786,752 | 154,664,959 | 32,406,076.56 | 
10,199,838.54 | 71,170 | 46,670 | 661 | 256.4
    10000 | 4 | 4 | 3 | 5 | Shuffle | 300,340 | 10,010.00 | 0 | 69,861,375 | 
126,615,551 | 5,586,944 | 155,058,175 | 36,010,152.88 | 12,085,641.80 | 75,850 
| 49,210 | 715 | 230.3
    
    The results above show performance similarity between `localOrShuffle` and 
`LocalityAwareShuffleGrouping` should be a good thing since in some certain 
situations they are only different on `chooseTasks()` function. And 
`LocalityAwareShuffleGrouping` is more general than `localOrShuffle`.
    
    I am still doing more performance testing. Posting these results here to 
hopefully get some feedback to help my further testing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to