Github user revans2 commented on the pull request:

    https://github.com/apache/storm/pull/765#issuecomment-147124664
  
    I have some new test results.  I did a comparison of several different 
branches.  I looked at this branch, the upgraded-disruptor branch #750, 
STORM-855 #694, and apache-master 0.11.0-SNAPSHOT 
(04cf3f6162ce6fdd1ec13b758222d889dafd5749).  I had to make a few modifications 
to get my test to work.  I applied the following patch 
https://gist.github.com/revans2/84301ef0fde0dc4fbe44 to each of the branches.  
For STORM-855 I had to modify the test a bit so it would optionally do 
batching.  In that case batching was enabled on all streams and all spouts and 
bolts.
    
    I then ran the test at various throughputs 100, 200, 400, 800, 1600, 3200, 
6400, 10000, 12800, 25600. and possibly a few others when looking for it to hit 
the maximum throughput, and different batch sizes.
    
    Each test ran for 5 mins.  Here is the results of that test, excluding the 
tests where the worker could not keep up with the rate.
    
    | 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns | 
avg service latency ms | std-dev ns |
    |---|---|---|---|---|---|---|
    | 2,613,247 | 4,673,535 | 100 | STORM-855-0 | 2,006,347.25 | 1.26 | 
2,675,778.36 |
    | 2,617,343 | 4,423,679 | 200 | STORM-855-0 | 1,991,238.45 | 1.29 | 
2,024,687.45 |
    | 2,623,487 | 5,619,711 | 400 | STORM-855-0 | 1,999,926.81 | 1.24 | 
1,778,335.92 |
    | 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 | 
893,085.40 |
    | 2,635,775 | 8,560,639 | 800 | STORM-855-0 | 2,010,286.65 | 1.35 | 
2,134,795.12 |
    | 2,654,207 | 302,252,031 | 3200 | STORM-855-0 | 2,942,360.75 | 2.13 | 
16,676,136.60 |
    | 2,684,927 | 124,190,719 | 3200 | batch-v2-1 | 2,154,234.45 | 1.41 | 
6,219,057.66 |
    | 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 | 
18,274,805.30 |
    | 2,715,647 | 7,356,415 | 100 | storm-base-1 | 2,092,991.53 | 1.30 | 
2,447,956.21 |
    | 2,723,839 | 4,587,519 | 400 | storm-base-1 | 2,082,835.21 | 1.31 | 
1,978,424.49 |
    | 2,723,839 | 6,049,791 | 100 | dist-upgraade-1 | 2,091,407.68 | 1.31 | 
2,222,977.89 |
    | 2,725,887 | 10,403,839 | 1600 | batch-v2-1 | 2,010,694.30 | 1.27 | 
2,095,223.90 |
    | 2,725,887 | 4,607,999 | 200 | storm-base-1 | 2,074,784.50 | 1.30 | 
1,951,564.93 |
    | 2,727,935 | 4,513,791 | 200 | dist-upgraade-1 | 2,082,025.31 | 1.33 | 
2,057,591.08 |
    | 2,729,983 | 4,182,015 | 400 | dist-upgraade-1 | 2,056,282.29 | 1.43 | 
862,428.67 |
    | 2,732,031 | 4,632,575 | 800 | storm-base-1 | 2,092,514.39 | 1.27 | 
2,231,550.66 |
    | 2,734,079 | 4,472,831 | 800 | dist-upgraade-1 | 2,095,994.08 | 1.28 | 
1,870,953.62 |
    | 2,740,223 | 4,192,255 | 200 | batch-v2-1 | 2,011,025.19 | 1.21 | 
911,556.19 |
    | 2,742,271 | 4,726,783 | 1600 | storm-base-1 | 2,089,581.40 | 1.35 | 
2,410,668.79 |
    | 2,748,415 | 4,444,159 | 400 | batch-v2-1 | 2,055,600.78 | 1.34 | 
1,729,257.92 |
    | 2,748,415 | 4,575,231 | 100 | batch-v2-1 | 2,035,920.21 | 1.31 | 
1,213,874.52 |
    | 2,754,559 | 16,875,519 | 1600 | dist-upgraade-1 | 2,098,441.13 | 1.35 | 
2,279,870.41 |
    | 2,754,559 | 3,969,023 | 800 | batch-v2-1 | 2,026,222.88 | 1.29 | 
767,491.71 |
    | 2,793,471 | 53,477,375 | 3200 | storm-base-1 | 2,147,360.05 | 1.42 | 
3,668,366.37 |
    | 2,801,663 | 147,062,783 | 3200 | dist-upgraade-1 | 2,358,863.31 | 1.59 | 
7,574,577.81 |
    | 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 | 
7,777,381.54 |
    | 13,369,343 | 15,122,431 | 3200 | batch-v2-100 | 10,699,832.23 | 10.02 | 
1,623,949.38 |
    | 13,418,495 | 15,392,767 | 800 | batch-v2-100 | 10,589,813.17 | 9.86 | 
2,439,134.80 |
    | 13,426,687 | 14,680,063 | 400 | batch-v2-100 | 10,738,973.68 | 10.03 | 
2,298,229.99 |
    | 13,484,031 | 14,368,767 | 200 | batch-v2-100 | 10,941,653.28 | 10.20 | 
2,471,899.43 |
    | 13,508,607 | 14,262,271 | 100 | batch-v2-100 | 11,099,257.68 | 10.35 | 
1,658,054.66 |
    | 13,524,991 | 14,376,959 | 1600 | batch-v2-100 | 10,723,471.83 | 10.00 | 
1,477,621.07 |
    | 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59 
| 78,326,501.83 |
    | 710,934,527 | 827,326,463 | 4000 | STORM-855-100 | 351,305,653.90 | 
339.28 | 141,283,307.30 |
    | 783,286,271 | 1,268,776,959 | 5000 | STORM-855-100 | 332,417,358.65 | 
312.07 | 139,760,316.82 |
    | 888,668,159 | 1,022,361,599 | 3200 | STORM-855-100 | 445,646,342.60 | 
431.55 | 179,065,279.65 |
    | 940,048,383 | 1,363,148,799 | 6400 | storm-base-1 | 20,225,300.17 | 17.17 
| 134,848,974.52 |
    | 1,043,333,119 | 1,409,286,143 | 10000 | batch-v2-1 | 22,750,840.18 | 6.13 
| 146,235,076.73 |
    | 1,209,008,127 | 1,786,773,503 | 6400 | dist-upgraade-1 | 28,588,397.01 | 
24.70 | 181,801,409.69 |
    | 1,747,976,191 | 1,946,157,055 | 1600 | STORM-855-100 | 738,741,774.85 | 
734.75 | 374,194,675.56 |
    | 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 | 
51.67 | 497,027,226.45 |
    | 3,374,317,567 | 3,892,314,111 | 10000 | dist-upgraade-1 | 141,866,760.39 
| 69.39 | 589,014,777.73 |
    | 3,447,717,887 | 3,869,245,439 | 10000 | storm-base-1 | 139,149,514.03 | 
56.45 | 609,509,456.98 |
    | 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 | 
93.37 | 743,434,065.83 |
    | 3,512,729,599 | 3,898,605,567 | 800 | STORM-855-100 | 1,354,193,514.47 | 
1,361.58 | 779,667,263.64 |
    | 3,963,617,279 | 4,416,602,111 | 5500 | STORM-855-100 | 450,364,286.22 | 
415.96 | 575,017,536.40 |
    | 4,185,915,391 | 5,347,737,599 | 4500 | STORM-855-0 | 366,268,233.66 | 
259.94 | 995,928,429.75 |
    | 4,919,918,591 | 5,582,618,623 | 6000 | STORM-855-100 | 534,520,242.96 | 
497.47 | 758,754,139.61 |
    | 4,919,918,591 | 5,582,618,623 | 6000 | STORM-855-100 | 534,520,242.96 | 
497.47 | 758,754,139.61 |
    | 7,071,596,543 | 7,843,348,479 | 400 | STORM-855-100 | 2,652,137,010.52 | 
2,630.51 | 1,589,666,333.78 |
    | 14,159,970,303 | 15,653,142,527 | 200 | STORM-855-100 | 5,202,877,719.25 
| 5,206.33 | 3,199,275,795.66 |
    | 27,648,851,967 | 31,205,621,759 | 100 | STORM-855-100 | 10,201,124,134.76 
| 10,169.37 | 6,289,786,882.10 |
    
    
    I then filtered the list to show the maximum throughput for a given latency 
(several different ones)
    
    99th percentile:
    
    | 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns | 
avg service latency ms | std-dev ns |
    |---|---|---|---|---|---|---|
    | 2,613,247 | 4,673,535 | 100 | STORM-855-0 | 2,006,347.25 | 1.26 | 
2,675,778.36 |
    | 2,617,343 | 4,423,679 | 200 | STORM-855-0 | 1,991,238.45 | 1.29 | 
2,024,687.45 |
    | 2,623,487 | 5,619,711 | 400 | STORM-855-0 | 1,999,926.81 | 1.24 | 
1,778,335.92 |
    | 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 | 
893,085.40 |
    | 2,654,207 | 302,252,031 | 3200 | STORM-855-0 | 2,942,360.75 | 2.13 | 
16,676,136.60 |
    | 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 | 
18,274,805.30 |
    | 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 | 
7,777,381.54 |
    | 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59 
| 78,326,501.83 |
    | 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 | 
51.67 | 497,027,226.45 |
    | 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 | 
93.37 | 743,434,065.83 |
    
    99.9th percentile:
    
    | 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns | 
avg service latency ms | std-dev ns |
    |---|---|---|---|---|---|---|
    | 2,754,559 | 3,969,023 | 800 | batch-v2-1 | 2,026,222.88 | 1.29 | 
767,491.71 |
    | 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 | 
893,085.40 |
    | 13,369,343 | 15,122,431 | 3200 | batch-v2-100 | 10,699,832.23 | 10.02 | 
1,623,949.38 |
    | 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 | 
7,777,381.54 |
    | 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59 
| 78,326,501.83 |
    | 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 | 
51.67 | 497,027,226.45 |
    | 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 | 
93.37 | 743,434,065.83 |
    
    mean latency:
    
    | 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns | 
avg service latency ms | std-dev ns |
    |---|---|---|---|---|---|---|
    | 2,627,583 | 4,603,903 | 1600 | STORM-855-0 | 1,971,888.24 | 1.30 | 
893,085.40 |
    | 2,793,471 | 53,477,375 | 3200 | storm-base-1 | 2,147,360.05 | 1.42 | 
3,668,366.37 |
    | 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 | 
18,274,805.30 |
    | 13,344,767 | 180,879,359 | 6400 | batch-v2-100 | 11,319,553.69 | 10.62 | 
7,777,381.54 |
    | 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59 
| 78,326,501.83 |
    | 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 | 
51.67 | 497,027,226.45 |
    | 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 | 
93.37 | 743,434,065.83 |
    
    service latency ms (storm's complete latency):
    
    | 99%-ile ns | 99.9%-ils ns | throughput | branch-batch | mean latency ns | 
avg service latency ms | std-dev ns |
    |---|---|---|---|---|---|---|
    | 2,740,223 | 4,192,255 | 200 | batch-v2-1 | 2,011,025.19 | 1.21 | 
911,556.19 |
    | 2,623,487 | 5,619,711 | 400 | STORM-855-0 | 1,999,926.81 | 1.24 | 
1,778,335.92 |
    | 2,725,887 | 10,403,839 | 1600 | batch-v2-1 | 2,010,694.30 | 1.27 | 
2,095,223.90 |
    | 2,684,927 | 124,190,719 | 3200 | batch-v2-1 | 2,154,234.45 | 1.41 | 
6,219,057.66 |
    | 2,701,311 | 349,700,095 | 5000 | batch-v2-1 | 2,921,661.67 | 1.78 | 
18,274,805.30 |
    | 1,043,333,119 | 1,409,286,143 | 10000 | batch-v2-1 | 22,750,840.18 | 6.13 
| 146,235,076.73 |
    | 346,554,367 | 977,272,831 | 12800 | batch-v2-100 | 18,596,303.93 | 15.59 
| 78,326,501.83 |
    | 2,642,411,519 | 3,124,756,479 | 20000 | batch-v2-100 | 133,706,248.88 | 
51.67 | 497,027,226.45 |
    | 3,456,106,495 | 3,953,131,519 | 22000 | batch-v2-100 | 274,785,584.11 | 
93.37 | 743,434,065.83 |
    
    I also looked at about the maximum throughput each branch could handle.
    
    | branch-batch | throughput | mean latency | 99%-lie latency |
    |---|---|---|---|
    | STORM-855-0 | 4,500 | 366,268,233.66 | 4,185,915,391 |
    | STORM-855-100 | 5,500 | 450,364,286.22 | 3,963,617,279 |
    | storm-base-1 | 10,000 | 139,149,514.03 | 3,447,717,887 |
    | dist-upgrade-1 | 10,000 | 141,866,760.39 | 3,374,317,567 |
    | batch-v2-1 | 10,000 | 22,750,840.18 | 1,043,333,119 |
    | batch-v2-100 | 22,000 | 274,785,584.11 | 3,456,106,495 |
    
    I really would like some feedback here, because these numbers seem to 
contradict STORM-855 using my original speed of light test.  I don't really 
like that test, even though I wrote it, because the throughput is limited only 
by storm, so with acking disabled it is measuring what the latency is when we 
hit the wall, and cannot provide any more throughput.  No one should run in 
production that way. When acking is enabled and we are using max-spout pending 
for flow control the throughput is directly related to the end to end latency.  
This too shouldn't be the common case in production because it means we cannot 
keep up with the incoming rate and are falling behind.
    
    This seems to indicate that the only time STORM-855 makes since is when 
looking at the 99%-ile latency at a very low throughput, and then it only seems 
to save 1/20th of a ms advantage over the others.  In other cases it looks like 
the throughput per host it can support is about 1/2 of that without the change. 
 This branch however has a weakness on the low end when batching is enabled it 
is about 12 ms slower, but on the high end it can handle more then 2x the 
throughput with little change to the latency.  If that 12 ms is important I 
think we can mitigate it by allowing the batch size to self-adjust on a per 
queue bases.
    
    I really would like others to look at my numbers and my test to see if 
there are issues with it that I am missing, because like I said it seems to 
contradict the numbers from STORM-855.  The only thing I can think of is that 
the messaging layer is the bottleneck in the speed of light test, which is what 
it was intended to stress test, and STORM-855 is giving a significant batching 
advantage there.  If that is the case then we should look at what STORM-855 is 
doing around that to try and combine it with the batching we are doing here.
    
    @ptgoetz @d2r @rfarivar @mjsax @kishorvpatil @knusbaum please let me know 
what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to