[
https://issues.apache.org/jira/browse/STORM-1742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266097#comment-15266097
]
ASF GitHub Bot commented on STORM-1742:
---------------------------------------
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/1379#issuecomment-216102534
I've done performance test with ThroughputVsLatency with options:
- throughput: 30000
- topology.backpressure.enable = false (acked/sec dropped from last tests -
#1362 - was back pressure)
- topology.executor.send.buffer.size = 16384
- topology.executor.receive.buffer.size = 16384
<< before >>
```
uptime: 30 acked: 2,080 acked/sec: 69.33 failed: 0 99%:
19,327,352,831 99.9%: 19,495,124,991 min: 790,102,016 max: 19,528,679,423
mean: 9,685,524,721.87 stddev: 5,111,175,911.98 user: 101,830 sys:
3,910 gc: 1,897 mem: 114.88
uptime: 61 acked: 388,940 acked/sec: 12,546.45 failed: 0 99%:
43,251,662,847 99.9%: 46,405,779,455 min: 9,281,536 max: 48,117,055,487
mean: 21,987,872,843.23 stddev: 10,957,140,755.65 user: 302,820 sys:
14,460 gc: 64,774 mem: 158.65
uptime: 91 acked: 1,794,120 acked/sec: 59,804.00 failed: 0 99%:
37,010,538,495 99.9%: 44,962,938,879 min: 7,966,720 max: 47,378,857,983
mean: 4,507,910,229.78 stddev: 8,914,095,623.34 user: 134,520 sys:
15,600 gc: 3,116 mem: 159.69
uptime: 121 acked: 903,860 acked/sec: 30,128.67 failed: 0 99%:
27,262,975 99.9%: 43,712,511 min: 7,401,472 max: 62,980,095
mean: 12,985,499.73 stddev: 3,159,411.39 user: 127,400 sys: 17,410
gc: 944 mem: 160.48
uptime: 151 acked: 926,000 acked/sec: 30,866.67 failed: 0 99%:
23,969,791 99.9%: 37,486,591 min: 7,876,608 max: 56,688,639
mean: 12,825,768.39 stddev: 2,573,168.47 user: 124,080 sys: 18,870
gc: 822 mem: 160.87
uptime: 181 acked: 903,080 acked/sec: 30,102.67 failed: 0 99%:
27,361,279 99.9%: 48,136,191 min: 7,831,552 max: 74,055,679
mean: 12,963,280.87 stddev: 3,265,626.09 user: 122,040 sys: 20,380
gc: 1,004 mem: 161.09
uptime: 211 acked: 902,980 acked/sec: 30,099.33 failed: 0 99%:
27,492,351 99.9%: 41,156,607 min: 8,024,064 max: 56,819,711
mean: 12,958,029.39 stddev: 3,102,467.38 user: 121,460 sys: 20,940
gc: 1,003 mem: 161.33
uptime: 241 acked: 902,680 acked/sec: 30,089.33 failed: 0 99%:
26,181,631 99.9%: 40,599,551 min: 7,909,376 max: 54,820,863
mean: 12,817,104.62 stddev: 2,856,706.78 user: 120,890 sys: 21,580
gc: 991 mem: 161.41
uptime: 271 acked: 902,840 acked/sec: 30,094.67 failed: 0 99%:
28,065,791 99.9%: 45,744,127 min: 7,983,104 max: 77,004,799
mean: 12,979,217.87 stddev: 3,258,876.30 user: 118,320 sys: 23,380
gc: 986 mem: 161.55
uptime: 301 acked: 902,580 acked/sec: 30,086.00 failed: 0 99%:
23,265,279 99.9%: 38,862,847 min: 7,016,448 max: 48,988,159
mean: 12,719,114.89 stddev: 2,443,669.60 user: 119,900 sys: 22,530
gc: 919 mem: 161.60
```
<< applying STORM-1742 >>
```
uptime: 31 acked: 1,800 acked/sec: 58.06 failed: 0 99%:
9,831,448,575 99.9%: 9,923,723,263 min: 2,121,269,248 max: 9,923,723,263
mean: 8,090,873,047.95 stddev: 1,516,468,044.70 user: 54,890 sys:
2,210 gc: 0 mem: 76.53
uptime: 61 acked: 1,266,640 acked/sec: 42,221.33 failed: 0 99%:
30,601,641,983 99.9%: 35,433,480,191 min: 10,469,376 max: 36,574,330,879
mean: 14,938,340,771.32 stddev: 7,628,453,815.97 user: 266,430 sys:
12,530 gc: 46,951 mem: 158.31
uptime: 91 acked: 904,700 acked/sec: 30,156.67 failed: 0 99%:
11,349,786,623 99.9%: 15,485,370,367 min: 7,925,760 max: 16,877,879,295
mean: 518,887,613.80 stddev: 1,972,798,692.60 user: 128,020 sys: 16,080
gc: 1,408 mem: 159.54
uptime: 121 acked: 904,020 acked/sec: 30,134.00 failed: 0 99%:
27,082,751 99.9%: 39,321,599 min: 7,831,552 max: 71,237,631
mean: 12,900,410.36 stddev: 2,969,073.67 user: 123,460 sys: 18,320
gc: 907 mem: 160.18
uptime: 151 acked: 903,620 acked/sec: 30,120.67 failed: 0 99%:
27,754,495 99.9%: 47,448,063 min: 7,860,224 max: 71,696,383
mean: 12,948,431.30 stddev: 3,268,915.88 user: 121,300 sys: 19,830
gc: 877 mem: 160.73
uptime: 181 acked: 925,480 acked/sec: 30,849.33 failed: 0 99%:
26,296,319 99.9%: 40,763,391 min: 7,897,088 max: 57,507,839
mean: 12,821,813.27 stddev: 2,854,751.59 user: 119,540 sys: 20,620
gc: 914 mem: 160.99
uptime: 211 acked: 902,740 acked/sec: 30,091.33 failed: 0 99%:
24,707,071 99.9%: 37,584,895 min: 7,958,528 max: 52,035,583
mean: 12,748,216.82 stddev: 2,608,800.49 user: 118,100 sys: 21,890
gc: 834 mem: 161.21
uptime: 241 acked: 902,760 acked/sec: 30,092.00 failed: 0 99%:
23,855,103 99.9%: 35,880,959 min: 7,970,816 max: 56,721,407
mean: 12,698,972.72 stddev: 2,434,051.28 user: 117,230 sys: 22,680
gc: 797 mem: 161.35
uptime: 271 acked: 902,720 acked/sec: 30,090.67 failed: 0 99%:
23,068,671 99.9%: 36,175,871 min: 7,888,896 max: 46,825,471
mean: 12,700,873.73 stddev: 2,364,935.16 user: 115,810 sys: 23,490
gc: 799 mem: 161.46
uptime: 301 acked: 902,620 acked/sec: 30,087.33 failed: 0 99%:
25,067,519 99.9%: 37,847,039 min: 7,761,920 max: 53,247,999
mean: 12,724,600.61 stddev: 2,600,411.79 user: 116,120 sys: 23,660
gc: 829 mem: 161.56
```
Seems like there's no major issue on performance hit. Ready to be reviewed.
> More accurate 'complete latency'
> --------------------------------
>
> Key: STORM-1742
> URL: https://issues.apache.org/jira/browse/STORM-1742
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-core
> Reporter: Jungtaek Lim
> Assignee: Jungtaek Lim
>
> I already initiated talking thread on dev@ list. Below is copy of the content
> in my mail.
> http://mail-archives.apache.org/mod_mbox/storm-dev/201604.mbox/%3CCAF5108gn=rskundfs7-sgy_pd-_prgj2hf2t5e5zppp-knd...@mail.gmail.com%3E
> While thinking about metrics improvements, I doubt how many users know that
> what 'exactly' is complete latency. In fact, it's somewhat complicated
> because additional waiting time could be added to complete latency because
> of single-thread model event loop of spout.
> Long running nextTuple() / ack() / fail() can affect complete latency but
> it's behind the scene. No latency information provided, and someone even
> didn't know about this characteristic. Moreover, calling nextTuple() could
> be skipped due to max spout waiting, which will make us harder to guess
> when avg. latency of nextTuple() will be provided.
> I think separation of threads (tuple handler to separate thread, as JStorm
> provides) would resolve the gap, but it requires our spout logic to be
> thread-safe, so I'd like to find workaround first.
> My sketched idea is let Acker decides end time for root tuple.
> There're two subsequent ways to decide start time for root tuple,
> 1. when Spout about to emit ACK_INIT to Acker (in other words, keep it as
> it is)
> - Acker sends ack / fail message to Spout with timestamp, and Spout
> calculates time delta
> - pros. : It's most accurate way since it respects the definition of
> 'complete latency'.
> - cons. : The sync of machine time between machines are very important.
> Milliseconds of precision would be required.
> 2. when Acker receives ACK_INIT from Spout
> - Acker calculates time delta itself, and sends ack / fail message to
> Spout with time delta
> - pros. : No requirement to sync the time between servers so strictly.
> - cons. : It doesn't contain the latency to send / receive ACK_INIT
> between Spout and Acker.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)