Hi!

I am seeing a generally negative impact of Go 1.13 -> 1.14-RC1 in terms of 
speed.

Running benchmarks in my deflate package - and removing the "no change" 
entries:

name                        old time/op    new time/op    delta
DecodeDigitsSpeed1e5-12        903µs ± 0%     940µs ± 1%  +4.14%  (p=0.008 
n=5+5)
DecodeDigitsSpeed1e6-12       8.97ms ± 0%    9.40ms ± 1%  +4.80%  (p=0.008 
n=5+5)
DecodeDigitsDefault1e4-12     93.2µs ± 0%    95.0µs ± 1%  +1.97%  (p=0.008 
n=5+5)
DecodeDigitsDefault1e5-12      855µs ± 0%     882µs ± 2%  +3.15%  (p=0.008 
n=5+5)
DecodeDigitsDefault1e6-12     8.58ms ± 0%    8.94ms ± 2%  +4.28%  (p=0.008 
n=5+5)
DecodeDigitsCompress1e4-12    93.3µs ± 0%    94.6µs ± 1%  +1.37%  (p=0.016 
n=4+5)
DecodeDigitsCompress1e5-12     976µs ± 0%     992µs ± 1%  +1.60%  (p=0.008 
n=5+5)
DecodeDigitsCompress1e6-12    9.85ms ± 0%    9.97ms ± 1%  +1.21%  (p=0.016 
n=4+5)
DecodeTwainSpeed1e4-12        93.7µs ± 0%    98.0µs ± 2%  +4.60%  (p=0.008 
n=5+5)
DecodeTwainSpeed1e5-12         896µs ± 0%     902µs ± 0%  +0.68%  (p=0.008 
n=5+5)
DecodeTwainDefault1e4-12      93.0µs ± 0%    95.1µs ± 1%  +2.32%  (p=0.008 
n=5+5)
DecodeTwainDefault1e5-12       832µs ± 0%     840µs ± 0%  +0.88%  (p=0.008 
n=5+5)
DecodeTwainDefault1e6-12      8.17ms ± 0%    8.22ms ± 0%  +0.68%  (p=0.008 
n=5+5)
DecodeTwainCompress1e4-12     90.4µs ± 1%    93.1µs ± 1%  +2.99%  (p=0.008 
n=5+5)
DecodeTwainCompress1e5-12      790µs ± 0%     802µs ± 0%  +1.55%  (p=0.008 
n=5+5)
DecodeRandomSpeed1e4-12        288ns ± 2%     305ns ± 1%  +5.91%  (p=0.008 
n=5+5)
DecodeRandomSpeed1e5-12       2.30µs ± 2%    2.24µs ± 1%  -2.40%  (p=0.008 
n=5+5)
_tokens_EstimatedBits-12       651ns ± 0%     707ns ± 2%  +8.67%  (p=0.008 
n=5+5)
EncodeDigitsConstant1e4-12    28.4µs ± 0%    29.4µs ± 0%  +3.41%  (p=0.016 
n=5+4)
EncodeDigitsConstant1e5-12     307µs ± 0%     314µs ± 2%  +2.41%  (p=0.008 
n=5+5)
EncodeDigitsConstant1e6-12    2.70ms ± 0%    2.77ms ± 1%  +2.47%  (p=0.008 
n=5+5)
EncodeDigitsSpeed1e5-12        966µs ± 0%     988µs ± 0%  +2.34%  (p=0.008 
n=5+5)
EncodeDigitsSpeed1e6-12       9.07ms ± 1%    9.22ms ± 1%  +1.67%  (p=0.032 
n=5+5)
EncodeDigitsDefault1e5-12     1.63ms ± 0%    1.65ms ± 1%  +1.17%  (p=0.008 
n=5+5)
EncodeDigitsCompress1e5-12    3.70ms ± 1%    3.64ms ± 1%  -1.65%  (p=0.008 
n=5+5)
EncodeDigitsCompress1e6-12    40.1ms ± 0%    39.4ms ± 2%  -1.61%  (p=0.008 
n=5+5)
EncodeDigitsSL1e5-12           955µs ± 0%     992µs ± 1%  +3.79%  (p=0.008 
n=5+5)
EncodeDigitsSL1e6-12          9.34ms ± 0%    9.99ms ± 1%  +6.92%  (p=0.008 
n=5+5)
EncodeTwainConstant1e4-12     37.6µs ± 2%    38.9µs ± 2%  +3.51%  (p=0.008 
n=5+5)
EncodeTwainConstant1e5-12      337µs ± 0%     345µs ± 1%  +2.38%  (p=0.008 
n=5+5)
EncodeTwainSpeed1e4-12         101µs ± 0%     102µs ± 0%  +0.62%  (p=0.024 
n=5+5)
EncodeTwainSpeed1e5-12         955µs ± 0%     968µs ± 1%  +1.35%  (p=0.008 
n=5+5)
EncodeTwainSpeed1e6-12        8.92ms ± 1%    9.09ms ± 1%  +1.94%  (p=0.032 
n=5+5)
EncodeTwainDefault1e4-12       152µs ± 1%     160µs ± 1%  +4.69%  (p=0.008 
n=5+5)
EncodeTwainDefault1e5-12      1.44ms ± 1%    1.49ms ± 1%  +3.69%  (p=0.008 
n=5+5)
EncodeTwainDefault1e6-12      13.7ms ± 1%    14.2ms ± 2%  +3.43%  (p=0.008 
n=5+5)
EncodeTwainCompress1e4-12      267µs ± 1%     272µs ± 2%  +1.97%  (p=0.008 
n=5+5)
EncodeTwainCompress1e5-12     4.76ms ± 0%    4.81ms ± 0%  +1.11%  (p=0.008 
n=5+5)
EncodeTwainCompress1e6-12     52.4ms ± 0%    53.0ms ± 1%  +1.04%  (p=0.008 
n=5+5)
EncodeTwainSL1e4-12            101µs ± 1%     105µs ± 1%  +4.48%  (p=0.008 
n=5+5)
EncodeTwainSL1e5-12            925µs ± 1%     949µs ± 1%  +2.59%  (p=0.008 
n=5+5)
EncodeTwainSL1e6-12           8.86ms ± 1%    9.24ms ± 0%  +4.28%  (p=0.008 
n=5+5)

`_tokens_EstimatedBits` is a microbenchmark and will probably be easier to 
identify. I will add an issue for that.

Running benchmarks on my zstd package gives a less clear, but still 
trending towards a performance loss:

name                                        old time/op    new time/op    
delta
Decoder_DecoderSmall/kppkn.gtb.zst-12         5.76ms ± 1%    5.87ms ± 2%  
 +1.98%  (p=0.016 n=5+5)
Decoder_DecoderSmall/geo.protodata.zst-12     1.53ms ± 0%    1.62ms ± 1%  
 +5.86%  (p=0.008 n=5+5)
Decoder_DecoderSmall/plrabn12.txt.zst-12      19.1ms ± 0%    18.7ms ± 1%  
 -2.25%  (p=0.008 n=5+5)
Decoder_DecoderSmall/lcet10.txt.zst-12        14.4ms ± 1%    13.6ms ± 0%  
 -5.65%  (p=0.008 n=5+5)
Decoder_DecoderSmall/html_x_4.zst-12          2.94ms ± 2%    3.00ms ± 0%  
 +2.21%  (p=0.008 n=5+5)
Decoder_DecoderSmall/paper-100k.pdf.zst-12     473µs ± 1%     511µs ± 1%  
 +7.94%  (p=0.008 n=5+5)
Decoder_DecoderSmall/fireworks.jpeg.zst-12     485µs ± 2%     511µs ± 4%  
 +5.29%  (p=0.008 n=5+5)
Decoder_DecoderSmall/html.zst-12              1.65ms ± 1%    1.71ms ± 1%  
 +4.01%  (p=0.008 n=5+5)
Decoder_DecoderSmall/comp-data.bin.zst-12      191µs ± 1%     206µs ± 1%  
 +7.70%  (p=0.008 n=5+5)
Decoder_DecodeAll/plrabn12.txt.zst-12         2.21ms ± 1%    2.19ms ± 1%  
 -0.95%  (p=0.032 n=5+5)
Decoder_DecodeAll/lcet10.txt.zst-12           1.63ms ± 1%    1.65ms ± 0%  
 +1.20%  (p=0.008 n=5+5)
Decoder_DecodeAll/alice29.txt.zst-12           726µs ± 0%     741µs ± 1%  
 +2.06%  (p=0.008 n=5+5)
Decoder_DecodeAll/paper-100k.pdf.zst-12       26.2µs ± 2%    28.3µs ± 3%  
 +8.14%  (p=0.008 n=5+5)
Decoder_DecodeAll/comp-data.bin.zst-12        11.7µs ± 1%    12.1µs ± 3%  
 +3.21%  (p=0.016 n=5+5)
Encoder_EncodeAllSimple/default-12             496µs ± 1%     491µs ± 1%  
 -0.85%  (p=0.008 n=5+5)
Encoder_EncodeAllSimple4K/fastest-12          28.6µs ± 1%    29.1µs ± 1%  
 +1.75%  (p=0.008 n=5+5)
RandomEncodeAllDefault-12                     4.73ms ± 1%    4.82ms ± 1%  
 +1.82%  (p=0.016 n=5+5)
RandomEncoderFastest-12                       4.30ms ± 3%    4.24ms ± 1%  
 -1.34%  (p=0.032 n=5+5)
Snappy_ConvertXML-12                          14.1ms ± 0%    13.9ms ± 0%  
 -1.55%  (p=0.008 n=5+5)


But there are some quite big regressions in there, definitely not what I 
expected.

Finally, the S2 benchmark assembly disabled is seeing some variance, some 
rather big losses (mostly encoding) and some rather big wins (mostly 
decoding, which is mostly memcopy).

name                                          old time/op    new time/op    
 delta
DecodeS2Block/0-html/block-better-12            13.9µs ± 1%     13.6µs ± 
0%   -2.45%  (p=0.016 n=5+4)
DecodeS2Block/2-jpg/block-better-12             1.27µs ± 0%     1.24µs ± 
1%   -1.78%  (p=0.008 n=5+5)
DecodeS2Block/4-pdf/block-better-12             2.72µs ± 2%     2.65µs ± 
1%   -2.72%  (p=0.008 n=5+5)
DecodeS2Block/5-html4/block-better-12           40.2µs ± 4%     37.7µs ± 
2%   -6.20%  (p=0.016 n=5+5)
DecodeS2Block/6-txt1/block-better-12            60.4µs ± 1%     63.0µs ± 
2%   +4.26%  (p=0.008 n=5+5)
DecodeS2Block/8-txt3/block-better-12             162µs ± 5%      155µs ± 
0%   -4.65%  (p=0.008 n=5+5)
DecodeS2Block/9-txt4/block-12                    161µs ± 1%      157µs ± 
0%   -2.62%  (p=0.008 n=5+5)
DecodeS2Block/9-txt4/block-better-12             224µs ± 1%      216µs ± 
1%   -3.55%  (p=0.008 n=5+5)
DecodeS2Block/10-pb/block-12                    10.7µs ± 0%     10.6µs ± 
1%   -1.43%  (p=0.008 n=5+5)
DecodeS2Block/10-pb/block-better-12             12.1µs ± 0%     11.8µs ± 
0%   -1.90%  (p=0.008 n=5+5)
DecodeS2Block/11-gaviota/block-12               51.9µs ± 1%     51.2µs ± 
0%   -1.36%  (p=0.032 n=5+5)
DecodeS2Block/12-txt1_128b/block-12             18.0ns ± 1%     17.6ns ± 
0%   -2.22%  (p=0.000 n=5+4)
DecodeS2Block/12-txt1_128b/block-better-12      18.3ns ± 1%     17.7ns ± 
0%   -3.07%  (p=0.016 n=5+4)
DecodeS2Block/13-txt1_1000b/block-12            73.1ns ± 2%     70.2ns ± 
0%   -4.02%  (p=0.008 n=5+5)
DecodeS2Block/13-txt1_1000b/block-better-12      183ns ± 5%      174ns ± 
1%   -4.92%  (p=0.008 n=5+5)
DecodeS2Block/14-txt1_10000b/block-12           1.45µs ± 1%     1.40µs ± 
0%   -3.08%  (p=0.008 n=5+5)
DecodeS2Block/14-txt1_10000b/block-better-12    3.48µs ± 1%     3.68µs ± 
0%   +5.86%  (p=0.008 n=5+5)
DecodeS2Block/15-txt1_20000b/block-12           4.36µs ± 0%     4.54µs ± 
0%   +4.08%  (p=0.008 n=5+5)
DecodeS2Block/15-txt1_20000b/block-better-12    8.54µs ± 0%     8.27µs ± 
0%   -3.20%  (p=0.008 n=5+5)
EncodeS2Block/0-html/block-12                   26.6µs ± 0%     26.4µs ± 
0%   -0.65%  (p=0.008 n=5+5)
EncodeS2Block/0-html/block-better-12            46.3µs ± 1%     46.6µs ± 
0%   +0.71%  (p=0.016 n=5+5)
EncodeS2Block/1-urls/block-better-12             567µs ± 0%      573µs ± 
0%   +1.00%  (p=0.008 n=5+5)
EncodeS2Block/2-jpg/block-better-12             4.61µs ± 3%     4.44µs ± 
1%   -3.76%  (p=0.008 n=5+5)
EncodeS2Block/3-jpg_200b/block-12                282ns ± 2%      274ns ± 
0%   -2.83%  (p=0.008 n=5+5)
EncodeS2Block/5-html4/block-better-12           59.1µs ± 1%     61.1µs ± 
0%   +3.40%  (p=0.008 n=5+5)
EncodeS2Block/6-txt1/block-12                   96.2µs ± 0%     97.1µs ± 
0%   +0.98%  (p=0.008 n=5+5)
EncodeS2Block/6-txt1/block-better-12             165µs ± 0%      169µs ± 
1%   +2.44%  (p=0.016 n=4+5)
EncodeS2Block/7-txt2/block-12                   80.7µs ± 0%     81.9µs ± 
1%   +1.55%  (p=0.016 n=4+5)
EncodeS2Block/7-txt2/block-better-12             148µs ± 0%      152µs ± 
1%   +2.26%  (p=0.016 n=5+4)
EncodeS2Block/8-txt3/block-12                    270µs ± 0%      276µs ± 
1%   +1.96%  (p=0.008 n=5+5)
EncodeS2Block/8-txt3/block-better-12             434µs ± 0%      441µs ± 
0%   +1.72%  (p=0.016 n=4+5)
EncodeS2Block/9-txt4/block-12                    356µs ± 2%      346µs ± 
1%   -2.84%  (p=0.008 n=5+5)
EncodeS2Block/10-pb/block-better-12             40.2µs ± 0%     41.2µs ± 
2%   +2.49%  (p=0.008 n=5+5)
EncodeS2Block/11-gaviota/block-12               80.7µs ± 0%     81.7µs ± 
0%   +1.30%  (p=0.008 n=5+5)
EncodeS2Block/11-gaviota/block-better-12         128µs ± 0%      131µs ± 
6%   +2.45%  (p=0.008 n=5+5)
EncodeS2Block/13-txt1_1000b/block-12             607ns ± 2%      598ns ± 
0%   -1.52%  (p=0.024 n=5+5)
EncodeS2Block/14-txt1_10000b/block-12           5.49µs ± 2%     5.06µs ± 
1%   -7.79%  (p=0.008 n=5+5)
EncodeS2Block/15-txt1_20000b/block-12           12.3µs ± 1%     12.9µs ± 
4%   +4.84%  (p=0.016 n=5+5)
EncodeS2Block/15-txt1_20000b/block-better-12    27.2µs ± 0%     28.7µs ± 
2%   +5.40%  (p=0.008 n=5+5)
DecodeSnappyBlock/0-html/s2-snappy-12           17.0µs ± 0%     16.8µs ± 
0%   -1.09%  (p=0.008 n=5+5)
DecodeSnappyBlock/1-urls/s2-snappy-12            197µs ± 1%      199µs ± 
1%   +1.04%  (p=0.032 n=5+5)
DecodeSnappyBlock/2-jpg/s2-snappy-12            1.26µs ± 1%     1.25µs ± 
1%   -1.08%  (p=0.032 n=5+5)
DecodeSnappyBlock/3-jpg_200b/snappy-12          29.7ns ± 9%     26.9ns ± 
0%   -9.42%  (p=0.008 n=5+5)
DecodeSnappyBlock/3-jpg_200b/s2-snappy-12       28.0ns ± 2%     26.8ns ± 
1%   -4.07%  (p=0.008 n=5+5)
DecodeSnappyBlock/5-html4/s2-snappy-12          70.5µs ± 2%     68.2µs ± 
1%   -3.26%  (p=0.016 n=4+5)


These are microbenchmarks which tends to over-emphasize differences, so I 
would say overall it looks like a 2% loss.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/e1e3dd98-da01-4681-a3bf-10fb45f4a9ad%40googlegroups.com.

Reply via email to