jasonk000 commented on a change in pull request #11721:
URL: https://github.com/apache/kafka/pull/11721#discussion_r798002944
##########
File path: clients/src/main/java/org/apache/kafka/common/utils/ByteUtils.java
##########
@@ -386,18 +386,39 @@ public static void writeDouble(double value, ByteBuffer
buffer) {
buffer.putDouble(value);
}
+ final static int[] LEADING_ZEROS_TO_U_VARINT_SIZE = new int[] {
+ // 32 bits, and each 7-bits adds one byte to the output
+ 5, 5, 5, 5, // 32
+ 4, 4, 4, 4, 4, 4, 4, // 28
+ 3, 3, 3, 3, 3, 3, 3, // 21
+ 2, 2, 2, 2, 2, 2, 2, // 14
+ 1, 1, 1, 1, 1, 1, 1, // 7
+ 1 // 0
+ };
+
+ final static int[] LEADING_ZEROS_TO_U_VARLONG_SIZE = new int[] {
+ // 64 bits, and each 7-bits adds one byte to the output
+ 10, // 64
+ 9, 9, 9, 9, 9, 9, 9, // 63
+ 8, 8, 8, 8, 8, 8, 8, // 56
+ 7, 7, 7, 7, 7, 7, 7, // 49
+ 6, 6, 6, 6, 6, 6, 6, // 42
+ 5, 5, 5, 5, 5, 5, 5, // 35
+ 4, 4, 4, 4, 4, 4, 4, // 28
+ 3, 3, 3, 3, 3, 3, 3, // 21
+ 2, 2, 2, 2, 2, 2, 2, // 14
+ 1, 1, 1, 1, 1, 1, 1, // 7
+ 1 // 0
+ };
+
/**
* Number of bytes needed to encode an integer in unsigned variable-length
format.
*
* @param value The signed value
*/
public static int sizeOfUnsignedVarint(int value) {
- int bytes = 1;
- while ((value & 0xffffff80) != 0L) {
- bytes += 1;
- value >>>= 7;
- }
- return bytes;
+ int leadingZeros = Integer.numberOfLeadingZeros(value);
+ return LEADING_ZEROS_TO_U_VARINT_SIZE[leadingZeros];
Review comment:
Yes, I agree - the arithmetic / bitshift looks intuitively quicker since
it requires no loads. Looking at the compiler output from JDK11, it does get
compiled down to a sequence of operations that are reasonably clean. Perhaps
there is a slight advantage in different use-cases, but, in my case this fix
takes the function completely out of the profiler results so it solves the
problem.
Out of interest sake, here is the sequence of operations according to
perfasm on JDK11.
```
││ 0x00007f26e04797b7: mov 0x10(%r12,%r10,8),%r10d
0.05% ││ 0x00007f26e04797bc: lzcnt %r10d,%r11d ;*invokestatic
numberOfLeadingZeros {reexecute=0 rethrow=0 return_oop=0}
││ ; -
org.apache.kafka.jmh.util.ByteUtilsBenchmark::testSizeOfUnsignedVarintMathOne@6
(line 76)
││ ; -
org.apache.kafka.jmh.util.jmh_generated.ByteUtilsBenchmark_testSizeOfUnsignedVarintMathOne_jmhTest::testSizeOfUnsignedVarintMathOne_thrpt_jmhStub@17
(line 119)
0.03% ││ 0x00007f26e04797c1: mov $0x26,%r8d
0.01% ││ 0x00007f26e04797c7: sub %r11d,%r8d ;*isub
{reexecute=0 rethrow=0 return_oop=0}
││ ; -
org.apache.kafka.jmh.util.ByteUtilsBenchmark::testSizeOfUnsignedVarintMathOne@13
(line 77)
││ ; -
org.apache.kafka.jmh.util.jmh_generated.ByteUtilsBenchmark_testSizeOfUnsignedVarintMathOne_jmhTest::testSizeOfUnsignedVarintMathOne_thrpt_jmhStub@17
(line 119)
0.49% ││ 0x00007f26e04797ca: mov %r11d,%r9d
0.36% ││ 0x00007f26e04797cd: sar $0x1f,%r9d
0.02% ││ 0x00007f26e04797d1: movsxd %r8d,%r10
0.77% ││ 0x00007f26e04797d4: shr $0x1b,%r9d
7.24% ││ 0x00007f26e04797d8: add %r11d,%r9d
0.78% ││ 0x00007f26e04797db: imulq $0x92492493,%r10,%r10
0.05% ││ 0x00007f26e04797e2: sar $0x5,%r9d
2.24% ││ 0x00007f26e04797e6: sar $0x20,%r10
9.93% ││ 0x00007f26e04797ea: movsxd %r9d,%r11
0.12% ││ 0x00007f26e04797ed: mov %r10d,%r10d
││ 0x00007f26e04797f0: add %r8d,%r10d
6.30% ││ 0x00007f26e04797f3: sar $0x1f,%r8d
2.53% ││ 0x00007f26e04797f7: sar $0x2,%r10d
6.21% ││ 0x00007f26e04797fb: movsxd %r8d,%r8
││ 0x00007f26e04797fe: movsxd %r10d,%rdx
7.25% ││ 0x00007f26e0479801: sub %r8,%rdx
7.01% ││ 0x00007f26e0479804: add %r11,%rdx ;*i2l
{reexecute=0 rethrow=0 return_oop=0}
││ ; -
org.apache.kafka.jmh.util.ByteUtilsBenchmark::testSizeOfUnsignedVarintMathOne@22
(line 77)
││ ; -
org.apache.kafka.jmh.util.jmh_generated.ByteUtilsBenchmark_testSizeOfUnsignedVarintMathOne_jmhTest::testSizeOfUnsignedVarintMathOne_thrpt_jmhStub@17
(line 119)
8.11% ││ 0x00007f26e0479807: mov (%rsp),%rsi
```
Lookup table:
```
│ 0x00007fde904792a7: mov 0x10(%r12,%r10,8),%r10d
1.42% │ 0x00007fde904792ac: lzcnt %r10d,%r10d ;*invokestatic
numberOfLeadingZeros {reexecute=0 rethrow=0 return_oop=0}
│ ; -
org.apache.kafka.common.utils.ByteUtils::sizeOfUnsignedVarint@1 (line 420)
│ ; -
org.apache.kafka.jmh.util.ByteUtilsBenchmark::testSizeOfUnsignedVarintOne@6
(line 61)
│ ; -
org.apache.kafka.jmh.util.jmh_generated.ByteUtilsBenchmark_testSizeOfUnsignedVarintOne_jmhTest::testSizeOfUnsignedVarintOne_thrpt_jmhStub@17
(line 119)
1.26% │ 0x00007fde904792b1: cmp $0x21,%r10d
2.65% │ 0x00007fde904792b5: jnb 0x7fde90479332
0.02% │ 0x00007fde904792b7: mov %r8,0x40(%rsp)
0.02% │ 0x00007fde904792bc: movabs $0xfe434bf0,%r11 ;
{oop([I{0x00000000fe434bf0})}
1.68% │ 0x00007fde904792c6: movsxd 0x10(%r11,%r10,4),%rdx ;*i2l
{reexecute=0 rethrow=0 return_oop=0}
│ ; -
org.apache.kafka.jmh.util.ByteUtilsBenchmark::testSizeOfUnsignedVarintOne@9
(line 61)
│ ; -
org.apache.kafka.jmh.util.jmh_generated.ByteUtilsBenchmark_testSizeOfUnsignedVarintOne_jmhTest::testSizeOfUnsignedVarintOne_thrpt_jmhStub@17
(line 119)
37.64% │ 0x00007fde904792cb: mov (%rsp),%rsi
```
It may be that the code in yours is faster when dropped into a full
application.
Perhaps there are a more direct series of shift operations that would be
better suited?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]