wumeibanfa opened a new pull request, #52829: URL: https://github.com/apache/doris/pull/52829
### What problem does this PR solve? Faster bit_pack Issue Number: close #xxx Related PR: #xxx Problem Summary: config: ```c++ -- Compiler: Clang-16.0.6 -- CXX Standard: 20 -- C Standard: 17 -- CXX Flags: -O0 -g -O0 -- C Flags: -O0 -g -O0 ``` result ```c++ when bit_width is 1, before is 3294.880000 ns, after is 738.410000 ns, speed is 4.462128 when bit_width is 2, before is 5224.640000 ns, after is 750.420000 ns, speed is 6.962288 when bit_width is 3, before is 7025.990000 ns, after is 800.540000 ns, speed is 8.776563 when bit_width is 4, before is 8864.260000 ns, after is 849.180000 ns, speed is 10.438611 when bit_width is 5, before is 11103.970000 ns, after is 940.930000 ns, speed is 11.801059 when bit_width is 6, before is 13494.090000 ns, after is 1477.850000 ns, speed is 9.130893 when bit_width is 7, before is 15521.960000 ns, after is 1111.380000 ns, speed is 13.966384 when bit_width is 8, before is 17099.260000 ns, after is 1194.690000 ns, speed is 14.312717 when bit_width is 9, before is 19617.930000 ns, after is 1851.810000 ns, speed is 10.593922 when bit_width is 10, before is 21230.220000 ns, after is 1736.980000 ns, speed is 12.222490 when bit_width is 11, before is 23830.230000 ns, after is 1807.110000 ns, speed is 13.186928 when bit_width is 12, before is 25530.010000 ns, after is 1873.100000 ns, speed is 13.629817 when bit_width is 13, before is 28219.560000 ns, after is 1971.700000 ns, speed is 14.312299 when bit_width is 14, before is 29954.840000 ns, after is 2033.920000 ns, speed is 14.727639 when bit_width is 15, before is 31235.600000 ns, after is 2279.610000 ns, speed is 13.702168 when bit_width is 16, before is 32902.600000 ns, after is 2349.790000 ns, speed is 14.002358 when bit_width is 17, before is 35014.840000 ns, after is 3697.410000 ns, speed is 9.470099 when bit_width is 18, before is 37080.690000 ns, after is 4192.660000 ns, speed is 8.844192 when bit_width is 19, before is 39394.830000 ns, after is 4375.110000 ns, speed is 9.004306 when bit_width is 20, before is 40999.150000 ns, after is 3988.830000 ns, speed is 10.278490 when bit_width is 21, before is 43171.530000 ns, after is 4130.000000 ns, speed is 10.453155 when bit_width is 22, before is 45251.360000 ns, after is 4274.880000 ns, speed is 10.585411 when bit_width is 23, before is 75335.650000 ns, after is 5134.930000 ns, speed is 14.671213 when bit_width is 24, before is 75119.890000 ns, after is 4554.030000 ns, speed is 16.495256 when bit_width is 25, before is 51994.840000 ns, after is 4665.730000 ns, speed is 11.143988 when bit_width is 26, before is 54442.400000 ns, after is 4996.010000 ns, speed is 10.897176 when bit_width is 27, before is 54795.190000 ns, after is 4907.710000 ns, speed is 11.165124 when bit_width is 28, before is 57297.020000 ns, after is 5039.520000 ns, speed is 11.369539 when bit_width is 29, before is 60222.040000 ns, after is 5318.350000 ns, speed is 11.323444 when bit_width is 30, before is 60703.280000 ns, after is 5301.010000 ns, speed is 11.451267 when bit_width is 31, before is 62014.760000 ns, after is 5433.120000 ns, speed is 11.414208 when bit_width is 32, before is 65925.390000 ns, after is 5776.760000 ns, speed is 11.412174 when bit_width is 33, before is 66719.330000 ns, after is 6376.230000 ns, speed is 10.463758 when bit_width is 34, before is 68838.450000 ns, after is 6727.810000 ns, speed is 10.231925 when bit_width is 35, before is 70870.760000 ns, after is 6588.620000 ns, speed is 10.756541 when bit_width is 36, before is 74243.660000 ns, after is 7068.240000 ns, speed is 10.503840 when bit_width is 37, before is 73991.530000 ns, after is 7054.550000 ns, speed is 10.488483 when bit_width is 38, before is 77108.970000 ns, after is 6735.970000 ns, speed is 11.447345 when bit_width is 39, before is 79812.320000 ns, after is 7079.080000 ns, speed is 11.274392 when bit_width is 40, before is 81248.290000 ns, after is 7510.010000 ns, speed is 10.818666 when bit_width is 41, before is 82469.520000 ns, after is 7711.570000 ns, speed is 10.694258 when bit_width is 42, before is 83729.900000 ns, after is 7217.040000 ns, speed is 11.601695 when bit_width is 43, before is 87894.120000 ns, after is 7589.720000 ns, speed is 11.580680 when bit_width is 44, before is 88691.490000 ns, after is 8045.350000 ns, speed is 11.023944 when bit_width is 45, before is 93285.010000 ns, after is 7859.350000 ns, speed is 11.869303 when bit_width is 46, before is 94219.430000 ns, after is 8435.480000 ns, speed is 11.169421 when bit_width is 47, before is 96710.040000 ns, after is 7944.720000 ns, speed is 12.172870 when bit_width is 48, before is 97699.690000 ns, after is 8137.250000 ns, speed is 12.006475 when bit_width is 49, before is 101710.410000 ns, after is 8552.680000 ns, speed is 11.892227 when bit_width is 50, before is 102342.820000 ns, after is 8809.110000 ns, speed is 11.617839 when bit_width is 51, before is 103768.860000 ns, after is 8407.980000 ns, speed is 12.341711 when bit_width is 52, before is 103158.160000 ns, after is 9329.240000 ns, speed is 11.057510 when bit_width is 53, before is 108210.390000 ns, after is 8778.270000 ns, speed is 12.327075 when bit_width is 54, before is 108626.620000 ns, after is 9537.640000 ns, speed is 11.389256 when bit_width is 55, before is 112367.080000 ns, after is 8909.410000 ns, speed is 12.612180 when bit_width is 56, before is 112482.250000 ns, after is 9234.500000 ns, speed is 12.180654 when bit_width is 57, before is 115389.060000 ns, after is 9338.950000 ns, speed is 12.355678 when bit_width is 58, before is 117666.920000 ns, after is 9490.280000 ns, speed is 12.398677 when bit_width is 59, before is 119405.150000 ns, after is 9542.360000 ns, speed is 12.513168 when bit_width is 60, before is 119394.540000 ns, after is 9618.860000 ns, speed is 12.412546 when bit_width is 61, before is 125850.860000 ns, after is 9667.090000 ns, speed is 13.018484 when bit_width is 62, before is 123440.010000 ns, after is 10373.070000 ns, speed is 11.900046 when bit_width is 63, before is 128568.310000 ns, after is 10061.470000 ns, speed is 12.778283 when bit_width is 64, before is 127146.070000 ns, after is 10859.810000 ns, speed is 11.707946 when bit_width is 65, before is 132856.090000 ns, after is 10570.510000 ns, speed is 12.568560 when bit_width is 66, before is 131795.120000 ns, after is 11277.570000 ns, speed is 11.686482 when bit_width is 67, before is 134795.710000 ns, after is 10430.060000 ns, speed is 12.923771 when bit_width is 68, before is 135264.780000 ns, after is 10805.470000 ns, speed is 12.518176 when bit_width is 69, before is 139791.760000 ns, after is 11420.050000 ns, speed is 12.240906 when bit_width is 70, before is 140257.180000 ns, after is 11419.590000 ns, speed is 12.282155 when bit_width is 71, before is 144447.500000 ns, after is 11593.390000 ns, speed is 12.459470 when bit_width is 72, before is 142882.680000 ns, after is 12351.200000 ns, speed is 11.568324 when bit_width is 73, before is 146948.850000 ns, after is 11380.370000 ns, speed is 12.912484 when bit_width is 74, before is 149337.960000 ns, after is 11701.290000 ns, speed is 12.762521 when bit_width is 75, before is 148851.730000 ns, after is 11396.000000 ns, speed is 13.061752 when bit_width is 76, before is 151498.540000 ns, after is 11668.360000 ns, speed is 12.983705 when bit_width is 77, before is 155565.760000 ns, after is 11679.210000 ns, speed is 13.319887 when bit_width is 78, before is 158459.180000 ns, after is 12236.490000 ns, speed is 12.949725 when bit_width is 79, before is 159992.310000 ns, after is 12266.280000 ns, speed is 13.043263 when bit_width is 80, before is 161931.990000 ns, after is 12910.100000 ns, speed is 12.543047 when bit_width is 81, before is 162788.940000 ns, after is 12817.100000 ns, speed is 12.700918 when bit_width is 82, before is 164473.530000 ns, after is 12356.120000 ns, speed is 13.311098 when bit_width is 83, before is 166102.780000 ns, after is 12521.520000 ns, speed is 13.265385 when bit_width is 84, before is 167809.270000 ns, after is 13697.380000 ns, speed is 12.251195 when bit_width is 85, before is 172094.270000 ns, after is 12862.120000 ns, speed is 13.379930 when bit_width is 86, before is 172265.020000 ns, after is 13343.230000 ns, speed is 12.910294 when bit_width is 87, before is 174760.640000 ns, after is 12961.020000 ns, speed is 13.483556 when bit_width is 88, before is 176460.170000 ns, after is 14279.770000 ns, speed is 12.357354 when bit_width is 89, before is 178641.440000 ns, after is 15359.610000 ns, speed is 11.630597 when bit_width is 90, before is 181722.670000 ns, after is 13387.270000 ns, speed is 13.574289 when bit_width is 91, before is 183219.300000 ns, after is 14016.480000 ns, speed is 13.071706 when bit_width is 92, before is 183712.720000 ns, after is 13741.870000 ns, speed is 13.368830 when bit_width is 93, before is 187281.590000 ns, after is 13947.090000 ns, speed is 13.428005 when bit_width is 94, before is 188414.200000 ns, after is 14460.270000 ns, speed is 13.029784 when bit_width is 95, before is 190727.280000 ns, after is 14017.250000 ns, speed is 13.606612 when bit_width is 96, before is 193357.990000 ns, after is 14819.410000 ns, speed is 13.047617 when bit_width is 97, before is 194656.850000 ns, after is 14512.690000 ns, speed is 13.412872 when bit_width is 98, before is 195899.460000 ns, after is 14700.380000 ns, speed is 13.326149 when bit_width is 99, before is 198263.920000 ns, after is 14929.920000 ns, speed is 13.279637 when bit_width is 100, before is 199953.350000 ns, after is 15431.690000 ns, speed is 12.957320 when bit_width is 101, before is 201515.640000 ns, after is 15489.910000 ns, speed is 13.009478 when bit_width is 102, before is 205375.680000 ns, after is 15712.300000 ns, speed is 13.071013 when bit_width is 103, before is 206560.910000 ns, after is 15429.880000 ns, speed is 13.387072 when bit_width is 104, before is 207171.020000 ns, after is 15776.680000 ns, speed is 13.131471 when bit_width is 105, before is 211459.020000 ns, after is 16287.280000 ns, speed is 12.983078 when bit_width is 106, before is 212725.000000 ns, after is 15536.460000 ns, speed is 13.691986 when bit_width is 107, before is 215239.390000 ns, after is 16513.340000 ns, speed is 13.034274 when bit_width is 108, before is 215504.480000 ns, after is 16106.450000 ns, speed is 13.380011 when bit_width is 109, before is 217016.800000 ns, after is 16687.330000 ns, speed is 13.004885 when bit_width is 110, before is 222151.920000 ns, after is 16848.360000 ns, speed is 13.185374 when bit_width is 111, before is 223039.520000 ns, after is 16141.470000 ns, speed is 13.817795 when bit_width is 112, before is 225471.600000 ns, after is 17829.100000 ns, speed is 12.646269 when bit_width is 113, before is 226737.690000 ns, after is 16694.340000 ns, speed is 13.581710 when bit_width is 114, before is 229768.270000 ns, after is 17509.110000 ns, speed is 13.122784 when bit_width is 115, before is 233042.630000 ns, after is 17861.260000 ns, speed is 13.047379 when bit_width is 116, before is 231025.630000 ns, after is 18136.170000 ns, speed is 12.738391 when bit_width is 117, before is 235595.480000 ns, after is 16999.570000 ns, speed is 13.858908 when bit_width is 118, before is 237023.120000 ns, after is 17707.760000 ns, speed is 13.385268 when bit_width is 119, before is 242474.950000 ns, after is 18507.250000 ns, speed is 13.101620 when bit_width is 120, before is 241731.480000 ns, after is 19315.490000 ns, speed is 12.514903 when bit_width is 121, before is 242043.730000 ns, after is 18499.660000 ns, speed is 13.083685 when bit_width is 122, before is 243224.170000 ns, after is 17621.990000 ns, speed is 13.802310 when bit_width is 123, before is 244620.780000 ns, after is 18782.770000 ns, speed is 13.023680 when bit_width is 124, before is 249524.870000 ns, after is 18445.390000 ns, speed is 13.527763 when bit_width is 125, before is 252920.830000 ns, after is 19115.260000 ns, speed is 13.231357 when bit_width is 126, before is 252695.320000 ns, after is 19040.970000 ns, speed is 13.271137 when bit_width is 127, before is 256947.780000 ns, after is 18873.720000 ns, speed is 13.614051 ``` ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [x] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [x] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [x] No. - [ ] Yes. <!-- Add document PR link here. eg: https://github.com/apache/doris-website/pull/1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org