klion26 commented on PR #9689:
URL: https://github.com/apache/arrow-rs/pull/9689#issuecomment-4301543572

   I can't reproduce the benchmark on my laptop, review the change manually, 
there change is extract logic to an inline function so that can be shared 
multiple places, there are two difference I can see maybe affect the performance
   - maybe there will be one more branch guess
   - `<T::Native as NumCast>::from` changed to `T::Native::from`
   
   <details>
   <summary>all benchmarks on my laptop</summary>
   <p>
   
   ```
   group                                                              
main-cast-0935                         variant-cast-decimal-0902
   -----                                                              
--------------                         -------------------------
   "cast decimal128 to float32"                                       1.00     
20.4±0.27µs        ? ?/sec    1.00     20.5±0.29µs        ? ?/sec
   "cast decimal128 to float64"                                       1.00     
20.9±0.27µs        ? ?/sec    1.00     20.9±0.26µs        ? ?/sec
   "cast decimal128 to int16"                                         1.01     
44.3±0.71µs        ? ?/sec    1.00     44.1±0.60µs        ? ?/sec
   "cast decimal128 to int32"                                         1.00     
38.7±0.51µs        ? ?/sec    1.00     38.7±0.49µs        ? ?/sec
   "cast decimal128 to int64"                                         1.00     
38.7±0.60µs        ? ?/sec    1.00     38.7±0.55µs        ? ?/sec
   "cast decimal128 to int8"                                          1.00     
41.7±0.66µs        ? ?/sec    1.00     41.8±0.60µs        ? ?/sec
   "cast decimal128 to uint16"                                        1.04     
49.5±0.52µs        ? ?/sec    1.00     47.4±0.62µs        ? ?/sec
   "cast decimal128 to uint32"                                        1.00     
37.4±0.44µs        ? ?/sec    1.00     37.4±0.39µs        ? ?/sec
   "cast decimal128 to uint64"                                        1.00     
38.3±0.50µs        ? ?/sec    1.00     38.3±0.56µs        ? ?/sec
   "cast decimal128 to uint8"                                         1.00     
40.5±0.54µs        ? ?/sec    1.00     40.5±0.59µs        ? ?/sec
   "cast decimal256 to float32"                                       1.00     
57.0±1.05µs        ? ?/sec    1.00     57.1±1.17µs        ? ?/sec
   "cast decimal256 to float64"                                       1.01     
57.6±0.71µs        ? ?/sec    1.00     57.3±0.54µs        ? ?/sec
   "cast decimal256 to int16"                                         1.00    
195.8±2.30µs        ? ?/sec    1.01    198.3±3.24µs        ? ?/sec
   "cast decimal256 to int32"                                         1.00    
179.9±1.93µs        ? ?/sec    1.01    182.5±2.19µs        ? ?/sec
   "cast decimal256 to int64"                                         1.00    
180.5±2.19µs        ? ?/sec    1.00    180.4±2.35µs        ? ?/sec
   "cast decimal256 to int8"                                          1.00    
193.4±2.89µs        ? ?/sec    1.01    194.5±2.46µs        ? ?/sec
   "cast decimal256 to uint16"                                        1.00    
193.3±2.44µs        ? ?/sec    1.00    193.7±2.07µs        ? ?/sec
   "cast decimal256 to uint32"                                        1.00    
174.6±2.43µs        ? ?/sec    1.00    175.2±2.35µs        ? ?/sec
   "cast decimal256 to uint64"                                        1.00    
173.4±2.32µs        ? ?/sec    1.00    173.2±2.44µs        ? ?/sec
   "cast decimal256 to uint8"                                         1.00    
188.6±1.98µs        ? ?/sec    1.00    188.7±2.18µs        ? ?/sec
   "cast decimal32 to float32"                                        1.00      
2.2±0.03µs        ? ?/sec    1.00      2.2±0.04µs        ? ?/sec
   "cast decimal32 to float64"                                        1.00      
2.6±0.03µs        ? ?/sec    1.00      2.5±0.03µs        ? ?/sec
   "cast decimal32 to int16"                                          1.00     
18.1±0.22µs        ? ?/sec    1.00     18.2±0.24µs        ? ?/sec
   "cast decimal32 to int32"                                          1.00     
18.2±0.22µs        ? ?/sec    1.00     18.1±0.19µs        ? ?/sec
   "cast decimal32 to int64"                                          1.00     
18.8±0.20µs        ? ?/sec    1.00     18.9±0.27µs        ? ?/sec
   "cast decimal32 to int8"                                           1.00     
33.5±0.30µs        ? ?/sec    1.04     34.7±7.43µs        ? ?/sec
   "cast decimal32 to uint16"                                         1.00     
18.1±0.22µs        ? ?/sec    1.00     18.2±0.26µs        ? ?/sec
   "cast decimal32 to uint32"                                         1.00     
18.1±0.23µs        ? ?/sec    1.00     18.1±0.24µs        ? ?/sec
   "cast decimal32 to uint64"                                         1.00     
18.8±0.22µs        ? ?/sec    1.00     18.8±0.22µs        ? ?/sec
   "cast decimal32 to uint8"                                          1.00     
40.8±0.46µs        ? ?/sec    1.00     40.9±0.51µs        ? ?/sec
   "cast decimal64 to float32"                                        1.00  
1855.0±24.26ns        ? ?/sec    1.00  1853.3±22.62ns        ? ?/sec
   "cast decimal64 to float64"                                        1.01      
2.1±0.04µs        ? ?/sec    1.00      2.1±0.05µs        ? ?/sec
   "cast decimal64 to int16"                                          1.00     
27.1±0.45µs        ? ?/sec    1.00     27.2±0.34µs        ? ?/sec
   "cast decimal64 to int32"                                          1.00     
18.2±0.24µs        ? ?/sec    1.00     18.2±0.28µs        ? ?/sec
   "cast decimal64 to int64"                                          1.00     
18.9±0.23µs        ? ?/sec    1.00     19.0±0.33µs        ? ?/sec
   "cast decimal64 to int8"                                           1.00     
25.0±0.34µs        ? ?/sec    1.00     25.0±0.34µs        ? ?/sec
   "cast decimal64 to uint16"                                         1.00     
29.4±0.52µs        ? ?/sec    1.01     29.8±0.71µs        ? ?/sec
   "cast decimal64 to uint32"                                         1.00     
18.2±0.35µs        ? ?/sec    1.00     18.2±0.31µs        ? ?/sec
   "cast decimal64 to uint64"                                         1.00     
19.0±0.28µs        ? ?/sec    1.00     18.9±0.26µs        ? ?/sec
   "cast decimal64 to uint8"                                          1.00     
25.0±0.37µs        ? ?/sec    1.00     25.1±0.49µs        ? ?/sec
   "cast float32 to decimal128(32, 3)"                                1.00     
31.3±0.63µs        ? ?/sec    1.00     31.4±0.63µs        ? ?/sec
   "cast float32 to decimal256(76, 4)"                                1.00   
565.2±10.41µs        ? ?/sec    1.00    565.1±9.10µs        ? ?/sec
   "cast float32 to decimal32(9, 2)"                                  1.00     
22.6±0.27µs        ? ?/sec    1.00     22.7±0.33µs        ? ?/sec
   "cast float32 to decimal64(18, 2"                                  1.00     
21.6±0.32µs        ? ?/sec    1.00     21.7±0.33µs        ? ?/sec
   "cast float64 to decimal128(32, 3)"                                1.00     
30.6±0.31µs        ? ?/sec    1.00     30.6±0.37µs        ? ?/sec
   "cast float64 to decimal256(76, 4)"                                1.00    
559.3±7.19µs        ? ?/sec    1.00    560.0±7.11µs        ? ?/sec
   "cast float64 to decimal32(9, 2)"                                  1.00     
22.4±0.27µs        ? ?/sec    1.00     22.5±0.28µs        ? ?/sec
   "cast float64 to decimal64(18, 2"                                  1.00     
21.4±0.21µs        ? ?/sec    1.00     21.5±0.22µs        ? ?/sec
   "cast invalid float32 to decimal128(32, 3)"                        1.00     
22.3±0.19µs        ? ?/sec    1.00     22.2±0.25µs        ? ?/sec
   "cast invalid float32 to decimal256(76, 4)"                        1.00     
40.4±0.40µs        ? ?/sec    1.00     40.2±0.53µs        ? ?/sec
   "cast invalid float32 to decimal32(9, 2)"                          1.01     
20.9±0.24µs        ? ?/sec    1.00     20.8±0.22µs        ? ?/sec
   "cast invalid float32 to decimal64(18, 2"                          1.01     
22.0±0.56µs        ? ?/sec    1.00     21.8±0.25µs        ? ?/sec
   "cast invalid float64 to decimal32(9, 2)"                          1.00     
21.1±0.24µs        ? ?/sec    1.00     21.1±0.23µs        ? ?/sec
   "cast invalid float64 to to decimal128(32, 3)"                     1.00     
22.7±0.35µs        ? ?/sec    1.00     22.7±0.25µs        ? ?/sec
   "cast invalid float64 to to decimal256(76, 4)"                     1.00     
39.5±0.43µs        ? ?/sec    1.04     41.1±0.52µs        ? ?/sec
   "cast invalid float64 to to decimal64(18, 2)"                      1.00     
22.0±0.24µs        ? ?/sec    1.00     22.0±0.22µs        ? ?/sec
   "cast invalid string to decimal128(38, 3)"                         1.00   
794.7±10.77µs        ? ?/sec    1.01    803.4±9.88µs        ? ?/sec
   "cast invalid string to decimal256(76, 4)"                         1.00   
798.3±13.31µs        ? ?/sec    1.01    805.6±8.91µs        ? ?/sec
   "cast invalid string to decimal32(9, 2)"                           1.00   
752.3±14.47µs        ? ?/sec    1.01    757.7±9.14µs        ? ?/sec
   "cast invalid string to decimal64(18, 2)"                          1.00    
753.9±9.34µs        ? ?/sec    1.01   763.4±10.36µs        ? ?/sec
   "cast string to decimal128(38, 3)"                                 1.00    
722.1±8.64µs        ? ?/sec    1.01    730.3±9.34µs        ? ?/sec
   "cast string to decimal256(76, 4)"                                 1.00    
728.8±7.86µs        ? ?/sec    1.01    734.7±9.10µs        ? ?/sec
   "cast string to decimal32(9, 2)"                                   1.00   
894.8±10.82µs        ? ?/sec    1.00   897.8±11.37µs        ? ?/sec
   "cast string to decimal64(18, 2)"                                  1.00   
704.6±10.37µs        ? ?/sec    1.01   714.4±16.92µs        ? ?/sec
   cast binary view to string                                         1.00     
78.6±1.08µs        ? ?/sec    1.00     78.9±1.27µs        ? ?/sec
   cast binary view to string view                                    1.00     
59.8±0.70µs        ? ?/sec    1.00     59.9±0.77µs        ? ?/sec
   cast binary view to wide string                                    1.00     
75.5±1.05µs        ? ?/sec    1.00     75.6±1.04µs        ? ?/sec
   cast date32 to date64 512                                          1.00    
271.2±3.02ns        ? ?/sec    1.00    271.6±4.18ns        ? ?/sec
   cast date64 to date32 512                                          1.00    
289.6±4.27ns        ? ?/sec    1.00    290.4±4.88ns        ? ?/sec
   cast decimal128 to decimal128 512                                  1.00      
6.4±0.14µs        ? ?/sec    1.00      6.4±0.08µs        ? ?/sec
   cast decimal128 to decimal128 512 lower precision                  1.00     
36.8±0.65µs        ? ?/sec    1.01     37.0±0.67µs        ? ?/sec
   cast decimal128 to decimal128 512 with lower scale (infallible)    1.00     
36.1±0.58µs        ? ?/sec    1.00     36.0±0.59µs        ? ?/sec
   cast decimal128 to decimal128 512 with same scale                  1.02     
54.3±0.83ns        ? ?/sec    1.00     53.1±0.89ns        ? ?/sec
   cast decimal128 to decimal256 512                                  1.00     
21.5±0.29µs        ? ?/sec    1.00     21.6±0.37µs        ? ?/sec
   cast decimal256 to decimal128 512                                  1.00    
333.1±4.18µs        ? ?/sec    1.00    332.4±3.16µs        ? ?/sec
   cast decimal256 to decimal256 512                                  1.03    
89.3±11.01µs        ? ?/sec    1.00     86.5±1.27µs        ? ?/sec
   cast decimal256 to decimal256 512 with same scale                  1.00     
53.4±0.93ns        ? ?/sec    1.00     53.6±0.80ns        ? ?/sec
   cast decimal32 to decimal32 512                                    1.00     
21.1±0.27µs        ? ?/sec    1.00     21.1±0.30µs        ? ?/sec
   cast decimal32 to decimal32 512 lower precision                    1.01     
42.8±1.45µs        ? ?/sec    1.00     42.3±0.82µs        ? ?/sec
   cast decimal32 to decimal64 512                                    1.00      
3.5±0.05µs        ? ?/sec    1.02      3.6±0.06µs        ? ?/sec
   cast decimal64 to decimal32 512                                    1.00     
19.9±0.20µs        ? ?/sec    1.00     19.9±0.29µs        ? ?/sec
   cast decimal64 to decimal64 512                                    1.00      
3.4±0.06µs        ? ?/sec    1.00      3.4±0.05µs        ? ?/sec
   cast dict to string view                                           1.00     
37.5±0.53µs        ? ?/sec    1.00     37.3±0.47µs        ? ?/sec
   cast f32 to string 512                                             1.00     
11.5±0.14µs        ? ?/sec    1.00     11.5±0.22µs        ? ?/sec
   cast f64 to string 512                                             1.00     
14.1±0.18µs        ? ?/sec    1.00     14.2±0.21µs        ? ?/sec
   cast float32 to int32 512                                          1.00    
594.4±7.93ns        ? ?/sec    1.00    595.7±7.76ns        ? ?/sec
   cast float64 to float32 512                                        1.00    
546.9±9.78ns        ? ?/sec    1.01   553.3±10.16ns        ? ?/sec
   cast float64 to uint64 512                                         1.00    
626.6±9.52ns        ? ?/sec    1.00    625.0±6.96ns        ? ?/sec
   cast i64 to string 512                                             1.00      
9.8±0.16µs        ? ?/sec    1.01      9.8±0.15µs        ? ?/sec
   cast int32 to float32 512                                          1.00    
553.0±7.63ns        ? ?/sec    1.00    551.1±7.75ns        ? ?/sec
   cast int32 to float64 512                                          1.00    
575.3±8.51ns        ? ?/sec    1.00    577.1±9.23ns        ? ?/sec
   cast int32 to int32 512                                            1.05    
115.4±5.15ns        ? ?/sec    1.00    110.2±5.03ns        ? ?/sec
   cast int32 to int64 512                                            1.00    
571.6±8.65ns        ? ?/sec    1.00    573.6±8.61ns        ? ?/sec
   cast int32 to uint32 512                                           1.00   
912.9±15.86ns        ? ?/sec    1.02  935.4±158.71ns        ? ?/sec
   cast int64 to int32 512                                            1.01  
1564.3±27.36ns        ? ?/sec    1.00  1551.2±21.00ns        ? ?/sec
   cast no runs of int32s to ree<int32>                               1.00     
66.2±0.85µs        ? ?/sec    1.00     66.3±1.15µs        ? ?/sec
   cast runs of 10 string to ree<int32>                               1.00      
8.1±0.15µs        ? ?/sec    1.00      8.1±0.10µs        ? ?/sec
   cast runs of 1000 int32s to ree<int32>                             1.00      
2.7±0.03µs        ? ?/sec    1.00      2.7±0.04µs        ? ?/sec
   cast string single run to ree<int32>                               1.00     
41.4±0.44µs        ? ?/sec    1.00     41.5±0.83µs        ? ?/sec
   cast string to binary view 512                                     1.00      
2.3±0.04µs        ? ?/sec    1.01      2.3±0.04µs        ? ?/sec
   cast string view to binary view                                    1.00     
39.8±0.77ns        ? ?/sec    1.00     39.7±0.61ns        ? ?/sec
   cast string view to dict                                           1.00    
126.5±2.91µs        ? ?/sec    1.00    126.6±2.15µs        ? ?/sec
   cast string view to string                                         1.00     
56.5±0.69µs        ? ?/sec    1.00     56.6±0.80µs        ? ?/sec
   cast string view to wide string                                    1.00     
55.7±0.85µs        ? ?/sec    1.00     55.6±0.73µs        ? ?/sec
   cast time32s to time32ms 512                                       1.00    
108.6±1.64ns        ? ?/sec    1.00    108.4±1.65ns        ? ?/sec
   cast time32s to time64us 512                                       1.00    
274.1±3.38ns        ? ?/sec    1.01    275.7±4.03ns        ? ?/sec
   cast time64ns to time32s 512                                       1.00    
289.7±3.60ns        ? ?/sec    1.00    289.9±3.99ns        ? ?/sec
   cast timestamp_ms to i64 512                                       1.05    
152.2±2.07ns        ? ?/sec    1.00    144.8±2.69ns        ? ?/sec
   cast timestamp_ms to timestamp_ns 512                              1.00  
1808.7±29.47ns        ? ?/sec    1.00  1802.3±23.53ns        ? ?/sec
   cast timestamp_ns to timestamp_s 512                               1.00    
108.9±3.99ns        ? ?/sec    1.00    108.7±3.93ns        ? ?/sec
   cast utf8 to date32 512                                            1.00      
6.5±0.09µs        ? ?/sec    1.00      6.5±0.09µs        ? ?/sec
   cast utf8 to date64 512                                            1.00     
31.1±0.57µs        ? ?/sec    1.00     31.2±0.50µs        ? ?/sec
   cast utf8 to f32                                                   1.00     
13.2±0.16µs        ? ?/sec    1.00     13.2±0.16µs        ? ?/sec
   cast wide string to binary view 512                                1.00      
5.0±0.08µs        ? ?/sec    1.00      5.0±0.10µs        ? ?/sec
   ```
   
   </p>
   </details>
   
   <details>
   <summary>only keep the decimal64 to * benchmarks</summary>
   <p>
   
   ```
   group                              main-cast-0844                         
variant-cast-decimal-0847
   -----                              --------------                         
-------------------------
   "cast decimal64 to float32"        1.00  1835.6±32.51ns        ? ?/sec    
1.01  1860.6±25.18ns        ? ?/sec
   "cast decimal64 to float64"        1.00      2.1±0.02µs        ? ?/sec    
1.00      2.1±0.03µs        ? ?/sec
   "cast decimal64 to int16"          1.00     27.2±0.43µs        ? ?/sec    
1.00     27.3±0.34µs        ? ?/sec
   "cast decimal64 to int32"          1.00     18.1±0.25µs        ? ?/sec    
1.00     18.1±0.22µs        ? ?/sec
   "cast decimal64 to int64"          1.00     19.0±0.32µs        ? ?/sec    
1.00     19.0±0.34µs        ? ?/sec
   "cast decimal64 to int8"           1.00     25.1±0.43µs        ? ?/sec    
1.00     24.9±0.31µs        ? ?/sec
   "cast decimal64 to uint16"         1.00     29.2±0.34µs        ? ?/sec    
1.01     29.5±0.39µs        ? ?/sec
   "cast decimal64 to uint32"         1.00     18.1±0.20µs        ? ?/sec    
1.00     18.2±0.31µs        ? ?/sec
   "cast decimal64 to uint64"         1.00     19.0±0.25µs        ? ?/sec    
1.00     18.9±0.28µs        ? ?/sec
   "cast decimal64 to uint8"          1.00     25.0±0.28µs        ? ?/sec    
1.11     27.6±0.46µs        ? ?/sec
   cast decimal64 to decimal32 512    1.01     19.8±0.32µs        ? ?/sec    
1.00     19.6±0.32µs        ? ?/sec
   cast decimal64 to decimal64 512    1.00      3.4±0.07µs        ? ?/sec    
1.00      3.4±0.05µs        ? ?/sec
   
   ```
   
   </p>
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to