klion26 commented on PR #9689: URL: https://github.com/apache/arrow-rs/pull/9689#issuecomment-4301543572
I can't reproduce the benchmark on my laptop, review the change manually, there change is extract logic to an inline function so that can be shared multiple places, there are two difference I can see maybe affect the performance - maybe there will be one more branch guess - `<T::Native as NumCast>::from` changed to `T::Native::from` <details> <summary>all benchmarks on my laptop</summary> <p> ``` group main-cast-0935 variant-cast-decimal-0902 ----- -------------- ------------------------- "cast decimal128 to float32" 1.00 20.4±0.27µs ? ?/sec 1.00 20.5±0.29µs ? ?/sec "cast decimal128 to float64" 1.00 20.9±0.27µs ? ?/sec 1.00 20.9±0.26µs ? ?/sec "cast decimal128 to int16" 1.01 44.3±0.71µs ? ?/sec 1.00 44.1±0.60µs ? ?/sec "cast decimal128 to int32" 1.00 38.7±0.51µs ? ?/sec 1.00 38.7±0.49µs ? ?/sec "cast decimal128 to int64" 1.00 38.7±0.60µs ? ?/sec 1.00 38.7±0.55µs ? ?/sec "cast decimal128 to int8" 1.00 41.7±0.66µs ? ?/sec 1.00 41.8±0.60µs ? ?/sec "cast decimal128 to uint16" 1.04 49.5±0.52µs ? ?/sec 1.00 47.4±0.62µs ? ?/sec "cast decimal128 to uint32" 1.00 37.4±0.44µs ? ?/sec 1.00 37.4±0.39µs ? ?/sec "cast decimal128 to uint64" 1.00 38.3±0.50µs ? ?/sec 1.00 38.3±0.56µs ? ?/sec "cast decimal128 to uint8" 1.00 40.5±0.54µs ? ?/sec 1.00 40.5±0.59µs ? ?/sec "cast decimal256 to float32" 1.00 57.0±1.05µs ? ?/sec 1.00 57.1±1.17µs ? ?/sec "cast decimal256 to float64" 1.01 57.6±0.71µs ? ?/sec 1.00 57.3±0.54µs ? ?/sec "cast decimal256 to int16" 1.00 195.8±2.30µs ? ?/sec 1.01 198.3±3.24µs ? ?/sec "cast decimal256 to int32" 1.00 179.9±1.93µs ? ?/sec 1.01 182.5±2.19µs ? ?/sec "cast decimal256 to int64" 1.00 180.5±2.19µs ? ?/sec 1.00 180.4±2.35µs ? ?/sec "cast decimal256 to int8" 1.00 193.4±2.89µs ? ?/sec 1.01 194.5±2.46µs ? ?/sec "cast decimal256 to uint16" 1.00 193.3±2.44µs ? ?/sec 1.00 193.7±2.07µs ? ?/sec "cast decimal256 to uint32" 1.00 174.6±2.43µs ? ?/sec 1.00 175.2±2.35µs ? ?/sec "cast decimal256 to uint64" 1.00 173.4±2.32µs ? ?/sec 1.00 173.2±2.44µs ? ?/sec "cast decimal256 to uint8" 1.00 188.6±1.98µs ? ?/sec 1.00 188.7±2.18µs ? ?/sec "cast decimal32 to float32" 1.00 2.2±0.03µs ? ?/sec 1.00 2.2±0.04µs ? ?/sec "cast decimal32 to float64" 1.00 2.6±0.03µs ? ?/sec 1.00 2.5±0.03µs ? ?/sec "cast decimal32 to int16" 1.00 18.1±0.22µs ? ?/sec 1.00 18.2±0.24µs ? ?/sec "cast decimal32 to int32" 1.00 18.2±0.22µs ? ?/sec 1.00 18.1±0.19µs ? ?/sec "cast decimal32 to int64" 1.00 18.8±0.20µs ? ?/sec 1.00 18.9±0.27µs ? ?/sec "cast decimal32 to int8" 1.00 33.5±0.30µs ? ?/sec 1.04 34.7±7.43µs ? ?/sec "cast decimal32 to uint16" 1.00 18.1±0.22µs ? ?/sec 1.00 18.2±0.26µs ? ?/sec "cast decimal32 to uint32" 1.00 18.1±0.23µs ? ?/sec 1.00 18.1±0.24µs ? ?/sec "cast decimal32 to uint64" 1.00 18.8±0.22µs ? ?/sec 1.00 18.8±0.22µs ? ?/sec "cast decimal32 to uint8" 1.00 40.8±0.46µs ? ?/sec 1.00 40.9±0.51µs ? ?/sec "cast decimal64 to float32" 1.00 1855.0±24.26ns ? ?/sec 1.00 1853.3±22.62ns ? ?/sec "cast decimal64 to float64" 1.01 2.1±0.04µs ? ?/sec 1.00 2.1±0.05µs ? ?/sec "cast decimal64 to int16" 1.00 27.1±0.45µs ? ?/sec 1.00 27.2±0.34µs ? ?/sec "cast decimal64 to int32" 1.00 18.2±0.24µs ? ?/sec 1.00 18.2±0.28µs ? ?/sec "cast decimal64 to int64" 1.00 18.9±0.23µs ? ?/sec 1.00 19.0±0.33µs ? ?/sec "cast decimal64 to int8" 1.00 25.0±0.34µs ? ?/sec 1.00 25.0±0.34µs ? ?/sec "cast decimal64 to uint16" 1.00 29.4±0.52µs ? ?/sec 1.01 29.8±0.71µs ? ?/sec "cast decimal64 to uint32" 1.00 18.2±0.35µs ? ?/sec 1.00 18.2±0.31µs ? ?/sec "cast decimal64 to uint64" 1.00 19.0±0.28µs ? ?/sec 1.00 18.9±0.26µs ? ?/sec "cast decimal64 to uint8" 1.00 25.0±0.37µs ? ?/sec 1.00 25.1±0.49µs ? ?/sec "cast float32 to decimal128(32, 3)" 1.00 31.3±0.63µs ? ?/sec 1.00 31.4±0.63µs ? ?/sec "cast float32 to decimal256(76, 4)" 1.00 565.2±10.41µs ? ?/sec 1.00 565.1±9.10µs ? ?/sec "cast float32 to decimal32(9, 2)" 1.00 22.6±0.27µs ? ?/sec 1.00 22.7±0.33µs ? ?/sec "cast float32 to decimal64(18, 2" 1.00 21.6±0.32µs ? ?/sec 1.00 21.7±0.33µs ? ?/sec "cast float64 to decimal128(32, 3)" 1.00 30.6±0.31µs ? ?/sec 1.00 30.6±0.37µs ? ?/sec "cast float64 to decimal256(76, 4)" 1.00 559.3±7.19µs ? ?/sec 1.00 560.0±7.11µs ? ?/sec "cast float64 to decimal32(9, 2)" 1.00 22.4±0.27µs ? ?/sec 1.00 22.5±0.28µs ? ?/sec "cast float64 to decimal64(18, 2" 1.00 21.4±0.21µs ? ?/sec 1.00 21.5±0.22µs ? ?/sec "cast invalid float32 to decimal128(32, 3)" 1.00 22.3±0.19µs ? ?/sec 1.00 22.2±0.25µs ? ?/sec "cast invalid float32 to decimal256(76, 4)" 1.00 40.4±0.40µs ? ?/sec 1.00 40.2±0.53µs ? ?/sec "cast invalid float32 to decimal32(9, 2)" 1.01 20.9±0.24µs ? ?/sec 1.00 20.8±0.22µs ? ?/sec "cast invalid float32 to decimal64(18, 2" 1.01 22.0±0.56µs ? ?/sec 1.00 21.8±0.25µs ? ?/sec "cast invalid float64 to decimal32(9, 2)" 1.00 21.1±0.24µs ? ?/sec 1.00 21.1±0.23µs ? ?/sec "cast invalid float64 to to decimal128(32, 3)" 1.00 22.7±0.35µs ? ?/sec 1.00 22.7±0.25µs ? ?/sec "cast invalid float64 to to decimal256(76, 4)" 1.00 39.5±0.43µs ? ?/sec 1.04 41.1±0.52µs ? ?/sec "cast invalid float64 to to decimal64(18, 2)" 1.00 22.0±0.24µs ? ?/sec 1.00 22.0±0.22µs ? ?/sec "cast invalid string to decimal128(38, 3)" 1.00 794.7±10.77µs ? ?/sec 1.01 803.4±9.88µs ? ?/sec "cast invalid string to decimal256(76, 4)" 1.00 798.3±13.31µs ? ?/sec 1.01 805.6±8.91µs ? ?/sec "cast invalid string to decimal32(9, 2)" 1.00 752.3±14.47µs ? ?/sec 1.01 757.7±9.14µs ? ?/sec "cast invalid string to decimal64(18, 2)" 1.00 753.9±9.34µs ? ?/sec 1.01 763.4±10.36µs ? ?/sec "cast string to decimal128(38, 3)" 1.00 722.1±8.64µs ? ?/sec 1.01 730.3±9.34µs ? ?/sec "cast string to decimal256(76, 4)" 1.00 728.8±7.86µs ? ?/sec 1.01 734.7±9.10µs ? ?/sec "cast string to decimal32(9, 2)" 1.00 894.8±10.82µs ? ?/sec 1.00 897.8±11.37µs ? ?/sec "cast string to decimal64(18, 2)" 1.00 704.6±10.37µs ? ?/sec 1.01 714.4±16.92µs ? ?/sec cast binary view to string 1.00 78.6±1.08µs ? ?/sec 1.00 78.9±1.27µs ? ?/sec cast binary view to string view 1.00 59.8±0.70µs ? ?/sec 1.00 59.9±0.77µs ? ?/sec cast binary view to wide string 1.00 75.5±1.05µs ? ?/sec 1.00 75.6±1.04µs ? ?/sec cast date32 to date64 512 1.00 271.2±3.02ns ? ?/sec 1.00 271.6±4.18ns ? ?/sec cast date64 to date32 512 1.00 289.6±4.27ns ? ?/sec 1.00 290.4±4.88ns ? ?/sec cast decimal128 to decimal128 512 1.00 6.4±0.14µs ? ?/sec 1.00 6.4±0.08µs ? ?/sec cast decimal128 to decimal128 512 lower precision 1.00 36.8±0.65µs ? ?/sec 1.01 37.0±0.67µs ? ?/sec cast decimal128 to decimal128 512 with lower scale (infallible) 1.00 36.1±0.58µs ? ?/sec 1.00 36.0±0.59µs ? ?/sec cast decimal128 to decimal128 512 with same scale 1.02 54.3±0.83ns ? ?/sec 1.00 53.1±0.89ns ? ?/sec cast decimal128 to decimal256 512 1.00 21.5±0.29µs ? ?/sec 1.00 21.6±0.37µs ? ?/sec cast decimal256 to decimal128 512 1.00 333.1±4.18µs ? ?/sec 1.00 332.4±3.16µs ? ?/sec cast decimal256 to decimal256 512 1.03 89.3±11.01µs ? ?/sec 1.00 86.5±1.27µs ? ?/sec cast decimal256 to decimal256 512 with same scale 1.00 53.4±0.93ns ? ?/sec 1.00 53.6±0.80ns ? ?/sec cast decimal32 to decimal32 512 1.00 21.1±0.27µs ? ?/sec 1.00 21.1±0.30µs ? ?/sec cast decimal32 to decimal32 512 lower precision 1.01 42.8±1.45µs ? ?/sec 1.00 42.3±0.82µs ? ?/sec cast decimal32 to decimal64 512 1.00 3.5±0.05µs ? ?/sec 1.02 3.6±0.06µs ? ?/sec cast decimal64 to decimal32 512 1.00 19.9±0.20µs ? ?/sec 1.00 19.9±0.29µs ? ?/sec cast decimal64 to decimal64 512 1.00 3.4±0.06µs ? ?/sec 1.00 3.4±0.05µs ? ?/sec cast dict to string view 1.00 37.5±0.53µs ? ?/sec 1.00 37.3±0.47µs ? ?/sec cast f32 to string 512 1.00 11.5±0.14µs ? ?/sec 1.00 11.5±0.22µs ? ?/sec cast f64 to string 512 1.00 14.1±0.18µs ? ?/sec 1.00 14.2±0.21µs ? ?/sec cast float32 to int32 512 1.00 594.4±7.93ns ? ?/sec 1.00 595.7±7.76ns ? ?/sec cast float64 to float32 512 1.00 546.9±9.78ns ? ?/sec 1.01 553.3±10.16ns ? ?/sec cast float64 to uint64 512 1.00 626.6±9.52ns ? ?/sec 1.00 625.0±6.96ns ? ?/sec cast i64 to string 512 1.00 9.8±0.16µs ? ?/sec 1.01 9.8±0.15µs ? ?/sec cast int32 to float32 512 1.00 553.0±7.63ns ? ?/sec 1.00 551.1±7.75ns ? ?/sec cast int32 to float64 512 1.00 575.3±8.51ns ? ?/sec 1.00 577.1±9.23ns ? ?/sec cast int32 to int32 512 1.05 115.4±5.15ns ? ?/sec 1.00 110.2±5.03ns ? ?/sec cast int32 to int64 512 1.00 571.6±8.65ns ? ?/sec 1.00 573.6±8.61ns ? ?/sec cast int32 to uint32 512 1.00 912.9±15.86ns ? ?/sec 1.02 935.4±158.71ns ? ?/sec cast int64 to int32 512 1.01 1564.3±27.36ns ? ?/sec 1.00 1551.2±21.00ns ? ?/sec cast no runs of int32s to ree<int32> 1.00 66.2±0.85µs ? ?/sec 1.00 66.3±1.15µs ? ?/sec cast runs of 10 string to ree<int32> 1.00 8.1±0.15µs ? ?/sec 1.00 8.1±0.10µs ? ?/sec cast runs of 1000 int32s to ree<int32> 1.00 2.7±0.03µs ? ?/sec 1.00 2.7±0.04µs ? ?/sec cast string single run to ree<int32> 1.00 41.4±0.44µs ? ?/sec 1.00 41.5±0.83µs ? ?/sec cast string to binary view 512 1.00 2.3±0.04µs ? ?/sec 1.01 2.3±0.04µs ? ?/sec cast string view to binary view 1.00 39.8±0.77ns ? ?/sec 1.00 39.7±0.61ns ? ?/sec cast string view to dict 1.00 126.5±2.91µs ? ?/sec 1.00 126.6±2.15µs ? ?/sec cast string view to string 1.00 56.5±0.69µs ? ?/sec 1.00 56.6±0.80µs ? ?/sec cast string view to wide string 1.00 55.7±0.85µs ? ?/sec 1.00 55.6±0.73µs ? ?/sec cast time32s to time32ms 512 1.00 108.6±1.64ns ? ?/sec 1.00 108.4±1.65ns ? ?/sec cast time32s to time64us 512 1.00 274.1±3.38ns ? ?/sec 1.01 275.7±4.03ns ? ?/sec cast time64ns to time32s 512 1.00 289.7±3.60ns ? ?/sec 1.00 289.9±3.99ns ? ?/sec cast timestamp_ms to i64 512 1.05 152.2±2.07ns ? ?/sec 1.00 144.8±2.69ns ? ?/sec cast timestamp_ms to timestamp_ns 512 1.00 1808.7±29.47ns ? ?/sec 1.00 1802.3±23.53ns ? ?/sec cast timestamp_ns to timestamp_s 512 1.00 108.9±3.99ns ? ?/sec 1.00 108.7±3.93ns ? ?/sec cast utf8 to date32 512 1.00 6.5±0.09µs ? ?/sec 1.00 6.5±0.09µs ? ?/sec cast utf8 to date64 512 1.00 31.1±0.57µs ? ?/sec 1.00 31.2±0.50µs ? ?/sec cast utf8 to f32 1.00 13.2±0.16µs ? ?/sec 1.00 13.2±0.16µs ? ?/sec cast wide string to binary view 512 1.00 5.0±0.08µs ? ?/sec 1.00 5.0±0.10µs ? ?/sec ``` </p> </details> <details> <summary>only keep the decimal64 to * benchmarks</summary> <p> ``` group main-cast-0844 variant-cast-decimal-0847 ----- -------------- ------------------------- "cast decimal64 to float32" 1.00 1835.6±32.51ns ? ?/sec 1.01 1860.6±25.18ns ? ?/sec "cast decimal64 to float64" 1.00 2.1±0.02µs ? ?/sec 1.00 2.1±0.03µs ? ?/sec "cast decimal64 to int16" 1.00 27.2±0.43µs ? ?/sec 1.00 27.3±0.34µs ? ?/sec "cast decimal64 to int32" 1.00 18.1±0.25µs ? ?/sec 1.00 18.1±0.22µs ? ?/sec "cast decimal64 to int64" 1.00 19.0±0.32µs ? ?/sec 1.00 19.0±0.34µs ? ?/sec "cast decimal64 to int8" 1.00 25.1±0.43µs ? ?/sec 1.00 24.9±0.31µs ? ?/sec "cast decimal64 to uint16" 1.00 29.2±0.34µs ? ?/sec 1.01 29.5±0.39µs ? ?/sec "cast decimal64 to uint32" 1.00 18.1±0.20µs ? ?/sec 1.00 18.2±0.31µs ? ?/sec "cast decimal64 to uint64" 1.00 19.0±0.25µs ? ?/sec 1.00 18.9±0.28µs ? ?/sec "cast decimal64 to uint8" 1.00 25.0±0.28µs ? ?/sec 1.11 27.6±0.46µs ? ?/sec cast decimal64 to decimal32 512 1.01 19.8±0.32µs ? ?/sec 1.00 19.6±0.32µs ? ?/sec cast decimal64 to decimal64 512 1.00 3.4±0.07µs ? ?/sec 1.00 3.4±0.05µs ? ?/sec ``` </p> </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
