Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2133774454
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Call a window function with a set of arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box),
Review Comment:
Turns out I was doing something else and got the size down to `128`:
- https://github.com/apache/datafusion/pull/16320
Hopefully this will merge in a bit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb merged PR #16207: URL: https://github.com/apache/datafusion/pull/16207 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2930174694 π -- thanks again @hendrikmakait -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2927412038 I plan to merge this tomorrow so it can be included in DataFusion 48.0.0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
Dandandan commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2116490428
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Call a window function with a set of arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box),
Review Comment:
Agree - although it seems the size of the changes will be lower than 272 to
144 (I hoped maybe there would be a single one to reduce it from 144 -> 80).
Thanks for checking!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922908904 I'll leave this open for a few days to gather any remaining feedback -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922659103 π€: Benchmark completed Details ``` group main reduce-expr-struct-size - --- logical_aggregate_with_join 1.03749.1Β±2.47Β΅s? ?/sec1.00728.8Β±3.55Β΅s? ?/sec logical_select_all_from_1000 1.00123.6Β±0.31ms? ?/sec1.00123.5Β±0.23ms? ?/sec logical_select_one_from_700 1.01422.4Β±1.88Β΅s? ?/sec1.00419.2Β±1.88Β΅s? ?/sec logical_trivial_join_high_numbered_columns1.02385.4Β±5.88Β΅s? ?/sec1.00378.7Β±6.61Β΅s? ?/sec logical_trivial_join_low_numbered_columns 1.01370.3Β±1.93Β΅s? ?/sec1.00365.2Β±3.99Β΅s? ?/sec physical_intersection 1.04 869.0Β±11.57Β΅s? ?/sec1.00831.8Β±5.20Β΅s? ?/sec physical_join_consider_sort 1.03 1402.3Β±6.23Β΅s? ?/sec1.00 1355.0Β±5.68Β΅s? ?/sec physical_join_distinct1.01359.1Β±2.18Β΅s? ?/sec1.00354.8Β±1.38Β΅s? ?/sec physical_many_self_joins 1.02 10.4Β±0.05ms? ?/sec1.00 10.2Β±0.05ms? ?/sec physical_plan_clickbench_all 1.03147.7Β±1.46ms? ?/sec1.00143.3Β±1.44ms? ?/sec physical_plan_clickbench_q1 1.02 1730.9Β±17.67Β΅s? ?/sec1.00 1691.0Β±22.98Β΅s? ?/sec physical_plan_clickbench_q10 1.05 2.5Β±0.02ms? ?/sec1.00 2.4Β±0.04ms? ?/sec physical_plan_clickbench_q11 1.02 2.6Β±0.03ms? ?/sec1.00 2.5Β±0.03ms? ?/sec physical_plan_clickbench_q12 1.03 2.7Β±0.03ms? ?/sec1.00 2.6Β±0.05ms? ?/sec physical_plan_clickbench_q13 1.04 2.4Β±0.03ms? ?/sec1.00 2.3Β±0.03ms? ?/sec physical_plan_clickbench_q14 1.02 2.6Β±0.02ms? ?/sec1.00 2.5Β±0.04ms? ?/sec physical_plan_clickbench_q15 1.05 2.5Β±0.03ms? ?/sec1.00 2.4Β±0.02ms? ?/sec physical_plan_clickbench_q16 1.03 2.3Β±0.03ms? ?/sec1.00 2.3Β±0.02ms? ?/sec physical_plan_clickbench_q17 1.03 2.4Β±0.02ms? ?/sec1.00 2.4Β±0.02ms? ?/sec physical_plan_clickbench_q18 1.02 1998.0Β±14.44Β΅s? ?/sec1.00 1961.3Β±21.79Β΅s? ?/sec physical_plan_clickbench_q19 1.05 2.9Β±0.05ms? ?/sec1.00 2.8Β±0.04ms? ?/sec physical_plan_clickbench_q2 1.02 1953.9Β±22.03Β΅s? ?/sec1.00 1910.3Β±21.29Β΅s? ?/sec physical_plan_clickbench_q20 1.02 1690.4Β±24.95Β΅s? ?/sec1.00 1657.7Β±19.62Β΅s? ?/sec physical_plan_clickbench_q21 1.04 2.0Β±0.06ms? ?/sec1.00 1955.6Β±24.11Β΅s? ?/sec physical_plan_clickbench_q22 1.04 2.6Β±0.04ms? ?/sec1.00 2.5Β±0.03ms? ?/sec physical_plan_clickbench_q23 1.07 3.0Β±0.05ms? ?/sec1.00 2.8Β±0.03ms? ?/sec physical_plan_clickbench_q24 1.04 4.7Β±0.08ms? ?/sec1.00 4.5Β±0.04ms? ?/sec physical_plan_clickbench_q25 1.03 2.0Β±0.02ms? ?/sec1.00 1978.0Β±26.06Β΅s? ?/sec physical_plan_clickbench_q26 1.02 1838.4Β±25.40Β΅s? ?/sec1.00 1801.0Β±23.81Β΅s? ?/sec physical_plan_clickbench_q27 1.03 2.1Β±0.02ms? ?/sec1.00 1996.3Β±19.56Β΅s? ?/sec physical_plan_clickbench_q28 1.05 2.9Β±0.02ms? ?/sec1.00 2.8Β±0.02ms? ?/sec physical_plan_clickbench_q29 1.05 3.6Β±0.04ms? ?/sec1.00 3.5Β±0.02ms? ?/sec physical_plan_clickbench_q3 1.03 1943.5Β±29.11Β΅s? ?/sec1.00 1884.8Β±20.97Β΅s? ?/sec physical_plan_clickbench_q30 1.06 15.2Β±0.16ms? ?/sec1.00 14.3Β±0.12ms? ?/sec physical_plan_clickbench_q31 1.04 2.9Β±0.03ms? ?/sec1.00 2.8Β±0.03ms? ?/sec physical_plan_clickbench_q32 1.04 2.9Β±0.02ms? ?/sec1.00 2.8Β±0.03ms? ?/sec physical_plan_clickbench_q33 1.03 2.5Β±0.02ms? ?/sec1.00 2.4Β±
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922755179 π€: Benchmark completed Details ``` Comparing HEAD and reduce-expr-struct-size Benchmark clickbench_extended.json ββββ³β³ββ³ββββ β Queryβ HEAD β reduce-expr-struct-size βChange β β‘ββββββββββ© β QQuery 0 β 1942.63ms β 1966.87ms β no change β β QQuery 1 β 717.15ms β700.88ms β no change β β QQuery 2 β 1467.62ms β 1429.88ms β no change β β QQuery 3 β 695.60ms β685.65ms β no change β β QQuery 4 β 1462.18ms β 1474.42ms β no change β β QQuery 5 β 15721.51ms β 15581.23ms β no change β β QQuery 6 β 2062.32ms β 2036.21ms β no change β β QQuery 7 β 2270.27ms β 2083.30ms β +1.09x faster β β QQuery 8 β 836.18ms β845.40ms β no change β ββββ΄β΄ββ΄ββββ ββ³β β Benchmark Summary ββ β‘ββ© β Total Time (HEAD) β 27175.46ms β β Total Time (reduce-expr-struct-size) β 26803.84ms β β Average Time (HEAD)β 3019.50ms β β Average Time (reduce-expr-struct-size) β 2978.20ms β β Queries Faster β 1 β β Queries Slower β 0 β β Queries with No Change β 8 β ββ΄β Benchmark clickbench_partitioned.json ββββ³β³ββ³ββββ β Queryβ HEAD β reduce-expr-struct-size βChange β β‘ββββββββββ© β QQuery 0 β15.63ms β 15.00ms β no change β β QQuery 1 β33.27ms β 32.47ms β no change β β QQuery 2 β82.50ms β 81.25ms β no change β β QQuery 3 β94.25ms β 98.21ms β no change β β QQuery 4 β 587.93ms β571.78ms β no change β β QQuery 5 β 833.99ms β847.44ms β no change β β QQuery 6 β24.18ms β 23.83ms β no change β β QQuery 7 β39.01ms β 38.25ms β no change β β QQuery 8 β 905.00ms β916.35ms β no change β β QQuery 9 β 1212.46ms β 1222.81ms β no change β β QQuery 10β 260.41ms β262.26ms β no change β β QQuery 11β 292.92ms β298.17ms β no change β β QQuery 12β 895.98ms β925.99ms β no change β β QQuery 13β 1364.30ms β 1194.60ms β +1.14x faster β β QQuery 14β 842.70ms β849.81ms β no change β β QQuery 15β 823.74ms β816.12ms β no change β β QQuery 16β 1750.44ms β 1737.46ms β no change β β QQuery 17β 1612.09ms β 1618.81ms β no change β β QQuery 18β 3077.16ms β 3088.23ms β no change β β QQuery 19β82.83ms β 85.59ms β no change β β QQuery 20β 1191.35ms β 1162.37ms β no change β β QQuery 21β 1333.85ms β 1333.28ms β no change β β QQuery 22β 2210.24ms β 2214.84ms β no change β β QQuery 23β 8055.93ms β 8149.44ms β no change β β QQuery 24β 475.26ms β473.62ms β no change β β QQuery 25β 387.85ms β392.97ms β no change β β QQuery 26β 524.75ms β523.65ms β no change β β QQuery 27β 1607.14ms β 1627.18ms β no change β β QQuery 28β 12685.19ms β 13601.69ms β 1.07x slower β β QQuery 29β 535.98ms β525.12ms β no change β β QQuery 30β 789.38ms β807.62ms β no change β β QQuery 31β 851.48ms β851.02ms β no change β β QQuery 32β 2663.64ms β 2669.92ms β no change β β QQuery 33β 3352.71ms β 3397.47ms β no change β β QQuery 34β 3365.36ms β 3385.66ms β no change β β QQuery 35β 1276.16ms β 1314.40ms β no change β β QQuery 36β 123.82ms β126.31ms β
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922659200 π€ `./gh_compare_branch.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh) Running Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing reduce-expr-struct-size (3ad082c3306884d13c795e7b37c70bf79963d355) to 2c2f225926958b6abf06b01fcfb594017531043c [diff](https://github.com/apache/datafusion/compare/2c2f225926958b6abf06b01fcfb594017531043c..3ad082c3306884d13c795e7b37c70bf79963d355) Benchmarks: tpch_mem clickbench_partitioned clickbench_extended Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2116011961
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Call a window function with a set of arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box),
Review Comment:
Maybe as a follow on experiment we can try to box Alias and
OuterReferenceColumn too -- but I recommend not in this PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922513729 I kicked off some benchmarks for this PR -- thank you @hendrikmakait π -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #16207: URL: https://github.com/apache/datafusion/pull/16207#issuecomment-2922512365 π€ `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing reduce-expr-struct-size (3ad082c3306884d13c795e7b37c70bf79963d355) to 2c2f225926958b6abf06b01fcfb594017531043c [diff](https://github.com/apache/datafusion/compare/2c2f225926958b6abf06b01fcfb594017531043c..3ad082c3306884d13c795e7b37c70bf79963d355) BENCH_NAME=sql_planner BENCH_COMMAND=cargo bench --bench sql_planner BENCH_FILTER= BENCH_BRANCH_NAME=reduce-expr-struct-size Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
hendrikmakait commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2114253916
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Call a window function with a set of arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box),
Review Comment:
It looks like `Alias` might be the culprit here (with some padding):
`size_of::()` is 136, followed by `size_of::<(DataType, Column)>()`
(alias `OuterReferenceColumn`) with 128.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
hendrikmakait commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2114223556
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Call a window function with a set of arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box),
Review Comment:
Good question! I'm fairly new to Rust; what's a good way to analyze the enum
variant sizes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
Dandandan commented on code in PR #16207:
URL: https://github.com/apache/datafusion/pull/16207#discussion_r2113426682
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Call a window function with a set of arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box),
Review Comment:
I wonder, what is now 144 bytes?``
##
datafusion/expr/src/expr.rs:
##
@@ -330,7 +331,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Call a window function with a set of arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box),
Review Comment:
I wonder, what is now 144 bytes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb closed pull request #14366: Reduce size of `Expr` struct URL: https://github.com/apache/datafusion/pull/14366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2913489237 Filed https://github.com/apache/datafusion/issues/16199 to track -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2909160571 > If these benchmarks are reliable, then this would be a pretty neat improvement. Oh, man I missed the benchmark results (I forgot to check them). I'll ressurect this PR over the next week or so and rerun benchmarks Thank you very much for the shout @crepererum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
crepererum commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2909035054 If these benchmarks are reliable, then this would be a pretty neat improvement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2887593105 π€: Benchmark completed Details ``` group alamb_make_expr_smaller main - --- logical_aggregate_with_join 1.00 1486.0Β±16.67Β΅s? ?/sec1.02 1519.2Β±14.28Β΅s? ?/sec logical_select_all_from_1000 1.00 4.5Β±0.02ms? ?/sec1.01 4.6Β±0.03ms? ?/sec logical_select_one_from_700 1.00 1182.9Β±12.15Β΅s? ?/sec1.03 1213.7Β±19.22Β΅s? ?/sec logical_trivial_join_high_numbered_columns1.00 1148.5Β±16.11Β΅s? ?/sec1.02 1171.8Β±12.08Β΅s? ?/sec logical_trivial_join_low_numbered_columns 1.00 1140.3Β±19.70Β΅s? ?/sec1.02 1164.6Β±17.11Β΅s? ?/sec physical_intersection 1.00 2.4Β±0.02ms? ?/sec1.02 2.5Β±0.02ms? ?/sec physical_join_consider_sort 1.00 3.2Β±0.02ms? ?/sec1.02 3.3Β±0.02ms? ?/sec physical_join_distinct1.00 1148.5Β±15.20Β΅s? ?/sec1.00 1150.0Β±10.96Β΅s? ?/sec physical_many_self_joins 1.01 17.5Β±0.12ms? ?/sec1.00 17.3Β±0.18ms? ?/sec physical_plan_clickbench_all 1.00285.4Β±4.98ms? ?/sec1.14324.2Β±6.52ms? ?/sec physical_plan_clickbench_q1 1.00 4.6Β±0.12ms? ?/sec1.10 5.0Β±0.13ms? ?/sec physical_plan_clickbench_q10 1.00 5.7Β±0.14ms? ?/sec1.16 6.5Β±0.42ms? ?/sec physical_plan_clickbench_q11 1.00 5.8Β±0.22ms? ?/sec1.13 6.6Β±0.21ms? ?/sec physical_plan_clickbench_q12 1.00 6.0Β±0.17ms? ?/sec1.12 6.7Β±0.24ms? ?/sec physical_plan_clickbench_q13 1.00 5.5Β±0.11ms? ?/sec1.12 6.1Β±0.17ms? ?/sec physical_plan_clickbench_q14 1.00 5.8Β±0.17ms? ?/sec1.12 6.5Β±0.15ms? ?/sec physical_plan_clickbench_q15 1.00 5.6Β±0.14ms? ?/sec1.13 6.3Β±0.18ms? ?/sec physical_plan_clickbench_q16 1.00 5.1Β±0.18ms? ?/sec1.09 5.5Β±0.19ms? ?/sec physical_plan_clickbench_q17 1.00 5.0Β±0.13ms? ?/sec1.14 5.8Β±0.15ms? ?/sec physical_plan_clickbench_q18 1.00 4.9Β±0.13ms? ?/sec1.12 5.4Β±0.15ms? ?/sec physical_plan_clickbench_q19 1.00 5.7Β±0.16ms? ?/sec1.12 6.4Β±0.14ms? ?/sec physical_plan_clickbench_q2 1.00 4.9Β±0.13ms? ?/sec1.10 5.4Β±0.12ms? ?/sec physical_plan_clickbench_q20 1.00 4.5Β±0.10ms? ?/sec1.11 5.0Β±0.14ms? ?/sec physical_plan_clickbench_q21 1.00 4.9Β±0.15ms? ?/sec1.13 5.5Β±0.20ms? ?/sec physical_plan_clickbench_q22 1.00 5.7Β±0.17ms? ?/sec1.15 6.6Β±0.17ms? ?/sec physical_plan_clickbench_q23 1.00 6.2Β±0.16ms? ?/sec1.15 7.1Β±0.17ms? ?/sec physical_plan_clickbench_q24 1.00 6.9Β±0.18ms? ?/sec1.12 7.8Β±0.18ms? ?/sec physical_plan_clickbench_q25 1.00 5.1Β±0.13ms? ?/sec1.14 5.8Β±0.12ms? ?/sec physical_plan_clickbench_q26 1.00 4.8Β±0.13ms? ?/sec1.12 5.4Β±0.13ms? ?/sec physical_plan_clickbench_q27 1.00 5.2Β±0.16ms? ?/sec1.11 5.8Β±0.19ms? ?/sec physical_plan_clickbench_q28 1.00 5.9Β±0.18ms? ?/sec1.14 6.7Β±0.20ms? ?/sec physical_plan_clickbench_q29 1.00 6.9Β±0.15ms? ?/sec1.17 8.1Β±0.27ms? ?/sec physical_plan_clickbench_q3 1.00 4.9Β±0.11ms? ?/sec1.10 5.3Β±0.12ms? ?/sec physical_plan_clickbench_q30 1.00 19.5Β±0.32ms? ?/sec1.15 22.3Β±0.25ms? ?/sec physical_plan_clickbench_q31 1.00 6.1Β±0.17ms? ?/sec1.15 7.0Β±0.20ms? ?/sec physical_plan_clickbench_q32 1.00 6.0Β±0.14ms? ?/sec1.15 7.0Β±0.18ms? ?/sec physical_plan_clickbench_q33 1.00 5.6Β±0.17ms? ?/sec1.13 6.3Β±0.17ms? ?/sec physical_plan
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2887495892 π€ `./gh_compare_branch_bench.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh) Running Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubuntu SMP Wed Apr 2 16:34:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing alamb/make_expr_smaller (279f074291ddb0ea01344a012c710aecef044e18) to cb5d42e13531f250f702c02fb4d8f5d710e863f5 [diff](https://github.com/apache/datafusion/compare/cb5d42e13531f250f702c02fb4d8f5d710e863f5..279f074291ddb0ea01344a012c710aecef044e18) BENCH_NAME=sql_planner BENCH_COMMAND=cargo bench --bench sql_planner BENCH_FILTER= BENCH_BRANCH_NAME=alamb_make_expr_smaller Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
github-actions[bot] closed pull request #14366: Reduce size of `Expr` struct URL: https://github.com/apache/datafusion/pull/14366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
github-actions[bot] commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2819872416 Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2668783186 I'll try and find time to run some sql planning benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
crepererum commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2662587013 TBH I would be surprised if the build performance is affected by a few `memcpy` calls. I think the question here is rather if our runtime performance changes, potentially due to less stress on the memory allocator. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on PR #14366: URL: https://github.com/apache/datafusion/pull/14366#issuecomment-2652408678 I ran some build benchmarks on a GPC machine and I conclude this change does not improve the build timings ``` Building main + rm -rf target + cargo build --release --timings --lib --quiet real4m28.125s user36m43.568s sys 1m19.993s + rm -rf target + cargo build --release --timings --lib --quiet real4m32.743s user36m47.328s sys 1m19.786s + rm -rf target + cargo build --release --timings --lib --quiet real4m33.336s user36m50.926s sys 1m19.246s + git reset --hard HEAD is now at cb5d42e135 Disable extended tests (#14604) + git checkout alamb/make_expr_smaller Switched to branch 'alamb/make_expr_smaller' Your branch is up to date with 'alamb/alamb/make_expr_smaller'. + echo 'Building make_expr_smaller' Building make_expr_smaller + rm -rf target + cargo build --release --timings --lib --quiet real4m30.234s user37m15.027s sys 1m19.688s + rm -rf target + cargo build --release --timings --lib --quiet real4m30.580s user37m6.178s sys 1m19.885s + rm -rf target + cargo build --release --timings --lib --quiet real4m32.293s user36m54.063s sys 1m19.557s alamb@aal-dev:~/arrow-datafusion$ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
felipecrv commented on code in PR #14366:
URL: https://github.com/apache/datafusion/pull/14366#discussion_r1935630021
##
datafusion/expr/src/expr.rs:
##
@@ -297,7 +298,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Represents the call of a window function with arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box), // Boxed as it is large (272 bytes)
Review Comment:
```suggestion
WindowFunction(Box), // Boxed as it is large (> 272
bytes)
```
##
datafusion/expr/src/expr.rs:
##
@@ -297,7 +298,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Represents the call of a window function with arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box), // Boxed as it is large (272 bytes)
Review Comment:
```suggestion
WindowFunction(Box), // Boxed as it is large (>= 272
bytes)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
crepererum commented on code in PR #14366:
URL: https://github.com/apache/datafusion/pull/14366#discussion_r1935282817
##
datafusion/expr/src/expr.rs:
##
@@ -297,7 +298,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Represents the call of a window function with arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box), // Boxed as it is large (272 bytes)
Review Comment:
the concrete byte count is likely out-of-date soon, so I would just exclude
it from the comment
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
alamb commented on code in PR #14366:
URL: https://github.com/apache/datafusion/pull/14366#discussion_r1934699879
##
datafusion/expr/src/expr.rs:
##
@@ -297,7 +298,7 @@ pub enum Expr {
/// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt
AggregateFunction(AggregateFunction),
/// Represents the call of a window function with arguments.
-WindowFunction(WindowFunction),
+WindowFunction(Box), // Boxed as it is large (272 bytes)
Review Comment:
This is the API change in this PR: it moves a "large" structure (272 bytes)
on to the heap
the rest of the changes follow from this one change
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
findepi commented on code in PR #14366:
URL: https://github.com/apache/datafusion/pull/14366#discussion_r1934669162
##
datafusion/expr/src/expr.rs:
##
@@ -3067,4 +3078,19 @@ mod test {
rename: opt_rename,
}
}
+
+#[test]
+fn test_size_of_expr() {
+// because Expr is such a widely used struct in DataFusion
+// it is important to keep its size as small as possible
+//
+// If this test fails when you change `Expr`, please try
+// `Box`ing the fields to make `Expr` smaller
+// See https://github.com/apache/datafusion/issues/14256 for details
+assert_eq!(size_of::(), 112);
+assert_eq!(size_of::(), 64);
+assert_eq!(size_of::(), 24); // 3 ptrs
Review Comment:
does this test pass in debug and release modes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Reduce size of `Expr` struct [datafusion]
findepi commented on code in PR #14366:
URL: https://github.com/apache/datafusion/pull/14366#discussion_r1934670024
##
datafusion/expr/src/logical_plan/plan.rs:
##
@@ -2420,19 +2420,24 @@ impl Window {
.iter()
.enumerate()
.filter_map(|(idx, expr)| {
-if let Expr::WindowFunction(WindowFunction {
+let Expr::WindowFunction(window_func) = expr else {
+return None;
+};
+let WindowFunction {
fun: WindowFunctionDefinition::WindowUDF(udwf),
partition_by,
..
-}) = expr
-{
-// When there is no PARTITION BY, row number will be unique
-// across the entire table.
-if udwf.name() == "row_number" && partition_by.is_empty() {
-return Some(idx + input_len);
-}
+} = window_func.as_ref()
+else {
+return None;
+};
+// When there is no PARTITION BY, row number will be unique
+// across the entire table.
+if udwf.name() == "row_number" && partition_by.is_empty() {
+return Some(idx + input_len);
+} else {
+None
Review Comment:
inconsistent return vs value
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
