Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-07-15 Thread via GitHub


ahmed-mez closed pull request #16506: fix: extend recursive protection to 
prevent stack overflows in additional functions
URL: https://github.com/apache/datafusion/pull/16506


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-07-15 Thread via GitHub


ahmed-mez commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-3073443155

   Thank you @alamb !
   
   Opened https://github.com/apache/datafusion/issues/16788 & 
https://github.com/apache/datafusion/pull/16787 and will close this PR. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-07-01 Thread via GitHub


alamb commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-3025182224

   Thank you for working on this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-07-01 Thread via GitHub


alamb commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-3025182092

   > Perhaps I can close this PR and open an issue so the community can 
participate in the effort? 
   
   I think that sounds like a great idea to me
   
   > Do we want to commit the reproducer test case only and mark it ignored 
(`#[ignore]`) ?
   
   That would also be a good idea in my opinion, with a link to the ticket / 
issue that was filed
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-06-30 Thread via GitHub


ahmed-mez commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-3019500939

   Apologies for the delay.
   
   I added a commit with a [reproducer test 
case](https://github.com/apache/datafusion/pull/16506/commits/1a9584b64349acfea70b6e82dbaabbdeaa625758)
 as requested. It passes when the `recursive_protection` feature is enabled.
   
   The reproducer committed is inspired by an original reproducer I mentioned 
in the PR description but it's a smaller and simpler plan, without any custom 
Substrait extensions.
   
   Correct me if I'm wrong @alamb: fixing these stack overflows by 
restructuring the code seems like a massive amount of work as it'd involve 
rewriting the 30+ recursive functions that the current PR tried to protect with 
`recursive`. And we can't be 100% sure it'd solve all the stack overflows.
   
   Is my understanding of the scope correct? If so, what's the preferred path 
forward?
   
   Perhaps I can close this PR and open an issue so the community can 
participate in the effort? Do we want to commit the reproducer test case only 
and mark it ignored (`#[ignore]`) ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-06-25 Thread via GitHub


ahmed-mez commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-317936

   Thank you @alamb for reviewing and for running the benchmark.
   
   I will work on adding a test as a first step, that would also help us 
iterate and validate any changes later on.
   
   In general, performance vs reliability trade-offs depend heavily on the 
specific context and use cases. In my opinion, giving users the ability to make 
that decision themselves is a strong and flexible approach.
   
   In this case, the reliability concern isn’t minor. Avoiding stack overflows 
is critical, as they can bring down the entire process. Personally, I would 
gladly accept even more than a 1–2% slowdown in planning time to gain 
protection against such failures. I’d really appreciate it if DataFusion 
provided a "knob" that allowed users to express this kind of preference.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-06-24 Thread via GitHub


alamb commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-3001108899

   > I’d really appreciate it if DataFusion provided a "knob" that allowed 
users to express this kind of preference.
   
   Yeah this makes sense to me to
   
   I was thinking sometimes we don't have to choose between "panic" and slower 
planning -- by restructuring the code sometimes we can do much better


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-06-24 Thread via GitHub


alamb commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-2999628483

   🤔  the benchmarks seem to imply this has some non trivial overhead (slows 
down planning by 1-2%). 
   
   I wonder if there is any way to restructure the affected codepaths to avoid 
the recursion (change them into an iterative version, for example, or `Box` 
more structures to reduce the stack size?)
   
   Also, while I appreciate you can't provide the exact reproducer, I think it 
is important that we have some tests for this change, otherwise it is likely 
that over time as the code is refactored we'll either lose the annotations or 
the paths will be different and not annotated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-06-23 Thread via GitHub


alamb commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-2997163119

   🤖: Benchmark completed
   
   Details
   
   
   
   ```
   group ahmed_recursion-protection 
main
   - -- 

   logical_aggregate_with_join   1.01632.1±3.78µs? 
?/sec1.00627.7±2.76µs? ?/sec
   logical_select_all_from_1000  1.02 11.2±0.03ms? 
?/sec1.00 11.1±0.03ms? ?/sec
   logical_select_one_from_700   1.00414.0±1.70µs? 
?/sec1.01417.4±7.10µs? ?/sec
   logical_trivial_join_high_numbered_columns1.00374.8±1.41µs? 
?/sec1.00373.5±2.43µs? ?/sec
   logical_trivial_join_low_numbered_columns 1.01362.1±2.71µs? 
?/sec1.00358.7±1.67µs? ?/sec
   physical_intersection 1.00837.1±3.59µs? 
?/sec1.00835.1±6.04µs? ?/sec
   physical_join_consider_sort   1.01   1390.9±8.24µs? 
?/sec1.00   1383.5±3.64µs? ?/sec
   physical_join_distinct1.00352.1±2.40µs? 
?/sec1.00350.7±1.16µs? ?/sec
   physical_many_self_joins  1.00 10.3±0.05ms? 
?/sec1.00 10.3±0.05ms? ?/sec
   physical_plan_clickbench_all  1.01187.7±0.97ms? 
?/sec1.00186.6±1.20ms? ?/sec
   physical_plan_clickbench_q1   1.02  2.5±0.02ms? 
?/sec1.00  2.5±0.02ms? ?/sec
   physical_plan_clickbench_q10  1.01  3.3±0.02ms? 
?/sec1.00  3.3±0.03ms? ?/sec
   physical_plan_clickbench_q11  1.01  3.5±0.03ms? 
?/sec1.00  3.5±0.04ms? ?/sec
   physical_plan_clickbench_q12  1.01  3.7±0.02ms? 
?/sec1.00  3.6±0.04ms? ?/sec
   physical_plan_clickbench_q13  1.02  3.3±0.04ms? 
?/sec1.00  3.3±0.03ms? ?/sec
   physical_plan_clickbench_q14  1.01  3.5±0.02ms? 
?/sec1.00  3.5±0.02ms? ?/sec
   physical_plan_clickbench_q15  1.01  3.4±0.03ms? 
?/sec1.00  3.4±0.03ms? ?/sec
   physical_plan_clickbench_q16  1.01  3.3±0.02ms? 
?/sec1.00  3.2±0.03ms? ?/sec
   physical_plan_clickbench_q17  1.01  3.4±0.03ms? 
?/sec1.00  3.3±0.03ms? ?/sec
   physical_plan_clickbench_q18  1.01  2.9±0.02ms? 
?/sec1.00  2.9±0.02ms? ?/sec
   physical_plan_clickbench_q19  1.02  3.8±0.05ms? 
?/sec1.00  3.8±0.04ms? ?/sec
   physical_plan_clickbench_q2   1.02  2.9±0.02ms? 
?/sec1.00  2.9±0.02ms? ?/sec
   physical_plan_clickbench_q20  1.01  2.6±0.02ms? 
?/sec1.00  2.6±0.02ms? ?/sec
   physical_plan_clickbench_q21  1.01  2.9±0.02ms? 
?/sec1.00  2.9±0.03ms? ?/sec
   physical_plan_clickbench_q22  1.00  3.5±0.02ms? 
?/sec1.00  3.5±0.04ms? ?/sec
   physical_plan_clickbench_q23  1.01  3.8±0.02ms? 
?/sec1.00  3.7±0.02ms? ?/sec
   physical_plan_clickbench_q24  1.00  4.3±0.03ms? 
?/sec1.00  4.3±0.02ms? ?/sec
   physical_plan_clickbench_q25  1.01  3.0±0.03ms? 
?/sec1.00  3.0±0.02ms? ?/sec
   physical_plan_clickbench_q26  1.00  2.9±0.02ms? 
?/sec1.00  2.9±0.06ms? ?/sec
   physical_plan_clickbench_q27  1.01  3.1±0.04ms? 
?/sec1.00  3.0±0.03ms? ?/sec
   physical_plan_clickbench_q28  1.00  3.8±0.03ms? 
?/sec1.00  3.8±0.03ms? ?/sec
   physical_plan_clickbench_q29  1.00  4.4±0.03ms? 
?/sec1.00  4.4±0.03ms? ?/sec
   physical_plan_clickbench_q3   1.01  2.9±0.02ms? 
?/sec1.00  2.8±0.02ms? ?/sec
   physical_plan_clickbench_q30  1.00 12.6±0.09ms? 
?/sec1.00 12.7±0.22ms? ?/sec
   physical_plan_clickbench_q31  1.01  3.8±0.03ms? 
?/sec1.00  3.8±0.03ms? ?/sec
   physical_plan_clickbench_q32  1.01  3.8±0.04ms? 
?/sec1.00  3.8±0.03ms? ?/sec
   physical_plan_clickbench_q33  1.01  3.3±0.03ms? 
?/sec1.00  3.3±0.02ms? ?/sec
   physical_plan

Re: [PR] fix: extend recursive protection to prevent stack overflows in additional functions [datafusion]

2025-06-23 Thread via GitHub


alamb commented on PR #16506:
URL: https://github.com/apache/datafusion/pull/16506#issuecomment-2996991443

   🤖 `./gh_compare_branch_bench.sh` [Benchmark 
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch_bench.sh)
 Running
   Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP Thu Apr 24 20:41:05 UTC 
2025 x86_64 x86_64 x86_64 GNU/Linux
   Comparing ahmed/recursion-protection 
(fe61624c7e1f1c34848a059962dae02a62e939d5) to 
aa1e6dac7c5dfdfb2c0f52282638886bff194a5d 
[diff](https://github.com/apache/datafusion/compare/aa1e6dac7c5dfdfb2c0f52282638886bff194a5d..fe61624c7e1f1c34848a059962dae02a62e939d5)
   BENCH_NAME=sql_planner
   BENCH_COMMAND=cargo bench --bench sql_planner
   BENCH_FILTER=
   BENCH_BRANCH_NAME=ahmed_recursion-protection
   Results will be posted here when complete
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]