Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-04-05 Thread via GitHub


alamb commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2740798896

   > I will port the tests on our next iteration, but would appreciate a 
recommendation for a good suite to add them to as I rewrite them - they are 
currently only validating the `InlineTableScan` analyser, so would need to be 
ported to a suite that has access to `DataFrameTableProvider` and can do 
logical planning.
   
   
   Thanks!
   
   I think unit tests on InlineTableScan is likely not as helpful as something 
like building multiple nested levels of `Views` with DataFrame (or SQL or 
however you are creating your plans) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-20 Thread via GitHub


alamb commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2738300722

   @aditanase  perhaps you can make a PR with whatever tests were breaking in 
your fork so we can make sure they work here


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-19 Thread via GitHub


aditanase commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2739305725

   The fork was not compiling, we were on 46.0.0 but the latest changes made my 
rebase more difficult than usual, we'll figure it out.
   
   I will port the tests on our next iteration, but would appreciate a 
recommendation for a good suite to add them to as I rewrite them - they are 
currently only validating the `InlineTableScan` analyser, so would need to be 
ported to a suite that has access to `DataFrameTableProvider` and can do 
logical planning.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-19 Thread via GitHub


jayzhan211 commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2736677639

   @aditanase 
   
   It works too
   
   ```
   statement count 0
   create table t(a int) as values (1), (2), (3);
   
   statement count 0
   create view v as select a, count(a) from t group by a;
   
   query II rowsort
   select * from v;
   
   1 1
   2 1
   3 1
   
   statement count 0
   create view v2 as select "count(t.a)", a from v;
   
   query II
   select a, "count(t.a)" from v2;
   
   1 1
   3 1
   2 1
   
   statement ok
   create view v3 as select a, "count(t.a)" from v2;
   
   query II
   select * from v2;
   
   1 1
   1 2
   1 3
   
   query II
   select * from v3;
   
   3 1
   2 1
   1 1
   
   query TT
   explain select * from v3;
   
   logical_plan
   01)SubqueryAlias: v3
   02)--Projection: v2.a, v2.count(t.a)
   03)SubqueryAlias: v2
   04)--Projection: v.count(t.a), v.a
   05)SubqueryAlias: v
   06)--Aggregate: groupBy=[[t.a]], aggr=[[count(t.a)]]
   07)TableScan: t projection=[a]
   physical_plan
   01)AggregateExec: mode=FinalPartitioned, gby=[a@0 as a], aggr=[count(t.a)]
   02)--CoalesceBatchesExec: target_batch_size=8192
   03)RepartitionExec: partitioning=Hash([a@0], 4), input_partitions=4
   04)--RepartitionExec: partitioning=RoundRobinBatch(4), input_partitions=1
   05)AggregateExec: mode=Partial, gby=[a@0 as a], aggr=[count(t.a)]
   06)--DataSourceExec: partitions=1, partition_sizes=[1]
   
   query II rowsort
   select "count(t.a)", a from v;
   
   1 1
   1 2
   1 3
   
   statement count 0
   drop view v;
   
   statement count 0
   drop view v2;
   
   statement ok
   drop view v3;
   
   statement count 0
   drop table t;
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-19 Thread via GitHub


aditanase commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-273664

   @jayzhan211 can you try with a 3rd view? 2 levels were also working on the 
original inlining code.
   Here you only need to inline v inside v2 and you get the final plan.
   if you had v3 -> v2 -> v -> t, then if you just inline v2, you will get a 
logical plan that will have v as a leaf node instead of t.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-19 Thread via GitHub


jayzhan211 commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2736240081

   ```
   statement count 0
   create table t(a int) as values (1), (2), (3);
   
   statement count 0
   create view v as select a, count(a) from t group by a;
   
   query II rowsort
   select * from v;
   
   1 1
   2 1
   3 1
   
   statement count 0
   create view v2 as select "count(t.a)", a from v;
   
   query II
   select * from v2;
   
   1 3
   1 1
   1 2
   
   query TT
   explain select * from v2;
   
   logical_plan
   01)SubqueryAlias: v2
   02)--Projection: v.count(t.a), v.a
   03)SubqueryAlias: v
   04)--Aggregate: groupBy=[[t.a]], aggr=[[count(t.a)]]
   05)TableScan: t projection=[a]
   physical_plan
   01)ProjectionExec: expr=[count(t.a)@1 as count(t.a), a@0 as a]
   02)--AggregateExec: mode=FinalPartitioned, gby=[a@0 as a], aggr=[count(t.a)]
   03)CoalesceBatchesExec: target_batch_size=8192
   04)--RepartitionExec: partitioning=Hash([a@0], 4), input_partitions=4
   05)RepartitionExec: partitioning=RoundRobinBatch(4), 
input_partitions=1
   06)--AggregateExec: mode=Partial, gby=[a@0 as a], aggr=[count(t.a)]
   07)DataSourceExec: partitions=1, partition_sizes=[1]
   
   query II rowsort
   select "count(t.a)", a from v;
   
   1 1
   1 2
   1 3
   
   statement count 0
   drop view v;
   
   statement count 0
   drop view v2;
   
   statement count 0
   drop table t;
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-19 Thread via GitHub


aditanase commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2736011205

   Hi @jayzhan211 @alamb, I'm a bit concerned with this patch, I was just in 
the process of submitting an enhancement to `InlineTableScan` that was dealing 
with multiple levels of views registered with `into_view`.
   
   
https://github.com/hstack/arrow-datafusion/commit/0f79c370373cae04c541c91862cb9bf0ab2c737f
   
   The proposal on the original implementation was that we should be 
transforming down instead of up, to allow the child plans to be expanded all 
the way down.
   The downside with not doing inlining all the way to the original scan is 
that the various optimizer stages don't have full visibility and will run in 
separate stages. For example from the 2nd level of nesting, logical planning 
will only be triggered in the physical planning of the 2nd 
`DataFrameTableProvider`.
   
   What are your thoughts around this? How are you dealing with a query plan 
that is built incrementally from 2-3 levels of using `into_view`? (resulting in 
chained `DataFrameTableProvider` sources along the way).
   
   I'm not seeing any tests handling this use case yet, and my first attempt of 
cherry-picking your commit breaks in other places for our fork.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-18 Thread via GitHub


jayzhan211 merged PR #15201:
URL: https://github.com/apache/datafusion/pull/15201


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-18 Thread via GitHub


jayzhan211 commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2734924833

   Thanks @alamb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-18 Thread via GitHub


jayzhan211 commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r2000904176


##
datafusion/core/tests/dataframe/mod.rs:
##
@@ -1571,14 +1571,18 @@ async fn with_column_join_same_columns() -> Result<()> {
 
 assert_snapshot!(
 df_with_column.logical_plan(),
-@r###"
+@r"
 Projection: t1.c1, t2.c1, Boolean(true) AS new_column
   Limit: skip=0, fetch=1
 Sort: t1.c1 ASC NULLS FIRST
   Inner Join: t1.c1 = t2.c1
-TableScan: t1
-TableScan: t2
-"###
+SubqueryAlias: t1

Review Comment:
   I explain it in the above 
https://github.com/apache/datafusion/pull/15201#discussion_r1996487546



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub


alamb commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1995496219


##
datafusion/sqllogictest/test_files/ddl.slt:
##
@@ -855,3 +855,29 @@ DROP TABLE t1;
 
 statement ok
 DROP TABLE t2;
+
+statement count 0
+create table t(a int) as values (1), (2), (3);
+
+statement count 0
+create view v as select a, count(a) from t group by a;
+
+query II rowsort
+select * from v;
+
+1 1
+2 1
+3 1
+
+query II rowsort
+select "count(t.a)", a from v;
+
+1 1
+1 2
+1 3
+
+query error DataFusion error: Execution error: Table 'v' doesn't exist\.

Review Comment:
   I think it should be `drop view v` rather than `drop table`



##
datafusion/expr/src/logical_plan/builder.rs:
##
@@ -465,9 +465,21 @@ impl LogicalPlanBuilder {
 projection: Option>,
 filters: Vec,
 ) -> Result {
-TableScan::try_new(table_name, table_source, projection, filters, None)
-.map(LogicalPlan::TableScan)
-.map(Self::new)
+let table_scan =
+TableScan::try_new(table_name, table_source, projection, filters, 
None)?;
+
+// Inline TableScan
+if table_scan.filters.is_empty() {
+if let Some(p) = table_scan.source.get_logical_plan() {
+let sub_plan = p.into_owned();
+// Ensures that the reference to the inlined table remains the
+// same, meaning we don't have to change any of the parent 
nodes
+// that reference this table.
+return Self::new(sub_plan).alias(table_scan.table_name);
+}
+}
+
+Ok(Self::new(LogicalPlan::TableScan(table_scan)))

Review Comment:
   seems like the same logic should apply to scan_with_filters_fetch below



##
datafusion/core/tests/dataframe/mod.rs:
##
@@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> {
 \n  Limit: skip=0, fetch=1\
 \nSort: t1.c1 ASC NULLS FIRST\
 \n  Inner Join: t1.c1 = t2.c1\
-\nTableScan: t1\
-\nTableScan: t2",
+\nSubqueryAlias: t1\

Review Comment:
   I am still worried about this change -- I don't undersrtand why the subquery 
allias has not been removed or if that is a problem 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub


jayzhan211 commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996487546


##
datafusion/core/tests/dataframe/mod.rs:
##
@@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> {
 \n  Limit: skip=0, fetch=1\
 \nSort: t1.c1 ASC NULLS FIRST\
 \n  Inner Join: t1.c1 = t2.c1\
-\nTableScan: t1\
-\nTableScan: t2",
+\nSubqueryAlias: t1\

Review Comment:
   You can check the `into_optimized_plan` below, we have the consistent result.
   
   It includes analyzer and optimizer. That is why the main branch doesn't have 
subquery before optimization, and we have it after `inline_tablescan` is called.
   In this change, since inline table scan is called inside plan building, so 
we got this subquery before optimization



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub


alamb commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996808102


##
datafusion/core/tests/dataframe/mod.rs:
##
@@ -1571,14 +1571,18 @@ async fn with_column_join_same_columns() -> Result<()> {
 
 assert_snapshot!(
 df_with_column.logical_plan(),
-@r###"
+@r"
 Projection: t1.c1, t2.c1, Boolean(true) AS new_column
   Limit: skip=0, fetch=1
 Sort: t1.c1 ASC NULLS FIRST
   Inner Join: t1.c1 = t2.c1
-TableScan: t1
-TableScan: t2
-"###
+SubqueryAlias: t1

Review Comment:
   I am concerned about this change -- it seems like a regression to me. 
   
   Maybe it is ok, but I don't understand the difference/ why there are a bunch 
of projections / SubqueryAlias nows
   
   If these are fixed by physical plan time it is fine, but from here it looks 
like something is wrong



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-15 Thread via GitHub


jayzhan211 commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996487546


##
datafusion/core/tests/dataframe/mod.rs:
##
@@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> {
 \n  Limit: skip=0, fetch=1\
 \nSort: t1.c1 ASC NULLS FIRST\
 \n  Inner Join: t1.c1 = t2.c1\
-\nTableScan: t1\
-\nTableScan: t2",
+\nSubqueryAlias: t1\

Review Comment:
   You can check the `into_optimized_plan` below, we have the consistent result.
   
   The process includes both an analyzer and an optimizer. This is why the main 
branch does not contain a subquery before optimization, but one appears after 
inline_tablescan is called.
   
   In this change, since inline_tablescan is invoked during plan building, the 
subquery is introduced before the optimization step



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-14 Thread via GitHub


jayzhan211 commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1996463077


##
datafusion/core/tests/dataframe/mod.rs:
##
@@ -1563,8 +1563,12 @@ async fn with_column_join_same_columns() -> Result<()> {
 \n  Limit: skip=0, fetch=1\
 \nSort: t1.c1 ASC NULLS FIRST\
 \n  Inner Join: t1.c1 = t2.c1\
-\nTableScan: t1\
-\nTableScan: t2",
+\nSubqueryAlias: t1\

Review Comment:
   I thought the subquery alias is what shows we correctly inline tablescan, 
since adding alias is the only thing we have done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-14 Thread via GitHub


alamb commented on PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#issuecomment-2725551994

   I ran planning performance benchmarks and I would say they showed no 
discernible difference with this branch
   
   Details
   
   
   ```
   group inline-table-mv
 main
   - ---
 
   logical_aggregate_with_join   1.00  1527.0±20.63µs? 
?/sec 1.00  1522.8±16.03µs? ?/sec
   logical_select_all_from_1000  1.00141.6±0.75ms? 
?/sec 1.00142.3±0.19ms? ?/sec
   logical_select_one_from_700   1.00  1201.5±27.95µs? 
?/sec 1.01  1210.6±24.70µs? ?/sec
   logical_trivial_join_high_numbered_columns1.00  1167.5±28.21µs? 
?/sec 1.02  1188.7±29.84µs? ?/sec
   logical_trivial_join_low_numbered_columns 1.00  1149.0±15.67µs? 
?/sec 1.01  1165.3±15.72µs? ?/sec
   physical_intersection 1.05  1846.6±111.33µs? 
?/sec1.00  1752.4±24.26µs? ?/sec
   physical_join_consider_sort   1.02  2.4±0.02ms? 
?/sec 1.00  2.3±0.01ms? ?/sec
   physical_join_distinct1.03  1142.4±13.14µs? 
?/sec 1.00  .5±12.97µs? ?/sec
   physical_many_self_joins  1.01 11.8±0.06ms? 
?/sec 1.00 11.7±0.08ms? ?/sec
   physical_plan_clickbench_all  1.02   255.1±10.58ms? 
?/sec 1.00251.0±1.41ms? ?/sec
   physical_plan_clickbench_q1   1.02  3.9±0.07ms? 
?/sec 1.00  3.8±0.04ms? ?/sec
   physical_plan_clickbench_q10  1.02  4.8±0.06ms? 
?/sec 1.00  4.7±0.07ms? ?/sec
   physical_plan_clickbench_q11  1.02  4.9±0.10ms? 
?/sec 1.00  4.9±0.04ms? ?/sec
   physical_plan_clickbench_q12  1.03  5.1±0.13ms? 
?/sec 1.00  5.0±0.05ms? ?/sec
   physical_plan_clickbench_q13  1.05  4.8±0.10ms? 
?/sec 1.00  4.6±0.05ms? ?/sec
   physical_plan_clickbench_q14  1.05  5.1±0.13ms? 
?/sec 1.00  4.9±0.09ms? ?/sec
   physical_plan_clickbench_q15  1.05  5.0±0.09ms? 
?/sec 1.00  4.7±0.05ms? ?/sec
   physical_plan_clickbench_q16  1.01  4.6±0.08ms? 
?/sec 1.00  4.6±0.06ms? ?/sec
   physical_plan_clickbench_q17  1.02  4.8±0.11ms? 
?/sec 1.00  4.7±0.04ms? ?/sec
   physical_plan_clickbench_q18  1.01  4.2±0.07ms? 
?/sec 1.00  4.2±0.05ms? ?/sec
   physical_plan_clickbench_q19  1.01  5.3±0.10ms? 
?/sec 1.00  5.2±0.06ms? ?/sec
   physical_plan_clickbench_q2   1.02  4.1±0.10ms? 
?/sec 1.00  4.1±0.07ms? ?/sec
   physical_plan_clickbench_q20  1.01  3.8±0.07ms? 
?/sec 1.00  3.8±0.07ms? ?/sec
   physical_plan_clickbench_q21  1.05  4.3±0.20ms? 
?/sec 1.00  4.1±0.04ms? ?/sec
   physical_plan_clickbench_q22  1.01  5.0±0.09ms? 
?/sec 1.00  4.9±0.07ms? ?/sec
   physical_plan_clickbench_q23  1.02  5.4±0.10ms? 
?/sec 1.00  5.3±0.03ms? ?/sec
   physical_plan_clickbench_q24  1.01  7.5±0.10ms? 
?/sec 1.00  7.5±0.05ms? ?/sec
   physical_plan_clickbench_q25  1.01  4.3±0.08ms? 
?/sec 1.00  4.3±0.05ms? ?/sec
   physical_plan_clickbench_q26  1.00  3.9±0.10ms? 
?/sec 1.01  4.0±0.06ms? ?/sec
   physical_plan_clickbench_q27  1.00  4.3±0.08ms? 
?/sec 1.01  4.3±0.03ms? ?/sec
   physical_plan_clickbench_q28  1.00  5.1±0.07ms? 
?/sec 1.02  5.2±0.09ms? ?/sec
   physical_plan_clickbench_q29  1.00  6.1±0.14ms? 
?/sec 1.01  6.2±0.05ms? ?/sec
   physical_plan_clickbench_q3   1.04  4.1±0.07ms? 
?/sec 1.00  4.0±0.04ms? ?/sec
   physical_plan_clickbench_q30  1.00 18.3±0.25ms? 
?/sec 1.01 18.4±0.10ms? ?/sec
   physical_plan_clickbench_q31  1.00  5.1±0.09ms? 
?/sec 1.05  5.3±0.31ms? ?/sec
   physical_plan_clickbench_q32  1.00  5.1±0.09ms? 
?/sec 1.03  5.2±0.09ms? ?/sec
   physical_plan_cli

Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-13 Thread via GitHub


jayzhan211 commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1994540603


##
datafusion/sql/src/select.rs:
##
@@ -550,7 +550,29 @@ impl SqlToRel<'_, S> {
 0 => Ok(LogicalPlanBuilder::empty(true).build()?),
 1 => {
 let input = from.remove(0);
-self.plan_table_with_joins(input, planner_context)
+let table = self.plan_table_with_joins(input, 
planner_context)?;

Review Comment:
   both dataframe and sql works well now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Remove inline table scan analyzer rule [datafusion]

2025-03-13 Thread via GitHub


alamb commented on code in PR #15201:
URL: https://github.com/apache/datafusion/pull/15201#discussion_r1993384623


##
datafusion/sql/src/select.rs:
##
@@ -550,7 +550,29 @@ impl SqlToRel<'_, S> {
 0 => Ok(LogicalPlanBuilder::empty(true).build()?),
 1 => {
 let input = from.remove(0);
-self.plan_table_with_joins(input, planner_context)
+let table = self.plan_table_with_joins(input, 
planner_context)?;

Review Comment:
   This will have the effect of only inlining table scans for SQL
   
   Perhaps we could move the code into LogicalPlanBuilder so it also applies to 
dataframes?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]