alamb commented on issue #16106:
URL: https://github.com/apache/datafusion/issues/16106#issuecomment-2906792842

   Thanks @aditanase  -- in general I would classify this under the category of 
the desire for a more sophisticated join reordering algorithm. I am pretty 
skeptical that we will be able to find such an algorithm that would work well 
for all cases and thus that belongs in DataFusion's core.
   
   The theory is that people can use DataFuson's extension APIs to get whatever 
join orders they want
   
   > adding some sorf ot join hints in the SQL planner [like we have in 
spark](https://downloads.apache.org/spark/docs/3.0.0/sql-ref-syntax-qry-select-hints.html#join-hints)
   
   I have two potential suggestions:
   
   # Idea 1: Semantic Optimizer
   
   One thing maybe you can do is use the fact that DataFusion doesn't typically 
reorder joins (normally it plans the joins in the order they are listed 
syntactically in the query. This is the ultimate form of join hinting. 
   
   I expect DataFusion to plan this with `a` as the left input and `b` as the 
right input
   ```sql
   SELECT .. a JOIN b ...
   ```
   
   Likewise, I expect DataFusion to plan this with `b` as the left input and 
`a` as the right input
   ```sql
   SELECT .. b JOIN a ...
   ```
   
   
   If the built in optimizer passes can't be disabled now, we should add some 
config setting to do so 
   
   # Idea 2: Custom optimizer
   
   ANother thing that you could do is add a custom optimizer rule that 
implements the heuristics you describe (e.g. join hints, FK/PK constraints, etc)
   
   
   
   I wrote about this design choice / limitation here:
   - https://www.influxdata.com/blog/optimizing-sql-dataframes-part-two/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to