alamb commented on issue #17718: URL: https://github.com/apache/datafusion/issues/17718#issuecomment-3334847958
Here is a note from @pepijnve in discord with a usecase I thought was interesting (basically wants to use a custom join type): > The ST_KNN join query implementation technique in Sedona is really interesting. I had a similar need in my project where I want to join on the longest prefix in another table efficiently. The actual implementation is a custom LogicalPlan and an ExecutionPlan that builds a trie based on the build side and then probes that. > > I couldn't figure out a simple way to plug a custom join into the SQL parser though so we ended up punting on that and simply create the logical plan via the dataframe API instead. Being able to use SQL would be nice though. There's first class support for ASOF JOIN in the SQL parser, but I didn't see a way to add a custom <strategy> JOIN. Using a marker function and an optimizer rule is a neat solution, but feels a bit brittle. The comment in the docs about UnsupportedOperationException with Spark seems to kind of confirm that. > > Is there any consensus in the community on how to add non standard join strategies? Is the Sedona ST_KNN approach the way to go or are there other ways to implement this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
