Kimahriman opened a new pull request #515:
URL: https://github.com/apache/incubator-sedona/pull/515


   ## Is this PR related to a proposed Issue?
   https://issues.apache.org/jira/browse/SEDONA-26
   
   ## What changes were proposed in this PR?
   Adds SQL broadcast join support. Broadcast hints are detected in the SQL 
Join detector and will execute a new `BroadcastIndexJoin` plan (that uses a 
`SpatialIndex` child plan, mostly for nice display purposes in the query plan).
   
   ## How was this patch tested?
   New UT, still need to add more
   
   ## Did this PR include necessary documentation updates?
   Not yet, TODO
   
   Still have some work to do on this, but wanted to get some initial feedback 
on the overall approach. I opted to keep things purely in SQL land instead of 
trying to add more parameters to spatialJoin or add another new core package 
that SQL calls out to. Currently works for Spark 3, and it compiles in Spark 2 
but the new test fails because broadcasts aren't currently detected in Spark 2.
   
   Also, I tried to better maintain the concept of left and right sides of the 
join, to more easily implement join types other than inner in the future. 
Rather than refer to the shapes as left and right where left.contains(right), I 
wanted to use different naming scheme. I went with window.contains(object), but 
very open to other ideas on how to separate the join sides versus 
<what>.contains(<what>).
   
   TODOs:
   - [ ] Either figure out how to detect broadcast hints in Spark 2, or update 
the tests to only run the broadcast test on Spark 3
   - [ ] Add support for distance joins
   - [ ] Add this to the documentation somewhere
   - [ ] Probably add more unit tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to