Adding rationalization here, my request for raising the thead to dev mailing list is, to figure out possible reasons not having full outer join at the moment when adding left/right outer join.
This is rather historical knowledge, so I have no idea about this. Most likely a limited number of folks could answer and I hope we could get some historical information. Note that I don't object the change. Just wanted to make clear we don't miss something. On Sat, Nov 21, 2020 at 4:15 AM real-cheng-su <chen...@fb.com.invalid> wrote: > Hi, > > Stream-stream join in spark structured streaming right now supports INNER, > LEFT OUTER, RIGHT OUTER and LEFT SEMI join type. But it does not support > FULL OUTER join and we are working on to add it in > https://github.com/apache/spark/pull/30395 . > > Given LEFT OUTER and RIGHT OUTER stream-stream join is supported, the code > needed for FULL OUTER join is actually quite straightforward: > > * For left side input row, check if there's a match on right side state > store. if there's a match, output the joined row, o.w. output nothing. Put > the row in left side state store. > * For right side input row, check if there's a match on left side state > store. if there's a match, output the joined row, o.w. output nothing. Put > the row in right side state store. > * State store eviction: evict rows from left/right side state store below > watermark, and output rows never matched before (a combination of left > outer > and right outer join). > > Given FULL OUTER join consumes same amount of space in state store, > compared > with INNER/LEFT OUTER/RIGH OUTER join, and pretty easy to add. I don’t see > any issues from system perspective that FULL OUTER join should not be > added. > > I am wondering is there any major blocker to add FULL OUTER stream-stream > join? Asking in dev mailing list in case we miss anything besides PR review > participation, thanks. > > Cheng Su > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >