Hi, Stream-stream join in spark structured streaming right now supports INNER, LEFT OUTER, RIGHT OUTER and LEFT SEMI join type. But it does not support FULL OUTER join and we are working on to add it in https://github.com/apache/spark/pull/30395 . Given LEFT OUTER and RIGHT OUTER stream-stream join is supported, the code needed for FULL OUTER join is actually quite straightforward:
* For left side input row, check if there's a match on right side state store. if there's a match, output the joined row, o.w. output nothing. Put the row in left side state store. * For right side input row, check if there's a match on left side state store. if there's a match, output the joined row, o.w. output nothing. Put the row in right side state store. * State store eviction: evict rows from left/right side state store below watermark, and output rows never matched before (a combination of left outer and right outer join). Given FULL OUTER join consumes same amount of space in state store, compared with INNER/LEFT OUTER/RIGH OUTER join, and pretty easy to add. I don’t see any issues from system perspective that FULL OUTER join should not be added. I am wondering is there any major blocker to add FULL OUTER stream-stream join? Asking in dev mailing list in case we miss anything besides PR review participation, thanks. Cheng Su -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org