As per the documentation
http://spark.apache.org/docs/2.3.2/structured-streaming-programming-guide.html#stream-stream-joins
, only append mode is supported
*As of Spark 2.3, you can use joins only when the query is in Append output
mode. Other output modes are not yet supported.*
But as per the code there is no check done for the output mode
// output mode checked is missed (UnsupportedOperationChecker.scala)
case LeftOuter =>
if (!left.isStreaming && right.isStreaming) {
throwError("Left outer join with a streaming
DataFrame/Dataset " +
"on the right and a static DataFrame/Dataset on the left
is not supported")
} else if (left.isStreaming && right.isStreaming) {
val watermarkInJoinKeys =
StreamingJoinHelper.isWatermarkInJoinKeys(subPlan)
val hasValidWatermarkRange =
StreamingJoinHelper.getStateValueWatermark(
left.outputSet, right.outputSet, condition,
Some(1000000)).isDefined
if (!watermarkInJoinKeys && !hasValidWatermarkRange) {
throwError("Stream-stream outer join between two streaming
DataFrame/Datasets " +
"is not supported without a watermark in the join keys,
or a watermark on " +
"the nullable side and an appropriate range condition")
}
}
If the documentation is correct, then I can raise the PR to fix the code
--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]