Github user stczwd commented on the issue: https://github.com/apache/spark/pull/22575 @WangTaoTheTonic Adding 'stream' keyword has two purposes: - **Mark the entire sql query as a stream query and generate the SQLStreaming plan tree.** - **Mark the table type as UnResolvedStreamRelation.** Parse the table as StreamingRelation or other Relation, especially in the stream join batch queries, such as kafka join mysql. **Besides, the keyword 'stream' makes it easier to express StructStreaming with pure SQL.** A little example to show importances of 'stream': read stream from kafka stream table, and join mysql to count user message - with 'stream' - `select stream kafka_sql_test.name, count(door) from kafka_sql_test inner join mysql_test on kafka_sql_test.name == mysql_test.name group by kafka_sql_test.name` - **It will be regarded as Streaming Query using Console Sink**, the kafka_sql_test will be parsed as StreamingRelation and mysql_test will be parsed as JDBCRelation, not Streaming Relation. - `insert into csv_sql_table select stream kafka_sql_test.name, count(door) from kafka_sql_test inner join mysql_test on kafka_sql_test.name == mysql_test.name group by kafka_sql_test.name` - **It will be regarded as Streaming Query using FileStream Sink**, the kafka_sql_test will be parsed as StreamingRelation and mysql_test will be parsed as JDBCRelation, not Streaming Relation. - without 'stream' - `select kafka_sql.name, count(door) from kafka_sql_test inner join mysql_test on kafka_sql_test.name == mysql_test.name group by kafka_sql_test.name` - **It will be regarded as Batch Query**, the kafka_sql_test will be parsed to KafkaRelation and mysql_test will be parsed as JDBCRelation.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org