I am trying to create a spark structured streaming job which reads from a
Kafka topic and the events coming from that Kafka topic will have different
schemas (There is no standard schema for the incoming events).
Sample incoming events:
event1: {timestamp:2018-09-28T15:50:57.2420418+00:00, value:
Need to create some hive test tables for pyCharm
SPARK_HOME is set up as
D:\temp\spark-3.0.1-bin-hadoop2.7
HADOOP_HOME is
c:\hadoop\
spark-shell works. Trying to run spark-sql, I get the following errors
PS C:\tmp\hive> spark-sql
log4j:WARN No appenders could be found for logger
(org.apache.h
If you use the `flatMap/mapGroupsWithState` API for a "stateful" SS job,
the blacklisting structure can be put into the user-defined state.
To use a 3rd-party cache should also be a good choice.
Eric Beabes 于2020年11月11日周三 上午6:54写道:
> Currently we’ve a “Stateful” Spark Structured Streaming job th
Maybe you can try the `foreachBatch` API in structured streaming, which
allows reusing existing datasources.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apa