I'm just circling back to this now. Is the commit protocol an acceptable way
of making this configureable? I could make the temp path (currently
"_temporary") configureable if that is what you are referring to.
Michael Armbrust wrote
> We didn't go this way initially because it doesn't work on st
Hi Mina
I believe this is different for Structured Streaming from Kafka,
specifically. I'm assuming you are using structured streaming based on the
name of the dependency ("spark-streaming-kafka"). There is a note in the
docs here:
https://spark.apache.org/docs/2.2.0/structured-streaming-kafka-int
Hi Priyank
I have a similar structure, although I am reading from Kafka and sinking to
multiple MySQL tables. My input stream has multiple message types and each
is headed for a different MySQL table.
I've looked for a solution for a few months, and have only come up with two
alternatives:
1. Si
Considering the @transient annotations and the work done in the instance
initializer, not much state is really be broadcast to the executors. It
might be simpler to just create these instances on the executors, rather
than trying to broadcast them?
--
View this message in context:
http://apache
Hello list
We have a Spark application that performs a set of ETLs: reading messages
from a Kafka topic, categorizing them, and writing the contents out as
Parquet files on HDFS. After writing, we are querying the data from HDFS
using Presto's hive integration. We are having problems because the