Another thing you may want to be aware is, if the result is not idempotent,
your query result is also not idempotent. For fault-tolerance there's a
chance for record (row) to be replayed (recomputed).
-Jungtaek Lim (HeartSaVioR)
2018년 4월 24일 (화) 오후 2:07, Jörn Franke 님이 작성:
> What is your use cas
HI all
i am using the following code for persisting data into S3 (aws keys are
already stored in the environment variables)
dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName)
However, i keep on receiving an exception that the file does not exist
here's what comes fro
Thanks for any help!
On Mon, Apr 23, 2018 at 11:46 AM, Lian Jiang wrote:
> Hi,
>
> I am using structured spark streaming which reads jsonl files and writes
> into parquet files. I am wondering what's the process if jsonl files schema
> change.
>
> Suppose jsonl files are generated in \jsonl fold
I guess you can wait for the termination, catch exception and then restart the
query in a loop. Something like…
while (true) {
try {
val query = df.writeStream().
…
.start()
query.awaitTermination()
} catch {
case e: Streaming