Re: Best way to read batch from Kafka and Offsets

2020-02-05 Thread Ruijing Li
Looks like I’m wrong, since I tried that exact snippet and it worked So to be clear, in the part where I do batchDF.write.parquet, that is not the exact code I’m using. I’m using a custom write function that does similar to write.parquet but has some added functionality. Somehow my custom write

Re: Best way to read batch from Kafka and Offsets

2020-02-05 Thread Ruijing Li
Hi all, I tried with forEachBatch but got an error. Is this expected? Code is df.writeStream.trigger(Trigger.Once).forEachBatch { (batchDF, batchId) => batchDF.write.parquet(hdfsPath) } .option(“checkPointLocation”, anotherHdfsPath) .start() Exception is: Queries with streaming sources must be

subscribe

2020-02-05 Thread Cool Joe
subscribe

SparkAppHandle can not stop application in yarn client mode

2020-02-05 Thread Zhang Victor
Hi all, When using spark launcher starts app in yarn client mode, the sparkAppHandle#stop() can not stop the application. SparkLauncher launcher = new SparkLauncher() .setAppName("My Launcher") .setJavaHome("/usr/bin/hadoop/software/java")

Re: Best way to read batch from Kafka and Offsets

2020-02-05 Thread Gourav Sengupta
Hi Burak, I am not quite used to streaming, but was almost thinking on the same lines :) makes a lot of sense to me now. Regards, Gourav On Wed, Feb 5, 2020 at 1:00 AM Burak Yavuz wrote: > Do you really want to build all of that and open yourself to bugs when you > can just use foreachBatch?