Can you try calling batchDF.unpersist() once the work is done in loop? On Mon, Jul 20, 2020 at 3:38 PM Yong Yuan <yyuankm1...@gmail.com> wrote:
> It seems the following structured streaming code keeps on consuming > usercache until all disk space are occupied. > > val monitoring_stream = > monitoring_df.writeStream > .trigger(Trigger.ProcessingTime("120 seconds")) > .foreachBatch { > (batchDF: DataFrame, batchId: Long) => > if(!batchDF.isEmpty) batchDF.show() > } > > > I even did not call batchDF.persist(). Do I need to really save/write > batchDF to somewhere to release the usercache? > > I also tried to call spark.catalog.clearCache() explicitly in a loop, > which does not help solve this problem either. > > Below figure also shows the capacity of the cluster is decreasing with the > running of these codes. > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org