Can you try calling batchDF.unpersist() once the work is done in loop?

On Mon, Jul 20, 2020 at 3:38 PM Yong Yuan <yyuankm1...@gmail.com> wrote:

> It seems the following structured streaming code keeps on consuming
> usercache until all disk space are occupied.
>
> val monitoring_stream =
>         monitoring_df.writeStream
>             .trigger(Trigger.ProcessingTime("120  seconds"))
>             .foreachBatch {
>                 (batchDF: DataFrame, batchId: Long) =>
>                 if(!batchDF.isEmpty)   batchDF.show()
>             }
>
>
> I even did not call batchDF.persist(). Do I need to really save/write
> batchDF to somewhere to release the usercache?
>
> I also tried to call spark.catalog.clearCache() explicitly in a loop,
> which does not help solve this problem either.
>
> Below figure also shows the capacity of the cluster is decreasing with the
> running of these codes.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to