[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

Fengyu Cao (JIRA) Thu, 03 Jan 2019 20:04:19 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16733815#comment-16733815
 ]


Fengyu Cao edited comment on SPARK-26389 at 1/4/19 4:03 AM:
------------------------------------------------------------

{quote}Temp checkpoint can be used in one-node scenario and deleted only if the 
query didn't fail.
{quote}
Yes, and there're no logs or error msgs says that we *must* set a non-temp 
checkpoint if we run a framework non-local

And if we do this(run non-local with temp checkpoint), the checkpoint dir on 
executor consume lots of space and not be deleted if the query fails, and this 
checkpoint can't be used to recover as I mentioned above.

I just think that spark either should prohibits users from using temp 
checkpoints when their frameworks are non-local, or should be responsible for 
cleaning up this useless checkpoint directory even if the query fails.

 

 


was (Author: camper42):
{quote}Temp checkpoint can be used in one-node scenario and deleted only if the 
query didn't fail.
{quote}
Yes, and there're no logs or error msgs says that we *must* set a non-temp 
checkpoint if we run a framework non-local

And if we do this(run non-local with temp checkpoint), the checkpoint dir on 
executor consume lots of space and not be deleted if the query if fail, and 
this checkpoint can't be used to recover as I mentioned above.

I just think that spark either should prohibits users from using temp 
checkpoints when their frameworks are non-local, or should be responsible for 
cleaning up this useless checkpoint directory even if the query fails.

 

 

> temp checkpoint folder at executor should be deleted on graceful shutdown
> -------------------------------------------------------------------------
>
>                 Key: SPARK-26389
>                 URL: https://issues.apache.org/jira/browse/SPARK-26389
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.4.0
>            Reporter: Fengyu Cao
>            Priority: Major
>
> {{spark-submit --master mesos://<mesos> -conf 
> spark.streaming.stopGracefullyOnShutdown=true <structured streaming 
> framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = 
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = 
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error 
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary-<uuid> on executor not deleted due to 
> org.apache.spark.SparkException: Writing job aborted., and this temp 
> checkpoint can't used to recovery.}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

Reply via email to