[
https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16733815#comment-16733815
]
Fengyu Cao edited comment on SPARK-26389 at 1/4/19 4:03 AM:
------------------------------------------------------------
{quote}Temp checkpoint can be used in one-node scenario and deleted only if the
query didn't fail.
{quote}
Yes, and there're no logs or error msgs says that we *must* set a non-temp
checkpoint if we run a framework non-local
And if we do this(run non-local with temp checkpoint), the checkpoint dir on
executor consume lots of space and not be deleted if the query fails, and this
checkpoint can't be used to recover as I mentioned above.
I just think that spark either should prohibits users from using temp
checkpoints when their frameworks are non-local, or should be responsible for
cleaning up this useless checkpoint directory even if the query fails.
was (Author: camper42):
{quote}Temp checkpoint can be used in one-node scenario and deleted only if the
query didn't fail.
{quote}
Yes, and there're no logs or error msgs says that we *must* set a non-temp
checkpoint if we run a framework non-local
And if we do this(run non-local with temp checkpoint), the checkpoint dir on
executor consume lots of space and not be deleted if the query if fail, and
this checkpoint can't be used to recover as I mentioned above.
I just think that spark either should prohibits users from using temp
checkpoints when their frameworks are non-local, or should be responsible for
cleaning up this useless checkpoint directory even if the query fails.
> temp checkpoint folder at executor should be deleted on graceful shutdown
> -------------------------------------------------------------------------
>
> Key: SPARK-26389
> URL: https://issues.apache.org/jira/browse/SPARK-26389
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 2.4.0
> Reporter: Fengyu Cao
> Priority: Major
>
> {{spark-submit --master mesos://<mesos> -conf
> spark.streaming.stopGracefullyOnShutdown=true <structured streaming
> framework>}}
> CTRL-C, framework shutdown
> {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id =
> f512e17a-df88-4414-a5cd-a23550cf1e7f, runId =
> 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error
> org.apache.spark.SparkException: Writing job aborted.}}
> {{/tmp/temporary-<uuid> on executor not deleted due to
> org.apache.spark.SparkException: Writing job aborted., and this temp
> checkpoint can't used to recovery.}}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]