[
https://issues.apache.org/jira/browse/SPARK-48309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-48309:
----------------------------------
Affects Version/s: 4.2.0
(was: 4.0.0)
> Stop am retry, in situations where some errors and retries may not be
> successful
> --------------------------------------------------------------------------------
>
> Key: SPARK-48309
> URL: https://issues.apache.org/jira/browse/SPARK-48309
> Project: Spark
> Issue Type: Improvement
> Components: YARN
> Affects Versions: 4.2.0
> Reporter: guihuawen
> Priority: Major
> Labels: pull-request-available
>
> In yarn cluster mode, spark.yarn.maxAppAttempts will be configured. In our
> production environment, it is configured as 2 If the first execution fails,
> AM will retry. However, in some scenarios, even attempting a second task may
> fail.
> For example:
> org. apache. park. SQL AnalysisException: Table or view not found:
> test.testxxxx_xxxxx; Line 1 pos 14;
> Project
> +-Unresolved Relationship [bigdata_qa, testxxxxx_xxxxx], [], false
>
> Other example:
> Caused by: org. apache. hadoop. hdfs. protocol NSQuotaExceededException: The
> NameSpace quota (directories and files) of directory/tmp/xxx_file/xxxx is
> exceeded: quota=1000000 file count=1000001
> Would it be more appropriate to try capturing these exceptions and stopping
> retry?
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]