[ https://issues.apache.org/jira/browse/SPARK-48309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated SPARK-48309: ----------------------------------- Labels: pull-request-available (was: ) > Stop am retry, in situations where some errors and retries may not be > successful > -------------------------------------------------------------------------------- > > Key: SPARK-48309 > URL: https://issues.apache.org/jira/browse/SPARK-48309 > Project: Spark > Issue Type: Improvement > Components: YARN > Affects Versions: 4.0.0 > Reporter: guihuawen > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > > In yarn cluster mode, spark.yarn.maxAppAttempts will be configured. In our > production environment, it is configured as 2 If the first execution fails, > AM will retry. However, in some scenarios, even attempting a second task may > fail. > For example: > org. apache. park. SQL AnalysisException: Table or view not found: > test.testxxxx_xxxxx; Line 1 pos 14; > Project > +-Unresolved Relationship [bigdata_qa, testxxxxx_xxxxx], [], false > > Other example: > Caused by: org. apache. hadoop. hdfs. protocol NSQuotaExceededException: The > NameSpace quota (directories and files) of directory/tmp/xxx_file/xxxx is > exceeded: quota=1000000 file count=1000001 > Would it be more appropriate to try capturing these exceptions and stopping > retry? > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org