[jira] [Commented] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires
[ https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16981377#comment-16981377 ] Jungtaek Lim commented on SPARK-27891: -- It would be really nice if we have some log from Spark 2.4.x, as someone adds Spark 2.4.1 as Affected Version. > Long running spark jobs fail because of HDFS delegation token expires > - > > Key: SPARK-27891 > URL: https://issues.apache.org/jira/browse/SPARK-27891 > Project: Spark > Issue Type: Bug > Components: Security >Affects Versions: 2.0.1, 2.1.0, 2.3.1, 2.4.1 >Reporter: hemshankar sahu >Priority: Critical > Attachments: application_1559242207407_0001.log, > spark_2.3.1_failure.log > > > When the spark job runs on a secured cluster for longer then time that is > mentioned in the dfs.namenode.delegation.token.renew-interval property of > hdfs-site.xml the spark job fails. ** > Following command was used to submit the spark job > bin/spark-submit --principal acekrbuser --keytab ~/keytabs/acekrbuser.keytab > --master yarn --deploy-mode cluster examples/src/main/python/wordcount.py > /tmp/ff1.txt > > Application Logs attached > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires
[ https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932981#comment-16932981 ] avinash v kodikal commented on SPARK-27891: --- [~vanzin] - DId you get a chance to look at the latest logs? Please let us know if this can be addressed in the ongoing spark release > Long running spark jobs fail because of HDFS delegation token expires > - > > Key: SPARK-27891 > URL: https://issues.apache.org/jira/browse/SPARK-27891 > Project: Spark > Issue Type: Bug > Components: Security >Affects Versions: 2.0.1, 2.1.0, 2.3.1, 2.4.1 >Reporter: hemshankar sahu >Priority: Critical > Attachments: application_1559242207407_0001.log, > spark_2.3.1_failure.log > > > When the spark job runs on a secured cluster for longer then time that is > mentioned in the dfs.namenode.delegation.token.renew-interval property of > hdfs-site.xml the spark job fails. ** > Following command was used to submit the spark job > bin/spark-submit --principal acekrbuser --keytab ~/keytabs/acekrbuser.keytab > --master yarn --deploy-mode cluster examples/src/main/python/wordcount.py > /tmp/ff1.txt > > Application Logs attached > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires
[ https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853140#comment-16853140 ] hemshankar sahu commented on SPARK-27891: - Attached log for spark 2.3.1 (spark_2.3.1_failure.log) > Long running spark jobs fail because of HDFS delegation token expires > - > > Key: SPARK-27891 > URL: https://issues.apache.org/jira/browse/SPARK-27891 > Project: Spark > Issue Type: Bug > Components: Security >Affects Versions: 2.0.1, 2.1.0, 2.3.1, 2.4.1 >Reporter: hemshankar sahu >Priority: Major > Attachments: application_1559242207407_0001.log, > spark_2.3.1_failure.log > > > When the spark job runs on a secured cluster for longer then time that is > mentioned in the dfs.namenode.delegation.token.renew-interval property of > hdfs-site.xml the spark job fails. ** > Following command was used to submit the spark job > bin/spark-submit --principal acekrbuser --keytab ~/keytabs/acekrbuser.keytab > --master yarn --deploy-mode cluster examples/src/main/python/wordcount.py > /tmp/ff1.txt > > Application Logs attached > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires
[ https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852573#comment-16852573 ] hemshankar sahu commented on SPARK-27891: - Sure I'll provide for 2.3.1 in some time as it needs to run for long time. > Long running spark jobs fail because of HDFS delegation token expires > - > > Key: SPARK-27891 > URL: https://issues.apache.org/jira/browse/SPARK-27891 > Project: Spark > Issue Type: Bug > Components: Security >Affects Versions: 2.0.1, 2.1.0, 2.3.1, 2.4.1 >Reporter: hemshankar sahu >Priority: Major > Attachments: application_1559242207407_0001.log > > > When the spark job runs on a secured cluster for longer then time that is > mentioned in the dfs.namenode.delegation.token.renew-interval property of > hdfs-site.xml the spark job fails. ** > Following command was used to submit the spark job > bin/spark-submit --principal acekrbuser --keytab ~/keytabs/acekrbuser.keytab > --master yarn --deploy-mode cluster examples/src/main/python/wordcount.py > /tmp/ff1.txt > > Application Logs attached > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires
[ https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852420#comment-16852420 ] Marcelo Vanzin commented on SPARK-27891: Ok, the updated logs show the issue. But they're from Spark 2.2.0, which is EOL. If you can provide logs from the lastest 2.3 or 2.4 releases, that would be more helpful (since there's been a few changes in this area). > Long running spark jobs fail because of HDFS delegation token expires > - > > Key: SPARK-27891 > URL: https://issues.apache.org/jira/browse/SPARK-27891 > Project: Spark > Issue Type: Bug > Components: Security >Affects Versions: 2.0.1, 2.1.0, 2.3.1, 2.4.1 >Reporter: hemshankar sahu >Priority: Major > Attachments: application_1559242207407_0001.log > > > When the spark job runs on a secured cluster for longer then time that is > mentioned in the dfs.namenode.delegation.token.renew-interval property of > hdfs-site.xml the spark job fails. ** > Following command was used to submit the spark job > bin/spark-submit --principal acekrbuser --keytab ~/keytabs/acekrbuser.keytab > --master yarn --deploy-mode cluster examples/src/main/python/wordcount.py > /tmp/ff1.txt > > Application Logs attached > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27891) Long running spark jobs fail because of HDFS delegation token expires
[ https://issues.apache.org/jira/browse/SPARK-27891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852408#comment-16852408 ] Marcelo Vanzin commented on SPARK-27891: {{container_e48_1559242207407_0001_02_01}} tells me that's the second attempt. That makes your problem the same as SPARK-23361. > Long running spark jobs fail because of HDFS delegation token expires > - > > Key: SPARK-27891 > URL: https://issues.apache.org/jira/browse/SPARK-27891 > Project: Spark > Issue Type: Bug > Components: Security >Affects Versions: 2.0.1, 2.1.0, 2.3.1, 2.4.1 >Reporter: hemshankar sahu >Priority: Major > > When the spark job runs on a secured cluster for longer then time that is > mentioned in the dfs.namenode.delegation.token.renew-interval property of > hdfs-site.xml the spark job fails. ** > Following command was used to submit the spark job > bin/spark-submit --principal acekrbuser --keytab ~/keytabs/acekrbuser.keytab > --master yarn --deploy-mode cluster examples/src/main/python/wordcount.py > /tmp/ff1.txt > > Application Logs pasted in Docs Text > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org