[jira] [Commented] (SPARK-31685) Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN expiration issue

2020-05-12 Thread Rajeev Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105309#comment-17105309
 ] 

Rajeev Kumar commented on SPARK-31685:
--

Another ticket is open for similar issue.

https://issues.apache.org/jira/browse/SPARK-26385

I had used that Jira to post comments and was asked to create new Jira.

> Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN 
> expiration issue
> ---
>
> Key: SPARK-31685
> URL: https://issues.apache.org/jira/browse/SPARK-31685
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.4
> Environment: spark-2.4.4-bin-hadoop2.7
>Reporter: Rajeev Kumar
>Priority: Major
>
> I am facing issue for spark-2.4.4-bin-hadoop2.7. I am using spark structured 
> streaming with Kafka. Reading the stream from Kafka and saving it to HBase.
> I get this error on the driver after 24 hours.
>  
> {code:java}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired
> at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
> at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:130)
> at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1169)
> at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1165)
> at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
> at 
> org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1171)
> at org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1630)
> at 
> org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.exists(CheckpointFileManager.scala:326)
> at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:142)
> at 
> org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:110)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply$mcV$sp(MicroBatchExecution.scala:382)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
> at 
> org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply$mcZ$sp(MicroBatchExecution.scala:381)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
> at 
> org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:557)
> at 
> 

[jira] [Commented] (SPARK-31685) Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN expiration issue

2020-05-12 Thread Rajeev Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105307#comment-17105307
 ] 

Rajeev Kumar commented on SPARK-31685:
--

I did some testing. I found one issue in HadoopFSDelegationTokenProvider. I 
might be wrong also. Please validate below.

Spark does not renew the token rather it creates the token at the scheduled 
interval.

Spark needs two HDFS_DELEGATION_TOKEN. One for resource manager and second for 
application user.
{code:java}
val fetchCreds = fetchDelegationTokens(getTokenRenewer(hadoopConf), 
fsToGetTokens, creds) 
// Get the token renewal interval if it is not set. It will only be called 
once. 
if (tokenRenewalInterval == null) { 
 tokenRenewalInterval = getTokenRenewalInterval(hadoopConf, sparkConf, 
fsToGetTokens) 
}
{code}
At the first call to the obtainDelegationTokens it creates TWO tokens correctly.

Token for resource manager is getting created by method fetchDelegationTokens.

Token for application user is getting created inside getTokenRenewalInterval 
method.
{code:java}
private var tokenRenewalInterval: Option[Long] = null
{code}
{code:java}
 sparkConf.get(PRINCIPAL).flatMap { renewer => 
val creds = new Credentials() 
fetchDelegationTokens(renewer, filesystems, creds)
{code}
But after 18 hours or whatever the renewal period when scheduled thread of 
AMCredentialRenewer tries to create HDFS_DELEFATION_TOKEN, it creates only one 
token (for resource manager as result of call to fetchDelegationTokens method ).

But it does not create HDFS_DELEFATION_TOKEN for application user because 
tokenRenewalInterval is NOT NULL this time. Hence after expiration of 
HDFS_DELEFATION_TOKEN (typically 24 hrs) spark fails to update the spark 
checkpointing directory and job dies.

As part of my testing, I just called getTokenRenewalInterval in else block and 
job is running fine. It did not die after 24 hrs.
{code:java}
if (tokenRenewalInterval == null) {
// I put this custom log
  logInfo("Token Renewal interval is null. Calling getTokenRenewalInterval "
+ getTokenRenewer(hadoopConf))
  tokenRenewalInterval =
getTokenRenewalInterval(hadoopConf, sparkConf, fsToGetTokens)
} else {
// I put this custom log
  logInfo("Token Renewal interval is NOT null. Calling getTokenRenewalInterval "
+ getTokenRenewer(hadoopConf))
  getTokenRenewalInterval(hadoopConf, sparkConf, fsToGetTokens)
}
{code}
Driver logs -
{code:java}
20/05/01 14:36:19 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is null. Calling getTokenRenewalInterval rm/host:port
20/05/02 08:36:42 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
20/05/03 02:37:00 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
20/05/03 20:37:18 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
{code}

> Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN 
> expiration issue
> ---
>
> Key: SPARK-31685
> URL: https://issues.apache.org/jira/browse/SPARK-31685
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.4
> Environment: spark-2.4.4-bin-hadoop2.7
>Reporter: Rajeev Kumar
>Priority: Major
>
> I am facing issue for spark-2.4.4-bin-hadoop2.7. I am using spark structured 
> streaming with Kafka. Reading the stream from Kafka and saving it to HBase.
> I get this error on the driver after 24 hours.
>  
> {code:java}
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired
> at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
> at 

[jira] [Created] (SPARK-31685) Spark structured streaming with Kafka fails with HDFS_DELEGATION_TOKEN expiration issue

2020-05-12 Thread Rajeev Kumar (Jira)
Rajeev Kumar created SPARK-31685:


 Summary: Spark structured streaming with Kafka fails with 
HDFS_DELEGATION_TOKEN expiration issue
 Key: SPARK-31685
 URL: https://issues.apache.org/jira/browse/SPARK-31685
 Project: Spark
  Issue Type: Bug
  Components: Structured Streaming
Affects Versions: 2.4.4
 Environment: spark-2.4.4-bin-hadoop2.7
Reporter: Rajeev Kumar


I am facing issue for spark-2.4.4-bin-hadoop2.7. I am using spark structured 
streaming with Kafka. Reading the stream from Kafka and saving it to HBase.

I get this error on the driver after 24 hours.

 
{code:java}
Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at org.apache.hadoop.fs.Hdfs.getFileStatus(Hdfs.java:130)
at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1169)
at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1165)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1171)
at org.apache.hadoop.fs.FileContext$Util.exists(FileContext.java:1630)
at 
org.apache.spark.sql.execution.streaming.FileContextBasedCheckpointFileManager.exists(CheckpointFileManager.scala:326)
at 
org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:142)
at 
org.apache.spark.sql.execution.streaming.HDFSMetadataLog.add(HDFSMetadataLog.scala:110)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply$mcV$sp(MicroBatchExecution.scala:382)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381)
at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply$mcZ$sp(MicroBatchExecution.scala:381)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:557)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch(MicroBatchExecution.scala:337)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:183)
at 
org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
at 

[jira] [Comment Edited] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-05-03 Thread Rajeev Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17098651#comment-17098651
 ] 

Rajeev Kumar edited comment on SPARK-26385 at 5/4/20, 4:44 AM:
---

Sorry for coming late on this thread. I was doing some testing. I found one 
issue in HadoopFSDelegationTokenProvider. I might be wrong also. Please 
validate below.

Spark does not renew the token rather it creates the token at the scheduled 
interval.

Spark needs two HDFS_DELEGATION_TOKEN. One for resource manager and second for 
application user.

 
{code:java}
val fetchCreds = fetchDelegationTokens(getTokenRenewer(hadoopConf), 
fsToGetTokens, creds) 
// Get the token renewal interval if it is not set. It will only be called 
once. 
if (tokenRenewalInterval == null) { 
 tokenRenewalInterval = getTokenRenewalInterval(hadoopConf, sparkConf, 
fsToGetTokens) 
}
{code}
At the first call to the obtainDelegationTokens it creates TWO tokens correctly.

Token for resource manager is getting created by method fetchDelegationTokens.

Token for application user is getting created inside getTokenRenewalInterval 
method.
{code:java}
// Code snippet
private var tokenRenewalInterval: Option[Long] = null
{code}
{code:java}
sparkConf.get(PRINCIPAL).flatMap { renewer => 
  val creds = new Credentials() 
  fetchDelegationTokens(renewer, filesystems, creds) 
{code}
But after 18 hours or whatever the renewal period when scheduled thread of 
AMCredentialRenewer tries to create HDFS_DELEFATION_TOKEN, it creates only one 
token (for resource manager as result of call to fetchDelegationTokens method ).

But it does not create HDFS_DELEFATION_TOKEN for application user because 
tokenRenewalInterval is NOT NULL this time. Hence after expiration of 
HDFS_DELEFATION_TOKEN (typically 24 hrs) spark fails to update the spark 
checkpointing directory and job dies.

As part of my testing, I just called getTokenRenewalInterval in else block and 
job is running fine. It did not die after 24 hrs.
{code:java}
if (tokenRenewalInterval == null) {
// I put this custom log
  logInfo("Token Renewal interval is null. Calling getTokenRenewalInterval "
+ getTokenRenewer(hadoopConf))
  tokenRenewalInterval =
getTokenRenewalInterval(hadoopConf, sparkConf, fsToGetTokens)
} else {
// I put this custom log
  logInfo("Token Renewal interval is NOT null. Calling getTokenRenewalInterval "
+ getTokenRenewer(hadoopConf))
  getTokenRenewalInterval(hadoopConf, sparkConf, fsToGetTokens)
}
{code}
 Logs are -
{code:java}
20/05/01 14:36:19 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is null. Calling getTokenRenewalInterval rm/host:port
20/05/02 08:36:42 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
20/05/03 02:37:00 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
20/05/03 20:37:18 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
{code}
 

 [~kabhwan] Let me know if I need to create new ticket.


was (Author: rajeevkumar):
Sorry for coming late on this thread. I was doing some testing. I found one 
issue in HadoopFSDelegationTokenProvider. I might be wrong also. Please 
validate below.

Spark does not renew the token rather it creates the token at the scheduled 
interval.

Spark needs two HDFS_DELEGATION_TOKEN. One for resource manager and second for 
application user.

 
{code:java}
val fetchCreds = fetchDelegationTokens(getTokenRenewer(hadoopConf), 
fsToGetTokens, creds) 
// Get the token renewal interval if it is not set. It will only be called 
once. 
if (tokenRenewalInterval == null) { 
 tokenRenewalInterval = getTokenRenewalInterval(hadoopConf, sparkConf, 
fsToGetTokens) 
}
{code}
At the first call to the obtainDelegationTokens it creates TWO tokens correctly.

Token for resource manager is getting created by method fetchDelegationTokens.

Token for application user is getting created inside getTokenRenewalInterval 
method.
{code:java}
// Code snippet
private var tokenRenewalInterval: Option[Long] = null
{code}
{code:java}
sparkConf.get(PRINCIPAL).flatMap { renewer => 
  val creds = new Credentials() 
  fetchDelegationTokens(renewer, filesystems, creds) 
{code}
But after 18 hours or whatever the renewal period when scheduled thread of 
AMCredentialRenewer tries to create HDFS_DELEFATION_TOKEN, it creates only one 
token (for resource manager as result of call to fetchDelegationTokens method ).

But it does not create HDFS_DELEFATION_TOKEN for application user because 
tokenRenewalInterval is NOT NULL this time. Hence after expiration of 
HDFS_DELEFATION_TOKEN (typically 24 hrs) spark fails to update the spark 
checkpointing directory and job dies.

As part of my testing, I just called getTokenRenewalInterval in else block and 

[jira] [Commented] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-05-03 Thread Rajeev Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17098651#comment-17098651
 ] 

Rajeev Kumar commented on SPARK-26385:
--

Sorry for coming late on this thread. I was doing some testing. I found one 
issue in HadoopFSDelegationTokenProvider. I might be wrong also. Please 
validate below.

Spark does not renew the token rather it creates the token at the scheduled 
interval.

Spark needs two HDFS_DELEGATION_TOKEN. One for resource manager and second for 
application user.

 
{code:java}
val fetchCreds = fetchDelegationTokens(getTokenRenewer(hadoopConf), 
fsToGetTokens, creds) 
// Get the token renewal interval if it is not set. It will only be called 
once. 
if (tokenRenewalInterval == null) { 
 tokenRenewalInterval = getTokenRenewalInterval(hadoopConf, sparkConf, 
fsToGetTokens) 
}
{code}
At the first call to the obtainDelegationTokens it creates TWO tokens correctly.

Token for resource manager is getting created by method fetchDelegationTokens.

Token for application user is getting created inside getTokenRenewalInterval 
method.
{code:java}
// Code snippet
private var tokenRenewalInterval: Option[Long] = null
{code}
{code:java}
sparkConf.get(PRINCIPAL).flatMap { renewer => 
  val creds = new Credentials() 
  fetchDelegationTokens(renewer, filesystems, creds) 
{code}
But after 18 hours or whatever the renewal period when scheduled thread of 
AMCredentialRenewer tries to create HDFS_DELEFATION_TOKEN, it creates only one 
token (for resource manager as result of call to fetchDelegationTokens method ).

But it does not create HDFS_DELEFATION_TOKEN for application user because 
tokenRenewalInterval is NOT NULL this time. Hence after expiration of 
HDFS_DELEFATION_TOKEN (typically 24 hrs) spark fails to update the spark 
checkpointing directory and job dies.

As part of my testing, I just called getTokenRenewalInterval in else block and 
job is running fine. It did not die after 24 hrs.
{code:java}
if (tokenRenewalInterval == null) {
// I put this custom log
  logInfo("Token Renewal interval is null. Calling getTokenRenewalInterval "
+ getTokenRenewer(hadoopConf))
  tokenRenewalInterval =
getTokenRenewalInterval(hadoopConf, sparkConf, fsToGetTokens)
} else {
// I put this custom log
  logInfo("Token Renewal interval is NOT null. Calling getTokenRenewalInterval "
+ getTokenRenewer(hadoopConf))
  getTokenRenewalInterval(hadoopConf, sparkConf, fsToGetTokens)
}
{code}
 Logs are -
{code:java}
20/05/01 14:36:19 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is null. Calling getTokenRenewalInterval rm/host:port
20/05/02 08:36:42 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
20/05/03 02:37:00 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
20/05/03 20:37:18 INFO HadoopFSDelegationTokenProvider: Token Renewal interval 
is NOT null. Calling getTokenRenewalInterval rm/host:port
{code}
 

 

> YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in 
> cache
> ---
>
> Key: SPARK-26385
> URL: https://issues.apache.org/jira/browse/SPARK-26385
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 2.4.0
> Environment: Hadoop 2.6.0, Spark 2.4.0
>Reporter: T M
>Priority: Major
>
>  
> Hello,
>  
> I have Spark Structured Streaming job which is runnning on YARN(Hadoop 2.6.0, 
> Spark 2.4.0). After 25-26 hours, my job stops working with following error:
> {code:java}
> 2018-12-16 22:35:17 ERROR 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91): Query 
> TestQuery[id = a61ce197-1d1b-4e82-a7af-60162953488b, runId = 
> a56878cf-dfc7-4f6a-ad48-02cf738ccc2f] terminated with error 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (token for REMOVED: HDFS_DELEGATION_TOKEN owner=REMOVED, renewer=yarn, 
> realUser=, issueDate=1544903057122, maxDate=1545507857122, 
> sequenceNumber=10314, masterKeyId=344) can't be found in cache at 
> org.apache.hadoop.ipc.Client.call(Client.java:1470) at 
> org.apache.hadoop.ipc.Client.call(Client.java:1401) at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>  at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source) at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>  at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> 

[jira] [Comment Edited] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-04-21 Thread Rajeev Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088785#comment-17088785
 ] 

Rajeev Kumar edited comment on SPARK-26385 at 4/21/20, 3:31 PM:


I am also facing the same issue for spark-2.4.4-bin-hadoop2.7. I am using spark 
structured streaming with Kafka. Reading the stream from Kafka and saving it to 
HBase.


 I am putting the logs from my application (after removing ip and username).
 When application starts it prints this log. We can see it is creating the 
HDFS_DELEGATION_TOKEN (token id = 6972072)
{quote}20/03/17 13:24:09 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 
6972072 for  on ha-hdfs:20/03/17 13:24:09 INFO 
HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_631280530_1, 
ugi=@ (auth:KERBEROS)]]20/03/17 13:24:09 INFO DFSClient: Created 
HDFS_DELEGATION_TOKEN token 6972073 for  on ha-hdfs:20/03/17 
13:24:09 INFO HadoopFSDelegationTokenProvider: Renewal interval is 86400039 for 
token HDFS_DELEGATION_TOKEN20/03/17 13:24:10 DEBUG 
HadoopDelegationTokenManager: Service hive does not require a token. Check your 
configuration to see if security is disabled or not.20/03/17 13:24:11 DEBUG 
HBaseDelegationTokenProvider: Attempting to fetch HBase security token.20/03/17 
13:24:12 INFO HBaseDelegationTokenProvider: Get token from HBase: Kind: 
HBASE_AUTH_TOKEN, Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@d1)20/03/17
 13:24:12 INFO AMCredentialRenewer: Scheduling login from keytab in 18.0 h.
{quote}
After 18 hours as mentioned in log, it created new tokens also. Token number is 
increased (7041621).
{quote}20/03/18 07:24:10 INFO AMCredentialRenewer: Attempting to login to KDC 
using principal: 20/03/18 07:24:10 INFO AMCredentialRenewer: Successfully 
logged into KDC.20/03/18 07:24:16 DEBUG HadoopFSDelegationTokenProvider: 
Delegation token renewer is: rm/@20/03/18 07:24:16 INFO 
HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-2072296893_22, 
ugi=@  (auth:KERBEROS)]]20/03/18 07:24:16 INFO DFSClient: 
Created HDFS_DELEGATION_TOKEN token 7041621 for  on 
ha-hdfs:20/03/18 07:24:16 DEBUG HadoopDelegationTokenManager: Service 
hive does not require a token. Check your configuration to see if security is 
disabled or not.20/03/18 07:24:16 DEBUG HBaseDelegationTokenProvider: 
Attempting to fetch HBase security token.20/03/18 07:24:16 INFO 
HBaseDelegationTokenProvider: Get token from HBase: Kind: HBASE_AUTH_TOKEN, 
Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102)20/03/18
 07:24:16 INFO AMCredentialRenewer: Scheduling login from keytab in 18.0 
h.20/03/18 07:24:16 INFO AMCredentialRenewer: Updating delegation 
tokens.20/03/18 07:24:17 INFO SparkHadoopUtil: Updating delegation tokens for 
current user.20/03/18 07:24:17 DEBUG SparkHadoopUtil: Adding/updating 
delegation tokens List(Kind: HBASE_AUTH_TOKEN, Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102); 
org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102, Kind: 
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:, Ident: (HDFS_DELEGATION_TOKEN 
token 7041621 for ); HDFS_DELEGATION_TOKEN token 7041621 for ; 
Renewer: yarn; Issued: 3/18/20 7:24 AM; Max Date: 3/25/20 7:24 AM)20/03/18 
07:24:17 INFO SparkHadoopUtil: Updating delegation tokens for current 
user.20/03/18 07:24:17 DEBUG SparkHadoopUtil: Adding/updating delegation tokens 
List(Kind: HBASE_AUTH_TOKEN, Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102); 
org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102, Kind: 
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:, Ident: (HDFS_DELEGATION_TOKEN 
token 7041621 for ); HDFS_DELEGATION_TOKEN token 7041621 for ; 
Renewer: yarn; Issued: 3/18/20 7:24 AM; Max Date: 3/25/20 7:24 AM)
{quote}
Everything goes fine till 24 hours. After that I see LeaseRenewer exception. 
But it is picking the older token number (6972072).This behaviour is same even 
if I use "--conf spark.hadoop.fs.hdfs.impl.disable.cache=true"
{quote}20/03/18 13:24:28 WARN Client: Exception encountered while connecting to 
the server : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired20/03/18 
13:24:28 WARN LeaseRenewer: Failed to renew lease for 
[DFSClient_NONMAPREDUCE_631280530_1] for 30 seconds.  Will retry shortly 
...org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired        at 
org.apache.hadoop.ipc.Client.call(Client.java:1475)        at 

[jira] [Commented] (SPARK-26385) YARN - Spark Stateful Structured streaming HDFS_DELEGATION_TOKEN not found in cache

2020-04-21 Thread Rajeev Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088785#comment-17088785
 ] 

Rajeev Kumar commented on SPARK-26385:
--

I am also facing the same issue for spark-2.4.4-bin-hadoop2.7. I am using spark 
structured streaming with Kafka. Reading the stream from Kafka and saving it to 
HBase.I am also facing the same issue for spark-2.4.4-bin-hadoop2.7. I am using 
spark structured streaming with Kafka. Reading the stream from Kafka and saving 
it to HBase.
I am putting the logs from my application (after removing ip and username).
When application starts it prints this log. We can see it is creating the 
HDFS_DELEGATION_TOKEN (token id = 6972072)


{quote}

20/03/17 13:24:09 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 6972072 
for  on ha-hdfs:20/03/17 13:24:09 INFO 
HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_631280530_1, 
ugi=@ (auth:KERBEROS)]]20/03/17 13:24:09 INFO DFSClient: Created 
HDFS_DELEGATION_TOKEN token 6972073 for  on ha-hdfs:20/03/17 
13:24:09 INFO HadoopFSDelegationTokenProvider: Renewal interval is 86400039 for 
token HDFS_DELEGATION_TOKEN20/03/17 13:24:10 DEBUG 
HadoopDelegationTokenManager: Service hive does not require a token. Check your 
configuration to see if security is disabled or not.20/03/17 13:24:11 DEBUG 
HBaseDelegationTokenProvider: Attempting to fetch HBase security token.20/03/17 
13:24:12 INFO HBaseDelegationTokenProvider: Get token from HBase: Kind: 
HBASE_AUTH_TOKEN, Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@d1)20/03/17
 13:24:12 INFO AMCredentialRenewer: Scheduling login from keytab in 18.0 h.

{quote}


After 18 hours as mentioned in log, it created new tokens also. Token number is 
increased (7041621).

{quote}

20/03/18 07:24:10 INFO AMCredentialRenewer: Attempting to login to KDC using 
principal: 20/03/18 07:24:10 INFO AMCredentialRenewer: Successfully 
logged into KDC.20/03/18 07:24:16 DEBUG HadoopFSDelegationTokenProvider: 
Delegation token renewer is: rm/@20/03/18 07:24:16 INFO 
HadoopFSDelegationTokenProvider: getting token for: 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-2072296893_22, 
ugi=@  (auth:KERBEROS)]]20/03/18 07:24:16 INFO DFSClient: 
Created HDFS_DELEGATION_TOKEN token 7041621 for  on 
ha-hdfs:20/03/18 07:24:16 DEBUG HadoopDelegationTokenManager: Service 
hive does not require a token. Check your configuration to see if security is 
disabled or not.20/03/18 07:24:16 DEBUG HBaseDelegationTokenProvider: 
Attempting to fetch HBase security token.20/03/18 07:24:16 INFO 
HBaseDelegationTokenProvider: Get token from HBase: Kind: HBASE_AUTH_TOKEN, 
Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102)20/03/18
 07:24:16 INFO AMCredentialRenewer: Scheduling login from keytab in 18.0 
h.20/03/18 07:24:16 INFO AMCredentialRenewer: Updating delegation 
tokens.20/03/18 07:24:17 INFO SparkHadoopUtil: Updating delegation tokens for 
current user.20/03/18 07:24:17 DEBUG SparkHadoopUtil: Adding/updating 
delegation tokens List(Kind: HBASE_AUTH_TOKEN, Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102); 
org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102, Kind: 
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:, Ident: (HDFS_DELEGATION_TOKEN 
token 7041621 for ); HDFS_DELEGATION_TOKEN token 7041621 for ; 
Renewer: yarn; Issued: 3/18/20 7:24 AM; Max Date: 3/25/20 7:24 AM)20/03/18 
07:24:17 INFO SparkHadoopUtil: Updating delegation tokens for current 
user.20/03/18 07:24:17 DEBUG SparkHadoopUtil: Adding/updating delegation tokens 
List(Kind: HBASE_AUTH_TOKEN, Service: 812777c29d67, Ident: 
(org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102); 
org.apache.hadoop.hbase.security.token.AuthenticationTokenIdentifier@102, Kind: 
HDFS_DELEGATION_TOKEN, Service: ha-hdfs:, Ident: (HDFS_DELEGATION_TOKEN 
token 7041621 for ); HDFS_DELEGATION_TOKEN token 7041621 for ; 
Renewer: yarn; Issued: 3/18/20 7:24 AM; Max Date: 3/25/20 7:24 AM)

{quote}


Everything goes fine till 24 hours. After that I see LeaseRenewer exception. 
But it is picking the older token number (6972072).This behaviour is same even 
if I use "--conf spark.hadoop.fs.hdfs.impl.disable.cache=true"


{quote}

20/03/18 13:24:28 WARN Client: Exception encountered while connecting to the 
server : 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 6972072 for ) is expired20/03/18 
13:24:28 WARN LeaseRenewer: Failed to renew lease for 
[DFSClient_NONMAPREDUCE_631280530_1] for 30 seconds.  Will retry shortly 
...org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
 token (HDFS_DELEGATION_TOKEN token 6972072 for