Hi all. In order to get Spark to properly release memory during batch processing as a workaround to issue https://issues.apache.org/jira/browse/SPARK-4927 I tear down and re-initialize the spark context with :
context.stop() and context = new SparkContext() The problem I run into is that eventually I hit the below error: :15/01/06 13:52:34 INFO BlockManagerMaster: Updated info of block broadcast_5_piece0 [1:53pm]:15/01/06 13:52:34 WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 214318 for zjb238) can't be found in cache [1:53pm]:Exception in thread "main" org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 214318 for zjb238) can't be found in cache This terminates execution but I have no idea why this would be happening. Does anyone know what could be at play here? This error appears as soon as I try to hit HDFS after re-starting a Spark context. When this issue appears is not deterministic and I am able to run several successful iterations before I see it. Any help would be much appreciated. Thank you. ________________________________________________________ The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.