I just saw another error after my job was run for 2 hours:

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
No lease on /apps/data/vddil/real-time/checkpoint/temp: File does not
exist. Holder DFSClient_NONMAPREDUCE_327993456_13 does not have any
open files.
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2946)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2766)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2674)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

        at org.apache.hadoop.ipc.Client.call(Client.java:1410)
        at org.apache.hadoop.ipc.Client.call(Client.java:1363)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor146.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
        at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
14/07/25 14:45:12 WARN CheckpointWriter: Error in attempt 1 of writing
checkpoint to 
hdfs://gnosis-01-01-01.crl.samsung.com/apps/data/vddil/real-time/checkpoint/checkpoint-1406324700000



All my jobs use the same parameter to the function checkpoint. Is it the
reason for the error?

I will post the stack trace of the other error after it appears again.
Thanks!


Bill


On Fri, Jul 25, 2014 at 2:57 PM, Tathagata Das <tathagata.das1...@gmail.com>
wrote:

> Can you give a stack trace and logs of the exception? Its hard to say
> anything without any associated stack trace and logs.
>
> TD
>
>
> On Fri, Jul 25, 2014 at 1:32 PM, Bill Jay <bill.jaypeter...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am running a Spark Streaming job that uses saveAsTextFiles to save
>> results into hdfs files. However, it has an exception after 20 batches
>>
>>
>> result-1406312340000/_temporary/0/task_201407251119_0000_m_000003 does not 
>> exist.
>>
>>
>> When the job is running, I do not change any file in the folder. Does
>> anyone know why the file cannot be found?
>>
>> Thanks!
>>
>> Bill
>>
>
>

Reply via email to