Re: LeaseExpiredException Exception

Ken Krugler Tue, 08 Dec 2009 11:44:11 -0800

Hi Jason,
Hi Jason,

Thanks for the info - it's good to hear from somebody else who's runinto this :)

I tried again with a bigger box for the master, and wound up with thesame results.

I guess the framework could be killing it - but no idea why. This isduring a very simple "write out the results" phase, so very high I/Obut not much computation, and nothing should be hung.

Any particular configuration values you had to tweak? I'm running thisin Elastic MapReduce (EMR) so most settings are whatever they provideby default. I override a few things in my JobConf, but (for example)anything related to HDFS/MR framework will be locked & loaded by thetime my job is executing.


Thanks!

-- Ken

On Dec 8, 2009, at 9:34am, Jason Venner wrote:

Is it possible that this is occurring in a task that is being killedby the
framework.
Sometimes there is a little lag, between the time the tracker 'killsa task'and the task fully dies, you could be getting into a situation likethatwhere the task is in the process of dying but the last write isstill in
progress.
I see this situation happen when the task tracker machine is heavilyloaded.In once case there was a 15 minute lag between the timestamp in thetracker
for killing task XYZ, and the task actually going away.
It took me a while to work this out as I had to merge the trackerand task
logs by time to actually see the pattern.
The host machines where under very heavy io pressure, and may havebeenpaging also. The code and configuration issues that triggered thishave been
resolved, so I don't see it anymore.
On Tue, Dec 8, 2009 at 8:32 AM, Ken Krugler <kkrugler_li...@transpac.com>wrote:
Hi all,

In searching the mail/web archives, I see occasionally questions from
people (like me) who run into the LeaseExpiredException (in mycase, on
0.18.3 while running a 50 server cluster in EMR).
Unfortunately I don't see any responses, other than Dennis Kubessayingthat he thought some work had been done in this area of Hadoop "awhileback". And this was in 2007, so it hopefully doesn't apply to mysituation.
I see these LeaseExpiredException errors showing up in the logsaround the
same time as IOException errors, eg:

java.io.IOException: Stream closed.
      at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.isClosed(DFSClient.java:2245)
      at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2481)
      at
org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155)
      at
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
      at
org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
      at
org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
atorg.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
      at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
      at java.io.DataOutputStream.write(DataOutputStream.java:90)
      at
org.apache.hadoop.io.SequenceFile$BlockCompressWriter.writeBuffer(SequenceFile.java:1260)
      at
org.apache.hadoop.io.SequenceFile$BlockCompressWriter.sync(SequenceFile.java:1277)
      at
org.apache.hadoop.io.SequenceFile$BlockCompressWriter.close(SequenceFile.java:1295)
      at
org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutputFormat.java:73)
      at
org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.close(MapTask.java:276)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:238)
      at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2216)
This issue seemed related, but would have been fixed in the 0.18.3release.
http://issues.apache.org/jira/browse/HADOOP-3760

I saw a similar HBase issue -
https://issues.apache.org/jira/browse/HBASE-529 - but they "fixed"it by
retrying a failure case.
These exceptions occur during "write storms", where lots of filesare being
written out. Though "lots" is relative, e.g. 10-20M.
It's repeatable, in that it fails on the same step of a series ofchained
MR jobs.
Is it possible I need to be running a bigger box for my namenodeserver?
Any other ideas?

Thanks,

-- Ken


On May 25, 2009, at 7:37am, Stas Oskin wrote:

Hi.
I have a process that writes to file on DFS from time to time, using
OutputStream.
After some time of writing, I'm starting getting the exceptionbelow, andthe write fails. The DFSClient retries several times, and thenfails.
Copying the file from local disk to DFS via CopyLocalFile() worksfine.
Can anyone advice on the matter?

I'm using Hadoop 0.18.3.

Thanks in advance.


09/05/25 15:35:35 INFO dfs.DFSClient:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on /test/test.bin
File
does not exist. Holder DFSClient_-951664265 does not have any openfiles.
         at
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1172)
         at
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1103
)
at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
at sun.reflect.GeneratedMethodAccessor8.invoke(UnknownSource)
         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

         at java.lang.reflect.Method.invoke(Method.java:597)

         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:890)
         at org.apache.hadoop.ipc.Client.call(Client.java:716)

         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)

         at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)
         at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
)

         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25
)

         at java.lang.reflect.Method.invoke(Method.java:597)

         at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82
)

         at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59
)

         at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source)

         at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2450
)

         at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2333
)

         at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1745
)

         at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1922
)
--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g
--
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals


--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g

Re: LeaseExpiredException Exception

Reply via email to