This time re-run passed (although with many failed/retry tasks) with my throttle bandwidth as 200M(although by iftop, it never reach close to that number). Is there a way to increase the lease expire time for low throttle bandwidth for individual export job?
Thanks Tian-Ying On Wed, Apr 30, 2014 at 10:17 AM, Tianying Chang <[email protected]> wrote: > yes, I am using the bandwidth throttle feature. The export job of this > table actually succeed for its first run. When I rerun it (for my robust > testing) it seems never pass. I am wondering if it has some werid state (I > did clean up the target cluster even removed > /hbase/.archive/rich_pint_data_v1 folder) > > It seems even if I set the throttle value really large, it still fail. And > I think even after I replace the jar back to the one without throttle, it > still fail for re-run. > > Is there some way that I can increase the lease to be very large to test > it out? > > > > On Wed, Apr 30, 2014 at 10:02 AM, Matteo Bertozzi <[email protected] > > wrote: > >> the file is the file in export, so you are creating that file. >> do you have the bandwidth throttle on? >> >> I'm thinking that the file is slow writing: e.g. write(few bytes) wait >> write(few bytes) >> and on the wait your lease expire >> or something like that can happen if your MR job is stuck in someway (slow >> machine or similar) and it is not writing within the lease timeout >> >> Matteo >> >> >> >> On Wed, Apr 30, 2014 at 9:53 AM, Tianying Chang <[email protected]> >> wrote: >> >> > we are using >> > >> > Hadoop 2.0.0-cdh4.2.0 and hbase 0.94.7. We also backported several >> snapshot >> > related jira, e.g 10111(verify snapshot), 11083 (bandwidth throttle in >> > exportSnapshot) >> > >> > I found when the LeaseExpiredException first reported, that file indeed >> > not there, and the map task retry. And I verifified couple minutes >> later, >> > that HFile does exist under /.archive. But the retry map task still >> > complain the same error of file not exist... >> > >> > I will check the namenode log for the LeaseExpiredException. >> > >> > >> > Thanks >> > >> > Tian-Ying >> > >> > >> > On Wed, Apr 30, 2014 at 9:33 AM, Ted Yu <[email protected]> wrote: >> > >> > > Can you give us the hbase and hadoop releases you're using ? >> > > >> > > Can you check namenode log around the time LeaseExpiredException was >> > > encountered ? >> > > >> > > Cheers >> > > >> > > >> > > On Wed, Apr 30, 2014 at 9:20 AM, Tianying Chang <[email protected]> >> > wrote: >> > > >> > > > Hi, >> > > > >> > > > When I export large table with 460+ regions, I saw the >> exportSnapshot >> > job >> > > > fail sometime (not all the time). The error of the map task is >> below: >> > > But I >> > > > verified the file highlighted below, it does exist. Smaller table >> seems >> > > > always pass. Any idea? Is it because it is too big and get session >> > > timeout? >> > > > >> > > > >> > > > >> > > >> > >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >> > > > No lease on >> > > > >> > > >> > >> /hbase/.archive/rich_pin_data_v1/7713d5331180cb610834ba1c4ebbb9b3/d/eef3642f49244547bb6606d4d0f15f1f >> > > > File does not exist. Holder DFSClient_NONMAPREDUCE_279781617_1 does >> > > > not have any open files. >> > > > at >> > > > >> > > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) >> > > > at >> > > > >> > > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) >> > > > at >> > > > >> > > >> > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) >> > > > at >> > > > >> > > >> > >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) >> > > > at >> > > > >> > > >> > >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) >> > > > at >> > > > >> > > >> > >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) >> > > > at org.apache.hadoop.ipc.ProtobufR >> > > > >> > > > >> > > > >> > > > Thanks >> > > > >> > > > Tian-Ying >> > > > >> > > >> > >> > >
