hi, Well no, i can't figure out what is the problem, but i saw that someone else had the same problem (see email: "LeaseException despite high hbase.regionserver.lease.period") What can i tell is the following: Last week the problem was consistent 1. I updated hbase.regionserver.lease.period=300000 (5 mins), restarted the cluster and still got the problem, the map got this exception event before the 5 mins, (some after 1 min and 20 sec) 2. The problem occurs only on job that will extract a large number of columns (>150 cols per row) 3. The problem never occurred when only 1 map per server is running (i have 8 CPU with hyper-threaded enabled = 16, so using only 1 map per machine is just a waste), (at this stage I was thinking perhaps there is a multi-threaded problem)
This week i got a sightly different behavior, after having restarted the servers. The extract were able to ran ok in most of the runs even with 4 maps running (per servers), i got only once the exception but the job was not killed as other runs last week If you have insight i will be happy to hear. Mikael.S On Tue, Feb 14, 2012 at 1:39 AM, Jean-Daniel Cryans <[email protected]>wrote: > Late answer, did you figure it out? > > This exception happens when you don't use your scanner lease for more > than the lease time (default one minute). AFAIK that didn't change, so > maybe something else got slow? Or maybe some special configurations > you had didn't make it during the upgrade? > > J-D > > On Mon, Feb 6, 2012 at 12:33 AM, Mikael Sitruk <[email protected]> > wrote: > > Hi all > > > > Recently I have upgraded my cluster from Hbase 0.90.1 to 0.90.4 (using > > cloudera from cdh3u0 to cdh3u2) > > Everything was ok till I ran pig extract on the new cluster, from the old > > cluster everything worked well. > > Now each time i run the extract in conjunction to other work performed on > > the cluster I get the following exception during job execution, and the > job > > is finally killed. > > It appears only on heavy extract (For small extract < minute execution > > everything is ok). > > > > Any suggestion? > > Below a stacktrace from the task. > > > > 2012-02-06 01:18:26,561 INFO > > org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat: > > setScan with ranges: 374419525347267552315334599026232694541505353300281 > - > > 374419525347267552315334599026232694541509698599225 ( 4345298944) > > 2012-02-06 01:19:32,836 INFO org.apache.hadoop.mapred.TaskLogsTruncater: > > Initializing logs' truncater with mapRetainSize=-1 and > reduceRetainSize=-1 > > 2012-02-06 01:19:32,845 WARN org.apache.hadoop.mapred.Child: Error > running > > child org.apache.hadoop.hbase.regionserver.LeaseException: > > org.apache.hadoop.hbase.regionserver.LeaseException: lease > > '-7220618182832784549' > > does not exist > > > > at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java: > 230) > > > > at > org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java: > > 1862) > > at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > > > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: > > 25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at > > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java: 570) > > at > > org.apache.hadoop.hbase.ipc.HBaseServer > $Handler.run(HBaseServer.java:1039) > > at > > sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > > > > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java: > > 39) > > > > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java: > > 27) > > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > > > > at > org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java: > > 96) > > > > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java: > > 83) > > > > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java: > > 38) > > at > > org.apache.hadoop.hbase.client.HConnectionManager > > > $HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java: > > 1019) > > at > > org.apache.hadoop.hbase.client.HTable > $ClientScanner.next(HTable.java:1151) > > > > at > org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java: > > 133) > > > > at > org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java: > > 142) > > at > > org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat > > $HBaseTableRecordReader.nextKeyValue(HBaseTableInputFormat.java:162) > > > > at > org.apache.pig.backend.hadoop.hbase.HBaseStorage.getNext(HBaseStorage.java: > > 321) > > > > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java: > > 187) > > at > > org.apache.hadoop.mapred.MapTask > > $NewTrackingRecordReader.nextKeyValue(MapTask.java:456) > > > > at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java: > 67) > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > > > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: > > 1127) > > at org.apache.hadoop.mapred.Child.main(Child.java:264) 2012-02-06 > > 01:19:32,978 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the > > task > > > > Thanks > > Mikael.S > -- Mikael.S
