bq. I am getting OOM exception yet. Can you confirm whether you got OOME or not ?
bq. Would it be 101 threads of endpoint executing in parallel? This signifies the need for your endpoint to use memory conservatively. Cheers On Wed, Mar 13, 2013 at 8:19 AM, Kumar, Deepak8 <deepak8.ku...@citi.com>wrote: > Thanks guys for assisting. I am getting OOM exception yet. I have one > query about Endpoints. As endpoint executes in parallel, so if I have a > table which is distributed at 101 regions across 5 regionserver. Would it > be 101 threads of endpoint executing in parallel?**** > > ** ** > > Regards,**** > > Deepak**** > > ** ** > > *From:* Gary Helmling [mailto:ghelml...@gmail.com] > *Sent:* Tuesday, March 12, 2013 2:14 PM > *To:* user@hbase.apache.org > *Cc:* lars hofhansl; Kumar, Deepak8 [CCC-OT_IT NE] > > *Subject:* Re: Regionserver goes down while endpoint execution**** > > ** ** > > To expand on what Himanshu said, your endpoint is doing an unbounded scan > on the region, so with a region with a lot of rows it's taking more than 60 > seconds to run to the region end, which is why the client side of the call > is timing out. In addition you're building up an in memory list of all the > values for that qualifier in that region, which could cause you to bump > into OOM issues, depending on how big your values are and how sparse the > given column qualifier is. If you trigger an OOMException, then the region > server would abort.**** > > ** ** > > For this usage specifically, though -- scanning through a single column > qualifier for all rows -- you would be better off just doing a normal > client side scan, ie. HTable.getScanner(). Then you will avoid the client > timeout and potential server-side memory issues.**** > > ** ** > > On Tue, Mar 12, 2013 at 9:29 AM, Ted Yu <yuzhih...@gmail.com> wrote:**** > > From region server log: > > 2013-03-12 03:07:22,605 DEBUG org.apache.hadoop.hdfs.DFSClient: Error > making BlockReader. Closing stale > Socket[addr=/10.42.105.112,port=50010,localport=54114] > java.io.EOFException: Premature EOF: no length prefix available > at > org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162) > at > org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:407) > > What version of HBase and hadoop are you using ? > Do versions of hadoop on Eclipse machine and in your cluster match ? > > Cheers**** > > > On Tue, Mar 12, 2013 at 4:46 AM, Kumar, Deepak8 <deepak8.ku...@citi.com > >wrote:**** > > > Lars,**** > > > > I am getting following errors at datanode & region servers.**** > > > > ** ** > > > > Regards,**** > > > > Deepak**** > > > > ** ** > > > > *From:* Kumar, Deepak8 [CCC-OT_IT NE] > > *Sent:* Tuesday, March 12, 2013 3:00 AM > > *To:* Kumar, Deepak8 [CCC-OT_IT NE]; 'user@hbase.apache.org'; 'lars > > hofhansl' > > > > *Subject:* RE: Regionserver goes down while endpoint execution**** > > > > ** ** > > > > Lars,******** > > > > > It is having following errors when I execute the Endpoint RPC client from > > eclipse. It seems some of the regions at regionserver**** > > > vm-8aa9-fe74.nam.nsroot.net is taking more time to reponse.**** > > > > ** ****** > > > > > Could you guide how to fix it. I don’t find any option to set > hbase.rpc.timeout**** > > > from hbase configuration menu in CDH4 CM server for hbase > configuration.** > > ** > > > > ** ** > > > > Regards,**** > > > > Deepak**** > > > > ** ****** > > > > > 3/03/12 02:33:12 INFO zookeeper.ClientCnxn: Session establishment > complete > > on server vm-15c2-3bbf.nam.nsroot.net/10.96.172.44:2181, sessionid =**** > > > 0x53d591b77090026, negotiated timeout = 60000******** > > > > > Mar 12, 2013 2:33:13 AM org.apache.hadoop.conf.Configuration**** > > > warnOnceIfDeprecated******** > > > > > WARNING: hadoop.native.lib is deprecated. Instead, use**** > > > io.native.lib.available******** > > > > > Mar 12, 2013 2:44:00 AM > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation > **** > > > processExecs******** > > > > > WARNING: Error executing for row 153299:1362780381523:2932572079500658:* > *** > > > vm-ab1f-dd21.nam.nsroot.net:**** > > > > *java.util.concurrent.ExecutionException*: * > > org.apache.hadoop.hbase.client.RetriesExhaustedException*: Failed after > > attempts=10, exceptions:******** > > > > > Tue Mar 12 02:34:15 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch : > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271 > remote=**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:35:16 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:36:18 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:37:20 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:38:22 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:39:25 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:40:30 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:41:34 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:42:43 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:44:00 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > > > > ** ** > > > > at java.util.concurrent.FutureTask$Sync.innerGet(Unknown > Source)**** > > > > at java.util.concurrent.FutureTask.get(Unknown Source)**** > > > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs( > > *HConnectionManager.java:1466*)**** > > > > at org.apache.hadoop.hbase.client.HTable.coprocessorExec(* > > HTable.java:1577*)**** > > > > at org.apache.hadoop.hbase.client.HTable.coprocessorExec(* > > HTable.java:1557*)**** > > > > at > com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog.main( > > *HBaseEndPointClientForElfLog.java:33*)**** > > > > Caused by: *org.apache.hadoop.hbase.client.RetriesExhaustedException*: > > Failed after attempts=10, exceptions:******** > > > > > Tue Mar 12 02:34:15 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch : > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271 > remote=**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:35:16 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:36:18 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:37:20 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:38:22 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:39:25 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:40:30 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:41:34 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:42:43 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:44:00 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842 > remote= > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > > > > ** ** > > > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries( > > *HConnectionManager.java:1345*)**** > > > > at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(* > > ExecRPCInvoker.java:79*)**** > > > > at $Proxy8.getValues(Unknown Source)**** > > > > at > > com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(* > > HBaseEndPointClientForElfLog.java:38*)**** > > > > at > > com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(* > > HBaseEndPointClientForElfLog.java:1*)**** > > > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call( > > *HConnectionManager.java:1454*)**** > > > > at java.util.concurrent.FutureTask$Sync.innerRun(Unknown > Source)**** > > > > at java.util.concurrent.FutureTask.run(Unknown Source)**** > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > > Source)**** > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > > **** > > > > at java.lang.Thread.run(Unknown Source)**** > > > > *org.apache.hadoop.hbase.client.RetriesExhaustedException*: Failed after > > attempts=10, exceptions:******** > > > > > Tue Mar 12 02:34:15 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch : > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271 > remote=**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:35:16 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:36:18 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:37:20 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:38:22 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:39:25 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:40:30 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:41:34 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:42:43 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677 > remote= > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > **** > > > > > Tue Mar 12 02:44:00 EDT 2013,**** > > > org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, * > > java.net.SocketTimeoutException*: Call to**** > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout* > *** > > > exception: *java.net.SocketTimeoutException*: 60000 millis timeout while > **** > > > waiting for channel to be ready for read. ch :**** > > > java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842 > remote= > > > vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020]****<http://vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020%5d****> > > > > ** ** > > > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries( > > *HConnectionManager.java:1345*)**** > > > > at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(* > > ExecRPCInvoker.java:79*)**** > > > > at $Proxy8.getValues(Unknown Source)**** > > > > at > > com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(* > > HBaseEndPointClientForElfLog.java:38*)**** > > > > at > > com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(* > > HBaseEndPointClientForElfLog.java:1*)**** > > > > at > > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call( > > *HConnectionManager.java:1454*)**** > > > > at java.util.concurrent.FutureTask$Sync.innerRun(Unknown > Source)**** > > > > at java.util.concurrent.FutureTask.run(Unknown Source)**** > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > > Source)**** > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > Source) > > **** > > > > at java.lang.Thread.run(Unknown Source)**** > > > > ** ** > > > > ** ** > > > > *From:* Kumar, Deepak8 [CCC-OT_IT NE] > > *Sent:* Tuesday, March 12, 2013 2:27 AM > > *To:* 'user@hbase.apache.org'; 'lars hofhansl' > > *Subject:* RE: Regionserver goes down while endpoint execution**** > > > > ** ** > > > > Lars,******** > > > > > Thanks for your quick response.There is not much info in region server** > ** > > > log. I am again executing it with DEBUG log level in region servers.**** > > > > ** ** > > > > *Here is the endpoint code* > > > > ** ** > > > > public class ColumnAggregationEndpoint extends > BaseEndpointCoprocessor**** > > > > implements ColumnAggregationProtocol {**** > > > > **** > > > > @Override******** > > > > > public List<String> getValues(byte[] family, byte[] qualifier, > int**** > > > batchSize, int cacheSize)**** > > > > throws IOException {**** > > > > // aggregate at each region**** > > > > Scan scan = new Scan();**** > > > > scan.addColumn(family, qualifier);**** > > > > scan.setCaching(cacheSize);**** > > > > scan.setBatch(batchSize);**** > > > > List<String> values = new ArrayList<String>();**** > > > > RegionCoprocessorEnvironment environment =**** > > > > (RegionCoprocessorEnvironment) getEnvironment();**** > > > > **** > > > > InternalScanner scanner = > > environment.getRegion().getScanner(scan);**** > > > > try {**** > > > > List<KeyValue> curVals = new ArrayList<KeyValue>();**** > > > > boolean hasMore = false;**** > > > > do {**** > > > > curVals.clear();**** > > > > hasMore = scanner.next(curVals);**** > > > > KeyValue kv = curVals.get(0);**** > > > > values.add(Bytes.toString(kv.getValue()));**** > > > > } while (hasMore);**** > > > > } finally {**** > > > > scanner.close();**** > > > > }**** > > > > return values;**** > > > > }**** > > > > }**** > > > > ** ** > > > > ** ** > > > > ** ** > > > > *The RPC client to invoke the Endpoint is as follows:* > > > > ** ** > > > > public class HBaseEndPointClientForElfLog {**** > > > > public static void main(String[] args) {**** > > > > try {**** > > > > Configuration conf = HBaseConfiguration.create();**** > > > > conf.set(**** > > > > "hbase.zookeeper.quorum",**** > > > > "vm-ab1f-dd21.nam.nsroot.net, > > vm-cb03-2277.nam.nsroot.net,vm-15c2-3bbf.nam.nsroot.net");**** > > > > String tableName = "elf_log";**** > > > > final String columnFamily = "content";**** > > > > final String columnQualifier = "logFileName";**** > > > > final String startRowKey = > > "153299:1362780381523:2932572079500658:vm-ab1f-dd21.nam.nsroot.net > :";**** > > > > final String endRowKey = "153299:1362953388000";**** > > > > HTableInterface table = new HTable(conf, > tableName);**** > > > > Scan scan;**** > > > > Map<byte[], List<String>> results;**** > > > > **** > > > > // scan: for all regions**** > > > > scan = new Scan();**** > > > > **** > > > > results = > > table.coprocessorExec(ColumnAggregationProtocol.class,**** > > > > startRowKey.getBytes(), > endRowKey.getBytes(), > > **** > > > > new Batch.Call<ColumnAggregationProtocol, > > List<String>>() {**** > > > > public List<String> > > call(ColumnAggregationProtocol instance)**** > > > > throws IOException {**** > > > > return > > instance.getValues(columnFamily.getBytes(),**** > > > > > > columnQualifier.getBytes(),2,5);**** > > > > }**** > > > > });**** > > > > ******** > > > > > for (Map.Entry<byte[], List<String>> e :**** > > > results.entrySet()) {******** > > > > > System.out.println("Size of list returned:**** > > > "+e.getValue().size());**** > > > > for(String singleVal: e.getValue()){**** > > > > System.out.println(singleVal);**** > > > > }**** > > > > **** > > > > **** > > > > }**** > > > > } catch (Throwable throwable) {**** > > > > throwable.printStackTrace();**** > > > > }**** > > > > }**** > > > > }**** > > > > ** ** > > > > Regards,**** > > > > Deepak**** > > > > ** ****** > > > > > -----Original Message----- > > From: lars hofhansl [mailto:la...@apache.org <la...@apache.org>] > > Sent: Tuesday, March 12, 2013 2:01 AM > > To: user@hbase.apache.org**** > > > Subject: Re: Regionserver goes down while endpoint execution**** > > > > ** ** > > > > What does the region server log say?**** > > > > ** ** > > > > ** ****** > > > > > Endpoints do not run in a sandbox. You can call System.exit(...) and your > **** > > > RegionServer will happily exit.**** > > > > If you can, please show us your endpoint code.**** > > > > ** ** > > > > -- Lars**** > > > > ** ** > > > > ** ** > > > > ** ** > > > > ________________________________**** > > > > From: "Kumar, Deepak8 " <deepak8.ku...@citi.com>**** > > > > To: "'user@hbase.apache.org'" <user@hbase.apache.org> **** > > > > Sent: Monday, March 11, 2013 10:51 PM**** > > > > Subject: Regionserver goes down while endpoint execution**** > > > > ** ** > > > > Hi,******** > > > > > I have a table in hbase which has more than 5GB of data, it is > distributed**** > > > at 101 regions at 5 regionservers.**** > > > > ** ****** > > > > > When I execute an endpoint which is supposed to fetch a column qualifier > > value using an endpoint RPC client, the region server goes down. The > hbase > > master log says "Can't connect to region, retrying.." The same endpoint* > *** > > > works fine for tables which has 300 records.**** > > > > ** ** > > > > Could you please guide me the reason for being regionserver down?**** > > > > ** ** > > > > Regards,**** > > > > Deepak**** > >**** > > ** ** >