Lars, It is having following errors when I execute the Endpoint RPC client from eclipse. It seems some of the regions at regionserver vm-8aa9-fe74.nam.nsroot.net is taking more time to reponse.
Could you guide how to fix it. I don't find any option to set hbase.rpc.timeout from hbase configuration menu in CDH4 CM server for hbase configuration. Regards, Deepak 3/03/12 02:33:12 INFO zookeeper.ClientCnxn: Session establishment complete on server vm-15c2-3bbf.nam.nsroot.net/10.96.172.44:2181, sessionid = 0x53d591b77090026, negotiated timeout = 60000 Mar 12, 2013 2:33:13 AM org.apache.hadoop.conf.Configuration warnOnceIfDeprecated WARNING: hadoop.native.lib is deprecated. Instead, use io.native.lib.available Mar 12, 2013 2:44:00 AM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation processExecs WARNING: Error executing for row 153299:1362780381523:2932572079500658:vm-ab1f-dd21.nam.nsroot.net: java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Tue Mar 12 02:34:15 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:35:16 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:36:18 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:37:20 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:38:22 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:39:25 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:40:30 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:41:34 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:42:43 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:44:00 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1466) at org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1577) at org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1557) at com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog.main(HBaseEndPointClientForElfLog.java:33) Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Tue Mar 12 02:34:15 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:35:16 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:36:18 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:37:20 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:38:22 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:39:25 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:40:30 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:41:34 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:42:43 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:44:00 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345) at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79) at $Proxy8.getValues(Unknown Source) at com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(HBaseEndPointClientForElfLog.java:38) at com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(HBaseEndPointClientForElfLog.java:1) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1454) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Tue Mar 12 02:34:15 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2271 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:35:16 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2403 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:36:18 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2465 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:37:20 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2500 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:38:22 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2538 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:39:25 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2572 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:40:30 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2606 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:41:34 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2640 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:42:43 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2677 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] Tue Mar 12 02:44:00 EDT 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@39443f, java.net.SocketTimeoutException: Call to vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/150.110.96.212:2842 remote=vm-8aa9-fe74.nam.nsroot.net/10.42.105.91:60020] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1345) at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79) at $Proxy8.getValues(Unknown Source) at com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(HBaseEndPointClientForElfLog.java:38) at com.citi.sponge.hbase.endpoint.HBaseEndPointClientForElfLog$1.call(HBaseEndPointClientForElfLog.java:1) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1454) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) From: Kumar, Deepak8 [CCC-OT_IT NE] Sent: Tuesday, March 12, 2013 2:27 AM To: 'user@hbase.apache.org'; 'lars hofhansl' Subject: RE: Regionserver goes down while endpoint execution Lars, Thanks for your quick response.There is not much info in region server log. I am again executing it with DEBUG log level in region servers. Here is the endpoint code public class ColumnAggregationEndpoint extends BaseEndpointCoprocessor implements ColumnAggregationProtocol { @Override public List<String> getValues(byte[] family, byte[] qualifier, int batchSize, int cacheSize) throws IOException { // aggregate at each region Scan scan = new Scan(); scan.addColumn(family, qualifier); scan.setCaching(cacheSize); scan.setBatch(batchSize); List<String> values = new ArrayList<String>(); RegionCoprocessorEnvironment environment = (RegionCoprocessorEnvironment) getEnvironment(); InternalScanner scanner = environment.getRegion().getScanner(scan); try { List<KeyValue> curVals = new ArrayList<KeyValue>(); boolean hasMore = false; do { curVals.clear(); hasMore = scanner.next(curVals); KeyValue kv = curVals.get(0); values.add(Bytes.toString(kv.getValue())); } while (hasMore); } finally { scanner.close(); } return values; } } The RPC client to invoke the Endpoint is as follows: public class HBaseEndPointClientForElfLog { public static void main(String[] args) { try { Configuration conf = HBaseConfiguration.create(); conf.set( "hbase.zookeeper.quorum", "vm-ab1f-dd21.nam.nsroot.net,vm-cb03-2277.nam.nsroot.net,vm-15c2-3bbf.nam.nsroot.net"); String tableName = "elf_log"; final String columnFamily = "content"; final String columnQualifier = "logFileName"; final String startRowKey = "153299:1362780381523:2932572079500658:vm-ab1f-dd21.nam.nsroot.net:"; final String endRowKey = "153299:1362953388000"; HTableInterface table = new HTable(conf, tableName); Scan scan; Map<byte[], List<String>> results; // scan: for all regions scan = new Scan(); results = table.coprocessorExec(ColumnAggregationProtocol.class, startRowKey.getBytes(), endRowKey.getBytes(), new Batch.Call<ColumnAggregationProtocol, List<String>>() { public List<String> call(ColumnAggregationProtocol instance) throws IOException { return instance.getValues(columnFamily.getBytes(), columnQualifier.getBytes(),2,5); } }); for (Map.Entry<byte[], List<String>> e : results.entrySet()) { System.out.println("Size of list returned: "+e.getValue().size()); for(String singleVal: e.getValue()){ System.out.println(singleVal); } } } catch (Throwable throwable) { throwable.printStackTrace(); } } } Regards, Deepak -----Original Message----- From: lars hofhansl [mailto:la...@apache.org] Sent: Tuesday, March 12, 2013 2:01 AM To: user@hbase.apache.org<mailto:user@hbase.apache.org> Subject: Re: Regionserver goes down while endpoint execution What does the region server log say? Endpoints do not run in a sandbox. You can call System.exit(...) and your RegionServer will happily exit. If you can, please show us your endpoint code. -- Lars ________________________________ From: "Kumar, Deepak8 " <deepak8.ku...@citi.com<mailto:deepak8.ku...@citi.com>> To: "'user@hbase.apache.org'" <user@hbase.apache.org<mailto:user@hbase.apache.org>> Sent: Monday, March 11, 2013 10:51 PM Subject: Regionserver goes down while endpoint execution Hi, I have a table in hbase which has more than 5GB of data, it is distributed at 101 regions at 5 regionservers. When I execute an endpoint which is supposed to fetch a column qualifier value using an endpoint RPC client, the region server goes down. The hbase master log says "Can't connect to region, retrying.." The same endpoint works fine for tables which has 300 records. Could you please guide me the reason for being regionserver down? Regards, Deepak