Including dev mailing list. So I let it run, and after about 43 minutes I finally got some exceptions (Sorry for the long paste)
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Thu Sep 27 14:59:29 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:42711 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:01:49 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:55858 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:04:09 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:40588 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:06:29 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:51399 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:08:50 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:41546 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:11:11 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:43072 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:13:34 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:43809 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:15:58 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:53426 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:18:25 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:33724 remote=finddev07.dev.oclc.org/192.168.215.7:29319] Thu Sep 27 15:21:00 EDT 2012, org.apache.hadoop.hbase.client.ScannerCallable@6c6742d0, java.net.SocketTimeoutException: Call to finddev07.dev.oclc.org/192.168.215.7:29319 failed on socket timeout exception: java.net.SocketTimeoutException: 10000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/192.168.215.7:51604 remote=finddev07.dev.oclc.org/192.168.215.7:29319] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat ion.getRegionServerWithRetries(HConnectionManager.java:1345) at org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.j ava:1246) at org.apache.hadoop.hbase.client.HTable$ClientScanner.initialize(HTable.ja va:1169) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:670) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.getScanner(HTable Pool.java:381) at org.oclc.higgins.hbase.util.HBaseUtils.getHBaseRegions(HBaseUtils.java:9 5) at org.oclc.higgins.hbase.snoop.Snoop.getCatalogRowsGroupedByRegionServer(S noop.java:392) at org.oclc.higgins.hbase.snoop.Snoop.watch(Snoop.java:318) at org.oclc.higgins.hbase.snoop.Snoop.main(Snoop.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) From: Espinoza,Carlos Sent: Thursday, September 27, 2012 2:46 PM To: user@hbase.apache.org Subject: Getting scans to timeout Hi Thanks for you help. I've been doing this in a pseudo-distributed hbase-0.92.1 environment with one region server. I'm trying to scan a table and see it timeout. I'm trying to recreate a scenario where the RS is not responding (for instance due to NIC failure). So I've been issuing a 'kill -STOP' to the region server, and I expected the client to timeout but instead it just blocks at HTable.getScanner(). There is no output, no retries, nothing. I understand that I'm pausing the execution on the region server, but from a client perspective, I'm thinking that this should not matter. My question is, is this a fair test? And if it is, any idea on how I can get it to not block? I've been playing around with client side settings, but no success. I've tried these settings (10sec) conf.setInt("hbase.rpc.timeout", 10000); conf.setInt("hbase.client.operation.timeout", 10000); I've also tried these HBaseClient.setSocketTimeout(this.conf, 10000); HBaseClient.setPingInterval(this.conf, 10000); This is the jstack output of my application after I "STOP" the region server "main" prio=10 tid=0x000000005c812000 nid=0x594e in Object.wait() [0x00000000410c4000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00002aaae205ee80> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:904) - locked <0x00002aaae205ee80> (a org.apache.hadoop.hbase.ipc.HBaseClient$Call) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpc Engine.java:150) at $Proxy4.openScanner(Unknown Source) at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallab le.java:120) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java :76) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java :39) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementat ion.getRegionServerWithRetries(HConnectionManager.java:1325) at org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.j ava:1246) at org.apache.hadoop.hbase.client.HTable$ClientScanner.initialize(HTable.ja va:1169) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:670) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.getScanner(HTable Pool.java:381) at org.oclc.higgins.hbase.util.HBaseUtils.getHBaseRegions(HBaseUtils.java:9 5) at org.oclc.higgins.hbase.snoop.Snoop.getCatalogRowsGroupedByRegionServer(S noop.java:392) at org.oclc.higgins.hbase.snoop.Snoop.watch(Snoop.java:318) at org.oclc.higgins.hbase.snoop.Snoop.main(Snoop.java:278) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197)