Hi Ted, 6 minutes is too long :( Will this decrease to seconds if more nodes are added in the cluster?
I got this exception finally(I recall faintly about increasing some timeout parameter while querying but I didn't want to increase it to a high value) : Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation processExecs WARNING: Error executing for row java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020] at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processExecs(HConnectionManager.java:1475) at org.apache.hadoop.hbase.client.HTable.coprocessorExec(HTable.java:1236) at org.apache.hadoop.hbase.client.coprocessor.AggregationClient.rowCount(AggregationClient.java:216) at client.hbase.HBaseCRUD.getTableCount(HBaseCRUD.java:307) at client.hbase.HBaseCRUD.main(HBaseCRUD.java:117) Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020] at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183) at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79) at $Proxy6.getRowNum(Unknown Source) at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220) at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Apr 19, 2013 1:05:43 PM org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation internalClose INFO: Closed zookeeper sessionid=0x13e185b8ee8003a org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=10, exceptions: Fri Apr 19 12:56:01 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1770 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:57:02 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1782 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:58:04 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1785 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 12:59:05 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1794 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:00:08 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1800 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:01:10 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1802 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:02:14 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1804 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:03:19 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1809 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:04:27 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1812 remote=cldx-1140-1034/172.25.6.71:60020] Fri Apr 19 13:05:43 IST 2013, org.apache.hadoop.hbase.ipc.ExecRPCInvoker$1@1d6e77, java.net.SocketTimeoutException: Call to cldx-1140-1034/172.25.6.71:60020 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/0.0.0.0:1829 remote=cldx-1140-1034/172.25.6.71:60020] at org.apache.hadoop.hbase.client.ServerCallable.withRetries(ServerCallable.java:183) at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79) at $Proxy6.getRowNum(Unknown Source) at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:220) at org.apache.hadoop.hbase.client.coprocessor.AggregationClient$3.call(AggregationClient.java:217) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1463) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Regards, Omkar Joshi -----Original Message----- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Friday, April 19, 2013 3:00 PM To: user@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: Speeding up the row count Since there is only one region in your table, using aggregation coprocessor has no advantage. I think there may be some issue with your cluster - row count should finish within 6 minutes. Have you checked server logs ? Thanks On Apr 19, 2013, at 12:33 AM, Omkar Joshi <omkar.jo...@lntinfotech.com> wrote: > Hi, > > I'm having a 2-node(VMs) Hadoop cluster atop which HBase is running in the > distributed mode. > > I'm having a table named ORDERS with >100000 rows. > > NOTE : Since my cluster is ultra-small, I didn't pre-split the table. > > ORDERS > rowkey : ORDER_ID > > column family : ORDER_DETAILS > columns : CUSTOMER_ID > PRODUCT_ID > REQUEST_DATE > PRODUCT_QUANTITY > PRICE > PAYMENT_MODE > > The java client code to simply check the count of the records is : > > public long getTableCount(String tableName, String columnFamilyName) { > > AggregationClient aggregationClient = new > AggregationClient(config); > Scan scan = new Scan(); > scan.addFamily(Bytes.toBytes(columnFamilyName)); > scan.setFilter(new FirstKeyOnlyFilter()); > > long rowCount = 0; > > try { > rowCount = > aggregationClient.rowCount(Bytes.toBytes(tableName), > null, scan); > System.out.println("No. of rows in " + tableName + " > is " > + rowCount); > } catch (Throwable e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } > > return rowCount; > } > > It is running for more than 6 minutes now :( > > What shall I do to speed up the execution to milliseconds(at least a couple > of seconds)? > > Regards, > Omkar Joshi > > > -----Original Message----- > From: Vedad Kirlic [mailto:kirl...@gmail.com] > Sent: Thursday, April 18, 2013 12:22 AM > To: user@hbase.apache.org > Subject: Re: Speeding up the row count > > Hi Omkar, > > If you are not interested in occurrences of specific column (e.g. name, > email ... ), and just want to get total number of rows (regardless of their > content - i.e. columns), you should avoid adding any columns to the Scan, in > which case coprocessor implementation for AggregateClient, will add > FirstKeyOnlyFilter to the Scan, so to avoid loading unnecessary columns, so > this should result in some speed up. > > This is similar approach to what hbase shell 'count' implementation does, > although reduction in overhead in that case is bigger, since data transfer > from region server to client (shell) is minimized, whereas in case of > coprocessor, data does not leave region server, so most of the improvement > in that case should come from avoiding loading of unnecessary files. Not > sure how this will apply to your particular case, given that data set per > row seems to be rather small. Also, in case of AggregateClient you will > benefit if/when your tables span multiple regions. Essentially, performance > of this approach will 'degrade' as your table gets bigger, but only to the > point when it splits, from which point it should be pretty constant. Having > this in mind, and your type of data, you might consider pre-splitting your > tables. > > DISCLAIMER: this is mostly theoretical, since I'm not an expert in hbase > internals :), so your best bet is to try it - I'm too lazy to verify impact > my self ;) > > Finally, if your case can tolerate eventual consistency of counters with > actual number of rows, you can, as already suggested, have RowCounter map > reduce run every once in a while, write the counter(s) back to hbase, and > read those when you need to obtain the number of rows. > > Regards, > Vedad > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/Speeding-up-the-row-count-tp4042378p4042415.html > Sent from the HBase User mailing list archive at Nabble.com. > > The contents of this e-mail and any attachment(s) may contain confidential or > privileged information for the intended recipient(s). Unintended recipients > are prohibited from taking action on the basis of information in this e-mail > and using or disseminating the information, and must notify the sender and > delete it from their system. L&T Infotech will not accept responsibility or > liability for the accuracy or completeness of, or the presence of any virus > or disabling code in this e-mail"