Hi Lars, Thank you for writing, The existing setup at my disposal is Cloudera CDH3. Do you have any information about connection pooling in CDH3? Also the client machine is WinXP, the main concern is the concurrent connection limit at the TCP/IP stack level. Of course if there is limitation at the level of JVM itself, the whole multithreaded app will suffer?
-----Original Message----- From: lars hofhansl [mailto:[email protected]] Sent: Tuesday, August 23, 2011 9:55 PM To: [email protected] Subject: Re: Multithreaded get The problem is that the requests are to some extend serialized over the connection. (See HBaseClient.sendParam, where is lock is held while network IO is in progress). HBase Trunk has connection pooling (see HBASE-2939 and HBASE-4150). In my tests this sped up requests/sec with multiple threads significantly (sometimes by a factor of 2 or 3). -- Lars ________________________________ From: Srikanth P. Shreenivas <[email protected]> To: "[email protected]" <[email protected]> Sent: Monday, August 22, 2011 11:43 PM Subject: RE: Multithreaded get Hi Jimson, In my experience, I have observed that as you increase number of threads, the get/put starts taking more time. The reason being that same TCP connection is used for all the gets/puts from a single JVM. All requests are multiplexed on the same connection. Hence, your example of gets taking 10ms is function of the minimum amount of time a single get takes. So, you cannot make it any faster by adding more threads. I had done some tests in the past with puts. Here are my observations: http://www.srikanthps.com/2011/06/hbase-benchmarking-for-multi-threaded.html Regards, Srikanth -----Original Message----- From: Jimson K. James [mailto:[email protected]] Sent: Tuesday, August 23, 2011 11:40 AM To: [email protected] Subject: RE: Multithreaded get Hi Li Pi, Thank you for your quick response. What I see here is, When we are reading 1000 keys, each key of 1MB data, from a total number of 5 nodes, one node shows 100% network usage with data receive and 50% network usage of data transmit from other 3 nodes (5th being just the name node with a little network traffic). Seems like the keys are aggregated onto a node before serving??? There is no map reduce in question just the plain Get operation. Any idea? Also with the multithread app, the data retrieval speed is showing weird behavior. For example, if a single threaded app took 10 ms to Get 2 rows, then a two thread app should took 5 ms, but when tested it is taking 10ms. ?? From: Li Pi [mailto:[email protected]] Sent: Tuesday, August 23, 2011 9:38 AM To: [email protected] Subject: Re: Multithreaded get Yes. Even if all keys are on the same region, you'll experience a speedup if multithreaded. Sort of relevant: read performance test with differing number of reader threads based on where the file is cached. On Mon, Aug 22, 2011 at 9:04 PM, Jimson K. James <[email protected]> wrote: Hi All, Can anyone confirm that, when a multi threaded application, say with 10 threads, try to get 10 different keys from 10 different regions spread over 10 nodes yield 1/10th of the total time taken by a single thread to fetch the same 10 keys? Or in other words, If I get 10 ms for the Get of a single key, then for 10 keys, 10*10=100ms for single threaded application and Approx 10ms for 10 keys in a 10 threaded application? Will the 10 threads retrieve the 10 keys simultaneously? The target keys are all 1MB in size and the network speed is 10/100Mbps lan. Thanks & Regards, Jimson K James ***** Confidentiality Statement/Disclaimer ***** This message and any attachments is intended for the sole use of the intended recipient. It may contain confidential information. Any unauthorized use, dissemination or modification is strictly prohibited. If you are not the intended recipient, please notify the sender immediately then delete it from all your systems, and do not copy, use or print. Internet communications are not secure and it is the responsibility of the recipient to make sure that it is virus/malicious code exempt. The company/sender cannot be responsible for any unauthorized alterations or modifications made to the contents. If you require any form of confirmation of the contents, please contact the company/sender. The company/sender is not liable for any errors or omissions in the content of this message. ***** Confidentiality Statement/Disclaimer ***** This message and any attachments is intended for the sole use of the intended recipient. It may contain confidential information. Any unauthorized use, dissemination or modification is strictly prohibited. If you are not the intended recipient, please notify the sender immediately then delete it from all your systems, and do not copy, use or print. Internet communications are not secure and it is the responsibility of the recipient to make sure that it is virus/malicious code exempt. The company/sender cannot be responsible for any unauthorized alterations or modifications made to the contents. If you require any form of confirmation of the contents, please contact the company/sender. The company/sender is not liable for any errors or omissions in the content of this message. ________________________________ http://www.mindtree.com/email/disclaimer.html ***** Confidentiality Statement/Disclaimer ***** This message and any attachments is intended for the sole use of the intended recipient. It may contain confidential information. Any unauthorized use, dissemination or modification is strictly prohibited. If you are not the intended recipient, please notify the sender immediately then delete it from all your systems, and do not copy, use or print. Internet communications are not secure and it is the responsibility of the recipient to make sure that it is virus/malicious code exempt. The company/sender cannot be responsible for any unauthorized alterations or modifications made to the contents. If you require any form of confirmation of the contents, please contact the company/sender. The company/sender is not liable for any errors or omissions in the content of this message.
