Hi Josh, The connection pooling code is attached AS IS (with all the usual legal disclaimers), note that you will have to modify it a bit to get it to compile because it depends on some internal libraries we use. In particular, DynamicAppSettings and Log are two internal classes that do what their names imply :) Make sure you initialize "servers" in the NewConnection() method to an array with your Thrift servers and you should be good to go. You use GetConnection() to get a connection and ReturnConnection() to return it back to the pool after you finish using it - make sure you don't close it in the application code.
-eran On Wed, Apr 27, 2011 at 00:30, Josh <j...@schulzone.org> wrote: > On Tue, Apr 26, 2011 at 3:34 AM, Eran Kutner <e...@gigya.com> wrote: > > Hi J-D, > > I don't think it's a Thrift issue. First, I use the TBufferedTransport > > transport, second, I implemented my own connection pool so the same > > connections are reused over and over again, > > Hey! I'm using C#->Hbase and high on my list of things todo is > 'Implement Thrift Connection Pooling in C#'. You have any desire to > release that code? > > > > so there is no overhead > > for opening and closing connections (I've verified that using > > Wireshark), third, if it was a client capacity issue I would expect to > > see an increase in throughput as I add more threads or run the test on > > two servers in parallel, this doesn't seem to happen, the total > > capacity remains unchanged. > > > > As for metrics, I already have it configured and monitored using > > Zabbix, but it only monitors specific counters, so let me know what > > information you would like to see. The numbers I quoted before are > > based on client counters and correlated with server counters ("multi" > > for writes and "get" for reads). > > > > -eran > > > > > > > > On Thu, Apr 21, 2011 at 20:43, Jean-Daniel Cryans <jdcry...@apache.org> > wrote: > >> > >> Hey Eran, > >> > >> Glad you could go back to debugging performance :) > >> > >> The scalability issues you are seeing are unknown to me, it sounds > >> like the client isn't pushing it enough. It reminded me of when we > >> switched to using the native Thrift PHP extension instead of the > >> "normal" one and we saw huge speedups. My limited knowledge of Thrift > >> may be blinding me, but I looked around for C# Thrift performance > >> issues and found threads like this one > >> http://www.mail-archive.com/user@thrift.apache.org/msg00320.html > >> > >> As you didn't really debug the speed of Thrift itself in your setup, > >> this is one more variable in the problem. > >> > >> Also you don't really provide metrics about your system apart from > >> requests/second. Would it be possible for you set them up using this > >> guide? http://hbase.apache.org/metrics.html > >> > >> J-D > >> > >> On Thu, Apr 21, 2011 at 5:13 AM, Eran Kutner <eran@> wrote: > >> > Hi J-D, > >> > After stabilizing the configuration, with your great help, I was able > >> > to go back to the the load tests. I tried using IRC, as you suggested, > >> > to continue this discussion but because of the time difference (I'm > >> > GMT+3) it is quite difficult to find a time when people are present > >> > and I am available to run long tests, so I'll give the mailing list > >> > one more try. > >> > > >> > I tested again on a clean table using 100 insert threads each, using a > >> > separate keyspace within the test table. Every row had just one column > >> > with 128 bytes of data. > >> > With one server and one region I got about 2300 inserts per second. > >> > After manually splitting the region I got about 3600 inserts per > >> > second (still on one machine). After a while the regions were balanced > >> > and one was moved to another server, that got writes to around 4500 > >> > writes per second. Additional splits and moves to more servers didn't > >> > improve this number and the write performance stabilized at ~4000 > >> > writes/sec per server. This seems pretty low, especially considering > >> > other numbers I've seen around here. > >> > > >> > Read performance is at around 1500 rows per second per server, which > >> > seems extremely low to me, especially considering that all the working > >> > set I was querying could fit in the servers memory. To make the test > >> > interesting I limited my client to fetch only 1 row (always the same > >> > one) from each keyspace, that yielded 10K reads per sec per server, so > >> > I tried increasing the range again a read the same 10 rows, now the > >> > performance dropped to 8500 reads/sec per server. Increasing the range > >> > to 100 rows and the performance drops to around 3500 reads per second > >> > per server. > >> > Do you have any idea what could explain this behavior and how do I get > >> > a decent number of reads from those servers? > >> > > >> > -eran > > > > > > -- > josh > @schulz > http://schulzone.org >