So I created an array of clients using the following code Clients = [riak.RiakClient(e, port=8087, transport_class=riak.RiakPbcTransport) for e in NODES]
After this I assigned each thread a particular id ranging from 0 to Number of Nodes So each thread now communicates with a single node. Even after this, I am getting <100 writes/sec On Wed, Jun 27, 2012 at 5:35 PM, Yousuf Fauzan <yousuffau...@gmail.com>wrote: > Oh! I think that may be an issue with my code then. > > Let me make some changes and get back to you. > > > On Wed, Jun 27, 2012 at 5:25 PM, Reid Draper <reiddra...@gmail.com> wrote: > >> >> On Jun 27, 2012, at 7:48 AM, Yousuf Fauzan wrote: >> >> This is great. >> >> I was loading data using Python. My code would spawn 10 threads and put >> data in a queue. All threads would read data from this queue. >> However, all threads were hitting the same server/load balancer. >> >> I tried a different setup too. Where I spawned processes with each >> process having its own queue. In this case too, all processes were hitting >> the same server. >> >> I just now made a change to my code. So now I have 10 threads randomly >> selecting a node and storing data in it. >> Again, I am getting around 50 writes/sec >> >> >> When the threads randomly pick a node, do they create a new connection to >> it, or do they pull the connection from >> a pool? As you saw with the throughput difference between curl and >> python, persistent connections make >> big difference. >> >> >> Could there be something wrong with the way I have written my loader >> script? >> >> On Wed, Jun 27, 2012 at 5:10 PM, Russell Brown <russell.br...@mac.com>wrote: >> >>> >>> On 27 Jun 2012, at 12:36, Yousuf Fauzan wrote: >>> >>> So I changed concurrency to 10 and put all the IPs of the nodes in basho >>> bench config. >>> Throughput is now around 1500. >>> >>> >>> I guess you can now try 5 or 15 concurrent workers and see which is >>> optimal for that set up to get a good feel for the sizing of any connection >>> pools for your application. >>> >>> You can also see how adding nodes and adding workers effects your >>> results to help you size the cluster you need for your expected usage. >>> >>> Cheers >>> >>> Russell >>> >>> >>> On Wed, Jun 27, 2012 at 4:40 PM, Russell Brown <russell.br...@mac.com>wrote: >>> >>>> >>>> On 27 Jun 2012, at 12:09, Yousuf Fauzan wrote: >>>> >>>> I used examples/riakc_pb.config >>>> >>>> {mode, max}. >>>> >>>> {duration, 10}. >>>> >>>> {concurrent, 1}. >>>> >>>> >>>> Try upping this. On my local 3 node cluster with 8gb ram and an old, >>>> cheap quad core per box I'd set concurrency to 10 workers. >>>> >>>> >>>> {driver, basho_bench_driver_riakc_pb}. >>>> >>>> {key_generator, {int_to_bin, {uniform_int, 10000}}}. >>>> >>>> {value_generator, {fixed_bin, 10000}}. >>>> >>>> {riakc_pb_ips, [{<IP of one of the nodes>}]}. >>>> >>>> >>>> I add all the IPs here, one entry per node. >>>> >>>> >>>> {riakc_pb_replies, 1}. >>>> >>>> {operations, [{get, 1}, {update, 1}]}. >>>> >>>> >>>> On Wed, Jun 27, 2012 at 4:37 PM, Russell Brown >>>> <russell.br...@mac.com>wrote: >>>> >>>>> >>>>> On 27 Jun 2012, at 12:05, Yousuf Fauzan wrote: >>>>> >>>>> I did use basho bench on my clusters. It should throughput of around >>>>> 150 >>>>> >>>>> >>>>> Could you share the config you used, please? >>>>> >>>>> >>>>> On Wed, Jun 27, 2012 at 4:24 PM, Russell Brown >>>>> <russell.br...@mac.com>wrote: >>>>> >>>>>> >>>>>> On 27 Jun 2012, at 11:50, Yousuf Fauzan wrote: >>>>>> >>>>>> Its not about the difference in throughput in the two approaches I >>>>>> took. Rather, the issue is that even 200 writes/sec is a bit on the lower >>>>>> side. >>>>>> I could be doing something wrong with the configuration because >>>>>> people are reporting throughputs of 2-3k ops/sec >>>>>> >>>>>> If anyone here could guide me in setting up a cluster which would >>>>>> give such kind of throughput. >>>>>> >>>>>> >>>>>> To get the kind of throughput I use multiple threads / workers. Have >>>>>> you looked at basho_bench[1], it is a simple, reliable tool to benchmark >>>>>> Riak clusters? >>>>>> >>>>>> Cheers >>>>>> >>>>>> Russell >>>>>> >>>>>> [1] Basho Bench - https://github.com/basho/basho_bench and >>>>>> http://wiki.basho.com/Benchmarking.html >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Yousuf >>>>>> >>>>>> On Wed, Jun 27, 2012 at 4:02 PM, Eric Anderson < >>>>>> ander...@copperegg.com> wrote: >>>>>> >>>>>>> On Jun 27, 2012, at 5:13 AM, Yousuf Fauzan <yousuffau...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I setup a 3 machine riak SM cluster. Each machine used 4GB Ram and >>>>>>> riak OpenSource SmartMachine Image. >>>>>>> >>>>>>> Afterwards I tried loading data by following two methods >>>>>>> 1. Bash script >>>>>>> #!/bin/bash >>>>>>> echo $(date) >>>>>>> for (( c=1; c<=1000; c++ )) >>>>>>> do >>>>>>> curl -s -d 'this is a test' -H "Content-Type: text/plain" >>>>>>> http://127.0.0.1:8098/buckets/test/keys >>>>>>> done >>>>>>> echo $(date) >>>>>>> >>>>>>> 2. Python Riak Client >>>>>>> c=riak.RiakClient("10.112.2.185") >>>>>>> b=c.bucket("test") >>>>>>> for i in xrange(10000):o=b.new(str(i), str(i)).store() >>>>>>> >>>>>>> For case 1, throughput was 25 writes/sec >>>>>>> For case 2, throughput was 200 writes/sec >>>>>>> >>>>>>> Maybe I am making a fundamental mistake somewhere. I tried the above >>>>>>> two scripts on EC2 clusters too and still got the same performance. >>>>>>> >>>>>>> Please, someone help >>>>>>> >>>>>>> >>>>>>> >>>>>>> The major difference between these two is the first is executing a >>>>>>> binary, which has to basically create everything (connection, payload, >>>>>>> etc) >>>>>>> every time through the loop. The second does not - it creates the >>>>>>> client >>>>>>> once, then iterates over it keeping the same client and presumably the >>>>>>> same >>>>>>> connection as well. That makes a huge difference. >>>>>>> >>>>>>> I would not use curl to do performance testing. What you probably >>>>>>> want is something like your python script that will work on many >>>>>>> threads/processes at once (or fire them up many times). >>>>>>> >>>>>>> >>>>>>> Eric Anderson >>>>>>> Co-Founder >>>>>>> CopperEgg >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> riak-users mailing list >>>>>> riak-users@lists.basho.com >>>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com