[ 
https://issues.apache.org/jira/browse/CASSANDRA-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905936#action_12905936
 ] 

Jon Hermes edited comment on CASSANDRA-1329 at 9/3/10 11:50 AM:
----------------------------------------------------------------

Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 
100k reads is just enough to occasionally throw me into GC. Also, I'm only 
randomizing over the first 100k block out of the 1M written, so everything 
should be key-cached and there's bound to be more duplicates in the set than I 
planned for.

// See attached multiget.test and multigetsmall.test

Regardless of the vagaries, the numbers are still comparable, and it looks like 
there is no significant difference in time to process a set versus a list.

      was (Author: jhermes):
    Process was:
 - stress.py insert 1m rows
 - loop stress.py multiget 100k rows until values stabilized

Note: The first runs have a cold cache (I left the default 200k keys in), and 
100k reads is just enough to occasionally throw me into GC. Also, I'm only 
randomizing over the first 100k block out of the 1M written, so everything 
should be key-cached and there's bound to be more duplicates in the set than I 
planned for.
{noformat}
==PRE-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ 
python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.19286876202,7                                           <-- Cold 
Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.85971493721,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.95732964993,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65282338619,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.60082161903,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.15557718277,5                                           <-- GC

==POST-PATCH1==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ 
python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.42788017273,7                                           <-- Cold 
Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.03476555347,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,4.36921528816,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.69064975262,3
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.66355334282,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75016436577,5
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.59240380764,4

==PRE-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ 
python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,5.46314402103,7                                           <-- Cold 
Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.97970569611,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.58520019531,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.65835041046,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.71839766502,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.54171346188,5                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.54564589024,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.70630379677,4

==POST-PATCH2==
$ PYTHONPATH=/usr/lib/python2.6/site-packages/:test:interface/thrift/gen-py/ 
python contrib/py_stress/stress.py -o multiget -n 100000 -k
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,6.26670286655,8                                           <-- Cold 
Keycache
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.88729266167,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.75670327663,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.99453821182,6                                           <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.5942284441,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.67040606022,4
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,3.8302997303,5                                            <-- GC
total,interval_op_rate,interval_key_rate,avg_latency,elapsed_time
50,5,10000,2.57685221672,3
{noformat}
Regardless of the vagaries, the numbers are still comparable, and it looks like 
there is no significant difference in time to process a set versus a list.
  
> make multiget take a set of keys instead of a list
> --------------------------------------------------
>
>                 Key: CASSANDRA-1329
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1329
>             Project: Cassandra
>          Issue Type: Task
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jon Hermes
>            Priority: Minor
>             Fix For: 0.7 beta 2
>
>         Attachments: 1329-rebase.txt, 1329-stresspy-multiget.txt, 1329.txt, 
> multiget.test, multigetsmall.test
>
>
> this more correctly sets the expectation that the order of keys in that list 
> doesn't matter, and duplicates don't make sense

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to