[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13181514#comment-13181514 ]
Brandon Williams commented on CASSANDRA-3634: --------------------------------------------- I have completed my benchmarks, with some interesting results. Each test was run against a 3 node cluster at RF=1, with a separate client machine pointed at it. Unless otherwise noted, each test was repeated 5 times, and the results are based on the aggregate of those runs. Due to this, for the sake of time, I didn't do all of the same tests that Eric did. The following results trim some of beginning and the end off from each run to avoid any warmup/falloff interference. Each run was compared against the others to check for outliers. h3. Insert 40M rows, 5 columns h4. Operations ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |CQL|45387.2772643|5904.81307513|37315.9|42172.25|46481.0|49434.75|53178.95|56010.6| |RPC|45931.9053199|6132.87181886|37661.6|42772.0|47025|50124.0|53961.5|57014.84| |PS|48311.0570284|8291.6760434|37031.6|43918.25|49592.5|54021.0|59505.25|63267.25| |BB|54603.8995816|11536.6909915|37070.2|48304|57419|62682|69488.6|74195.64| Latency differences were negligible. h3. Read 40M rows, 5 columns h4. Operations ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |CQL|51443.1086957|3745.68428455|46905.2|51052.0|52492.0|53435.75|54634.3|55384.79| |RPC|57503.4226592|4003.63039895|51631.8|56866.5|58758|59848.5|61284.0|62149.58| |PS|53620.2478142|3863.83989801|48231.9|52604.75|54854.0|56003.0|57381.25|58247.87| |BB|56141.4066438|3837.89800487|50404.3|55446.25|57226.5|58418.25|59947.45|60845.6| h4. Latency ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |CQL|0.00610705090124|0.000469496778011|0.00581482442558|0.0059168154319|0.00603730067576|0.00620725273782|0.00700885743783|0.00738831808365| |RPC|0.00533403099278|0.000470016449963|0.00501528631633|0.00512358578504|0.0052437593985|0.00542117815048|0.00627800086141|0.00664429953201| |PS|0.00527725020587|0.000575478263522|0.00478626820553|0.00496031451654|0.00513961124805|0.00543807678393|0.00645881637227|0.00710907693204| |BB|0.00550721726674|0.000465152724459|0.0051733781942|0.00529720514525|0.00542902622263|0.00561093846787|0.00644494569622|0.00697081447867| After this point, I began focusing on PS and BB alone. h3. Insert 4M rows, 50 columns h4. Operations ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|16646.3925759|5280.09708065|8170.2|14582|18628|19979|21875.4|22844.2| |BB|18788.0462487|8293.24195773|5300.4|13353|21986|24688|28159.2|29828.76| h4. Latency ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|nan|nan|0.0139596939996|0.0148752865229|0.0160873770492|0.0190529939254|0.0592844420784|nan| |BB|nan|nan|0.0110781744836|0.0121558163106|0.0138960263181|0.0211759061834|nan|nan| The NaNs indicate periods where nothing happened, likely caused by CSLM garbage from the higher column counts. h3. Read 4M rows, 50 columns h4. Operations ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|34539.9544554|637.625135771|33708.8|34177|34649|34983|35437.6|35622.2| |BB|34763.4899598|687.137336912|33898.6|34515.25|34887.5|35226.75|35526.15|35677.27| Latencies were statistically identical. h3. Insert 2M rows, 100 columns h4. Operations ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|9942.96043656|3256.29877687|4520.4|9172|11310|12039|12748.0|13093.84| |BB|11187.3251232|5054.66940811|1868.8|8712.0|13169.0|14822.75|16438.7|17366.29| h4. Latency ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|nan|nan|0.0255763122529|0.0265708752266|0.0281904388031|0.0330053557766|0.124159655014|nan| |BB|nan|nan|0.0196192202003|0.0211057864458|0.0237350334693|0.0366660241253|nan|nan| h3. Read 2M rows, 100 columns h4. Operations ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|19457.1524249|63.7626578468|19375.4|19453|19477|19489|19499.0|19525.76| |BB|19452.0904872|67.9280612887|19391|19441.5|19474|19486.5|19498.0|19518.8| h4. Latency ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|0.0169462953925|0.000645556882921|0.0168879243752|0.0169206096685|0.0169354541263|0.0169647912699|0.0170544990146|0.0171358551454| |BB|0.016875082754|0.00061442017596|0.0169232071407|0.0169351055969|0.0169496839184|0.0169832991663|0.0170569914419|0.0171009883986| h3. Read 1M rows, 5 columns, 2Kb values Note that I'm omitting inserts here, as the machines were obviously bound by the commit log. I also chose this combination carefully, so as not to be network bound. Approximate peak is 922Mb on a gigabit network. h4. Operations ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|12042.5353846|10.7183298178|12028.0|12037|12045|12050|12055.0|12060.76| |BB|11757.3492537|141.932842405|11651.8|11700.0|11780|11822.5|11939.0|11964.6| h4. Latency ||Type||Mean||SD||10th||25th||50th||75th||95th||99th|| |PS|0.0273751391287|0.000433847035981|0.0273802118349|0.0273989370536|0.0274172059068|0.0274350859706|0.0274714290518|0.0275139832189| |BB|0.00452289713894|0.000443022928354|0.00435995387188|0.004620072698|0.00464701897019|0.00468012476814|0.00470964983515|0.00472619195963| h4. Notes h5. 40x5 * Outside of this test, measuring inserts gets a bit iffy because GC begins playing a significant role, though this should even out with enough runs * Even in this test, GC is still likely a factor at this scale, since reads end up being faster than writes * BB's dominance on inserts is undeniable, though both it and PS have a significantly higher standard deviation, and BB is very bursty * reads were very consistent across the board here, these are very trustworthy results h5. 4x50 * reads are again very consistent, but with little difference between PS and BB h5. 2x100 * reads are statistically identical h5. 1x5x2Kb * throughput is roughly the same between them, but BB's standard deviation is 14x higher. Mostly that is a result of PS being *extremely* consistent. * for whatever reason, PS pays a *huge* latency penalty on large values. This is very consistent across runs. > compare string vs. binary prepared statement parameters > ------------------------------------------------------- > > Key: CASSANDRA-3634 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 > Project: Cassandra > Issue Type: Sub-task > Components: API, Core > Reporter: Eric Evans > Assignee: Eric Evans > Priority: Minor > Labels: cql > Fix For: 1.1 > > > Perform benchmarks to compare the performance of string and pre-serialized > binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira