Nice catch Daniel. The comment from Sylvain explains a lot ! On Tue, Sep 23, 2014 at 11:33 PM, Daniel Chia <danc...@coursera.org> wrote:
> If I had to guess, it might be in part i could be due to inefficiencies in > 2.0 with regards to CompositeType (which is used in CQL3 tables) - > https://issues.apache.org/jira/browse/CASSANDRA-5417?focusedCommentId=13821243&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13821243 > > Ticket reports 45% performance increase in reading slices compared to > trunk in 2.1 > > Thanks, > Daniel > > On Tue, Sep 23, 2014 at 5:08 PM, DuyHai Doan <doanduy...@gmail.com> wrote: > >> I had done some benching in the past when we faced high CPU usage even >> though data set is very small, sitting entirely in memory, read the report >> there: https://github.com/doanduyhai/Cassandra_Data_Model_Bench >> >> Our *partial *conclusion were: >> >> 1) slice query fetching a page of 64kb of data and decoding columns is >> more CPU-expensive than a single read by column >> 2) the decoding of CompositeType costs more CPU for CQL3 data model than >> for old Thrift column family >> 3) since the Cell type for all CQL3 table is forced to BytesType to >> support any type of data, serialization/de-serialization may have a cost on >> CPU. >> >> The issue Eric Leleu is facing reminds me of point 1). When he puts limit >> to 1, it's a single read by column. The other query with limit 10 is >> translated internally to a slice query and may explain the CPU difference >> >> Now, do not take my words as granted. Those points are just *assumptions >> *and partial conclusions. I need extensive in depth debugging to confirm >> those. Did not have time lately. >> >> On Tue, Sep 23, 2014 at 10:46 PM, Chris Lohfink <clohf...@blackbirdit.com >> > wrote: >> >>> CPU consumption may be affected from the cassandra-stress tool in 2nd >>> example as well. Running on a separate system eliminates it as a possible >>> cause. There is a little extra work but not anything that I think would be >>> that obvious. tracing (can enable with nodetool) or profiling (ie with >>> yourkit) can give more exposure to the bottleneck. Id run test from >>> separate system first. >>> >>> --- >>> Chris Lohfink >>> >>> >>> On Sep 23, 2014, at 12:48 PM, Leleu Eric <eric.le...@worldline.com> >>> wrote: >>> >>> First of all, Thanks for your help ! :) >>> >>> Here is some details : >>> >>> With RF=N=2 your essentially testing a single machine locally which isnt >>> the best indicator long term >>> >>> I will test with more nodes, (4 with RF = 2) but for now I'm limited to >>> 2 nodes for non technical reason ... >>> >>> Well, first off you shouldn't run stress tool on the node your testing. >>> Give it its own box. >>> >>> I performed the test in a new Keyspace in order to have a clear dataset. >>> >>> the 2nd query since its returning 10x the data and there will be more to >>> go through within the partition >>> >>> I configured cassandra-stress in a way of each user has only one bucket >>> so the amount of data is the same in the both case. ("select * from buckets >>> where name = ? and tenantid = ? limit 1" and "select * from >>> owner_to_buckets where owner = ? and tenantid = ? limit 10"). >>> Does cassandra perform extra read when the limit is bigger than the >>> available data (even if the partition key contains only one single value in >>> the clustering column) ? >>> If the amount of data is the same, how can we explain the difference of >>> CPU consumption? >>> >>> >>> Regards, >>> Eric >>> >>> ________________________________________ >>> De : Chris Lohfink [clohf...@blackbirdit.com] >>> Date d'envoi : mardi 23 septembre 2014 19:23 >>> À : user@cassandra.apache.org >>> Objet : Re: CPU consumption of Cassandra >>> >>> Well, first off you shouldn't run stress tool on the node your testing. >>> Give it its own box. >>> >>> With RF=N=2 your essentially testing a single machine locally which isnt >>> the best indicator long term (optimizations available when reading data >>> thats local to the node). 80k/sec on a system is pretty good though, your >>> probably seeing slower on the 2nd query since its returning 10x the data >>> and there will be more to go through within the partition. 42k/sec is still >>> acceptable imho since these are smaller boxes. You are probably seeing >>> high CPU because the system is doing a lot :) >>> >>> If you want to get more out of these systems can do some tuning >>> probably, enable trace to see whats actually the bottleneck. >>> >>> Collections will very likely hurt more then help. >>> >>> --- >>> Chris Lohfink >>> >>> On Sep 23, 2014, at 9:39 AM, Leleu Eric <eric.le...@worldline.com< >>> mailto:eric.le...@worldline.com <eric.le...@worldline.com>>> wrote: >>> >>> I tried to run “cassandra-stress” on some of my table as proposed by >>> Jake Luciani. >>> >>> For a simple table, this tool is able to perform 80000 read op/s with a >>> few CPU consumption if I request the table by the PK(name, tenanted) >>> >>> Ex : >>> TABLE : >>> >>> CREATE TABLE IF NOT EXISTS buckets (tenantid varchar, >>> name varchar, >>> owner varchar, >>> location varchar, >>> description varchar, >>> codeQuota varchar, >>> creationDate timestamp, >>> updateDate timestamp, >>> PRIMARY KEY (name, tenantid)); >>> >>> QUERY : select * from buckets where name = ? and tenantid = ? limit 1; >>> >>> TOP output for 900 threads on cassandra-stress : >>> top - 13:17:09 up 173 days, 21:54, 4 users, load average: 11.88, 4.30, >>> 2.76 >>> Tasks: 272 total, 1 running, 270 sleeping, 0 stopped, 1 zombie >>> Cpu(s): 71.4%us, 14.0%sy, 0.0%ni, 13.1%id, 0.0%wa, 0.0%hi, 1.5%si, >>> 0.0%st >>> Mem: 98894704k total, 96367436k used, 2527268k free, 15440k buffers >>> Swap: 0k total, 0k used, 0k free, 88194556k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 25857 root 20 0 29.7g 1.5g 12m S 693.0 1.6 38:45.58 java <== >>> Cassandra-stress >>> 29160 cassandr 20 0 16.3g 4.8g 10m S 1.3 5.0 44:46.89 java <== >>> Cassandra >>> >>> >>> >>> Now, If I run another query on a table that provides a list of buckets >>> according to the owner, the number of op/s is divided by 2 (42000 op/s) >>> and CPU consumption grow UP. >>> >>> Ex : >>> TABLE : >>> >>> CREATE TABLE IF NOT EXISTS owner_to_buckets (tenantid varchar, >>> name varchar, >>> owner varchar, >>> location varchar, >>> description varchar, >>> codeQuota varchar, >>> creationDate timestamp, >>> updateDate timestamp, >>> PRIMARY KEY ((owner, tenantid), name)); >>> >>> QUERY : select * from owner_to_buckets where owner = ? and tenantid = ? >>> limit 10; >>> >>> TOP output for 4 threads on cassandra-stress: >>> >>> top - 13:49:16 up 173 days, 22:26, 4 users, load average: 1.76, 1.48, >>> 1.17 >>> Tasks: 273 total, 1 running, 271 sleeping, 0 stopped, 1 zombie >>> Cpu(s): 26.3%us, 8.0%sy, 0.0%ni, 64.7%id, 0.0%wa, 0.0%hi, 1.0%si, >>> 0.0%st >>> Mem: 98894704k total, 97512156k used, 1382548k free, 14580k buffers >>> Swap: 0k total, 0k used, 0k free, 90413772k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 29160 cassandr 20 0 13.6g 4.8g 37m S 186.7 5.1 62:26.77 java <== >>> Cassandra >>> 50622 root 20 0 28.8g 469m 12m S 102.5 0.5 0:45.84 java <== >>> Cassandra-stress >>> >>> TOP output for 271 threads on cassandra-stress: >>> >>> >>> top - 13:57:03 up 173 days, 22:34, 4 users, load average: 4.67, 1.76, >>> 1.25 >>> Tasks: 272 total, 1 running, 270 sleeping, 0 stopped, 1 zombie >>> Cpu(s): 81.5%us, 14.0%sy, 0.0%ni, 3.1%id, 0.0%wa, 0.0%hi, 1.3%si, >>> 0.0%st >>> Mem: 98894704k total, 94955936k used, 3938768k free, 15892k buffers >>> Swap: 0k total, 0k used, 0k free, 85993676k cached >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 29160 cassandr 20 0 13.6g 4.8g 38m S 430.0 5.1 82:31.80 java <== >>> Cassandra >>> 50622 root 20 0 29.1g 2.3g 12m S 343.4 2.4 17:51.22 java <== >>> Cassandra-stress >>> >>> >>> I have 4 tables with a composed PRIMARY KEY (two of them has 4 entries >>> : 2 for the partition key, one for cluster column and one for sort column) >>> Two of these tables are frequently read with the partition key because >>> we want to list data of a given user, this should explain my CPU load >>> according to the simple test done with Cassandra-stress … >>> >>> How can I avoid this? >>> Collections could be an option but the number of data per user is not >>> limited and can easily exceed 200 entries. According to the Cassandra >>> documentation, collections have a size limited to 64KB. So it is probably >>> not a solution in my case. ☹ >>> >>> >>> Regards, >>> Eric >>> >>> De : Chris Lohfink [mailto:clohf...@blackbirdit.com >>> <clohf...@blackbirdit.com>] >>> Envoyé : lundi 22 septembre 2014 22:03 >>> À : user@cassandra.apache.org<mailto:user@cassandra.apache.org >>> <user@cassandra.apache.org>> >>> Objet : Re: CPU consumption of Cassandra >>> >>> Its going to depend a lot on your data model but 5-6k is on the low end >>> of what I would expect. N=RF=2 is not really something I would recommend. >>> That said 93GB is not much data so the bottleneck may exist more in your >>> data model, queries, or client. >>> >>> What profiler are you using? The cpu on the select/read is marked as >>> RUNNABLE but its really more of a wait state that may throw some profilers >>> off, it may be a red haring. >>> >>> --- >>> Chris Lohfink >>> >>> On Sep 22, 2014, at 11:39 AM, Leleu Eric <eric.le...@worldline.com< >>> mailto:eric.le...@worldline.com <eric.le...@worldline.com>>> wrote: >>> >>> >>> Hi, >>> >>> >>> I’m currently testing Cassandra 2.0.9 (and since the last week 2.1) >>> under some read heavy load… >>> >>> I have 2 cassandra nodes (RF : 2) running under CentOS 6 with 16GB of >>> RAM and 8 Cores. >>> I have around 93GB of data per node (one Disk of 300GB with SAS >>> interface and a Rotational Speed of 10500) >>> >>> I have 300 active client threads and they request the C* nodes with a >>> Consitency level set to ONE (I’m using the CQL datastax driver). >>> >>> During my tests I saw a lot of CPU consumption (70% user / 6%sys / 4% >>> iowait / 20%idle). >>> C* nodes respond to around 5000 op/s (sometime up to 6000op/s) >>> >>> I try to profile a node and at the first look, 60% of the CPU is passed >>> in the “sun.nio.ch<http://sun.nio.ch/>” package. (SelectorImpl.select >>> or Channel.read) >>> >>> I know that Benchmark results are highly dependent of the Dataset and >>> use cases, but according to my point of view this CPU consumption is normal >>> according to the load. >>> Someone can confirm that point ? >>> According to my Hardware configuration, can I expect to have more than >>> 6000 read op/s ? >>> >>> >>> Regards, >>> Eric >>> >>> >>> >>> >>> >>> ________________________________ >>> >>> Ce message et les pièces jointes sont confidentiels et réservés à >>> l'usage exclusif de ses destinataires. Il peut également être protégé par >>> le secret professionnel. Si vous recevez ce message par erreur, merci d'en >>> avertir immédiatement l'expéditeur et de le détruire. L'intégrité du >>> message ne pouvant être assurée sur Internet, la responsabilité de >>> Worldline ne pourra être recherchée quant au contenu de ce message. Bien >>> que les meilleurs efforts soient faits pour maintenir cette transmission >>> exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et >>> sa responsabilité ne saurait être recherchée pour tout dommage résultant >>> d'un virus transmis. >>> >>> This e-mail and the documents attached are confidential and intended >>> solely for the addressee; it may also be privileged. If you receive this >>> e-mail in error, please notify the sender immediately and destroy it. As >>> its integrity cannot be secured on the Internet, the Worldline liability >>> cannot be triggered for the message content. Although the sender endeavours >>> to maintain a computer virus-free network, the sender does not warrant that >>> this transmission is virus-free and will not be liable for any damages >>> resulting from any virus transmitted. >>> >>> >>> ________________________________ >>> >>> Ce message et les pièces jointes sont confidentiels et réservés à >>> l'usage exclusif de ses destinataires. Il peut également être protégé par >>> le secret professionnel. Si vous recevez ce message par erreur, merci d'en >>> avertir immédiatement l'expéditeur et de le détruire. L'intégrité du >>> message ne pouvant être assurée sur Internet, la responsabilité de >>> Worldline ne pourra être recherchée quant au contenu de ce message. Bien >>> que les meilleurs efforts soient faits pour maintenir cette transmission >>> exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et >>> sa responsabilité ne saurait être recherchée pour tout dommage résultant >>> d'un virus transmis. >>> >>> This e-mail and the documents attached are confidential and intended >>> solely for the addressee; it may also be privileged. If you receive this >>> e-mail in error, please notify the sender immediately and destroy it. As >>> its integrity cannot be secured on the Internet, the Worldline liability >>> cannot be triggered for the message content. Although the sender endeavours >>> to maintain a computer virus-free network, the sender does not warrant that >>> this transmission is virus-free and will not be liable for any damages >>> resulting from any virus transmitted. >>> >>> >>> >>> Ce message et les pièces jointes sont confidentiels et réservés à >>> l'usage exclusif de ses destinataires. Il peut également être protégé par >>> le secret professionnel. Si vous recevez ce message par erreur, merci d'en >>> avertir immédiatement l'expéditeur et de le détruire. L'intégrité du >>> message ne pouvant être assurée sur Internet, la responsabilité de >>> Worldline ne pourra être recherchée quant au contenu de ce message. Bien >>> que les meilleurs efforts soient faits pour maintenir cette transmission >>> exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et >>> sa responsabilité ne saurait être recherchée pour tout dommage résultant >>> d'un virus transmis. >>> >>> This e-mail and the documents attached are confidential and intended >>> solely for the addressee; it may also be privileged. If you receive this >>> e-mail in error, please notify the sender immediately and destroy it. As >>> its integrity cannot be secured on the Internet, the Worldline liability >>> cannot be triggered for the message content. Although the sender endeavours >>> to maintain a computer virus-free network, the sender does not warrant that >>> this transmission is virus-free and will not be liable for any damages >>> resulting from any virus transmitted. >>> >>> >>> >> >