Re: RandomPartitioner is providing a very skewed distribution of keys across a 5-node Solandra cluster
An additional detail is that the CPU utilization on those nodes is proportional to the load below, so machines 9.9.9.1 and 9.9.9.3 experience a fraction of CPU load as compared to the remaining 3 nodes. This might further point to the possibility that the keys are hashing minimally to the token ranges on those nodes. I'm no expert at cryptography, but is it possible that web URLs are not evenly distributed via MD5 hashing due to the common prefixes they contain? (such as the http://; prefix, or perhaps a domain name?)? What's also interesting is that the distribution is more-or-less even across *alternating* nodes... (0, 2, 4 -- vs -- 1, 3). Thanks, Safdar On Sun, Jun 24, 2012 at 6:00 PM, Safdar Kureishy safdar.kurei...@gmail.comwrote: Hi, I've searched online but was unable to find any leads for the problem below. This mailing list seemed the most appropriate place. Apologies in advance if that isn't the case. I'm running a 5-node Solandra cluster (Solr + Cassandra). I've setup the nodes with tokens *evenly distributed across the token space*, for a 5-node cluster (as evidenced below under the effective-ownership column of the nodetool ring output). My data is a set of a few million crawled web pages, crawled using Nutch, and also indexed using the solrindex command available through Nutch. AFAIK, the key for each document generated from the crawled data is the URL. Based on the load values for the nodes below, despite adding about 3 million web pages to this index via the HTTP Rest API (e.g.: http://9.9.9.x:8983/solandra/index/update), some nodes are still empty. Specifically, nodes 9.9.9.1 and 9.9.9.3 have just a few kilobytes (shown in *bold* below) of the index, while the remaining 3 nodes are consistently getting hammered by all the data. If the RandomPartioner (which is what I'm using for this cluster) is supposed to achieve an even distribution of keys across the token space, why is it that the data below is skewed in this fashion? Literally, no key was yet been hashed to the nodes 9.9.9.1 and 9.9.9.3 below. Could someone possibly shed some light on this absurdity?. [me@hm1 solandra-app]$ bin/nodetool -h hm1 ring Address DC RackStatus State Load Effective-Owership Token 136112946768375385385349842972707284580 9.9.9.0 datacenter1 rack1 Up Normal 7.57 GB 20.00% 0 9.9.9.1 datacenter1 rack1 Up Normal *21.44 KB* 20.00% 34028236692093846346337460743176821145 9.9.9.2 datacenter1 rack1 Up Normal 14.99 GB 20.00% 68056473384187692692674921486353642290 9.9.9.3 datacenter1 rack1 Up Normal *50.79 KB* 20.00% 102084710076281539039012382229530463435 9.9.9.4 datacenter1 rack1 Up Normal 15.22 GB 20.00% 136112946768375385385349842972707284580 Thanks in advance. Regards, Safdar
Re: RandomPartitioner is providing a very skewed distribution of keys across a 5-node Solandra cluster
If i read what you are saying, you are _not_ using composite keys? That's one thing that could do it, if the first part of the composite key had a very very low cardinality. On 06/24/2012 11:00 AM, Safdar Kureishy wrote: Hi, I've searched online but was unable to find any leads for the problem below. This mailing list seemed the most appropriate place. Apologies in advance if that isn't the case. I'm running a 5-node Solandra cluster (Solr + Cassandra). I've setup the nodes with tokens /evenly distributed across the token space/, for a 5-node cluster (as evidenced below under the effective-ownership column of the nodetool ring output). My data is a set of a few million crawled web pages, crawled using Nutch, and also indexed using the solrindex command available through Nutch. AFAIK, the key for each document generated from the crawled data is the URL. Based on the load values for the nodes below, despite adding about 3 million web pages to this index via the HTTP Rest API (e.g.: http://9.9.9.x:8983/solandra/index/update), some nodes are still empty. Specifically, nodes 9.9.9.1 and 9.9.9.3 have just a few kilobytes (shown in *bold* below) of the index, while the remaining 3 nodes are consistently getting hammered by all the data. If the RandomPartioner (which is what I'm using for this cluster) is supposed to achieve an even distribution of keys across the token space, why is it that the data below is skewed in this fashion? Literally, no key was yet been hashed to the nodes 9.9.9.1 and 9.9.9.3 below. Could someone possibly shed some light on this absurdity?. [me@hm1 solandra-app]$ bin/nodetool -h hm1 ring Address DC RackStatus State Load Effective-Owership Token 136112946768375385385349842972707284580 9.9.9.0 datacenter1 rack1 Up Normal 7.57 GB 20.00% 0 9.9.9.1 datacenter1 rack1 Up Normal *21.44 KB* 20.00% 34028236692093846346337460743176821145 9.9.9.2 datacenter1 rack1 Up Normal 14.99 GB 20.00% 68056473384187692692674921486353642290 9.9.9.3 datacenter1 rack1 Up Normal *50.79 KB* 20.00% 102084710076281539039012382229530463435 9.9.9.4 datacenter1 rack1 Up Normal 15.22 GB 20.00% 136112946768375385385349842972707284580 Thanks in advance. Regards, Safdar
Re: RandomPartitioner is providing a very skewed distribution of keys across a 5-node Solandra cluster
Hi Dave, Would you mind elaborating a bit more on that, preferably with an example? AFAIK, Solandra uses the unique id of the Solr document as the input for calculating the md5 hash for shard/node assignment. In this case the ids are just millions of varied web URLs that do *not* adhere to any regular expression. I'm not sure if that answers your question below? Thanks, Safdar On Sun, Jun 24, 2012 at 8:38 PM, Dave Brosius dbros...@mebigfatguy.comwrote: If i read what you are saying, you are _not_ using composite keys? That's one thing that could do it, if the first part of the composite key had a very very low cardinality. On 06/24/2012 11:00 AM, Safdar Kureishy wrote: Hi, I've searched online but was unable to find any leads for the problem below. This mailing list seemed the most appropriate place. Apologies in advance if that isn't the case. I'm running a 5-node Solandra cluster (Solr + Cassandra). I've setup the nodes with tokens *evenly distributed across the token space*, for a 5-node cluster (as evidenced below under the effective-ownership column of the nodetool ring output). My data is a set of a few million crawled web pages, crawled using Nutch, and also indexed using the solrindex command available through Nutch. AFAIK, the key for each document generated from the crawled data is the URL. Based on the load values for the nodes below, despite adding about 3 million web pages to this index via the HTTP Rest API (e.g.: http://9.9.9.x:8983/solandra/index/update), some nodes are still empty. Specifically, nodes 9.9.9.1 and 9.9.9.3 have just a few kilobytes (shown in *bold* below) of the index, while the remaining 3 nodes are consistently getting hammered by all the data. If the RandomPartioner (which is what I'm using for this cluster) is supposed to achieve an even distribution of keys across the token space, why is it that the data below is skewed in this fashion? Literally, no key was yet been hashed to the nodes 9.9.9.1 and 9.9.9.3 below. Could someone possibly shed some light on this absurdity?. [me@hm1 solandra-app]$ bin/nodetool -h hm1 ring Address DC RackStatus State Load Effective-Owership Token 136112946768375385385349842972707284580 9.9.9.0 datacenter1 rack1 Up Normal 7.57 GB 20.00% 0 9.9.9.1 datacenter1 rack1 Up Normal *21.44 KB* 20.00% 34028236692093846346337460743176821145 9.9.9.2 datacenter1 rack1 Up Normal 14.99 GB 20.00% 68056473384187692692674921486353642290 9.9.9.3 datacenter1 rack1 Up Normal *50.79 KB* 20.00% 102084710076281539039012382229530463435 9.9.9.4 datacenter1 rack1 Up Normal 15.22 GB 20.00% 136112946768375385385349842972707284580 Thanks in advance. Regards, Safdar
Re: wildcards as both ends
I'm wondering how or if it's possible to implement efficient wildcards at both ends, e.g. *string* No. - if I can get another equality constraint which narrows down potential result set significantly, I can do a scan. I'm not sure how feasible this is without benchmarks. Does any one know if I can scan couple hundreds / thousands in a 3 node replication factory=2 cluster quickly? Not efficiently. If you need full text capabilities look at Solr, Solandra (the solr to cassandra port) or Data Stax Enterprise. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/06/2012, at 2:20 AM, Sam Z J wrote: Hi all I'm wondering how or if it's possible to implement efficient wildcards at both ends, e.g. *string* I can think of a few options... please comment, thanks =D - if I can get another equality constraint which narrows down potential result set significantly, I can do a scan. I'm not sure how feasible this is without benchmarks. Does any one know if I can scan couple hundreds / thousands in a 3 node replication factory=2 cluster quickly? - for each string I have, index all the prefixes in a column family, e.g. for string 'string', I'd have rows string, strin, stri, str, st, s, with column values somehow pointing back as row keys. This almost blows up the storage needed =/ (also, what do I do if I hit the 2billion row width limit? is there a way to say 'insert into another row if the current one is full'?) thanks -- Zhongshi (Sam) Jiang sammyjiang...@gmail.com
Re: Tiered compation on two disks
I have a Cassandra installation where we plan to store 1Tb of data, split between two 1Tb disks. In general it's a good idea to limit the per node storage to 300GB to 400GB. This has more to do with operational issues that any particular issue with cassandra. However storing a very large number of keys on a single node can result in high memory usage while the server is idling, and reduced read performance. I know that tiered compaction needs 50% free disk space for worst case situation. Not really now days, but it's a good idea to treat 50% as a soft limit. How does this combine with the disk split? Whenever a new file is written to disk it will use the data directory with the most space. In general we recommend using a single data directory. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/06/2012, at 10:56 PM, Flavio Baronti wrote: Hi, I have a Cassandra installation where we plan to store 1Tb of data, split between two 1Tb disks. Tiered compation should be better suited for our workload (append-only, deletion of old data, few reads). I know that tiered compaction needs 50% free disk space for worst case situation. How does this combine with the disk split? What happens if I have 500Gb of data in one disk and 500Gb in the other? Won't compaction try to build a single 1Tb file, failing since there are only 500Gb free on each disk? Flavio
Re: Weird behavior in Cassandra 1.1.0 - throwing unconfigured CF exceptions when the CF is present
I would check if the schema's have diverged, run describe cluster in the cli. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/06/2012, at 12:22 AM, Tharindu Mathew wrote: Hi, I'm having issues with Hector 1.1.0 and Cassandra 1.1.0. I'm adding a column family dynamically, and after sleeping for some time and making sure that the column family is created using keyspacedefinition.getCFs, I still get unconfigured column family exceptions.. Even after some time if I try to insert data I still get unconfigured CF exceptions. Below at [1], I have inserted logs to specifically print all the CFs before inserting data. It is present in the CF list, but still it's failing. Note, that this does not happen for all data. Some data does get inserted. I'm baffled as to what could be the reason. Any help would be really appreciated. [1] - [2012-06-21 17:22:21,680] INFO {org.wso2.carbon.eventbridge.streamdefn.cassandra.datastore.CassandraConnector} - Keyspace desc. : ThriftKsDef[name=EVENT_KS,strategyClass=org.apache.cassandra.locator.SimpleStrategy,strategyOptions={replication_factor=1},cfDefs=[ThriftCfDef[keyspace=EVENT_KS,name=org_wso2_bam_kp,columnType=STANDARD,comparatorType=me.prettyprint.hector.api.ddl.ComparatorType@c89abe1,subComparatorType=null,comparatorTypeAlias=,subComparatorTypeAlias=,comment=,rowCacheSize=0.0,rowCacheSavePeriodInSeconds=0,keyCacheSize=0.0,readRepairChance=1.0,columnMetadata=[],gcGraceSeconds=864000,keyValidationClass=org.apache.cassandra.db.marshal.BytesType,defaultValidationClass=org.apache.cassandra.db.marshal.BytesType,id=1004,maxCompactionThreshold=32,minCompactionThreshold=4,memtableOperationsInMillions=0.0,memtableThroughputInMb=0,memtableFlushAfterMins=0,keyCacheSavePeriodInSeconds=0,replicateOnWrite=true,compactionStrategy=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={},compressionOptions={sstable_compression=org.apache.cassandra.io.compress.SnappyCompressor},mergeShardsChance=0.0,rowCacheProvider=null,keyAlias=null,rowCacheKeysToSave=0]],durableWrites=true] [2012-06-21 17:22:21,681] INFO {org.wso2.carbon.eventbridge.streamdefn.cassandra.datastore.CassandraConnector} - CFs present cf name : org_wso2_bam_kp [2012-06-21 17:22:21,683] ERROR {org.wso2.carbon.eventbridge.streamdefn.cassandra.subscriber.BAMEventSubscriber} - Error processing event. Event{streamId='org.wso2.bam.kp-1.0.5-6b80ca6c-1ad9-4495-a872-8466c424c5d0', timeStamp=1340279541606, metaData=[external], metaData=null, payloadData=[Orange, 1.0, 520.0, Ivan]} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:unconfigured columnfamily org_wso2_bam_kp) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:45) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:264) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at org.wso2.carbon.eventbridge.streamdefn.cassandra.datastore.CassandraConnector.insertEvent(CassandraConnector.java:361) at org.wso2.carbon.eventbridge.streamdefn.cassandra.subscriber.BAMEventSubscriber.receive(BAMEventSubscriber.java:42) at org.wso2.carbon.eventbridge.core.internal.queue.QueueWorker.run(QueueWorker.java:64) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: InvalidRequestException(why:unconfigured columnfamily org_wso2_bam_kp) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20169) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:913) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:899) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 11 more -- Regards, Tharindu blog: http://mackiemathew.com/
Re: Starting cassandra with -D option
Idea is to avoid having the copies of cassandra code in each node, If you run cassandra from the NAS you are adding a single point of failure into the system. Better to use some form of deployment automation and install all the requirement components onto each node. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/06/2012, at 12:29 AM, Flavio Baronti wrote: The option must actually include also the name of the yaml file: Dcassandra.config=file:///Users/walmart/Downloads/Cassandra/Node2-Cassandra1.1.0/conf/cassandra.yaml Flavio Il 6/21/2012 13:16 PM, Roshni Rajagopal ha scritto: Hi Folks, We wanted to have a single cassandra installation, and use it to start cassandra in other nodes by passing it the cassandra configuration directories as a parameter. Idea is to avoid having the copies of cassandra code in each node, and starting each node by getting into bin/cassandra of that node. As per http://www.datastax.com/docs/1.0/references/cassandra, We have an option –D where we can supply some parameters to cassandra. Has anyone tried this? Im getting an error as below. walmarts-MacBook-Pro-2:Node1-Cassandra1.1.0 walmart$ bin/cassandra -Dcassandra.config=file:///Users/walmart/Downloads/Cassandra/Node2-Cassandra1.1.0/conf walmarts-MacBook-Pro-2:Node1-Cassandra1.1.0 walmart$ INFO 15:38:01,763 Logging initialized INFO 15:38:01,766 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_31 INFO 15:38:01,766 Heap size: 1052770304/1052770304 INFO 15:38:01,766 Classpath: bin/../conf:bin/../build/classes/main:bin/../build/classes/thrift:bin/../lib/antlr-3.2.jar:bin/../lib/apache-cassandra-1.1.0.jar:bin/../lib/apache-cassandra-clientutil-1.1.0.jar:bin/../lib/apache-cassandra-thrift-1.1.0.jar:bin/../lib/avro-1.4.0-fixes.jar:bin/../lib/avro-1.4.0-sources-fixes.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/compress-lzf-0.8.4.jar:bin/../lib/concurrentlinkedhashmap-lru-1.2.jar:bin/../lib/guava-r08.jar:bin/../lib/high-scale-lib-1.1.2.jar:bin/../lib/jackson-core-asl-1.9.2.jar:bin/../lib/jackson-mapper-asl-1.9.2.jar:bin/../lib/jamm-0.2.5.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-0.7.0.jar:bin/../lib/log4j-1.2.16.jar:bin/../lib/metrics-core-2.0.3.jar:bin/../lib/mx4j-tools-3.0.1.jar:bin/../lib/servlet-api-2.5-20081211.jar:bin/../lib/slf4j-api-1.6.1.jar:bin/../lib/slf4j-log4j12-1.6.1.jar:bin/../lib/snakeyaml-1.6.jar:bin/../lib/snappy-java-1.0.4.1.jar:bin/../lib/snaptree-0.1.jar:bin/../lib/jamm-0.2.5.jar INFO 15:38:01,768 JNA not found. Native methods will be disabled. INFO 15:38:01,826 Loading settings from file:/Users/walmart/Downloads/Cassandra/Node2-Cassandra1.1.0/conf ERROR 15:38:01,873 Fatal configuration error error Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=No single argument constructor found for class org.apache.cassandra.config.Config in reader, line 1, column 1: cassandra.yaml The other option would be to modify cassandra.in.sh. Has anyone tried this?? Regards, Roshni This email and any files transmitted with it are confidential and intended solely for the individual or entity to whom they are addressed. If you have received this email in error destroy it immediately. *** Walmart Confidential ***
Re: RandomPartitioner is providing a very skewed distribution of keys across a 5-node Solandra cluster
Well it sounds like this doesn't apply to you. if you had set up your column family in cql as PRIMARY KEY (domain_name, path) or something like that and where looking at lots and lots of url pages (domain_name + path), but from a very small number domain_names, then the partitioner being just the domain_name could account for an uneven distribution. But it sounds like your key is just a URL so that should (in theory) be fine. On 06/24/2012 01:53 PM, Safdar Kureishy wrote: Hi Dave, Would you mind elaborating a bit more on that, preferably with an example? AFAIK, Solandra uses the unique id of the Solr document as the input for calculating the md5 hash for shard/node assignment. In this case the ids are just millions of varied web URLs that do /not/ adhere to any regular expression. I'm not sure if that answers your question below? Thanks, Safdar On Sun, Jun 24, 2012 at 8:38 PM, Dave Brosius dbros...@mebigfatguy.com mailto:dbros...@mebigfatguy.com wrote: If i read what you are saying, you are _not_ using composite keys? That's one thing that could do it, if the first part of the composite key had a very very low cardinality. On 06/24/2012 11:00 AM, Safdar Kureishy wrote: Hi, I've searched online but was unable to find any leads for the problem below. This mailing list seemed the most appropriate place. Apologies in advance if that isn't the case. I'm running a 5-node Solandra cluster (Solr + Cassandra). I've setup the nodes with tokens /evenly distributed across the token space/, for a 5-node cluster (as evidenced below under the effective-ownership column of the nodetool ring output). My data is a set of a few million crawled web pages, crawled using Nutch, and also indexed using the solrindex command available through Nutch. AFAIK, the key for each document generated from the crawled data is the URL. Based on the load values for the nodes below, despite adding about 3 million web pages to this index via the HTTP Rest API (e.g.: http://9.9.9.x:8983/solandra/index/update), some nodes are still empty. Specifically, nodes 9.9.9.1 and 9.9.9.3 have just a few kilobytes (shown in *bold* below) of the index, while the remaining 3 nodes are consistently getting hammered by all the data. If the RandomPartioner (which is what I'm using for this cluster) is supposed to achieve an even distribution of keys across the token space, why is it that the data below is skewed in this fashion? Literally, no key was yet been hashed to the nodes 9.9.9.1 and 9.9.9.3 below. Could someone possibly shed some light on this absurdity?. [me@hm1 solandra-app]$ bin/nodetool -h hm1 ring Address DC RackStatus State Load Effective-Owership Token 136112946768375385385349842972707284580 9.9.9.0 datacenter1 rack1 Up Normal 7.57 GB 20.00% 0 9.9.9.1 datacenter1 rack1 Up Normal *21.44 KB* 20.00% 34028236692093846346337460743176821145 9.9.9.2 datacenter1 rack1 Up Normal 14.99 GB 20.00% 68056473384187692692674921486353642290 9.9.9.3 datacenter1 rack1 Up Normal *50.79 KB* 20.00% 102084710076281539039012382229530463435 9.9.9.4 datacenter1 rack1 Up Normal 15.22 GB 20.00% 136112946768375385385349842972707284580 Thanks in advance. Regards, Safdar
Re: Cassandra 1.0.6 data flush query
memtable_total_space_in_mb: 200 This means cassandra tries to use less than 200MB of real memory to hold memtables. The problem is java takes a lot more memory to hold data than it takes to store on disk. You can see the ratio of serialized to live bytes logged from the Memtable with messages like setting live ratio… It can be anywhere from 1 to 64. So it the live ratio is 10, your 10MB SSTable is taking 100MB in ram. In short, add more ram to the VM. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/06/2012, at 3:58 PM, Roshan wrote: Hi I am using Cassandra 1.0.6 version in our production system and noticed that Cassandra flushing the data to SSTable and the file size is 10MB. With under moderate write load, the Cassandra flushing lots of memtables with small sizes. With this compaction doing lots of compactions. O/S - Centos 64bit Sun Java 1.6_31 VM size - 2.4GB Following parameters change on cassandra.yaml file. flush_largest_memtables_at: 0.45 (reduce it from .75) reduce_cache_sizes_at: 0.55 (reduce it from .85) reduce_cache_capacity_to: 0.3 (reduce it from .6) concurrent_compactors: 1 memtable_total_space_in_mb: 200 in_memory_compaction_limit_in_mb: 16 (from 64MB) Key cache = 1 Row cache = 0 Could someone please help me on this. Thanks /Roshan -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-1-0-6-data-flush-query-tp7580733.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Column names overhead
What is the penalty for using longer column names? Each column name is stored in each -Data file were a value is stored for it. So if you have muchos overwrites the column name may be stored many places. Should I sacrifice longer self-explanatory names for shorter cryptic ones to save the disk space? If you have lots of COBOL programmers around it may be OK. If you are at the extremes of capacity it may also be ok. You may also get some value by storing the schema separately from the data. On one hand, I understand that Cassandra row id a key-value map, but on another hand, it probably uses compression when storing them. Compression is (currently) off by default, see http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/06/2012, at 5:03 AM, Leonid Ilyevsky wrote: What is the penalty for using longer column names? Should I sacrifice longer self-explanatory names for shorter cryptic ones to save the disk space? On one hand, I understand that Cassandra row id a key-value map, but on another hand, it probably uses compression when storing them. This email, along with any attachments, is confidential and may be legally privileged or otherwise protected from disclosure. Any unauthorized dissemination, copying or use of the contents of this email is strictly prohibited and may be in violation of law. If you are not the intended recipient, any disclosure, copying, forwarding or distribution of this email is strictly prohibited and this email and any attachments should be deleted immediately. This email and any attachments do not constitute an offer to sell or a solicitation of an offer to purchase any interest in any investment vehicle sponsored by Moon Capital Management LP (“Moon Capital”). Moon Capital does not provide legal, accounting or tax advice. Any statement regarding legal, accounting or tax matters was not intended or written to be relied upon by any person as advice. Moon Capital does not waive confidentiality or privilege as a result of this email.
Re: Fat Client Commit Log
The fat client would still have some information in the system CF. Are the files big ? Are they continually created ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/06/2012, at 8:07 AM, Frank Ng wrote: Hi All, We are using the Fat Client and notice that there are files written to the commit log directory on the Fat Client. Does anyone know what these files are storing? Are these hinted handoff data? The Fat Client has no files in the data directory, as expected. thanks
Re: RandomPartitioner is providing a very skewed distribution of keys across a 5-node Solandra cluster
Thanks. Oh, I forgot to mention that I'm using cassandra 1.1.0-beta2...in case that question comes up. Hoping someone can offer some more feedback on the likelyhood of this behavior ... Thanks again, Safdar On Jun 24, 2012 9:22 PM, Dave Brosius dbros...@mebigfatguy.com wrote: Well it sounds like this doesn't apply to you. if you had set up your column family in cql as PRIMARY KEY (domain_name, path) or something like that and where looking at lots and lots of url pages (domain_name + path), but from a very small number domain_names, then the partitioner being just the domain_name could account for an uneven distribution. But it sounds like your key is just a URL so that should (in theory) be fine. On 06/24/2012 01:53 PM, Safdar Kureishy wrote: Hi Dave, Would you mind elaborating a bit more on that, preferably with an example? AFAIK, Solandra uses the unique id of the Solr document as the input for calculating the md5 hash for shard/node assignment. In this case the ids are just millions of varied web URLs that do *not* adhere to any regular expression. I'm not sure if that answers your question below? Thanks, Safdar On Sun, Jun 24, 2012 at 8:38 PM, Dave Brosius dbros...@mebigfatguy.comwrote: If i read what you are saying, you are _not_ using composite keys? That's one thing that could do it, if the first part of the composite key had a very very low cardinality. On 06/24/2012 11:00 AM, Safdar Kureishy wrote: Hi, I've searched online but was unable to find any leads for the problem below. This mailing list seemed the most appropriate place. Apologies in advance if that isn't the case. I'm running a 5-node Solandra cluster (Solr + Cassandra). I've setup the nodes with tokens *evenly distributed across the token space*, for a 5-node cluster (as evidenced below under the effective-ownership column of the nodetool ring output). My data is a set of a few million crawled web pages, crawled using Nutch, and also indexed using the solrindex command available through Nutch. AFAIK, the key for each document generated from the crawled data is the URL. Based on the load values for the nodes below, despite adding about 3 million web pages to this index via the HTTP Rest API (e.g.: http://9.9.9.x:8983/solandra/index/update), some nodes are still empty. Specifically, nodes 9.9.9.1 and 9.9.9.3 have just a few kilobytes (shown in *bold* below) of the index, while the remaining 3 nodes are consistently getting hammered by all the data. If the RandomPartioner (which is what I'm using for this cluster) is supposed to achieve an even distribution of keys across the token space, why is it that the data below is skewed in this fashion? Literally, no key was yet been hashed to the nodes 9.9.9.1 and 9.9.9.3 below. Could someone possibly shed some light on this absurdity?. [me@hm1 solandra-app]$ bin/nodetool -h hm1 ring Address DC RackStatus State Load Effective-Owership Token 136112946768375385385349842972707284580 9.9.9.0 datacenter1 rack1 Up Normal 7.57 GB 20.00% 0 9.9.9.1 datacenter1 rack1 Up Normal *21.44 KB* 20.00% 34028236692093846346337460743176821145 9.9.9.2 datacenter1 rack1 Up Normal 14.99 GB 20.00% 68056473384187692692674921486353642290 9.9.9.3 datacenter1 rack1 Up Normal *50.79 KB* 20.00% 102084710076281539039012382229530463435 9.9.9.4 datacenter1 rack1 Up Normal 15.22 GB 20.00% 136112946768375385385349842972707284580 Thanks in advance. Regards, Safdar
Re: how to reduce latency?
Hi Yan, Did you manage to figure out what was causing the increasing latency on your cluster? Was the resolution just to add more nodes, or something else? Thanks, Safdar On Jun 13, 2012 2:40 PM, Yan Chunlu springri...@gmail.com wrote: I have three nodes running cassandra 0.7.4 about two years, as showed below: 10.x.x.x Up Normal 138.07 GB 33.33% 0 10.x.x.x Up Normal 143.97 GB 33.33% 56713727820156410577229101238628035242 10.x.x.x Up Normal 137.33 GB 33.33% 113427455640312821154458202477256070484 the commitlog and data directory are separated on different disk(Western Digital WD RE3 WD1002FBYS 1TB). as the data size grow the reads and write time are keep increasing, slow down the website frequently. Based on the experience that every time using nodetool to maintain the nodes, could cost a very long time, consume a lot system resource(always ended nowhere), and cause my web service very unstable, I really have no idea what to do, upgrade seems not solving this either, I have a newer cluster with have the same configuration but version is 1.0.2, also got increasing latency, and the new system also suffering unstable problem... just wondering does that means I must add more nodes(which is also a painful and slow path)? [image: Inline image 1]
Re: Starting cassandra with -D option
I did something similar for my installation, but I used ENV variables: I created a directory on a machine (call this the master) with directories for all of the distributions (call them slaves). So, consider: /master/slave1 /master/slave2 ... /master/slaven then i rdist this to all of my slaves. In the /master directory all of the standard cassandra distribution. In the /master/slave* directory all of the machine dependent stuff. Also in /master I have a .profile with: -bash-4.1$ cat /master/.profile # export CASSANDRA_HOME=$HOME/run SHOST=`hostname | sed s'/\..*//'` export CASSANDRA_CONF=$CASSANDRA_HOME/conf/$SHOST export CASSANDRA_INCLUDE=$CASSANDRA_HOME/conf/$SHOST/cassandra.in.sh . $CASSANDRA_HOME/conf/cassandra-env.sh PATH=$HOME/run/bin:$PATH echo 'to start cassandra type cassandra' this leaves me with this environment on each slave (slave1 example): -bash-4.1$ env | grep CAS CASSANDRA_HOME=/usr/share/cassandra/run CASSANDRA_CONF=/usr/share/cassandra/run/conf/slave1 CASSANDRA_INCLUDE=/usr/share/cassandra/run/conf/slave1/cassandra.in.sh Using this technique I maintain my Cassandra cluster on 1 machine and rdist to the participants.Rdist makes each node independent. -greg On Sun, Jun 24, 2012 at 1:11 PM, aaron morton aa...@thelastpickle.com wrote: Idea is to avoid having the copies of cassandra code in each node, If you run cassandra from the NAS you are adding a single point of failure into the system. Better to use some form of deployment automation and install all the requirement components onto each node. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/06/2012, at 12:29 AM, Flavio Baronti wrote: The option must actually include also the name of the yaml file: Dcassandra.config=file:///Users/walmart/Downloads/Cassandra/Node2-Cassandra1.1.0/conf/cassandra.yaml Flavio Il 6/21/2012 13:16 PM, Roshni Rajagopal ha scritto: Hi Folks, We wanted to have a single cassandra installation, and use it to start cassandra in other nodes by passing it the cassandra configuration directories as a parameter. Idea is to avoid having the copies of cassandra code in each node, and starting each node by getting into bin/cassandra of that node. As per http://www.datastax.com/docs/1.0/references/cassandra, We have an option –D where we can supply some parameters to cassandra. Has anyone tried this? Im getting an error as below. walmarts-MacBook-Pro-2:Node1-Cassandra1.1.0 walmart$ bin/cassandra -Dcassandra.config=file:///Users/walmart/Downloads/Cassandra/Node2-Cassandra1.1.0/conf walmarts-MacBook-Pro-2:Node1-Cassandra1.1.0 walmart$ INFO 15:38:01,763 Logging initialized INFO 15:38:01,766 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_31 INFO 15:38:01,766 Heap size: 1052770304/1052770304 INFO 15:38:01,766 Classpath: bin/../conf:bin/../build/classes/main:bin/../build/classes/thrift:bin/../lib/antlr-3.2.jar:bin/../lib/apache-cassandra-1.1.0.jar:bin/../lib/apache-cassandra-clientutil-1.1.0.jar:bin/../lib/apache-cassandra-thrift-1.1.0.jar:bin/../lib/avro-1.4.0-fixes.jar:bin/../lib/avro-1.4.0-sources-fixes.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/compress-lzf-0.8.4.jar:bin/../lib/concurrentlinkedhashmap-lru-1.2.jar:bin/../lib/guava-r08.jar:bin/../lib/high-scale-lib-1.1.2.jar:bin/../lib/jackson-core-asl-1.9.2.jar:bin/../lib/jackson-mapper-asl-1.9.2.jar:bin/../lib/jamm-0.2.5.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-0.7.0.jar:bin/../lib/log4j-1.2.16.jar:bin/../lib/metrics-core-2.0.3.jar:bin/../lib/mx4j-tools-3.0.1.jar:bin/../lib/servlet-api-2.5-20081211.jar:bin/../lib/slf4j-api-1.6.1.jar:bin/../lib/slf4j-log4j12-1.6.1.jar:bin/../lib/snakeyaml-1.6.jar:bin/../lib/snappy-java-1.0.4.1.jar:bin/../lib/snaptree-0.1.jar:bin/../lib/jamm-0.2.5.jar INFO 15:38:01,768 JNA not found. Native methods will be disabled. INFO 15:38:01,826 Loading settings from file:/Users/walmart/Downloads/Cassandra/Node2-Cassandra1.1.0/conf ERROR 15:38:01,873 Fatal configuration error error Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=No single argument constructor found for class org.apache.cassandra.config.Config in reader, line 1, column 1: cassandra.yaml The other option would be to modify cassandra.in.sh. Has anyone tried this?? Regards, Roshni This email and any files transmitted with it are confidential and intended solely for the individual or entity to whom they are addressed. If you have received this email in error destroy it immediately. *** Walmart Confidential ***
Consistency Problem with Quorum consistencyLevel configuration
Hi I met the consistency problem when we have Quorum for both read and write. I use MultigetSubSliceQuery to query rows from super column limit size 100, and then read it, then delete it. And start another around. But I found, the row which should be delete by last query, it still shown from next around query. And also form normal Column Family, I updated the value of one column from status='FALSE' to status='TURE', and next time I query it, the status still 'FALSE'. More detail: - It not happened not every time (1/10,000) - The time between two round query is around 500 ms (but we found two query which 2 seconds happened later then the first one, still have this consistency problem) - We use ntp as our cluster time synchronization solution. - We have 6 nodes, and replication factor is 3 Some body say, Cassandra suppose to have such problem, because read may not happen before write inside Cassandra. But for two seconds?! And if so, it meaningless to have Quorum or other consistency level configuration. So first of all, is it the correct behavior of Cassandra, and if not, what data we need to analyze for further investment. BRs Ares
Re: RandomPartitioner is providing a very skewed distribution of keys across a 5-node Solandra cluster
Hi Safdar, If you want to get better utilization of the cluster raise the solandra.shards.at.once param in solandra.properties -Jake On Sun, Jun 24, 2012 at 11:00 AM, Safdar Kureishy safdar.kurei...@gmail.com wrote: Hi, I've searched online but was unable to find any leads for the problem below. This mailing list seemed the most appropriate place. Apologies in advance if that isn't the case. I'm running a 5-node Solandra cluster (Solr + Cassandra). I've setup the nodes with tokens *evenly distributed across the token space*, for a 5-node cluster (as evidenced below under the effective-ownership column of the nodetool ring output). My data is a set of a few million crawled web pages, crawled using Nutch, and also indexed using the solrindex command available through Nutch. AFAIK, the key for each document generated from the crawled data is the URL. Based on the load values for the nodes below, despite adding about 3 million web pages to this index via the HTTP Rest API (e.g.: http://9.9.9.x:8983/solandra/index/update), some nodes are still empty. Specifically, nodes 9.9.9.1 and 9.9.9.3 have just a few kilobytes (shown in *bold* below) of the index, while the remaining 3 nodes are consistently getting hammered by all the data. If the RandomPartioner (which is what I'm using for this cluster) is supposed to achieve an even distribution of keys across the token space, why is it that the data below is skewed in this fashion? Literally, no key was yet been hashed to the nodes 9.9.9.1 and 9.9.9.3 below. Could someone possibly shed some light on this absurdity?. [me@hm1 solandra-app]$ bin/nodetool -h hm1 ring Address DC RackStatus State Load Effective-Owership Token 136112946768375385385349842972707284580 9.9.9.0 datacenter1 rack1 Up Normal 7.57 GB 20.00% 0 9.9.9.1 datacenter1 rack1 Up Normal *21.44 KB* 20.00% 34028236692093846346337460743176821145 9.9.9.2 datacenter1 rack1 Up Normal 14.99 GB 20.00% 68056473384187692692674921486353642290 9.9.9.3 datacenter1 rack1 Up Normal *50.79 KB* 20.00% 102084710076281539039012382229530463435 9.9.9.4 datacenter1 rack1 Up Normal 15.22 GB 20.00% 136112946768375385385349842972707284580 Thanks in advance. Regards, Safdar -- http://twitter.com/tjake
Re: Limited row cache size
I was using the datastax build. Do they also have a 1.1 build? On Mon, Jun 18, 2012 at 9:05 AM, aaron morton aa...@thelastpickle.com wrote: cassandra 1.1.1 ships with concurrentlinkedhashmap-lru-1.3.jar row_cache_size_in_mb starts life as an int but the byte size is stored as a long https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CacheService.java#L143 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 15/06/2012, at 7:13 PM, Noble Paul നോബിള് नोब्ळ् wrote: hi, I configured my server with a row_cache_size_in_mb : 1920 When started the server and checked the JMX it shows the capacity is set to 1024MB . I investigated further and found that the version of concurrentlruhashmap used is 1.2 which sets capacity max value to 1GB. So, in cassandra 1.1 the max cache size I can use is 1GB Digging deeper , I realized that throughout the API chain the cache size is passed around as an int so even if I write my own CacheProvider the max size would be Integer.MAX_VALUE = 2GB unless cassandra changes the version of concurrentlruhashmap to 1.3 and change the signature to use a long for size, we can't have a big cache. according to me 1 GB is a really small size. So , even if I have bigger machines I can't really use them -- - Noble Paul -- - Noble Paul
Re: Limited row cache size
sorry I meant 1.1.1 build On Mon, Jun 25, 2012 at 10:40 AM, Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com wrote: I was using the datastax build. Do they also have a 1.1 build? On Mon, Jun 18, 2012 at 9:05 AM, aaron morton aa...@thelastpickle.com wrote: cassandra 1.1.1 ships with concurrentlinkedhashmap-lru-1.3.jar row_cache_size_in_mb starts life as an int but the byte size is stored as a long https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CacheService.java#L143 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 15/06/2012, at 7:13 PM, Noble Paul നോബിള് नोब्ळ् wrote: hi, I configured my server with a row_cache_size_in_mb : 1920 When started the server and checked the JMX it shows the capacity is set to 1024MB . I investigated further and found that the version of concurrentlruhashmap used is 1.2 which sets capacity max value to 1GB. So, in cassandra 1.1 the max cache size I can use is 1GB Digging deeper , I realized that throughout the API chain the cache size is passed around as an int so even if I write my own CacheProvider the max size would be Integer.MAX_VALUE = 2GB unless cassandra changes the version of concurrentlruhashmap to 1.3 and change the signature to use a long for size, we can't have a big cache. according to me 1 GB is a really small size. So , even if I have bigger machines I can't really use them -- - Noble Paul -- - Noble Paul -- - Noble Paul
Re: Weird behavior in Cassandra 1.1.0 - throwing unconfigured CF exceptions when the CF is present
Yes, it seems an error on our side. Sorry for the noise. On Sun, Jun 24, 2012 at 11:38 PM, aaron morton aa...@thelastpickle.comwrote: I would check if the schema's have diverged, run describe cluster in the cli. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/06/2012, at 12:22 AM, Tharindu Mathew wrote: Hi, I'm having issues with Hector 1.1.0 and Cassandra 1.1.0. I'm adding a column family dynamically, and after sleeping for some time and making sure that the column family is created using keyspacedefinition.getCFs, I still get unconfigured column family exceptions.. Even after some time if I try to insert data I still get unconfigured CF exceptions. Below at [1], I have inserted logs to specifically print all the CFs before inserting data. It is present in the CF list, but still it's failing. Note, that this does not happen for all data. Some data does get inserted. I'm baffled as to what could be the reason. Any help would be really appreciated. [1] - [2012-06-21 17:22:21,680] INFO {org.wso2.carbon.eventbridge.streamdefn.cassandra.datastore.CassandraConnector} - Keyspace desc. : ThriftKsDef[name=EVENT_KS,strategyClass=org.apache.cassandra.locator.SimpleStrategy,strategyOptions={replication_factor=1},cfDefs=[ThriftCfDef[keyspace=EVENT_KS,name=org_wso2_bam_kp,columnType=STANDARD,comparatorType=me.prettyprint.hector.api.ddl.ComparatorType@c89abe1 ,subComparatorType=null,comparatorTypeAlias=,subComparatorTypeAlias=,comment=,rowCacheSize=0.0,rowCacheSavePeriodInSeconds=0,keyCacheSize=0.0,readRepairChance=1.0,columnMetadata=[],gcGraceSeconds=864000,keyValidationClass=org.apache.cassandra.db.marshal.BytesType,defaultValidationClass=org.apache.cassandra.db.marshal.BytesType,id=1004,maxCompactionThreshold=32,minCompactionThreshold=4,memtableOperationsInMillions=0.0,memtableThroughputInMb=0,memtableFlushAfterMins=0,keyCacheSavePeriodInSeconds=0,replicateOnWrite=true,compactionStrategy=org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={},compressionOptions={sstable_compression=org.apache.cassandra.io.compress.SnappyCompressor},mergeShardsChance=0.0,rowCacheProvider=null,keyAlias=null,rowCacheKeysToSave=0]],durableWrites=true] *[2012-06-21 17:22:21,681] INFO {org.wso2.carbon.eventbridge.streamdefn.cassandra.datastore.CassandraConnector} - CFs present * *cf name : org_wso2_bam_kp* [2012-06-21 17:22:21,683] ERROR {org.wso2.carbon.eventbridge.streamdefn.cassandra.subscriber.BAMEventSubscriber} - Error processing event. Event{streamId='org.wso2.bam.kp-1.0.5-6b80ca6c-1ad9-4495-a872-8466c424c5d0', timeStamp=1340279541606, metaData=[external], metaData=null, payloadData=[Orange, 1.0, 520.0, Ivan]} me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:unconfigured columnfamily *org_wso2_bam_kp*) at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:45) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:264) at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97) at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243) at org.wso2.carbon.eventbridge.streamdefn.cassandra.datastore.CassandraConnector.insertEvent(CassandraConnector.java:361) at org.wso2.carbon.eventbridge.streamdefn.cassandra.subscriber.BAMEventSubscriber.receive(BAMEventSubscriber.java:42) at org.wso2.carbon.eventbridge.core.internal.queue.QueueWorker.run(QueueWorker.java:64) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Caused by: InvalidRequestException(why:unconfigured columnfamily org_wso2_bam_kp) at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20169) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:913) at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:899) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:246) at me.prettyprint.cassandra.model.MutatorImpl$3.execute(MutatorImpl.java:243) at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103) at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ... 11 more -- Regards, Tharindu blog: http://mackiemathew.com/ -- Regards, Tharindu blog: http://mackiemathew.com/