Re: Registration open for Community Over Code North America
unsubscribe On Mon, Aug 28, 2023 at 12:54 PM Rich Bowen wrote: > Hello! Registration is still open for the upcoming Community Over Code > NA event in Halifax, NS! We invite you to register for the event > https://communityovercode.org/registration/ > > Apache Committers, note that you have a special discounted rate for the > conference at US$250. To take advantage of this rate, use the special > code sent to the committers@ list by Brian Proffitt earlier this month. > > If you are in need of an invitation letter, please consult the > information at https://communityovercode.org/visa-letter/ > > Please see https://communityovercode.org/ for more information about > the event, including how to make reservations for discounted hotel > rooms in Halifax. Discounted rates will only be available until Sept. > 5, so reserve soon! > > --Rich, for the event planning team > -- Cheers, -Arya
Re: High GC activity on node with 4TB on data
Sorry to jump on this late. GC is one of my favorite topics. A while ago I wrote a blob post about C* GC tuning and documented several issues that I had experienced. It seems it has helped some people in that past, so I am sharing it here: http://aryanet.com/blog/cassandra-garbage-collector-tuning On Thu, Feb 12, 2015 at 11:08 AM, Jiri Horky wrote: > Number of cores: 2x6Cores x 2(HT). > > I do agree with you that the the hardware is certainly overestimated for > just one Cassandra, but we got a very good price since we ordered several > 10s of the same nodes for a different project. That's why we use for > multiple cassandra instances. > > Jirka H. > > > On 02/12/2015 04:18 PM, Eric Stevens wrote: > > > each node has 256G of memory, 24x1T drives, 2x Xeon CPU > > I don't have first hand experience running Cassandra on such massive > hardware, but it strikes me that these machines are dramatically oversized > to be good candidates for Cassandra (though I wonder how many cores are in > those CPUs; I'm guessing closer to 18 than 2 based on the other hardware). > > A larger cluster of smaller hardware would be a much better shape for > Cassandra. Or several clusters of smaller hardware since you're running > multiple instances on this hardware - best practices have one instance per > host no matter the hardware size. > > On Thu, Feb 12, 2015 at 12:36 AM, Jiri Horky wrote: > >> Hi Chris, >> >> On 02/09/2015 04:22 PM, Chris Lohfink wrote: >> >> - number of tombstones - how can I reliably find it out? >> https://github.com/spotify/cassandra-opstools >> https://github.com/cloudian/support-tools >> >> thanks. >> >> >> If not getting much compression it may be worth trying to disable it, >> it may contribute but its very unlikely that its the cause of the gc >> pressure itself. >> >> 7000 sstables but STCS? Sounds like compactions couldn't keep up. Do >> you have a lot of pending compactions (nodetool)? You may want to increase >> your compaction throughput (nodetool) to see if you can catch up a little, >> it would cause a lot of heap overhead to do reads with that many. May even >> need to take more drastic measures if it cant catch back up. >> >> I am sorry, I was wrong. We actually do use LCS (the switch was done >> recently). There are almost none pending compaction. We have increased the >> size sstable to 768M, so it should help as as well. >> >> >> May also be good to check `nodetool cfstats` for very wide partitions. >> >> >> There are basically none, this is fine. >> >> It seems that the problem really comes from having so much data in so >> many sstables, so >> org.apache.cassandra.io.compress.CompressedRandomAccessReader classes >> consumes more memory than 0.75*HEAP_SIZE, which triggers the CMS over and >> over. >> >> We have turned off the compression and so far, the situation seems to be >> fine. >> >> Cheers >> Jirka H. >> >> >> >> Theres a good chance if under load and you have over 8gb heap your GCs >> could use tuning. The bigger the nodes the more manual tweaking it will >> require to get the most out of them >> https://issues.apache.org/jira/browse/CASSANDRA-8150 also has some >> ideas. >> >> Chris >> >> On Mon, Feb 9, 2015 at 2:00 AM, Jiri Horky wrote: >> >>> Hi all, >>> >>> thank you all for the info. >>> >>> To answer the questions: >>> - we have 2 DCs with 5 nodes in each, each node has 256G of memory, >>> 24x1T drives, 2x Xeon CPU - there are multiple cassandra instances running >>> for different project. The node itself is powerful enough. >>> - there 2 keyspaces, one with 3 replicas per DC, one with 1 replica per >>> DC (because of amount of data and because it serves more or less like a >>> cache) >>> - there are about 4k/s Request-response, 3k/s Read and 2k/s Mutation >>> requests - numbers are sum of all nodes >>> - we us STCS (LCS would be quite IO have for this amount of data) >>> - number of tombstones - how can I reliably find it out? >>> - the biggest CF (3.6T per node) has 7000 sstables >>> >>> Now, I understand that the best practice for Cassandra is to run "with >>> the minimum size of heap which is enough" which for this case we thought is >>> about 12G - there is always 8G consumbed by the SSTable readers. Also, I >>> though that high number of tombstones create pressure in the new space >>> (which can then cause pressure in old space as well), but this is not what >>> we are seeing. We see continuous GC activity in Old generation only. >>> >>> Also, I noticed that the biggest CF has Compression factor of 0.99 which >>> basically means that the data come compressed already. Do you think that >>> turning off the compression should help with memory consumption? >>> >>> Also, I think that tuning CMSInitiatingOccupancyFraction=75 might help >>> here, as it seems that 8G is something that Cassandra needs for bookkeeping >>> this amount of data and that this was sligtly above the 75% limit which >>> triggered the CMS again and again. >>> >>> I wi
Re: Tombstones
Nodetool cleanup deletes rows that aren't owned by specific tokens (shouldn't be on this node). And nodetool repair makes sure data is in sync between all replicas. It is wrong to say either of these commands cleanup tombstones. Tombstones are only cleaned up during compactions only if they are expired passed gc_grace_seconds. Now it is also incorrect to say that compaction always cleans up tombstones. In fact there are situations that can lead to tombstones live for a long time. SSTables are immutable, so if the SSTables that hold tombstones aren't part of a compaction the tombstones don't get cleaned up, so the behavior you are expecting is not 100% predictable. In case of LCS, if SStables are promoted to another level, compaction happens and tombstones which are expired will cleanup. Unlike SizeTiered in LCS there is no easy way to force compaction on SSTables. One hack I have tried in the past was to stop the node and deleted the .json file that holds level manifests. Start the node. LCS will compact all of them again to figure out the levels. Another way is if you pick smaller SSTable sizes, you may have more compaction churn but again it is not 100% guarantee that the tombstones you want will be cleaned up. On Fri, May 16, 2014 at 9:06 AM, Omar Shibli wrote: > Yes, but still you need to run 'nodetool cleanup' from time to time to > make sure all tombstones are deleted. > > > On Fri, May 16, 2014 at 10:11 AM, Dimetrio wrote: > >> Does cassandra delete tombstones during simple LCS compaction or I should >> use >> node tool repair? >> >> Thanks. >> >> >> >> -- >> View this message in context: >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Tombstones-tp7594467.html >> Sent from the cassandra-u...@incubator.apache.org mailing list archive >> at Nabble.com. >> > > -- Cheers, -Arya
Re: GC eden filled instantly (any size). Dropping messages.
Dimetrio, Look at my last post. I showed you how to turn on all useful GC logging flags. From there we can get information on why GC has long pauses. From the changes you have made it seems you are changing things without knowing the effect. Here are a few things to considenr: - Having a 9GB NewGen out of a 16GB heap is one recipe for disaster. I am sure if you turn on GC logs, you will see lots of promotion failures. The standard is NewGen to be at max 1/4th of your HEAP to allow for healthy GC promotions; - The Jstat output suggests that the survivor spaces aren't utilized. This is one sign of premature promotion. Consider increasing MaxTenuringThreshold to a value higher that what it is. The higher it is, the slowed things get promoted out of Eden; but we should really examine your GC logs before making this part of the resolution; - If you are going with 16Gb heap, then reduce your NewGen to 1/4th of it; - It seems you have lowered compaction so much that SSTables aren't compacting fast enough; tpstats should tell you something about this it my assumption is true; I also agree with Jonathan about Data Model and access pattern issues. It seems your queries are creating long rows with lots of tombstones. If you are deleting lots of columns from a single row and writing more to it and do a fetch of lots of columns, you end up having to read a large row causing it to stay in heap while being processes and cause long GCs. The GC histograms inside GC logs (after you enabled it), should tell you what is in the heap, either columns from slice queries or columns from compaction (these two are usually the cases based on my experience of tuning GC pauses). Hope this helps On Mon, Jan 27, 2014 at 4:07 AM, Dimetrio wrote: > No one advice did't help to me for reduce GC load > > I tried these: > > MAX_HEAP_SIZE from default(8GB) to 16G with HEAP_NEWSIZE from 400M to 9600M > key cache on/off > compacting memory size and other limits > 15 c3.4xlarge nodes (adding 5 nodes to 10 nodes cluster did't help): > and many other > > Reads ~5000 ops/s > Writes ~ 5000 ops/s > max batch is 50 > heavy reads and heavy writes (and heavy deletes) > sometimes i have message: > Read 1001 live and 2691 > Read 12 live and 2796 > > > sudo jstat -gcutil -h15 `sudo cat /var/run/cassandra/cassandra.pid` 250ms 0 > S0 S1 E O P YGC YGCTFGCFGCT GCT > 18.93 0.00 4.52 75.36 59.77225 30.11918 28.361 58.480 > 0.00 13.12 3.78 81.09 59.77226 30.19318 28.617 58.810 > 0.00 13.12 39.50 81.09 59.78226 30.19318 28.617 58.810 > 0.00 13.12 80.70 81.09 59.78226 30.19318 28.617 58.810 > 17.21 9.13 0.66 87.38 59.78228 30.23518 28.617 58.852 > 0.00 10.96 29.43 87.89 59.78228 30.32818 28.617 58.945 > 0.00 10.96 62.67 87.89 59.78228 30.32818 28.617 58.945 > 0.00 10.96 96.62 87.89 59.78228 30.32818 28.617 58.945 > 0.00 10.69 10.29 94.56 59.78230 30.46218 28.617 59.078 > 0.00 10.69 38.08 94.56 59.78230 30.46218 28.617 59.078 > 0.00 10.69 71.70 94.56 59.78230 30.46218 28.617 59.078 > 15.91 6.24 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > S0 S1 E O P YGC YGCTFGCFGCT GCT > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > 15.91 8.02 0.03 99.96 59.78232 30.50618 28.617 59.123 > > > $ nodetool cfhistograms Social home_timeline > Social/home_timeline histograms > Offset SSTables Write Latency Read LatencyPartition Size > Cell Count > (micros) (micros) (bytes) > 1 10458 0
Re: Cassandra mad GC
Ha! I missed the log line: WARN [ReadStage:70] 2014-01-14 13:03:36,963 SliceQueryFilter.java (line 209) Read 1001 live and 1518 tombstoned cells (see tombstone_warn_threshold) Seems some application code it trying to read a wide row with lots of tombstones. On Thu, Jan 16, 2014 at 8:28 PM, Aaron Morton wrote: > c3.4xlarge > > long par new on a machine like this is not normal. > > Do you have a custom comparator or are you using triggers ? > Do you have a data model that creates a lot of tombstones ? > > Try to return the settings to default and then tune from there, that > includes returning to the default JVM GC settings. If for no other reason > than other people will be able to offer advice. > > Have you changed the compaction_throughput ? Put it back if you have. > If you have enabled multi_threaded compaction disable it. > Consider setting concurrent_compactors to 4 or 8 to reduce compaction > churn. > If you have increased in_memory_compaction_limit put it back. > > Cassandra logs > > Can you provide some of the log messages from GCInspector ? How long are > the pauses ? Is there a lot of CMS or ParNew ? > Do you have monitoring in place ? Is CMS able to return the heap to a low > value e.g. < 3Gb ? > > cpu load > 1000% > > Is this all from cassandra ? > try jvmtop (https://code.google.com/p/jvmtop/) to see what cassandra > threads are doing. > > It’s a lot easier to tune a system with fewer non default settings. > > Cheers > > - > Aaron Morton > New Zealand > @aaronmorton > > Co-Founder & Principal Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > On 16/01/2014, at 8:22 am, Arya Goudarzi wrote: > > It is not a good idea to change settings without identifying the root > cause. Chances are what you did masked the problem a bit for you, but the > problem is still there, isn't it? > > > On Wed, Jan 15, 2014 at 1:11 AM, Dimetrio wrote: > >> I set G1 because GS started to work wrong(dropped messages) with standard >> GC >> settings. >> In my opinion, Cassandra started to work more stable with G1 (it's getting >> less count of timeouts now) but it's not ideally yet. >> I just want cassandra to works fine. >> >> >> >> -- >> View this message in context: >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-mad-GC-tp7592248p7592257.html >> Sent from the cassandra-u...@incubator.apache.org mailing list archive >> at Nabble.com. >> > > > > -- > Cheers, > -Arya > > > -- Cheers, -Arya
Re: Upgrading 1.0.9 to 2.0
Read the upgrade best practices http://www.datastax.com/docs/1.1/install/upgrading#best-practices You cannot change partitioner http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/architecture/architecturePartitionerAbout_c.html On Thu, Jan 16, 2014 at 2:04 AM, Or Sher wrote: > Hi, > > In order to upgrade our env from 1.0.9 to 2.0 I thought about the > following steps: > > - Creating a new 1.0.9 cluster > - Creating the keyspaces and column families > (I need to move one keyspace data to the new cluster and so:) > - Moving all xKS SSTables from old cluster to every node in the new cluster > - compact & cleanup > - upgrading to 1.2.13 (all at once) > -- upgrade sstables? > - upgrading to 2.0 (all at once) > > 1. I'd like to use new features such as Murmur3 Partitioner and Vnodes - > How can I accomplish that? > > 2. Is there any other features that would be "hard to enable"? > > 3. What am I'm missing in the process? > > Thanks in advance, > -- > Or Sher > -- Cheers, -Arya
Re: Cassandra mad GC
It is not a good idea to change settings without identifying the root cause. Chances are what you did masked the problem a bit for you, but the problem is still there, isn't it? On Wed, Jan 15, 2014 at 1:11 AM, Dimetrio wrote: > I set G1 because GS started to work wrong(dropped messages) with standard > GC > settings. > In my opinion, Cassandra started to work more stable with G1 (it's getting > less count of timeouts now) but it's not ideally yet. > I just want cassandra to works fine. > > > > -- > View this message in context: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-mad-GC-tp7592248p7592257.html > Sent from the cassandra-u...@incubator.apache.org mailing list archive at > Nabble.com. > -- Cheers, -Arya
Re: Cassandra mad GC
Hi, I sympathize with your issue. I recommend adding the following to your JVM flags: JVM_OPTS="$JVM_OPTS -XX:+PrintGCDetails" JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" JVM_OPTS="$JVM_OPTS -XX:+PrintHeapAtGC" JVM_OPTS="$JVM_OPTS -XX:+PrintTenuringDistribution" JVM_OPTS="$JVM_OPTS -XX:+PrintGCApplicationStoppedTime" JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" JVM_OPTS="$JVM_OPTS -XX:PrintFLSStatistics=1" JVM_OPTS="$JVM_OPTS -XX:+PrintSafepointStatistics" JVM_OPTS="$JVM_OPTS -XX:+PrintClassHistogramBeforeFullGC" JVM_OPTS="$JVM_OPTS -XX:+PrintClassHistogramAfterFullGC" Provide the compaction settings from your schema and cassandra.yaml. Also try rolling back the JVM config to the defaults shipped with C*. I recall G1 was not recommended for Cassandra. Change back SurvivorRatio to 8 and remove NewRatio. Next time you got long GCs, try to find segment of gc.log the relates to the long pause. You can do that by greping the log for keyword "stopped". Paste anything above it. From there, I can help you explore a few things: 1. Compaction Pressure; 2. Potentially reading a fat row; (> 10Mb) 3. Premature tenuring of objects in Java heap. Cheers, -Arya On Tue, Jan 14, 2014 at 5:39 AM, Dimetrio wrote: > iostat is clean > vm.max_map_count = 131072 > > > > > -- > View this message in context: > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-mad-GC-tp7592248p7592251.html > Sent from the cassandra-u...@incubator.apache.org mailing list archive at > Nabble.com. > -- Cheers, -Arya
Re: Cassandra + Hadoop - 2 Task attempts with million of rows
We haven't tried using Pig. However, we had a problem where our mapreduce job blew up for a subset of data. It appeared that we had a bug in our code that had generated a row as big as 3Gb. It was actually causing long GC pauses and would cause GC thrashing. The hadoop job of course would time out. Our range batch sizes are 32 and we had wide rows enabled. Your scenario seems similar. Try using nodetool tpstats to see if the column family involved with your job has a max row size which is very large. Also inspect your C* logs looking for long GC pause log lines from GCInspector. You can also refer to heap usage trends if you have them in your monitoring tools. On Thu, Apr 25, 2013 at 7:03 PM, aaron morton wrote: > 2013-04-23 16:09:17,838 INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: > Current split being processed ColumnFamilySplit((9197470410121435301, '-1] > @[p00nosql02.00, p00nosql01.00]) > Why it's split data from two nodes? we have 6 nodes cassandra cluster + > hadoop slaves - every task should get local input split from local > cassandra - am i right? > > My understanding is that it may get it locally, but it's not something > that has to happen. Once of the Hadoop guys will have a better idea. > > Try reducing the cassandra.range.batch.size and/or if you are using wide > rows enable cassandra.input.widerows > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 25/04/2013, at 7:55 PM, Shamim wrote: > > Hello Aaron, > I have got the following Log from the server (Sorry for being late) > > job_201304231203_0004 > attempt_201304231203_0004_m_000501_0 > > 2013-04-23 16:09:14,196 INFO org.apache.hadoop.util.NativeCodeLoader: > Loaded the native-hadoop library > 2013-04-23 16:09:14,438 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating > symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/pigContext > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/pigContext > 2013-04-23 16:09:14,453 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating > symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/dk > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/dk > 2013-04-23 16:09:14,456 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating > symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/META-INF > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/META-INF > 2013-04-23 16:09:14,459 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating > symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/org > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/org > 2013-04-23 16:09:14,469 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating > symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/com > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/com > 2013-04-23 16:09:14,471 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating > symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/.job.jar.crc > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/.job.jar.crc > 2013-04-23 16:09:14,474 INFO > org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating > symlink: > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/jars/job.jar > <- > /egov/data/hadoop/mapred/local/taskTracker/cassandra/jobcache/job_201304231203_0004/attempt_201304231203_0004_m_000501_0/work/job.jar > 2013-04-23 16:09:17,329 INFO org.apache.hadoop.util.ProcessTree: setsid > exited with exit code 0 > 2013-04-23 16:09:17,387 INFO org.apache.hadoop.mapred.Task: Using > ResourceCalculatorPlugin : > org.apache.hadoop.util.LinuxResourceCalculatorPlugin@256ef705 > 2013-04-23 16:09:17,838 INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: > Current split being processed ColumnFamilySplit((9197470410121435301, '-1] > @[p00nosql02.00, p00nosql01.00]) > 2013-04-23 16:09:18,088 INFO org.apache.pig.data.SchemaTupleBackend: Key > [pig.schematuple] was not set... will not generate code. > 2013-04-23 16:09:19,784 INFO > org.apache.pig.backend.hadoop.executionengine.mapRedu
Re: Repair Freeze / Gossip Invisibility / EC2 Public IP configuration
We don't use default ports. Woops! Now I advertised mine. I did try disabling internode compression for all in cassandra.yaml but still it did not work. I have to open the insecure storage port to public ips. On Tue, Apr 16, 2013 at 4:59 PM, Edward Capriolo wrote: > So cassandra does inter node compression. I have not checked but this > might be accidentally getting turned on by default. Because the storage > port is typically 7000. Not sure why you are allowing 7100. In any case try > allowing 7000 or with internode compression off. > > > On Tue, Apr 16, 2013 at 6:42 PM, Arya Goudarzi wrote: > >> TL;DR; An EC2 Multi-Region Setup's Repair/Gossip Works with 1.1.10 but >> with 1.2.4, gossip does not see the nodes after restarting all nodes at >> once, and repair gets stuck. >> >> This is a working configuration: >> Cassandra 1.1.10 Cluster with 12 nodes in us-east-1 and 12 nodes in >> us-west-2 >> Using Ec2MultiRegionSnitch and SSL enabled for DC_ONLY and >> NetworkTopologyStrategy with strategy_options: us-east-1:3;us-west-2:3; >> C* instances have a security group called 'cluster1' >> security group 'cluster1' in each region is configured as such >> Allow TCP: >> 7199 from cluster1 (JMX) >> 1024 - 65535 from cluster1 (JMX Random Ports - This supersedes all >> specific ports, but I have the specific ports just for clarity ) >> 7100 from cluster1 (Configured Normal Storage) >> 7103 from cluster1 (Configured SSL Storage) >> 9160 from cluster1 (Configured Thrift RPC Port) >> 9160 from >> foreach node's public IP we also have this rule set to enable cross >> region comminication: >> 7103 from public_ip (Open SSL storage) >> >> The above is a functioning and happy setup. You run repair, and it >> finishes successfully. >> >> Broken Setup: >> >> Upgrade to 1.2.4 without changing any of the above security group >> settings: >> >> Run repair. The repair will get stuck. Thus hanging. >> >> Now for each public_ip add a security group rule as such to cluster1 >> security group: >> >> Allow TCP: 7100 from public_ip >> >> Run repair. Things will work now. Also after restarting all nodes at the >> same time, gossip will see everyone again. >> >> I was told on https://issues.apache.org/jira/browse/CASSANDRA-5432 that >> nothing in terms of networking was changed. If nothing in terms of port and >> networking was changed in 1.2, then why the above is happening? I can >> constantly reproduce it. >> >> Please advice. >> >> -Arya >> >> >
Repair Freeze / Gossip Invisibility / EC2 Public IP configuration
TL;DR; An EC2 Multi-Region Setup's Repair/Gossip Works with 1.1.10 but with 1.2.4, gossip does not see the nodes after restarting all nodes at once, and repair gets stuck. This is a working configuration: Cassandra 1.1.10 Cluster with 12 nodes in us-east-1 and 12 nodes in us-west-2 Using Ec2MultiRegionSnitch and SSL enabled for DC_ONLY and NetworkTopologyStrategy with strategy_options: us-east-1:3;us-west-2:3; C* instances have a security group called 'cluster1' security group 'cluster1' in each region is configured as such Allow TCP: 7199 from cluster1 (JMX) 1024 - 65535 from cluster1 (JMX Random Ports - This supersedes all specific ports, but I have the specific ports just for clarity ) 7100 from cluster1 (Configured Normal Storage) 7103 from cluster1 (Configured SSL Storage) 9160 from cluster1 (Configured Thrift RPC Port) 9160 from foreach node's public IP we also have this rule set to enable cross region comminication: 7103 from public_ip (Open SSL storage) The above is a functioning and happy setup. You run repair, and it finishes successfully. Broken Setup: Upgrade to 1.2.4 without changing any of the above security group settings: Run repair. The repair will get stuck. Thus hanging. Now for each public_ip add a security group rule as such to cluster1 security group: Allow TCP: 7100 from public_ip Run repair. Things will work now. Also after restarting all nodes at the same time, gossip will see everyone again. I was told on https://issues.apache.org/jira/browse/CASSANDRA-5432 that nothing in terms of networking was changed. If nothing in terms of port and networking was changed in 1.2, then why the above is happening? I can constantly reproduce it. Please advice. -Arya
Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10
Yes, I know blowing them away would fix it and that is what I did, but I want to understand why this happens in first place. I was upgrading from 1.1.10 to 1.2.3 On Fri, Apr 5, 2013 at 2:53 PM, Edward Capriolo wrote: > This has happened before the save caches files were not compatible between > 0.6 and 0.7. I have ran into this a couple other times before. The good > news is the save key cache is just an optimization, you can blow it away > and it is not usually a big deal. > > > > > On Fri, Apr 5, 2013 at 2:55 PM, Arya Goudarzi wrote: > >> Here is a chunk of bloom filter sstable skip messages from the node I >> enabled DEBUG on: >> >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39459 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39483 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39332 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39335 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39438 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39478 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39456 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39469 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39334 >> DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line >> 737) Bloom filter allows skipping sstable 39406 >> >> This is the last chunk of log before C* gets stuck, right before I stop >> the process, remove key caches and start again (This is from another node >> that I upgraded 2 days ago): >> >> INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499 >> (5273270 bytes) >> INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755 >> (5264359 bytes) >> INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762 >> (5260887 bytes) >> INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java >> (line 166) Opening >> /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886 >> (5262864 bytes) >> INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java >> (line 112) reading saved cache >> /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache >> >> >> I finally upgrade all 12 nodes in our test environment yesterday. This >> issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get >> stuck on the same CF loading its saved KeyCache. >> >> >> On Fri, Apr 5, 2013 at 9:56 AM, aaron morton wrote: >> >>> skipping sstable due to bloom filter debug messages >>> >>> What were these messages? >>> >>> Do you have the logs from the start up ? >>> >>> Cheers >>> >>>- >>> Aaron Morton >>> Freelance Cassandra Consultant >>> New Zealand >>> >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 4/04/2013, at 6:11 AM, Arya Goudarzi wrote: >>> >>> Hi, >>> >>> I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to >>> 1.2.3. During startup while tailing C*'s system.log, I observed a series of >>> SSTable batch load messages and skipping sstable due to bloom filter debug >>> messages which is normal for startup, but when it reached loading saved key >>> caches, it gets stuck forever. The I/O wait stays high in the CPU graph and >>> I/O ops are sent to disk, but C* never passes that step of loading the key >>> cache file successfully. The saved key cache file was about 75MB on one >>> node and 125MB on the other node and they were for different CFs. >>> >>> >>> >>> The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at >>> loading one saved key cache file. I have marked that on the graph above. >>> The workaround was to delete the saved cache files and things loaded fine >>> (See marked Normal Startup). >>> >>> These machines are m1.xlarge EC2 instances. And this issue happened on >>> both nodes upgraded. This did not happen during exercise of upgrade from >>> 1.1.6 to 1.2.2 using the same snapshot. >>> >>> Should I raise a JIRA? >>> >>> -Arya >>> >>> >>> >> >
Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10
Here is a chunk of bloom filter sstable skip messages from the node I enabled DEBUG on: DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39459 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39483 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39332 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39335 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39438 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39478 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39456 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39469 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39334 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line 737) Bloom filter allows skipping sstable 39406 This is the last chunk of log before C* gets stuck, right before I stop the process, remove key caches and start again (This is from another node that I upgraded 2 days ago): INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499 (5273270 bytes) INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755 (5264359 bytes) INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762 (5260887 bytes) INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java (line 166) Opening /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886 (5262864 bytes) INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java (line 112) reading saved cache /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache I finally upgrade all 12 nodes in our test environment yesterday. This issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get stuck on the same CF loading its saved KeyCache. On Fri, Apr 5, 2013 at 9:56 AM, aaron morton wrote: > skipping sstable due to bloom filter debug messages > > What were these messages? > > Do you have the logs from the start up ? > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 4/04/2013, at 6:11 AM, Arya Goudarzi wrote: > > Hi, > > I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to > 1.2.3. During startup while tailing C*'s system.log, I observed a series of > SSTable batch load messages and skipping sstable due to bloom filter debug > messages which is normal for startup, but when it reached loading saved key > caches, it gets stuck forever. The I/O wait stays high in the CPU graph and > I/O ops are sent to disk, but C* never passes that step of loading the key > cache file successfully. The saved key cache file was about 75MB on one > node and 125MB on the other node and they were for different CFs. > > > > The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at > loading one saved key cache file. I have marked that on the graph above. > The workaround was to delete the saved cache files and things loaded fine > (See marked Normal Startup). > > These machines are m1.xlarge EC2 instances. And this issue happened on > both nodes upgraded. This did not happen during exercise of upgrade from > 1.1.6 to 1.2.2 using the same snapshot. > > Should I raise a JIRA? > > -Arya > > >
Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
Aaron, I added -Dcassandra.load_ring_state=false in the cassandra-env.sh and did a rolling restart With one node in 1.2.3 version and 11 other nodes in 1.1.10, the 1.1.10 nodes saw 1.2.3 node but now the gossip on 1.2.3 only sees itself. Cheers, -Arya On Thu, Mar 28, 2013 at 1:02 PM, Arya Goudarzi wrote: > There has been a little misunderstanding. When all nodes are 1.2.2, they > are fine. But during the rolling upgrade, 1.2.2 nodes see 1.1.10 nodes as > down in nodetool command despite gossip reporting NORMAL. I will give your > suggestion a try and wil report back. > > > On Sat, Mar 23, 2013 at 10:37 AM, aaron morton wrote: > >> So all nodes are 1.2 and some are still being marked as down ? >> >> I would try a rolling restart with -Dcassandra.load_ring_state=false >> added as a JVM _OPT in cassandra-env.sh. There is no guarantee it will fix >> it, but it's a simple thing to try. >> >> Cheers >> >>- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 22/03/2013, at 10:30 AM, Arya Goudarzi wrote: >> >> I took Brandon's suggestion in CASSANDRA-5332 and upgraded to 1.1.10 >> before upgrading to 1.2.2 but the issue with nodetool ring reporting >> machines as down did not resolve. >> >> On Fri, Mar 15, 2013 at 6:35 PM, Arya Goudarzi wrote: >> >>> Thank you very much Aaron. I recall from the logs of this upgraded node >>> to 1.2.2 reported seeing others as dead. Brandon suggested in >>> https://issues.apache.org/jira/browse/CASSANDRA-5332 that I should at >>> least upgrade from 1.1.7. So, I decided to try upgrading to 1.1.10 first >>> before upgrading to 1.2.2. I am in the middle of troubleshooting some other >>> issues I had with that upgrade (posted separately), once I am done, I will >>> give your suggestion a try. >>> >>> >>> On Mon, Mar 11, 2013 at 10:34 PM, aaron morton >>> wrote: >>> >>>> > Is this just a display bug in nodetool or this upgraded node really >>>> sees the other ones as dead? >>>> Is the 1.2.2 node which is see all the others as down processing >>>> requests ? >>>> Is it showing the others as down in the log ? >>>> >>>> I'm not really sure what's happening. But you can try starting the >>>> 1.2.2 node with the >>>> >>>> -Dcassandra.load_ring_state=false >>>> >>>> parameter, append it at the bottom of the cassandra-env.sh file. It >>>> will force the node to get the ring state from the others. >>>> >>>> Cheers >>>> >>>> - >>>> Aaron Morton >>>> Freelance Cassandra Consultant >>>> New Zealand >>>> >>>> @aaronmorton >>>> http://www.thelastpickle.com >>>> >>>> On 8/03/2013, at 10:24 PM, Arya Goudarzi wrote: >>>> >>>> > OK. I upgraded one node from 1.1.6 to 1.2.2 today. Despite some new >>>> problems that I had and I posted them in a separate email, this issue still >>>> exists but now it is only on 1.2.2 node. This means that the nodes running >>>> 1.1.6 see all other nodes including 1.2.2 as Up. Here is the ring and >>>> gossip from nodes with 1.1.6 for example. Bold denotes upgraded node: >>>> > >>>> > Address DC RackStatus State Load >>>> Effective-Ownership Token >>>> > >>>> 141784319550391026443072753098378663700 >>>> > XX.180.36us-east 1b Up Normal 49.47 GB >>>> 25.00% 1808575600 >>>> > XX.231.121 us-east 1c Up Normal 47.08 GB >>>> 25.00% 7089215977519551322153637656637080005 >>>> > XX.177.177 us-east 1d Up Normal 33.64 GB >>>> 25.00% 14178431955039102644307275311465584410 >>>> > XX.7.148us-east 1b Up Normal 41.27 GB >>>> 25.00% 42535295865117307932921825930779602030 >>>> > XX.20.9 us-east 1c Up Normal 38.51 GB >>>> 25.00% 49624511842636859255075463585608106435 >>>> > XX.86.255us-east 1d Up Normal 34.78 GB >>>> 25.00% 56713727820156410577229101240436610840 >>>> > XX.63.230us-east 1b Up
Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to 1.1.10
Filed https://issues.apache.org/jira/browse/CASSANDRA-5411 https://issues.apache.org/jira/browse/CASSANDRA-5412 However, I don't think that was our issue as we don't have nodes down for a long period of time. The longest we had a node down was for a day, and it was replaced within few hours. I have tried very hard to reproduce this issue by putting our production snapshot on our staging cluster and run the upgrade from 1.1.6 to 1.1.10, but I was not successful. So, I proposed to upgrade our cluster in production and it happened again. Now our production includes lots of these zombies that we have to delete. Luckily our engineers already written scripts to facilitate those issues, but why why why C*? After 3 years of developing apps with C* and maintaining it, I have never been this much disappointed. Anyways, I cut my nagging, but this time before i have engineers clean up the data, I checked the timestamps of a handful of returned rows. The timestamps where from before they were deleted, so they are officially deleted rows that came back to life. We have repairs running every night, unless repair is incorrectly reporting successful repairs, I really have no clue where to start and find answers for this strange issue except medicating myself with scotch whiskey so that I can sleep at night, not having to think what C* is going to bring on my desk the next morning. -Arya On Sun, Mar 31, 2013 at 3:09 AM, aaron morton wrote: > But what if the gc_grace was changed to a lower value as part of a schema > migration after the hints have been marked with TTLs equal to the lower > gc_grace before the migration? > > There would be a chance then if the tombstones had been purged. > Want to raise a ticket ? > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 29/03/2013, at 2:58 AM, Arya Goudarzi wrote: > > I am not familiar with that part of the code yet. But what if the gc_grace > was changed to a lower value as part of a schema migration after the hints > have been marked with TTLs equal to the lower gc_grace before the > migration? > > From what you've described, I think this is not an issue for us as we did > not have a node down for a long period of time, but just pointing out what > I think could happen based on what you've described. > > On Sun, Mar 24, 2013 at 10:03 AM, aaron morton wrote: > >> I could imagine a scenario where a hint was replayed to a replica after >> all replicas had purged their tombstones >> >> Scratch that, the hints are TTL'd with the lowest gc_grace. >> Ticket closed https://issues.apache.org/jira/browse/CASSANDRA-5379 >> >> Cheers >> >>- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 24/03/2013, at 6:24 AM, aaron morton wrote: >> >> Beside the joke, would hinted handoff really have any role in this issue? >> >> I could imagine a scenario where a hint was replayed to a replica after >> all replicas had purged their tombstones. That seems like a long shot, it >> would need one node to be down for the write and all up for the delete and >> for all of them to have purged the tombstone. But maybe we should have a >> max age on hints so it cannot happen. >> >> Created https://issues.apache.org/jira/browse/CASSANDRA-5379 >> >> Ensuring no hints are in place during an upgrade would work around. I >> tend to make sure hints and commit log are clear during an upgrade. >> >> Cheers >> >>- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 22/03/2013, at 7:54 AM, Arya Goudarzi wrote: >> >> Beside the joke, would hinted handoff really have any role in this issue? >> I have been struggling to reproduce this issue using the snapshot data >> taken from our cluster and following the same upgrade process from 1.1.6 to >> 1.1.10. I know snapshots only link to active SSTables. What if these >> returned rows belong to some inactive SSTables and some bug exposed itself >> and marked them as active? What are the possibilities that could lead to >> this? I am eager to find our as this is blocking our upgrade. >> >> On Tue, Mar 19, 2013 at 2:11 AM, wrote: >> >>> This obscure feature of Cassandra is called “haunted handoff”. >>> >>> ** ** >>> >>> Happy (early) April Fools J >>> >>> ** ** &
Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to 1.1.10
I am not familiar with that part of the code yet. But what if the gc_grace was changed to a lower value as part of a schema migration after the hints have been marked with TTLs equal to the lower gc_grace before the migration? >From what you've described, I think this is not an issue for us as we did not have a node down for a long period of time, but just pointing out what I think could happen based on what you've described. On Sun, Mar 24, 2013 at 10:03 AM, aaron morton wrote: > I could imagine a scenario where a hint was replayed to a replica after > all replicas had purged their tombstones > > Scratch that, the hints are TTL'd with the lowest gc_grace. > Ticket closed https://issues.apache.org/jira/browse/CASSANDRA-5379 > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 24/03/2013, at 6:24 AM, aaron morton wrote: > > Beside the joke, would hinted handoff really have any role in this issue? > > I could imagine a scenario where a hint was replayed to a replica after > all replicas had purged their tombstones. That seems like a long shot, it > would need one node to be down for the write and all up for the delete and > for all of them to have purged the tombstone. But maybe we should have a > max age on hints so it cannot happen. > > Created https://issues.apache.org/jira/browse/CASSANDRA-5379 > > Ensuring no hints are in place during an upgrade would work around. I tend > to make sure hints and commit log are clear during an upgrade. > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 22/03/2013, at 7:54 AM, Arya Goudarzi wrote: > > Beside the joke, would hinted handoff really have any role in this issue? > I have been struggling to reproduce this issue using the snapshot data > taken from our cluster and following the same upgrade process from 1.1.6 to > 1.1.10. I know snapshots only link to active SSTables. What if these > returned rows belong to some inactive SSTables and some bug exposed itself > and marked them as active? What are the possibilities that could lead to > this? I am eager to find our as this is blocking our upgrade. > > On Tue, Mar 19, 2013 at 2:11 AM, wrote: > >> This obscure feature of Cassandra is called “haunted handoff”. >> >> ** ** >> >> Happy (early) April Fools J >> >> ** ** >> >> *From:* aaron morton [mailto:aa...@thelastpickle.com] >> *Sent:* Monday, March 18, 2013 7:45 PM >> *To:* user@cassandra.apache.org >> *Subject:* Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to >> 1.1.10 >> >> ** ** >> >> As you see, this node thinks lots of ranges are out of sync which >> shouldn't be the case as successful repairs where done every night prior to >> the upgrade. >> >> Could this be explained by writes occurring during the upgrade process ? >> >> >> ** ** >> >> I found this bug which touches timestamp and tomstones which was fixed in >> 1.1.10 but am not 100% sure if it could be related to this issue: >> https://issues.apache.org/jira/browse/CASSANDRA-5153 >> >> Me neither, but the issue was fixed in 1.1.0 >> >> ** ** >> >> It appears that the repair task that I executed after upgrade, brought >> back lots of deleted rows into life. >> >> Was it entire rows or columns in a row? >> >> Do you know if row level or column level deletes were used ? >> >> ** ** >> >> Can you look at the data in cassanca-cli and confirm the timestamps on >> the columns make sense ? >> >> ** ** >> >> Cheers >> >> ** ** >> >> - >> >> Aaron Morton >> >> Freelance Cassandra Consultant >> >> New Zealand >> >> ** ** >> >> @aaronmorton >> >> http://www.thelastpickle.com >> >> ** ** >> >> On 16/03/2013, at 2:31 PM, Arya Goudarzi wrote: >> >> >> >> >> >> Hi, >> >> ** ** >> >> I have upgraded our test cluster from 1.1.6 to 1.1.10. Followed by >> running repairs. It appears that the repair task that I executed after >> upgrade, brought back lots of deleted rows into life. Here are some >> logistics: >> >> ** ** >> >> - The upgraded cluster started from 1.1.1 -> 1.1.2 -> 1.1.5 -> 1.1.6 >> >> -
Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
There has been a little misunderstanding. When all nodes are 1.2.2, they are fine. But during the rolling upgrade, 1.2.2 nodes see 1.1.10 nodes as down in nodetool command despite gossip reporting NORMAL. I will give your suggestion a try and wil report back. On Sat, Mar 23, 2013 at 10:37 AM, aaron morton wrote: > So all nodes are 1.2 and some are still being marked as down ? > > I would try a rolling restart with -Dcassandra.load_ring_state=false added > as a JVM _OPT in cassandra-env.sh. There is no guarantee it will fix it, > but it's a simple thing to try. > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 22/03/2013, at 10:30 AM, Arya Goudarzi wrote: > > I took Brandon's suggestion in CASSANDRA-5332 and upgraded to 1.1.10 > before upgrading to 1.2.2 but the issue with nodetool ring reporting > machines as down did not resolve. > > On Fri, Mar 15, 2013 at 6:35 PM, Arya Goudarzi wrote: > >> Thank you very much Aaron. I recall from the logs of this upgraded node >> to 1.2.2 reported seeing others as dead. Brandon suggested in >> https://issues.apache.org/jira/browse/CASSANDRA-5332 that I should at >> least upgrade from 1.1.7. So, I decided to try upgrading to 1.1.10 first >> before upgrading to 1.2.2. I am in the middle of troubleshooting some other >> issues I had with that upgrade (posted separately), once I am done, I will >> give your suggestion a try. >> >> >> On Mon, Mar 11, 2013 at 10:34 PM, aaron morton >> wrote: >> >>> > Is this just a display bug in nodetool or this upgraded node really >>> sees the other ones as dead? >>> Is the 1.2.2 node which is see all the others as down processing >>> requests ? >>> Is it showing the others as down in the log ? >>> >>> I'm not really sure what's happening. But you can try starting the 1.2.2 >>> node with the >>> >>> -Dcassandra.load_ring_state=false >>> >>> parameter, append it at the bottom of the cassandra-env.sh file. It will >>> force the node to get the ring state from the others. >>> >>> Cheers >>> >>> - >>> Aaron Morton >>> Freelance Cassandra Consultant >>> New Zealand >>> >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 8/03/2013, at 10:24 PM, Arya Goudarzi wrote: >>> >>> > OK. I upgraded one node from 1.1.6 to 1.2.2 today. Despite some new >>> problems that I had and I posted them in a separate email, this issue still >>> exists but now it is only on 1.2.2 node. This means that the nodes running >>> 1.1.6 see all other nodes including 1.2.2 as Up. Here is the ring and >>> gossip from nodes with 1.1.6 for example. Bold denotes upgraded node: >>> > >>> > Address DC RackStatus State Load >>> Effective-Ownership Token >>> > >>> 141784319550391026443072753098378663700 >>> > XX.180.36us-east 1b Up Normal 49.47 GB >>> 25.00% 1808575600 >>> > XX.231.121 us-east 1c Up Normal 47.08 GB >>> 25.00% 7089215977519551322153637656637080005 >>> > XX.177.177 us-east 1d Up Normal 33.64 GB >>> 25.00% 14178431955039102644307275311465584410 >>> > XX.7.148us-east 1b Up Normal 41.27 GB >>> 25.00% 42535295865117307932921825930779602030 >>> > XX.20.9 us-east 1c Up Normal 38.51 GB >>> 25.00% 49624511842636859255075463585608106435 >>> > XX.86.255us-east 1d Up Normal 34.78 GB >>> 25.00% 56713727820156410577229101240436610840 >>> > XX.63.230us-east 1b Up Normal 38.11 GB >>> 25.00% 85070591730234615865843651859750628460 >>> > XX.163.36 us-east 1c Up Normal 44.25 GB >>> 25.00% 92159807707754167187997289514579132865 >>> > XX.31.234us-east 1d Up Normal 44.66 GB >>> 25.00% 99249023685273718510150927169407637270 >>> > XX.132.169 us-east 1b Up Normal 44.2 GB >>> 25.00% 127605887595351923798765477788721654890 >>> > XX.71.63 us-east 1c Up Normal 38.74 GB >>> 25.00% 1346951035728714751209191154435501592
Re: Infinit Loop in CompactionExecutor
I did a nodetool rebuild and it seemed to clear out the pending compactions and didn't have the Exceptions in the log any more, so it fixed the issue intermittently. Not it is time to expedite the upgrade. On Wed, Mar 27, 2013 at 1:10 PM, aaron morton wrote: > Is there a workaround beside upgrading? We are not ready to upgrade just > yet. > > Cannot see one. > > Cheers > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 26/03/2013, at 7:42 PM, Arya Goudarzi wrote: > > Hi, > > I am experiencing this bug on our 1.1.6 cluster: > > https://issues.apache.org/jira/browse/CASSANDRA-4765 > > The pending compactions has been stuck on a constant value, so I suppose > something is not compacting due to this. Is there a workaround beside > upgrading? We are not ready to upgrade just yet. > > Thanks, > -Arya > > >
Infinit Loop in CompactionExecutor
Hi, I am experiencing this bug on our 1.1.6 cluster: https://issues.apache.org/jira/browse/CASSANDRA-4765 The pending compactions has been stuck on a constant value, so I suppose something is not compacting due to this. Is there a workaround beside upgrading? We are not ready to upgrade just yet. Thanks, -Arya
Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
I took Brandon's suggestion in CASSANDRA-5332 and upgraded to 1.1.10 before upgrading to 1.2.2 but the issue with nodetool ring reporting machines as down did not resolve. On Fri, Mar 15, 2013 at 6:35 PM, Arya Goudarzi wrote: > Thank you very much Aaron. I recall from the logs of this upgraded node to > 1.2.2 reported seeing others as dead. Brandon suggested in > https://issues.apache.org/jira/browse/CASSANDRA-5332 that I should at > least upgrade from 1.1.7. So, I decided to try upgrading to 1.1.10 first > before upgrading to 1.2.2. I am in the middle of troubleshooting some other > issues I had with that upgrade (posted separately), once I am done, I will > give your suggestion a try. > > > On Mon, Mar 11, 2013 at 10:34 PM, aaron morton wrote: > >> > Is this just a display bug in nodetool or this upgraded node really >> sees the other ones as dead? >> Is the 1.2.2 node which is see all the others as down processing requests >> ? >> Is it showing the others as down in the log ? >> >> I'm not really sure what's happening. But you can try starting the 1.2.2 >> node with the >> >> -Dcassandra.load_ring_state=false >> >> parameter, append it at the bottom of the cassandra-env.sh file. It will >> force the node to get the ring state from the others. >> >> Cheers >> >> ----- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 8/03/2013, at 10:24 PM, Arya Goudarzi wrote: >> >> > OK. I upgraded one node from 1.1.6 to 1.2.2 today. Despite some new >> problems that I had and I posted them in a separate email, this issue still >> exists but now it is only on 1.2.2 node. This means that the nodes running >> 1.1.6 see all other nodes including 1.2.2 as Up. Here is the ring and >> gossip from nodes with 1.1.6 for example. Bold denotes upgraded node: >> > >> > Address DC RackStatus State Load >> Effective-Ownership Token >> > >>141784319550391026443072753098378663700 >> > XX.180.36us-east 1b Up Normal 49.47 GB >> 25.00% 1808575600 >> > XX.231.121 us-east 1c Up Normal 47.08 GB >> 25.00% 7089215977519551322153637656637080005 >> > XX.177.177 us-east 1d Up Normal 33.64 GB >> 25.00% 14178431955039102644307275311465584410 >> > XX.7.148us-east 1b Up Normal 41.27 GB >> 25.00% 42535295865117307932921825930779602030 >> > XX.20.9 us-east 1c Up Normal 38.51 GB >> 25.00% 49624511842636859255075463585608106435 >> > XX.86.255us-east 1d Up Normal 34.78 GB >> 25.00% 56713727820156410577229101240436610840 >> > XX.63.230us-east 1b Up Normal 38.11 GB >> 25.00% 85070591730234615865843651859750628460 >> > XX.163.36 us-east 1c Up Normal 44.25 GB >> 25.00% 92159807707754167187997289514579132865 >> > XX.31.234us-east 1d Up Normal 44.66 GB >> 25.00% 99249023685273718510150927169407637270 >> > XX.132.169 us-east 1b Up Normal 44.2 GB >> 25.00% 127605887595351923798765477788721654890 >> > XX.71.63 us-east 1c Up Normal 38.74 GB >> 25.00% 134695103572871475120919115443550159295 >> > XX.197.209 us-east 1d Up Normal 41.5 GB >> 25.00% 141784319550391026443072753098378663700 >> > >> > /XX.71.63 >> > RACK:1c >> > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 >> > LOAD:4.1598705272E10 >> > DC:us-east >> > INTERNAL_IP:XX.194.92 >> > STATUS:NORMAL,134695103572871475120919115443550159295 >> > RPC_ADDRESS:XX.194.92 >> > RELEASE_VERSION:1.1.6 >> > /XX.86.255 >> > RACK:1d >> > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 >> > LOAD:3.734334162E10 >> > DC:us-east >> > INTERNAL_IP:XX.6.195 >> > STATUS:NORMAL,56713727820156410577229101240436610840 >> > RPC_ADDRESS:XX.6.195 >> > RELEASE_VERSION:1.1.6 >> > /XX.7.148 >> > RACK:1b >> > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 >> > LOAD:4.4316975808E10 >> > DC:us-east >> > INTERNAL_IP:XX.47.250 >> > STATUS:NORMAL,425352958651173079329218259307796020
Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to 1.1.10
Beside the joke, would hinted handoff really have any role in this issue? I have been struggling to reproduce this issue using the snapshot data taken from our cluster and following the same upgrade process from 1.1.6 to 1.1.10. I know snapshots only link to active SSTables. What if these returned rows belong to some inactive SSTables and some bug exposed itself and marked them as active? What are the possibilities that could lead to this? I am eager to find our as this is blocking our upgrade. On Tue, Mar 19, 2013 at 2:11 AM, wrote: > This obscure feature of Cassandra is called “haunted handoff”. > > ** ** > > Happy (early) April Fools J > > ** ** > > *From:* aaron morton [mailto:aa...@thelastpickle.com] > *Sent:* Monday, March 18, 2013 7:45 PM > *To:* user@cassandra.apache.org > *Subject:* Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to > 1.1.10 > > ** ** > > As you see, this node thinks lots of ranges are out of sync which > shouldn't be the case as successful repairs where done every night prior to > the upgrade. > > Could this be explained by writes occurring during the upgrade process ? * > *** > > ** ** > > I found this bug which touches timestamp and tomstones which was fixed in > 1.1.10 but am not 100% sure if it could be related to this issue: > https://issues.apache.org/jira/browse/CASSANDRA-5153 > > Me neither, but the issue was fixed in 1.1.0 > > ** ** > > It appears that the repair task that I executed after upgrade, brought > back lots of deleted rows into life. > > Was it entire rows or columns in a row? > > Do you know if row level or column level deletes were used ? > > ** ** > > Can you look at the data in cassanca-cli and confirm the timestamps on the > columns make sense ? > > ** ** > > Cheers > > ** ** > > - > > Aaron Morton > > Freelance Cassandra Consultant > > New Zealand > > ** ** > > @aaronmorton > > http://www.thelastpickle.com > > ** ** > > On 16/03/2013, at 2:31 PM, Arya Goudarzi wrote: > > > > > > Hi, > > ** ** > > I have upgraded our test cluster from 1.1.6 to 1.1.10. Followed by running > repairs. It appears that the repair task that I executed after upgrade, > brought back lots of deleted rows into life. Here are some logistics: > > ** ** > > - The upgraded cluster started from 1.1.1 -> 1.1.2 -> 1.1.5 -> 1.1.6 > > - Old cluster: 4 node, C* 1.1.6 with RF3 using NetworkTopology; > > - Upgrade to : 1.1.10 with all other settings the same; > > - Successful repairs were being done on this cluster every night; > > - Our clients use nanosecond precision timestamp for cassandra calls; > > - After upgrade, while running repair I say some log messages like this in > one node: > > ** ** > > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,847 > AntiEntropyService.java (line 1022) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.194.60 and / > 23.20.207.56 have 2223 range(s) out of sync for App > > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,877 > AntiEntropyService.java (line 1022) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.250.43 and / > 23.20.207.56 have 161 range(s) out of sync for App > > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:55,097 > AntiEntropyService.java (line 1022) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.194.60 and / > 23.20.250.43 have 2294 range(s) out of sync for App > > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:59,190 > AntiEntropyService.java (line 789) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] App is fully synced (13 remaining > column family to sync for this session) > > ** ** > > As you see, this node thinks lots of ranges are out of sync which > shouldn't be the case as successful repairs where done every night prior to > the upgrade. > > ** ** > > The App CF uses SizeTiered with gc_grace of 10 days. It has caching = > 'ALL', and it is fairly small (11Mb on each node). > > ** ** > > I found this bug which touches timestamp and tomstones which was fixed in > 1.1.10 but am not 100% sure if it could be related to this issue: > https://issues.apache.org/jira/browse/CASSANDRA-5153 > > ** ** > > Any advice on how to dig deeper into this would be appreciated. > > ** ** > > Thanks, > > -Arya > > ** ** > > ** ** > > ** ** > > ___ > > This messa
Re: Lots of Deleted Rows Came back after upgrade 1.1.6 to 1.1.10
Hi Aaron: Thanks for your attention. The cluster in question is a 4 node sandbox cluster we have that does not have much traffic. I was able to chase down this issue on a CF that doesn't change much. That bug was flagged as fixed on 1.1.10. They were row level deletes. We use the nanosecond precision so they are something like this : 1363379219546536704. Although this is recent for the last friday. I have to find one with timestamp which is old and came back to life. Doing this investigation this week and once I collect more info and reproduce from snapshot, I'll let you know. Cheers, -Arya On Mon, Mar 18, 2013 at 10:45 AM, aaron morton wrote: > As you see, this node thinks lots of ranges are out of sync which > shouldn't be the case as successful repairs where done every night prior to > the upgrade. > > Could this be explained by writes occurring during the upgrade process ? > > I found this bug which touches timestamp and tomstones which was fixed in > 1.1.10 but am not 100% sure if it could be related to this issue: > https://issues.apache.org/jira/browse/CASSANDRA-5153 > > Me neither, but the issue was fixed in 1.1.0 > > It appears that the repair task that I executed after upgrade, brought > back lots of deleted rows into life. > > Was it entire rows or columns in a row? > Do you know if row level or column level deletes were used ? > > Can you look at the data in cassanca-cli and confirm the timestamps on the > columns make sense ? > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 16/03/2013, at 2:31 PM, Arya Goudarzi wrote: > > Hi, > > I have upgraded our test cluster from 1.1.6 to 1.1.10. Followed by running > repairs. It appears that the repair task that I executed after upgrade, > brought back lots of deleted rows into life. Here are some logistics: > > - The upgraded cluster started from 1.1.1 -> 1.1.2 -> 1.1.5 -> 1.1.6 > - Old cluster: 4 node, C* 1.1.6 with RF3 using NetworkTopology; > - Upgrade to : 1.1.10 with all other settings the same; > - Successful repairs were being done on this cluster every night; > - Our clients use nanosecond precision timestamp for cassandra calls; > - After upgrade, while running repair I say some log messages like this in > one node: > > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,847 > AntiEntropyService.java (line 1022) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.194.60 and / > 23.20.207.56 have 2223 range(s) out of sync for App > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,877 > AntiEntropyService.java (line 1022) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.250.43 and / > 23.20.207.56 have 161 range(s) out of sync for App > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:55,097 > AntiEntropyService.java (line 1022) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.194.60 and / > 23.20.250.43 have 2294 range(s) out of sync for App > system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:59,190 > AntiEntropyService.java (line 789) [repair > #0990f320-8da9-11e2--e9b2bd8ea1bd] App is fully synced (13 remaining > column family to sync for this session) > > As you see, this node thinks lots of ranges are out of sync which > shouldn't be the case as successful repairs where done every night prior to > the upgrade. > > The App CF uses SizeTiered with gc_grace of 10 days. It has caching = > 'ALL', and it is fairly small (11Mb on each node). > > I found this bug which touches timestamp and tomstones which was fixed in > 1.1.10 but am not 100% sure if it could be related to this issue: > https://issues.apache.org/jira/browse/CASSANDRA-5153 > > Any advice on how to dig deeper into this would be appreciated. > > Thanks, > -Arya > > > >
Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
Thank you very much Aaron. I recall from the logs of this upgraded node to 1.2.2 reported seeing others as dead. Brandon suggested in https://issues.apache.org/jira/browse/CASSANDRA-5332 that I should at least upgrade from 1.1.7. So, I decided to try upgrading to 1.1.10 first before upgrading to 1.2.2. I am in the middle of troubleshooting some other issues I had with that upgrade (posted separately), once I am done, I will give your suggestion a try. On Mon, Mar 11, 2013 at 10:34 PM, aaron morton wrote: > > Is this just a display bug in nodetool or this upgraded node really sees > the other ones as dead? > Is the 1.2.2 node which is see all the others as down processing requests ? > Is it showing the others as down in the log ? > > I'm not really sure what's happening. But you can try starting the 1.2.2 > node with the > > -Dcassandra.load_ring_state=false > > parameter, append it at the bottom of the cassandra-env.sh file. It will > force the node to get the ring state from the others. > > Cheers > > - > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 8/03/2013, at 10:24 PM, Arya Goudarzi wrote: > > > OK. I upgraded one node from 1.1.6 to 1.2.2 today. Despite some new > problems that I had and I posted them in a separate email, this issue still > exists but now it is only on 1.2.2 node. This means that the nodes running > 1.1.6 see all other nodes including 1.2.2 as Up. Here is the ring and > gossip from nodes with 1.1.6 for example. Bold denotes upgraded node: > > > > Address DC RackStatus State Load > Effective-Ownership Token > > >141784319550391026443072753098378663700 > > XX.180.36us-east 1b Up Normal 49.47 GB > 25.00% 1808575600 > > XX.231.121 us-east 1c Up Normal 47.08 GB > 25.00% 7089215977519551322153637656637080005 > > XX.177.177 us-east 1d Up Normal 33.64 GB > 25.00% 14178431955039102644307275311465584410 > > XX.7.148us-east 1b Up Normal 41.27 GB > 25.00% 42535295865117307932921825930779602030 > > XX.20.9 us-east 1c Up Normal 38.51 GB > 25.00% 49624511842636859255075463585608106435 > > XX.86.255us-east 1d Up Normal 34.78 GB > 25.00% 56713727820156410577229101240436610840 > > XX.63.230us-east 1b Up Normal 38.11 GB > 25.00% 85070591730234615865843651859750628460 > > XX.163.36 us-east 1c Up Normal 44.25 GB > 25.00% 92159807707754167187997289514579132865 > > XX.31.234us-east 1d Up Normal 44.66 GB > 25.00% 99249023685273718510150927169407637270 > > XX.132.169 us-east 1b Up Normal 44.2 GB > 25.00% 127605887595351923798765477788721654890 > > XX.71.63 us-east 1c Up Normal 38.74 GB > 25.00% 134695103572871475120919115443550159295 > > XX.197.209 us-east 1d Up Normal 41.5 GB > 25.00% 141784319550391026443072753098378663700 > > > > /XX.71.63 > > RACK:1c > > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > > LOAD:4.1598705272E10 > > DC:us-east > > INTERNAL_IP:XX.194.92 > > STATUS:NORMAL,134695103572871475120919115443550159295 > > RPC_ADDRESS:XX.194.92 > > RELEASE_VERSION:1.1.6 > > /XX.86.255 > > RACK:1d > > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > > LOAD:3.734334162E10 > > DC:us-east > > INTERNAL_IP:XX.6.195 > > STATUS:NORMAL,56713727820156410577229101240436610840 > > RPC_ADDRESS:XX.6.195 > > RELEASE_VERSION:1.1.6 > > /XX.7.148 > > RACK:1b > > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > > LOAD:4.4316975808E10 > > DC:us-east > > INTERNAL_IP:XX.47.250 > > STATUS:NORMAL,42535295865117307932921825930779602030 > > RPC_ADDRESS:XX.47.250 > > RELEASE_VERSION:1.1.6 > > /XX.63.230 > > RACK:1b > > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > > LOAD:4.0918593305E10 > > DC:us-east > > INTERNAL_IP:XX.89.127 > > STATUS:NORMAL,85070591730234615865843651859750628460 > > RPC_ADDRESS:XX.89.127 > > RELEASE_VERSION:1.1.6 > > /XX.132.169 > > RACK:1b > > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > > LOAD:4.745883458E10 > > DC:us-east > > INTERNAL_IP:XX.94.161 > > STATUS:NORMAL,1276
Lots of Deleted Rows Came back after upgrade 1.1.6 to 1.1.10
Hi, I have upgraded our test cluster from 1.1.6 to 1.1.10. Followed by running repairs. It appears that the repair task that I executed after upgrade, brought back lots of deleted rows into life. Here are some logistics: - The upgraded cluster started from 1.1.1 -> 1.1.2 -> 1.1.5 -> 1.1.6 - Old cluster: 4 node, C* 1.1.6 with RF3 using NetworkTopology; - Upgrade to : 1.1.10 with all other settings the same; - Successful repairs were being done on this cluster every night; - Our clients use nanosecond precision timestamp for cassandra calls; - After upgrade, while running repair I say some log messages like this in one node: system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,847 AntiEntropyService.java (line 1022) [repair #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.194.60 and / 23.20.207.56 have 2223 range(s) out of sync for App system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:54,877 AntiEntropyService.java (line 1022) [repair #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.250.43 and / 23.20.207.56 have 161 range(s) out of sync for App system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:55,097 AntiEntropyService.java (line 1022) [repair #0990f320-8da9-11e2--e9b2bd8ea1bd] Endpoints /XX.194.60 and / 23.20.250.43 have 2294 range(s) out of sync for App system.log.5: INFO [AntiEntropyStage:1] 2013-03-15 19:55:59,190 AntiEntropyService.java (line 789) [repair #0990f320-8da9-11e2--e9b2bd8ea1bd] App is fully synced (13 remaining column family to sync for this session) As you see, this node thinks lots of ranges are out of sync which shouldn't be the case as successful repairs where done every night prior to the upgrade. The App CF uses SizeTiered with gc_grace of 10 days. It has caching = 'ALL', and it is fairly small (11Mb on each node). I found this bug which touches timestamp and tomstones which was fixed in 1.1.10 but am not 100% sure if it could be related to this issue: https://issues.apache.org/jira/browse/CASSANDRA-5153 Any advice on how to dig deeper into this would be appreciated. Thanks, -Arya
Re: Can't replace dead node
You may have bumped to this issue: https://github.com/Netflix/Priam/issues/161 make sure is_replace_token Priam API call is working for you. On Fri, Mar 8, 2013 at 8:22 AM, aaron morton wrote: > If it does not have the schema check the logs for errors and ensure it is > actually part of the cluster. > > You may have better luck with Priam specific questions on > https://github.com/Netflix/Priam > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 7/03/2013, at 11:11 AM, Andrey Ilinykh wrote: > > Hello everybody! > > I used to run cassandra 1.1.5 with Priam. To replace dead node priam > launches cassandra with cassandra.replace_token property. It works smoothly > with 1.1.5. Couple days ago I moved to 1.1.10 and have a problem now. New > cassandra successfully starts, joins the ring but it doesn't see my > keyspaces. It doesn't try to stream data from other nodes. I see only > system keyspace. Any idea what is the difference between 1.1.5 and 1.1.10? > How am I supposed to replace dead node? > > Thank you, >Andrey > > >
Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
despite my other issue having to do with the wrong version of cassandra, this one still stands as described. On Fri, Mar 8, 2013 at 10:24 PM, Arya Goudarzi wrote: > OK. I upgraded one node from 1.1.6 to 1.2.2 today. Despite some new > problems that I had and I posted them in a separate email, this issue still > exists but now it is only on 1.2.2 node. This means that the nodes running > 1.1.6 see all other nodes including 1.2.2 as Up. Here is the ring and > gossip from nodes with 1.1.6 for example. Bold denotes upgraded node: > > Address DC RackStatus State Load > Effective-Ownership Token > > 141784319550391026443072753098378663700 > XX.180.36us-east 1b Up Normal 49.47 GB25.00% > 1808575600 > *XX.231.121 us-east 1c Up Normal 47.08 GB > 25.00% 7089215977519551322153637656637080005* > XX.177.177 us-east 1d Up Normal 33.64 GB25.00% > 14178431955039102644307275311465584410 > XX.7.148us-east 1b Up Normal 41.27 GB25.00% > 42535295865117307932921825930779602030 > XX.20.9 us-east 1c Up Normal 38.51 GB25.00% > 49624511842636859255075463585608106435 > XX.86.255us-east 1d Up Normal 34.78 GB25.00% > 56713727820156410577229101240436610840 > XX.63.230us-east 1b Up Normal 38.11 GB25.00% > 85070591730234615865843651859750628460 > XX.163.36 us-east 1c Up Normal 44.25 GB25.00% > 92159807707754167187997289514579132865 > XX.31.234us-east 1d Up Normal 44.66 GB25.00% > 99249023685273718510150927169407637270 > XX.132.169 us-east 1b Up Normal 44.2 GB 25.00% > 127605887595351923798765477788721654890 > XX.71.63 us-east 1c Up Normal 38.74 GB25.00% > 134695103572871475120919115443550159295 > XX.197.209 us-east 1d Up Normal 41.5 GB 25.00% > 141784319550391026443072753098378663700 > > /XX.71.63 > RACK:1c > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:4.1598705272E10 > DC:us-east > INTERNAL_IP:XX.194.92 > STATUS:NORMAL,134695103572871475120919115443550159295 > RPC_ADDRESS:XX.194.92 > RELEASE_VERSION:1.1.6 > /XX.86.255 > RACK:1d > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:3.734334162E10 > DC:us-east > INTERNAL_IP:XX.6.195 > STATUS:NORMAL,56713727820156410577229101240436610840 > RPC_ADDRESS:XX.6.195 > RELEASE_VERSION:1.1.6 > /XX.7.148 > RACK:1b > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:4.4316975808E10 > DC:us-east > INTERNAL_IP:XX.47.250 > STATUS:NORMAL,42535295865117307932921825930779602030 > RPC_ADDRESS:XX.47.250 > RELEASE_VERSION:1.1.6 > /XX.63.230 > RACK:1b > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:4.0918593305E10 > DC:us-east > INTERNAL_IP:XX.89.127 > STATUS:NORMAL,85070591730234615865843651859750628460 > RPC_ADDRESS:XX.89.127 > RELEASE_VERSION:1.1.6 > /XX.132.169 > RACK:1b > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:4.745883458E10 > DC:us-east > INTERNAL_IP:XX.94.161 > STATUS:NORMAL,127605887595351923798765477788721654890 > RPC_ADDRESS:XX.94.161 > RELEASE_VERSION:1.1.6 > /XX.180.36 > RACK:1b > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:5.311963027E10 > DC:us-east > INTERNAL_IP:XX.123.112 > STATUS:NORMAL,1808575600 > RPC_ADDRESS:XX.123.112 > RELEASE_VERSION:1.1.6 > /XX.163.36 > RACK:1c > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:4.7516755022E10 > DC:us-east > INTERNAL_IP:XX.163.180 > STATUS:NORMAL,92159807707754167187997289514579132865 > RPC_ADDRESS:XX.163.180 > RELEASE_VERSION:1.1.6 > /XX.31.234 > RACK:1d > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:4.7954372912E10 > DC:us-east > INTERNAL_IP:XX.192.159 > STATUS:NORMAL,99249023685273718510150927169407637270 > RPC_ADDRESS:XX.192.159 > RELEASE_VERSION:1.1.6 > /XX.197.209 > RACK:1d > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:4.4558968005E10 > DC:us-east > INTERNAL_IP:XX.66.205 > STATUS:NORMAL,141784319550391026443072753098378663700 > RPC_ADDRESS:XX.66.205 > RELEASE_VERSION:1.1.6 > /XX.177.177 > RACK:1d > SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 > LOAD:3.6115572697E10 > DC:us-east > INTERNAL_IP:XX.65.57 > STATUS:NORMAL,14178431955039102644307275311465
Re: Startup Exception During Upgrade 1.1.6 to 1.2.2 during LCS Migration and Corrupt Tables
*face palm* You are totally right. I built from the wrong branch. I am so sorry. But at least you got yourself a development bug to figure out. :) The specific commit I built from was this: f3e0aa683f3f310678d62ba8345fe33633b709e0. On Fri, Mar 8, 2013 at 9:14 PM, Yuki Morishita wrote: > Are you sure you are using 1.2.2? > Because LegacyLeveledManifest is from unreleased development version. > > On Friday, March 8, 2013 at 11:02 PM, Arya Goudarzi wrote: > > Hi, > > I am exercising the rolling upgrade from 1.1.6 to 1.2.2. When I upgraded > to 1.2.2 on the first node, during startup I got this exception: > > ERROR [main] 2013-03-09 04:24:30,771 CassandraDaemon.java (line 213) > Could not migrate old leveled manifest. Move away the .json file in the > data directory > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:265) > at > org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:365) > at > org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:351) > at > org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:100) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:209) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434) > > > This is when it is trying to migrate LCS I believe. I removed the Json > files from data directories: > > data/ $ for i in `find . | grep 'json' | awk '{print $1}'`; do rm -rf $i; > done > > Then during the second attempt at restart, I got the following exception: > > ERROR [main] 2013-03-09 04:24:30,771 CassandraDaemon.java (line 213) > Could not migrate old leveled manifest. Move away the .json file in the > data directory > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:265) > at > org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:365) > at > org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:351) > at > org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:100) > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:209) > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391) > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434) > > OK. I seems it created snapshots prior to migration step. So it is safe to > remove those, right? > > data/ $ for i in `find . | grep 'pre-sstablemetamigration' | awk '{print > $1}'`; do rm -rf $i; done > > Now startup again, but I see bunch of corrupt sstable logs messages: > > ERROR [SSTableBatchOpen:1] 2013-03-09 04:55:39,826 SSTableReader.java > (line 242) Corrupt sstable > /var/lib/cassandra/data/keyspace_production/UniqueIndexes/keyspace_production-UniqueIndexes-hf-98318=[Filter.db, > Data.db, CompressionInfo.db, Statistics.db, Index.db]; skipped > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.cassandra.utils.BloomFilterSerializer.deserialize(BloomFilterSerializer.java:45) > at > org.apache.cassandra.utils.Murmur2BloomFilter$Murmur2BloomFilterSerializer.deserialize(Murmur2BloomFilter.java:40) > at > org.apache.cassandra.utils.FilterFactory.deserialize(FilterFactory.java:71) > at > org.apache.cassandra.io.sstable.SSTableReader.loadBloomFilter(SSTableReader.java:334) > at > org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:199) > at > org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149) > at > org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:238) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > > This is worrisome. How should I deal with this situation? scrub maybe? > Should I open a bug? > > Cheers, > Arya > > >
Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
Down Normal 44.2 GB 25.00% 127605887595351923798765477788721654890 XX.7.1481b Down Normal 41.27 GB25.00% 42535295865117307932921825930779602030 XX.180.361b Down Normal 49.47 GB25.00% 1808575600 XX.63.2301b Down Normal 38.11 GB25.00% 85070591730234615865843651859750628460 *XX.231.121 1c Up Normal 47.25 GB25.00% 7089215977519551322153637656637080005* XX.71.63 1c Down Normal 38.74 GB25.00% 134695103572871475120919115443550159295 XX.177.177 1d Down Normal 33.64 GB25.00% 14178431955039102644307275311465584410 XX.31.2341d Down Normal 44.66 GB25.00% 99249023685273718510150927169407637270 XX.20.9 1c Down Normal 38.51 GB25.00% 49624511842636859255075463585608106435 XX.163.36 1c Down Normal 44.25 GB25.00% 92159807707754167187997289514579132865 XX.197.209 1d Down Normal 41.5 GB 25.00% 141784319550391026443072753098378663700 XX.86.2551d Down Normal 34.78 GB25.00% 56713727820156410577229101240436610840 /XX.71.63 RACK:1c RPC_ADDRESS:XX.194.92 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.194.92 STATUS:NORMAL,134695103572871475120919115443550159295 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.1598705272E10 /XX.86.255 RACK:1d RPC_ADDRESS:XX.6.195 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.6.195 STATUS:NORMAL,56713727820156410577229101240436610840 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:3.7343205002E10 /XX.7.148 RACK:1b RPC_ADDRESS:XX.47.250 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.47.250 STATUS:NORMAL,42535295865117307932921825930779602030 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.4316975808E10 /XX.63.230 RACK:1b RPC_ADDRESS:XX.89.127 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.89.127 STATUS:NORMAL,85070591730234615865843651859750628460 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.0918456687E10 /XX.132.169 RACK:1b RPC_ADDRESS:XX.94.161 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.94.161 STATUS:NORMAL,127605887595351923798765477788721654890 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.745883458E10 /XX.180.36 RACK:1b RPC_ADDRESS:XX.123.112 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.123.112 STATUS:NORMAL,1808575600 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:5.311963027E10 /XX.163.36 RACK:1c RPC_ADDRESS:XX.163.180 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.163.180 STATUS:NORMAL,92159807707754167187997289514579132865 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.7516755022E10 /XX.31.234 RACK:1d RPC_ADDRESS:XX.192.159 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.192.159 STATUS:NORMAL,99249023685273718510150927169407637270 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.7954372912E10 /XX.197.209 RACK:1d RPC_ADDRESS:XX.66.205 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.66.205 STATUS:NORMAL,141784319550391026443072753098378663700 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.4559013211E10 /XX.177.177 RACK:1d RPC_ADDRESS:XX.65.57 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.65.57 STATUS:NORMAL,14178431955039102644307275311465584410 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:3.6115572697E10 /XX.20.9 RACK:1c RPC_ADDRESS:XX.33.229 RELEASE_VERSION:1.1.6 INTERNAL_IP:XX.33.229 STATUS:NORMAL,49624511842636859255075463585608106435 SCHEMA:99dce53b-487e-3e7b-a958-a1cc48d9f575 DC:us-east LOAD:4.1352367264E10 */XX.231.121* * HOST_ID:9c765678-d058-4d85-a588-638ce10ff984* * RACK:1c* * RPC_ADDRESS:XX.223.241* * RELEASE_VERSION:1.2.2* * INTERNAL_IP:XX.223.241* * STATUS:NORMAL,7089215977519551322153637656637080005* * NET_VERSION:7* * SCHEMA:8b8948f5-d56f-3a96-8005-b9452e42cd67* * SEVERITY:0.0* * DC:us-east* * LOAD:5.0710624207E10* Is this just a display bug in nodetool or this upgraded node really sees the other ones as dead? -Arya On Mon, Feb 25, 2013 at 8:10 PM, Arya Goudarzi wrote: > No I did not look at nodetool gossipinfo but from the ring on both > pre-upgrade and post upgrade nodes to 1.2.1, what I observed was the > described behavior. > > > On Sat, Feb 23, 2013 at 1:26 AM, Michael Kjellman > wrote: > >> This was a bug with 1.2.0 but resolved in 1.2.1. Did you take a capture >> of nodetool gossipinfo and nodetool ring by chance? >> >> On Feb 23, 2013, at 12:26 AM, "Arya Goudarzi" wrote: >> >> > Hi C* users, >> > >> > I just upgrade a 12 node test cluster from 1.1.6 to 1.2.1. What I >> noticed from nodetool ring was that the new upgraded nodes only saw each >> other as Normal and the rest of the cluster which was on 1.1.6 as Down. >> Vise versa was true for the nod
Startup Exception During Upgrade 1.1.6 to 1.2.2 during LCS Migration and Corrupt Tables
Hi, I am exercising the rolling upgrade from 1.1.6 to 1.2.2. When I upgraded to 1.2.2 on the first node, during startup I got this exception: ERROR [main] 2013-03-09 04:24:30,771 CassandraDaemon.java (line 213) Could not migrate old leveled manifest. Move away the .json file in the data directory java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:265) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:365) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:351) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:100) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:209) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434) This is when it is trying to migrate LCS I believe. I removed the Json files from data directories: data/ $ for i in `find . | grep 'json' | awk '{print $1}'`; do rm -rf $i; done Then during the second attempt at restart, I got the following exception: ERROR [main] 2013-03-09 04:24:30,771 CassandraDaemon.java (line 213) Could not migrate old leveled manifest. Move away the .json file in the data directory java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:265) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:365) at org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:351) at org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:100) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:209) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:391) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:434) OK. I seems it created snapshots prior to migration step. So it is safe to remove those, right? data/ $ for i in `find . | grep 'pre-sstablemetamigration' | awk '{print $1}'`; do rm -rf $i; done Now startup again, but I see bunch of corrupt sstable logs messages: ERROR [SSTableBatchOpen:1] 2013-03-09 04:55:39,826 SSTableReader.java (line 242) Corrupt sstable /var/lib/cassandra/data/keyspace_production/UniqueIndexes/keyspace_production-UniqueIndexes-hf-98318=[Filter.db, Data.db, CompressionInfo.db, Statistics.db, Index.db]; skipped java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.cassandra.utils.BloomFilterSerializer.deserialize(BloomFilterSerializer.java:45) at org.apache.cassandra.utils.Murmur2BloomFilter$Murmur2BloomFilterSerializer.deserialize(Murmur2BloomFilter.java:40) at org.apache.cassandra.utils.FilterFactory.deserialize(FilterFactory.java:71) at org.apache.cassandra.io.sstable.SSTableReader.loadBloomFilter(SSTableReader.java:334) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:199) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149) at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:238) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) This is worrisome. How should I deal with this situation? scrub maybe? Should I open a bug? Cheers, Arya
Re: Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
No I did not look at nodetool gossipinfo but from the ring on both pre-upgrade and post upgrade nodes to 1.2.1, what I observed was the described behavior. On Sat, Feb 23, 2013 at 1:26 AM, Michael Kjellman wrote: > This was a bug with 1.2.0 but resolved in 1.2.1. Did you take a capture of > nodetool gossipinfo and nodetool ring by chance? > > On Feb 23, 2013, at 12:26 AM, "Arya Goudarzi" wrote: > > > Hi C* users, > > > > I just upgrade a 12 node test cluster from 1.1.6 to 1.2.1. What I > noticed from nodetool ring was that the new upgraded nodes only saw each > other as Normal and the rest of the cluster which was on 1.1.6 as Down. > Vise versa was true for the nodes running 1.1.6. They saw each other as > Normal but the 1.2.1 nodes as down. I don't see a note in upgrade docs that > this would be an issue. Has anyone else observed this problem? > > > > In the debug logs I could see messages saying attempting to connect to > node IP and then saying it is down. > > > > Cheers, > > -Arya > > Copy, by Barracuda, helps you store, protect, and share all your amazing > > things. Start today: www.copy.com. >
Incompatible Gossip 1.1.6 to 1.2.1 Upgrade?
Hi C* users, I just upgrade a 12 node test cluster from 1.1.6 to 1.2.1. What I noticed from nodetool ring was that the new upgraded nodes only saw each other as Normal and the rest of the cluster which was on 1.1.6 as Down. Vise versa was true for the nodes running 1.1.6. They saw each other as Normal but the 1.2.1 nodes as down. I don't see a note in upgrade docs that this would be an issue. Has anyone else observed this problem? In the debug logs I could see messages saying attempting to connect to node IP and then saying it is down. Cheers, -Arya
Re: 1.1.5 Missing Insert! Strange Problem
rcoli helped me investigate this issue. The mystery was that the segment of commit log was probably not fsynced to disk since the setting was set to periodic with 10 second delay and CRC32 checksum validation failed skipping the reply, so what happened in my scenario can be explained by this. I am going to change our settings to batch mode. Thank you rcoli for you help. On Thu, Sep 27, 2012 at 2:49 PM, Arya Goudarzi wrote: > I was restarting Cassandra nodes again today. 1 hour later my support team > let me know that a customer has reported some missing data. I suppose this > is the same issue. The application logs show that our client got success > from the Thrift log and proceeded with responding to the user and I could > grep the commit log for a missing record like I did before. > > We have durable writes enabled. To me, it seams like when stuff are in > memtables and hasn't been flushed to disk, when I restart the node, the > commit log doesn't get replayed correctly. > > Please advice. > > > On Thu, Sep 27, 2012 at 2:43 PM, Arya Goudarzi wrote: > >> Thanks for your reply. I did grep on the commit logs for the offending >> key and grep showed Binary file matches. I am trying to use this tool to >> extract the commitlog and actually confirm if the mutation was a write: >> >> https://github.com/carloscm/cassandra-commitlog-extract.git >> >> >> On Thu, Sep 27, 2012 at 1:45 AM, Sylvain Lebresne >> wrote: >> >>> > I can verify the existence of the key that was inserted in Commitlogs >>> of both replicas however it seams that this record was never inserted. >>> >>> Out of curiosity, how can you verify that? >>> >>> -- >>> Sylvain >>> >> >> >
Re: 1.1.5 Missing Insert! Strange Problem
I was restarting Cassandra nodes again today. 1 hour later my support team let me know that a customer has reported some missing data. I suppose this is the same issue. The application logs show that our client got success from the Thrift log and proceeded with responding to the user and I could grep the commit log for a missing record like I did before. We have durable writes enabled. To me, it seams like when stuff are in memtables and hasn't been flushed to disk, when I restart the node, the commit log doesn't get replayed correctly. Please advice. On Thu, Sep 27, 2012 at 2:43 PM, Arya Goudarzi wrote: > Thanks for your reply. I did grep on the commit logs for the offending key > and grep showed Binary file matches. I am trying to use this tool to > extract the commitlog and actually confirm if the mutation was a write: > > https://github.com/carloscm/cassandra-commitlog-extract.git > > > On Thu, Sep 27, 2012 at 1:45 AM, Sylvain Lebresne wrote: > >> > I can verify the existence of the key that was inserted in Commitlogs >> of both replicas however it seams that this record was never inserted. >> >> Out of curiosity, how can you verify that? >> >> -- >> Sylvain >> > >
Re: 1.1.5 Missing Insert! Strange Problem
Thanks for your reply. I did grep on the commit logs for the offending key and grep showed Binary file matches. I am trying to use this tool to extract the commitlog and actually confirm if the mutation was a write: https://github.com/carloscm/cassandra-commitlog-extract.git On Thu, Sep 27, 2012 at 1:45 AM, Sylvain Lebresne wrote: > > I can verify the existence of the key that was inserted in Commitlogs of > both replicas however it seams that this record was never inserted. > > Out of curiosity, how can you verify that? > > -- > Sylvain >
Re: 1.1.5 Missing Insert! Strange Problem
Any change anyone has seen the same mysterious issue? On Wed, Sep 26, 2012 at 12:03 AM, Arya Goudarzi wrote: > No. We don't use TTLs. > > > On Tue, Sep 25, 2012 at 11:47 PM, Roshni Rajagopal < > roshni_rajago...@hotmail.com> wrote: > >> By any chance is a TTL (time to live ) set on the columns... >> >> -- >> Date: Tue, 25 Sep 2012 19:56:19 -0700 >> Subject: 1.1.5 Missing Insert! Strange Problem >> From: gouda...@gmail.com >> To: user@cassandra.apache.org >> >> >> Hi All, >> >> I have a 4 node cluster setup in 2 zones with NetworkTopology strategy >> and strategy options for writing a copy to each zone, so the effective load >> on each machine is 50%. >> >> Symptom: >> I have a column family that has gc grace seconds of 10 days (the >> default). On 17th there was an insert done to this column family and from >> our application logs I can see that the client got a successful response >> back with write consistency of ONE. I can verify the existence of the key >> that was inserted in Commitlogs of both replicas however it seams that this >> record was never inserted. I used list to get all the column family rows >> which were about 800ish, and examine them to see if it could possibly be >> deleted by our application. List should have shown them to me since I have >> not gone beyond gc grace seconds if this record was deleted during past >> days. I could not find it. >> >> Things happened: >> During the same time as this insert was happening, I was performing a >> rolling upgrade of Cassandra from 1.1.3 to 1.1.5 by taking one node down at >> a time, performing the package upgrade and restarting the service and going >> to the next node. I could see from system.log that some mutations were >> replayed during those restarts, so I suppose the memtables were not flushed >> before restart. >> >> >> Could this procedure cause the row inser to disappear? How could I >> troubleshoot as I am running out of ideas. >> >> Your help is greatly appreciated. >> >> >> Cheers, >> =Arya >> > >
Re: 1.1.5 Missing Insert! Strange Problem
No. We don't use TTLs. On Tue, Sep 25, 2012 at 11:47 PM, Roshni Rajagopal < roshni_rajago...@hotmail.com> wrote: > By any chance is a TTL (time to live ) set on the columns... > > -- > Date: Tue, 25 Sep 2012 19:56:19 -0700 > Subject: 1.1.5 Missing Insert! Strange Problem > From: gouda...@gmail.com > To: user@cassandra.apache.org > > > Hi All, > > I have a 4 node cluster setup in 2 zones with NetworkTopology strategy and > strategy options for writing a copy to each zone, so the effective load on > each machine is 50%. > > Symptom: > I have a column family that has gc grace seconds of 10 days (the default). > On 17th there was an insert done to this column family and from our > application logs I can see that the client got a successful response back > with write consistency of ONE. I can verify the existence of the key that > was inserted in Commitlogs of both replicas however it seams that this > record was never inserted. I used list to get all the column family rows > which were about 800ish, and examine them to see if it could possibly be > deleted by our application. List should have shown them to me since I have > not gone beyond gc grace seconds if this record was deleted during past > days. I could not find it. > > Things happened: > During the same time as this insert was happening, I was performing a > rolling upgrade of Cassandra from 1.1.3 to 1.1.5 by taking one node down at > a time, performing the package upgrade and restarting the service and going > to the next node. I could see from system.log that some mutations were > replayed during those restarts, so I suppose the memtables were not flushed > before restart. > > > Could this procedure cause the row inser to disappear? How could I > troubleshoot as I am running out of ideas. > > Your help is greatly appreciated. > > > Cheers, > =Arya >
1.1.5 Missing Insert! Strange Problem
Hi All, I have a 4 node cluster setup in 2 zones with NetworkTopology strategy and strategy options for writing a copy to each zone, so the effective load on each machine is 50%. Symptom: I have a column family that has gc grace seconds of 10 days (the default). On 17th there was an insert done to this column family and from our application logs I can see that the client got a successful response back with write consistency of ONE. I can verify the existence of the key that was inserted in Commitlogs of both replicas however it seams that this record was never inserted. I used list to get all the column family rows which were about 800ish, and examine them to see if it could possibly be deleted by our application. List should have shown them to me since I have not gone beyond gc grace seconds if this record was deleted during past days. I could not find it. Things happened: During the same time as this insert was happening, I was performing a rolling upgrade of Cassandra from 1.1.3 to 1.1.5 by taking one node down at a time, performing the package upgrade and restarting the service and going to the next node. I could see from system.log that some mutations were replayed during those restarts, so I suppose the memtables were not flushed before restart. Could this procedure cause the row inser to disappear? How could I troubleshoot as I am running out of ideas. Your help is greatly appreciated. Cheers, =Arya
Steps to Manually Resolve Stuck Schema!! CASSANDRA-4561
Just had a good conversation with rcoli in chat. Wanted to clarify the steps for resolving this issue and see if there are any pitfalls I am missing. Issue: I upgraded from 1.1.2 to 1.1.3 a while ago and today I realized I cannot make any schema changes since the fix in https://issues.apache.org/jira/browse/CASSANDRA-4432. Solution: Somehow I have to make Cassandra system's column family to forget about those old schemas with nanosecond timestamps. I have to do this either live or with a brief downtime. Please advice of any pitfalls or incorrectness in my steps. I am planning to automate them so please advice. Within a short downtime, I have to do this: 1. Take all nodes out of service; 2. Run nodetool flush on each; 3. Stop cassandra on each node; 4. Remove /var/lib/cassandra/data/system 5. Remove /var/lib/cassandra/saved_caches/system-* 6. Start all nodes; 7. cassandra-cli < schema_definition_file on one node only. (includes create keyspace and create column familiy entries) 8. put the nodes back in service. 9. Done. Please advice if I have got the steps correctly or if I am missing something. Thanks in advance for you help. Cheers, -Arya
Secondary Index Limitation / DOes it exist in Cassandra 1.1.2
Hi All, Correct me if I am wrong, but I know that secondary indexes are stored to local column families on each node. Previously where the default cache key value was 200,000 rows, and you couldn't really tune the local index column family, that posed a limitation on low cardinality of the possible values of the secondary index column. However, in Cassandra 1.1.2, I don't see the option of tuning cache per row count any more and it is solely memory based. I wonder if this eliminates he previous limitations with secondary indexes. Please advice. Cheers, -Arya
Re: cannot build 1.1.2 from source
Thanks for your suggestion however, I am still unable to build 1.1.2. I have found a version of antlr and libantlr installed by apt which then I removed them, but that did not resolve the issue. After some digging in Google, I found some people having similar problems with antlr and they had suggested to increate the conversiontimeout, so I added -Xconversiontimeout 20 to build.xml as antlr parameter for all 3 antlr related targets and that still didn't work. I have added -d to ant and I can see it is using the provided jar, but no luck. Any more tips would be appreciated. Here is the antlr command from the debug output of ant: Execute:Java13CommandLauncher: Executing '/opt/java/64/jdk1.6.0_32/jre/bin/java' with arguments: '-classpath' '/home/arya/workspace/cassandra-1.1.2/lib/antlr-3.2.jar' 'org.antlr.Tool' '/home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g' '-fo' '/home/arya/workspace/cassandra-1.1.2/src/gen-java/org/apache/cassandra/cli/' '-Xconversiontimeout 20' On Tue, Jul 10, 2012 at 5:31 AM, Sylvain Lebresne wrote: > I would check if you don't have a version of antlr install on you > system that takes > precedence over the one distributed with C* and happens to not be compatible. > > Because I don't remember there having been much change to the Cli between > 1.1.1 > and 1.1.2 and the grammar nobody has had that problem so far. > > -- > Sylvain > > On Mon, Jul 9, 2012 at 8:07 PM, Arya Goudarzi wrote: >> Thanks for your response. Yes. I do that every time before I build. >> >> On Sun, Jul 8, 2012 at 11:51 AM, aaron morton >> wrote: >>> Did you try running ant clean first ? >>> >>> Cheers >>> >>> - >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 8/07/2012, at 1:57 PM, Arya Goudarzi wrote: >>> >>> Hi Fellows, >>> >>> I used to be able to build cassandra 1.1 up to 1.1.1 with the same set >>> of procedures by running ant on the same machine, but now the stuff >>> associated with gen-cli-grammar breaks the build. Any advice will be >>> greatly appreciated. >>> >>> -Arya >>> >>> Source: >>> source tarball for 1.1.2 downloaded from one of the mirrors in >>> cassandra.apache.org >>> OS: >>> Ubuntu 10.04 Precise 64bit >>> Ant: >>> Apache Ant(TM) version 1.8.2 compiled on December 3 2011 >>> Maven: >>> Apache Maven 3.0.3 (r1075438; 2011-02-28 17:31:09+) >>> Java: >>> java version "1.6.0_32" >>> Java(TM) SE Runtime Environment (build 1.6.0_32-b05) >>> Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode) >>> >>> >>> >>> Buildfile: /home/arya/workspace/cassandra-1.1.2/build.xml >>> >>> maven-ant-tasks-localrepo: >>> >>> maven-ant-tasks-download: >>> >>> maven-ant-tasks-init: >>> >>> maven-declare-dependencies: >>> >>> maven-ant-tasks-retrieve-build: >>> >>> init-dependencies: >>> [echo] Loading dependency paths from file: >>> /home/arya/workspace/cassandra-1.1.2/build/build-dependencies.xml >>> >>> init: >>>[mkdir] Created dir: >>> /home/arya/workspace/cassandra-1.1.2/build/classes/main >>>[mkdir] Created dir: >>> /home/arya/workspace/cassandra-1.1.2/build/classes/thrift >>>[mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/test/lib >>>[mkdir] Created dir: >>> /home/arya/workspace/cassandra-1.1.2/build/test/classes >>>[mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/src/gen-java >>> >>> check-avro-generate: >>> >>> avro-interface-generate-internode: >>> [echo] Generating Avro internode code... >>> >>> avro-generate: >>> >>> build-subprojects: >>> >>> check-gen-cli-grammar: >>> >>> gen-cli-grammar: >>> [echo] Building Grammar >>> /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g >>> >>> [java] warning(209): >>> /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:697:1: >>> Multiple token rules can match input such as "'-'": >>> IntegerNegativeLiteral, COMMENT >>> [java] >>> [java] As a result, token(s) COMMENT were disabled for that input &
Re: cannot build 1.1.2 from source
Thanks for your response. Yes. I do that every time before I build. On Sun, Jul 8, 2012 at 11:51 AM, aaron morton wrote: > Did you try running ant clean first ? > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 8/07/2012, at 1:57 PM, Arya Goudarzi wrote: > > Hi Fellows, > > I used to be able to build cassandra 1.1 up to 1.1.1 with the same set > of procedures by running ant on the same machine, but now the stuff > associated with gen-cli-grammar breaks the build. Any advice will be > greatly appreciated. > > -Arya > > Source: > source tarball for 1.1.2 downloaded from one of the mirrors in > cassandra.apache.org > OS: > Ubuntu 10.04 Precise 64bit > Ant: > Apache Ant(TM) version 1.8.2 compiled on December 3 2011 > Maven: > Apache Maven 3.0.3 (r1075438; 2011-02-28 17:31:09+) > Java: > java version "1.6.0_32" > Java(TM) SE Runtime Environment (build 1.6.0_32-b05) > Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode) > > > > Buildfile: /home/arya/workspace/cassandra-1.1.2/build.xml > > maven-ant-tasks-localrepo: > > maven-ant-tasks-download: > > maven-ant-tasks-init: > > maven-declare-dependencies: > > maven-ant-tasks-retrieve-build: > > init-dependencies: > [echo] Loading dependency paths from file: > /home/arya/workspace/cassandra-1.1.2/build/build-dependencies.xml > > init: >[mkdir] Created dir: > /home/arya/workspace/cassandra-1.1.2/build/classes/main >[mkdir] Created dir: > /home/arya/workspace/cassandra-1.1.2/build/classes/thrift >[mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/test/lib >[mkdir] Created dir: > /home/arya/workspace/cassandra-1.1.2/build/test/classes >[mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/src/gen-java > > check-avro-generate: > > avro-interface-generate-internode: > [echo] Generating Avro internode code... > > avro-generate: > > build-subprojects: > > check-gen-cli-grammar: > > gen-cli-grammar: > [echo] Building Grammar > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g > > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:697:1: > Multiple token rules can match input such as "'-'": > IntegerNegativeLiteral, COMMENT > [java] > [java] As a result, token(s) COMMENT were disabled for that input > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: > Multiple token rules can match input such as "'I'": INCR, INDEX, > Identifier > [java] > [java] As a result, token(s) INDEX,Identifier were disabled for that > input > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: > Multiple token rules can match input such as "'0'..'9'": IP_ADDRESS, > IntegerPositiveLiteral, DoubleLiteral, Identifier > [java] > [java] As a result, token(s) > IntegerPositiveLiteral,DoubleLiteral,Identifier were disabled for that > input > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: > Multiple token rules can match input such as "'T'": TRUNCATE, TTL, > Identifier > [java] > [java] As a result, token(s) TTL,Identifier were disabled for that input > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: > Multiple token rules can match input such as "'A'": T__109, > API_VERSION, AND, ASSUME, Identifier > [java] > [java] As a result, token(s) API_VERSION,AND,ASSUME,Identifier > were disabled for that input > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: > Multiple token rules can match input such as "'E'": EXIT, Identifier > [java] > [java] As a result, token(s) Identifier were disabled for that input > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: > Multiple token rules can match input such as "'L'": LIST, LIMIT, > Identifier > [java] > [java] As a result, token(s) LIMIT,Identifier were disabled for that > input > [java] warning(209): > /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: > Multiple token rules can match input such as "'B'": BY, Identifier > [java] > [
cannot build 1.1.2 from source
Hi Fellows, I used to be able to build cassandra 1.1 up to 1.1.1 with the same set of procedures by running ant on the same machine, but now the stuff associated with gen-cli-grammar breaks the build. Any advice will be greatly appreciated. -Arya Source: source tarball for 1.1.2 downloaded from one of the mirrors in cassandra.apache.org OS: Ubuntu 10.04 Precise 64bit Ant: Apache Ant(TM) version 1.8.2 compiled on December 3 2011 Maven: Apache Maven 3.0.3 (r1075438; 2011-02-28 17:31:09+) Java: java version "1.6.0_32" Java(TM) SE Runtime Environment (build 1.6.0_32-b05) Java HotSpot(TM) 64-Bit Server VM (build 20.7-b02, mixed mode) Buildfile: /home/arya/workspace/cassandra-1.1.2/build.xml maven-ant-tasks-localrepo: maven-ant-tasks-download: maven-ant-tasks-init: maven-declare-dependencies: maven-ant-tasks-retrieve-build: init-dependencies: [echo] Loading dependency paths from file: /home/arya/workspace/cassandra-1.1.2/build/build-dependencies.xml init: [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/classes/main [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/classes/thrift [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/test/lib [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/build/test/classes [mkdir] Created dir: /home/arya/workspace/cassandra-1.1.2/src/gen-java check-avro-generate: avro-interface-generate-internode: [echo] Generating Avro internode code... avro-generate: build-subprojects: check-gen-cli-grammar: gen-cli-grammar: [echo] Building Grammar /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:697:1: Multiple token rules can match input such as "'-'": IntegerNegativeLiteral, COMMENT [java] [java] As a result, token(s) COMMENT were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'I'": INCR, INDEX, Identifier [java] [java] As a result, token(s) INDEX,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'0'..'9'": IP_ADDRESS, IntegerPositiveLiteral, DoubleLiteral, Identifier [java] [java] As a result, token(s) IntegerPositiveLiteral,DoubleLiteral,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'T'": TRUNCATE, TTL, Identifier [java] [java] As a result, token(s) TTL,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'A'": T__109, API_VERSION, AND, ASSUME, Identifier [java] [java] As a result, token(s) API_VERSION,AND,ASSUME,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'E'": EXIT, Identifier [java] [java] As a result, token(s) Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'L'": LIST, LIMIT, Identifier [java] [java] As a result, token(s) LIMIT,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'B'": BY, Identifier [java] [java] As a result, token(s) Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'O'": ON, Identifier [java] [java] As a result, token(s) Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:628:1: Multiple token rules can match input such as "'K'": KEYSPACE, KEYSPACES, Identifier [java] [java] As a result, token(s) KEYSPACES,Identifier were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:38:1: Multiple token rules can match input such as "'<'": T__113, T__115 [java] [java] As a result, token(s) T__115 were disabled for that input [java] warning(209): /home/arya/workspace/cassandra-1.1.2/src/java/org/apache/cassandra/cli/Cli.g:693:1: Multiple token rules can match input such as "' '": DoubleLiteral, WS [java] [java]
Re: using jna.jar "Unknown mlockall error 0"
In this case should one set ulimit -l to the amount of heap size? Thanks, -Arya Goudarzi - Original Message - From: "Peter Schuller" To: user@cassandra.apache.org Sent: Saturday, October 9, 2010 1:18:28 AM Subject: Re: using jna.jar "Unknown mlockall error 0" > IIRC, mlockall doesn't work as a non root user on Linux. Memory locking is permitted for non-root on modern Linux, but is subject to resource limitations (ulimit -l). I believe this also applies to mlockall() (and not just mlock()), provided that you lock MCL_CURRENT rather than MCL_FUTURE (Cassandra does the former). -- / Peter Schuller
Re: After loadbalance why does the size increase
You won't get even distribution. All it does is decommission and auto-bootstrap. http://wiki.apache.org/cassandra/Operations . However, you will get closer to even distribution if you double the size of your cluster, meaning if you had 3 nodes, you add 3 more nodes. In that case, each new node will assume 1/2 of an older node. Right now the only way you could get blanced ring is by doing manual token moves. 1418 seams to be addressing this issue. I will also look forward into it. - Original Message - From: "Joe Alex" To: user@cassandra.apache.org Sent: Tuesday, October 26, 2010 2:38:07 PM Subject: Re: After loadbalance why does the size increase I did not. I did try "cleanup" and here is after that. Was expecting an even distribution and total load to be approximately same as before also was reading this https://issues.apache.org/jira/browse/CASSANDRA-1418 Address Status Load Range Ring 127314552263552317194896535803056965704 10.210.32.75 Up 684.52 MB 3261306534628430564231103090162243723 |<--| 10.210.32.92 Up 727.91 MB 38958085454962719903262501903937071633 | | 10.210.32.93 Up 366.47 MB 81410091240220304547331026452103785021 | | 10.210.32.74 Up 415.47 MB 127314552263552317194896535803056965704|-->| On Tue, Oct 26, 2010 at 5:28 PM, Arya Goudarzi wrote: > Do you perform nodetool cleanup after you loadbalance? > > > From: "Joe Alex" > To: cassandra-u...@incubator.apache.org > Sent: Tuesday, October 26, 2010 2:06:58 PM > Subject: After loadbalance why does the size increase > > Hi, > > I have Cassandra 0.6.6 running on 4 nodes with RF=2. I have around 2 > million rows for a simple test. > Can somebody tell me why does the size keep increasing after running > "loadbalance" > > Address Status Load Range > Ring > > 127384081359183520765545698844531833520 > 10.210.32.75 Up 458.08 MB > 3261306534628430564231103090162243723 |<--| > 10.210.32.92 Up 407.88 MB > 38958085454962719903262501903937071633 | | > 10.210.32.93 Up 350.51 MB > 81410091240220304547331026452103785021 | | > 10.210.32.74 Up 498.58 MB > 127384081359183520765545698844531833520 |-->| > > > Address Status Load Range > Ring > > 127314552263552317194896535803056965704 > 10.210.32.75 Up 684.52 MB > 3261306534628430564231103090162243723 |<--| > 10.210.32.92 Up 624.67 MB > 38958085454962719903262501903937071633 | | > 10.210.32.93 Up 350.51 MB > 81410091240220304547331026452103785021 | | > 10.210.32.74 Up 924.65 MB > 127314552263552317194896535803056965704 |-->| >
Re: After loadbalance why does the size increase
Do you perform nodetool cleanup after you loadbalance? From: "Joe Alex" To: cassandra-u...@incubator.apache.org Sent: Tuesday, October 26, 2010 2:06:58 PM Subject: After loadbalance why does the size increase Hi, I have Cassandra 0.6.6 running on 4 nodes with RF=2. I have around 2 million rows for a simple test. Can somebody tell me why does the size keep increasing after running "loadbalance" Address Status Load Range Ring 127384081359183520765545698844531833520 10.210.32.75 Up 458.08 MB 3261306534628430564231103090162243723 |<--| 10.210.32.92 Up 407.88 MB 38958085454962719903262501903937071633 | | 10.210.32.93 Up 350.51 MB 81410091240220304547331026452103785021 | | 10.210.32.74 Up 498.58 MB 127384081359183520765545698844531833520 |-->| Address Status Load Range Ring 127314552263552317194896535803056965704 10.210.32.75 Up 684.52 MB 3261306534628430564231103090162243723 |<--| 10.210.32.92 Up 624.67 MB 38958085454962719903262501903937071633 | | 10.210.32.93 Up 350.51 MB 81410091240220304547331026452103785021 | | 10.210.32.74 Up 924.65 MB 127314552263552317194896535803056965704 |-->|
Re: ERROR saved_caches_directory missing
Never mind, I did not pay attention to the new config change. - Original Message - From: "Arya Goudarzi" To: user@cassandra.apache.org Sent: Friday, October 8, 2010 4:22:34 PM Subject: ERROR saved_caches_directory missing Upgraded code from trunk 10/7 to trunk 10/8 and nodes don't start: ERROR 16:19:41,335 Fatal error: saved_caches_directory missing Please advice. Best Regards, -Arya
ERROR saved_caches_directory missing
Upgraded code from trunk 10/7 to trunk 10/8 and nodes don't start: ERROR 16:19:41,335 Fatal error: saved_caches_directory missing Please advice. Best Regards, -Arya
Fwd: Fix for (some?) of the PHP Thrift problems
Forwarding to Cassandra users list. This fix addresses the issue with PHP Accelerated module returning "Cannot Read XX bytes" TException due to faulty stream sent to FramedTransport. On Tue, Aug 31, 2010 at 12:13 PM, Bryan Duxbury wrote: > Hey guys, > > I think someone has managed to figure out why users of the PHP Thrift > accelerator module has been causing issues. Does anyone feel like trying > out > https://issues.apache.org/jira/browse/THRIFT-867 to see if that patch > fixes > their issues? Nice! I think you'll find better reception on the cassandra-user list, rather than this newly created one, though. -Brandon
MIGRATION-STAGE: IllegalArgumentException: value already present
While inserting into a 3 node cluster, one of the nodes got this exception in its log: ERROR [MIGRATION-STAGE:1] 2010-08-16 17:46:24,090 CassandraDaemon.java (line 82) Uncaught exception in thread Thread[MIGRATION-STAGE:1,5,main] java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: value already present: 1017 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:87) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1118) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) Caused by: java.lang.IllegalArgumentException: value already present: 1017 at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115) at com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:109) at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:94) at com.google.common.collect.HashBiMap.put(HashBiMap.java:83) at org.apache.cassandra.config.CFMetaData.map(CFMetaData.java:170) at org.apache.cassandra.db.migration.AddColumnFamily.applyModels(AddColumnFamily.java:78) at org.apache.cassandra.db.migration.Migration.apply(Migration.java:157) at org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:729) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) ... 2 more What does it mean? Is it something I should open a JIRA for? -Arya
Re: COMMIT-LOG_WRITER Assertion Error
Sure. https://issues.apache.org/jira/browse/CASSANDRA-1376 - Original Message - From: "Jonathan Ellis" To: user@cassandra.apache.org Sent: Tuesday, August 10, 2010 7:05:31 AM Subject: Re: COMMIT-LOG_WRITER Assertion Error Can you create a ticket for this? On Mon, Aug 9, 2010 at 8:42 PM, Arya Goudarzi wrote: > I've never run 0.6. I have been running of trunc with automatic svn update > and build everyday at 2pm. One of my nodes got this error which lead to the > same last error prior to build and restart today. Hope this helps better: > > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.AssertionError > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:549) > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:339) > at > org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:174) > at > org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:120) > at > org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:90) > at > org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224) > Caused by: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.AssertionError > at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) > at java.util.concurrent.FutureTask.get(FutureTask.java:111) > at > org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:545) > ... 5 more > Caused by: java.lang.RuntimeException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.AssertionError > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:636) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.AssertionError > at > org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegments(CommitLog.java:408) > at > org.apache.cassandra.db.ColumnFamilyStore$2.runMayThrow(ColumnFamilyStore.java:445) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > ... 6 more > Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError > at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) > at java.util.concurrent.FutureTask.get(FutureTask.java:111) > at > org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegments(CommitLog.java:400) > ... 8 more > Caused by: java.lang.AssertionError > at > org.apache.cassandra.db.commitlog.CommitLogHeader$CommitLogHeaderSerializer.serialize(CommitLogHeader.java:157) > at > org.apache.cassandra.db.commitlog.CommitLogHeader.writeCommitLogHeader(CommitLogHeader.java:124) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.writeHeader(CommitLogSegment.java:70) > at > org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegmentsInternal(CommitLog.java:450) > at > org.apache.cassandra.db.commitlog.CommitLog.access$300(CommitLog.java:75) > at > org.apache.cassandra.db.commitlog.CommitLog$6.call(CommitLog.java:394) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:52) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > ... 1 more > > - Original Message - > From: "Jonathan Ellis" > To: user@cassandra.apache.org > Sent: Monday, August 9, 2010 5:18:35 PM > Subject: Re: COMMIT-LOG_WRITER Assertion Error > > Sounds like you upgraded to trunk from 0.6 without draining your > commitlog first? > > On Mon, Aug 9, 2010 at 3:30 PM, Arya Goudarzi > wrote: >> Just throwing this out there as it could be a concern. I had a cluster of 3 >> n
Re: COMMIT-LOG_WRITER Assertion Error
I've never run 0.6. I have been running of trunc with automatic svn update and build everyday at 2pm. One of my nodes got this error which lead to the same last error prior to build and restart today. Hope this helps better: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:549) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:339) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:174) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:120) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:90) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:545) ... 5 more Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError at org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegments(CommitLog.java:408) at org.apache.cassandra.db.ColumnFamilyStore$2.runMayThrow(ColumnFamilyStore.java:445) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 6 more Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegments(CommitLog.java:400) ... 8 more Caused by: java.lang.AssertionError at org.apache.cassandra.db.commitlog.CommitLogHeader$CommitLogHeaderSerializer.serialize(CommitLogHeader.java:157) at org.apache.cassandra.db.commitlog.CommitLogHeader.writeCommitLogHeader(CommitLogHeader.java:124) at org.apache.cassandra.db.commitlog.CommitLogSegment.writeHeader(CommitLogSegment.java:70) at org.apache.cassandra.db.commitlog.CommitLog.discardCompletedSegmentsInternal(CommitLog.java:450) at org.apache.cassandra.db.commitlog.CommitLog.access$300(CommitLog.java:75) at org.apache.cassandra.db.commitlog.CommitLog$6.call(CommitLog.java:394) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:52) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 1 more - Original Message - From: "Jonathan Ellis" To: user@cassandra.apache.org Sent: Monday, August 9, 2010 5:18:35 PM Subject: Re: COMMIT-LOG_WRITER Assertion Error Sounds like you upgraded to trunk from 0.6 without draining your commitlog first? On Mon, Aug 9, 2010 at 3:30 PM, Arya Goudarzi wrote: > Just throwing this out there as it could be a concern. I had a cluster of 3 > nodes running. Over the weekend I updated to trunc (Aug 9th @ 2pm). Today, I > came to run my daily tests and my client kept giving me TSocket timeouts. > Checking the error log of Cassandra servers, all 3 nodes had this and they > all became unresponsive! Not sure how to reproduce this but a restart of all > 3 nodes fixed the issue: > > ERROR [COMMIT-LOG-WRITER] 2010-08-09 11:30:27,722 CassandraDaemon.java (line > 82) Uncaught exception in thread Thread[COMMIT-LOG-WRITER,5,main] > java.lang.AssertionError > at > org.apache.cassandra.db.commitlog.CommitLogHeader$CommitLogHeaderSerializer.serialize(CommitLogHeader.java:157) > at > org.apache.cassandra.db.commitlog.Comm
COMMIT-LOG_WRITER Assertion Error
Just throwing this out there as it could be a concern. I had a cluster of 3 nodes running. Over the weekend I updated to trunc (Aug 9th @ 2pm). Today, I came to run my daily tests and my client kept giving me TSocket timeouts. Checking the error log of Cassandra servers, all 3 nodes had this and they all became unresponsive! Not sure how to reproduce this but a restart of all 3 nodes fixed the issue: ERROR [COMMIT-LOG-WRITER] 2010-08-09 11:30:27,722 CassandraDaemon.java (line 82) Uncaught exception in thread Thread[COMMIT-LOG-WRITER,5,main] java.lang.AssertionError at org.apache.cassandra.db.commitlog.CommitLogHeader$CommitLogHeaderSerializer.serialize(CommitLogHeader.java:157) at org.apache.cassandra.db.commitlog.CommitLogHeader.writeCommitLogHeader(CommitLogHeader.java:124) at org.apache.cassandra.db.commitlog.CommitLogSegment.writeHeader(CommitLogSegment.java:70) at org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:103) at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:521) at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:52) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.lang.Thread.run(Thread.java:636) -Arya
Re: Avro Runtime Exception Bad Index
I pull from trunc and build every day at 2pm PST. So, Previous Version: Trunc, July 28th @ 2pm pst Broken Startup Version: Trunc, July 29th @ 2pm pst Today, I also ended up getting the following AssertionError when my cron svn updated and built and i had to cleanup my commit and data directories for it to start: Previous Version: Trunc, July, 29th @ 2pm pst Broken Startup Version: Trunc, July, 30th @ 2pm pst java.lang.AssertionError at org.apache.cassandra.db.ColumnFamily.create(ColumnFamily.java:67) at org.apache.cassandra.db.ColumnFamily.create(ColumnFamily.java:57) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:112) at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:372) at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:340) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:255) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:173) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:120) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:90) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224) No one is using this cluster. So, there is 0 traffic at the time update happens. - Original Message - From: "Stu Hood" To: user@cassandra.apache.org Sent: Thursday, July 29, 2010 2:52:48 PM Subject: RE: Avro Runtime Exception Bad Index Can you determine approximately what revisions you were running before and after? -Original Message----- From: "Arya Goudarzi" Sent: Thursday, July 29, 2010 4:42pm To: user@cassandra.apache.org Subject: Avro Runtime Exception Bad Index Just wanted to toss this out there in case if this is an issue or the format really changed and have to start from a clean slate. I was running from yesterday's trunc and had some Keyspaces with data. Today's trunc failed server start giving this exception: ERROR [main] 2010-07-29 14:05:21,489 AbstractCassandraDaemon.java (line 107) Exception encountered during startup. org.apache.avro.AvroRuntimeException: Bad index at org.apache.cassandra.avro.KsDef.put(KsDef.java:27) at org.apache.avro.specific.SpecificDatumReader.setField(SpecificDatumReader.java:47) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:108) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:80) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:71) at org.apache.cassandra.io.SerDeUtils.deserialize(SerDeUtils.java:57) at org.apache.cassandra.db.DefsTable.loadFromStorage(DefsTable.java:112) at org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:471) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:103) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:90) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224) -Arya
Avro Runtime Exception Bad Index
Just wanted to toss this out there in case if this is an issue or the format really changed and have to start from a clean slate. I was running from yesterday's trunc and had some Keyspaces with data. Today's trunc failed server start giving this exception: ERROR [main] 2010-07-29 14:05:21,489 AbstractCassandraDaemon.java (line 107) Exception encountered during startup. org.apache.avro.AvroRuntimeException: Bad index at org.apache.cassandra.avro.KsDef.put(KsDef.java:27) at org.apache.avro.specific.SpecificDatumReader.setField(SpecificDatumReader.java:47) at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:108) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:80) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:71) at org.apache.cassandra.io.SerDeUtils.deserialize(SerDeUtils.java:57) at org.apache.cassandra.db.DefsTable.loadFromStorage(DefsTable.java:112) at org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:471) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:103) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:90) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224) -Arya
Re: Increasing Replication Factor in 0.7
https://issues.apache.org/jira/browse/CASSANDRA-1285 - Original Message - From: "Gary Dusbabek" To: user@cassandra.apache.org Sent: Friday, July 16, 2010 7:17:48 AM Subject: Re: Increasing Replication Factor in 0.7 Arya, That is not currently possible in trunk. It would be a good feature though. Care to file a ticket? Gary. On Thu, Jul 15, 2010 at 22:13, Arya Goudarzi wrote: > I recall jbellis in his training showing us how to increase the replication > factor and repair data on a cluster in 0.6. How is that possible in 0.7 when > you cannot change schemas from config and there is no alter keyspace method > in api? >
Re: nodetool loadbalance : Strerams Continue on Non Acceptance of New Token
Hi Gary, Thanks for the reply. I tried this again today. Streams gets stuck, pls read my comment: https://issues.apache.org/jira/browse/CASSANDRA-1221 -arya - Original Message - From: "Gary Dusbabek" To: user@cassandra.apache.org Sent: Wednesday, June 23, 2010 5:40:02 AM Subject: Re: nodetool loadbalance : Strerams Continue on Non Acceptance of New Token On Tue, Jun 22, 2010 at 20:16, Arya Goudarzi wrote: > Hi, > > Please confirm if this is an issue and should be reported or I am doing > something wrong. I could not find anything relevant on JIRA: > > Playing with 0.7 nightly (today's build), I setup a 3 node cluster this way: > > - Added one node; > - Loaded default schema with RF 1 from YAML using JMX; > - Loaded 2M keys using py_stress; > - Bootstrapped a second node; > - Cleaned up the first node; > - Bootstrapped a third node; > - Cleaned up the second node; > > I got the following ring: > > Address Status Load Range > Ring > 154293670372423273273390365393543806425 > 10.50.26.132 Up 518.63 MB 69164917636305877859094619660693892452 > |<--| > 10.50.26.134 Up 234.8 MB > 111685517405103688771527967027648896391 | | > 10.50.26.133 Up 235.26 MB > 154293670372423273273390365393543806425 |-->| > > Now I ran: > > nodetool --host 10.50.26.132 loadbalance > > It's been going for a while. I checked the streams > > nodetool --host 10.50.26.134 streams > Mode: Normal > Not sending any streams. > Streaming from: /10.50.26.132 > Keyspace1: > /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-3-Data.db/[(0,22206096), > (22206096,27271682)] > Keyspace1: > /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-4-Data.db/[(0,15180462), > (15180462,18656982)] > Keyspace1: > /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-5-Data.db/[(0,353139829), > (353139829,433883659)] > Keyspace1: > /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-6-Data.db/[(0,366336059), > (366336059,450095320)] > > nodetool --host 10.50.26.132 streams > Mode: Leaving: streaming data to other nodes > Streaming to: /10.50.26.134 > /var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), > (366336059,450095320)] > Not receiving any streams. > > These have been going for the past 2 hours. > > I see in the logs of the node with 134 IP address and I saw this: > > INFO [GOSSIP_STAGE:1] 2010-06-22 16:30:54,679 StorageService.java (line 603) > Will not change my token ownership to /10.50.26.132 A node will give this message when it sees another node (usually for the first time) that is trying to claim the same token but whose startup time is much earlier (i.e., this isn't a token replacement). It would follow that you would see this during a rebalance. > > So, to my understanding from wikis loadbalance supposed to decommission and > re-bootstrap again by sending its tokens to other nodes and then bootstrap > again. It's been stuck in streaming for the past 2 hours and the size of ring > has not changed. The log in the first node says it has started streaming for > the past hours: > > INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 72) > Beginning transfer process to /10.50.26.134 for ranges > (154293670372423273273390365393543806425,69164917636305877859094619660693892452] > INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 82) > Flushing memtables for Keyspace1... > INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,266 StreamOut.java (line 128) > Stream context metadata > [/var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), > (366336059,450095320)]] 1 sstables. > INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 135) > Sending a stream initiate message to /10.50.26.134 ... > INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 140) > Waiting for transfer to /10.50.26.134 to complete > INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 359) > LocationInfo has reached its threshold; switching in a fresh Memtable at > CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1277249454413.log', > position=720) > INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 622) > Enqueuing flush of Memtable(LocationInfo)@1637794189 > INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,370 Memtable.java (line 149) > Writing Memtable(LocationInfo)@1637794189 > INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,528 Memtable.java (line 163) > Completed flushing /var/lib/cassandra/data/system/LocationInfo
java.lang.NoSuchMethodError: org.apache.cassandra.db.ColumnFamily.id()I
I just build today's trunk successfully and am getting the following exception on startup which to me it seams bogus as the method exists but I don't know why: ERROR 15:27:00,957 Exception encountered during startup. java.lang.NoSuchMethodError: org.apache.cassandra.db.ColumnFamily.id()I at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:63) at org.apache.cassandra.db.RowMutationSerializer.freezeTheMaps(RowMutation.java:351) at org.apache.cassandra.db.RowMutationSerializer.serialize(RowMutation.java:362) at org.apache.cassandra.db.RowMutationSerializer.serialize(RowMutation.java:340) at org.apache.cassandra.db.RowMutation.getSerializedBuffer(RowMutation.java:271) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:196) at org.apache.cassandra.db.SystemTable.initMetadata(SystemTable.java:217) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:345) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:134) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:90) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:221) My data/commitlog are clean. Notice id()I. Where does that "I" come from. Please advice. -Arya
nodetool loadbalance : Strerams Continue on Non Acceptance of New Token
Hi, Please confirm if this is an issue and should be reported or I am doing something wrong. I could not find anything relevant on JIRA: Playing with 0.7 nightly (today's build), I setup a 3 node cluster this way: - Added one node; - Loaded default schema with RF 1 from YAML using JMX; - Loaded 2M keys using py_stress; - Bootstrapped a second node; - Cleaned up the first node; - Bootstrapped a third node; - Cleaned up the second node; I got the following ring: Address Status Load Range Ring 154293670372423273273390365393543806425 10.50.26.132 Up 518.63 MB 69164917636305877859094619660693892452 |<--| 10.50.26.134 Up 234.8 MB 111685517405103688771527967027648896391 | | 10.50.26.133 Up 235.26 MB 154293670372423273273390365393543806425 |-->| Now I ran: nodetool --host 10.50.26.132 loadbalance It's been going for a while. I checked the streams nodetool --host 10.50.26.134 streams Mode: Normal Not sending any streams. Streaming from: /10.50.26.132 Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-3-Data.db/[(0,22206096), (22206096,27271682)] Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-4-Data.db/[(0,15180462), (15180462,18656982)] Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-5-Data.db/[(0,353139829), (353139829,433883659)] Keyspace1: /var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-6-Data.db/[(0,366336059), (366336059,450095320)] nodetool --host 10.50.26.132 streams Mode: Leaving: streaming data to other nodes Streaming to: /10.50.26.134 /var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), (366336059,450095320)] Not receiving any streams. These have been going for the past 2 hours. I see in the logs of the node with 134 IP address and I saw this: INFO [GOSSIP_STAGE:1] 2010-06-22 16:30:54,679 StorageService.java (line 603) Will not change my token ownership to /10.50.26.132 So, to my understanding from wikis loadbalance supposed to decommission and re-bootstrap again by sending its tokens to other nodes and then bootstrap again. It's been stuck in streaming for the past 2 hours and the size of ring has not changed. The log in the first node says it has started streaming for the past hours: INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 72) Beginning transfer process to /10.50.26.134 for ranges (154293670372423273273390365393543806425,69164917636305877859094619660693892452] INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 82) Flushing memtables for Keyspace1... INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,266 StreamOut.java (line 128) Stream context metadata [/var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), (366336059,450095320)]] 1 sstables. INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 135) Sending a stream initiate message to /10.50.26.134 ... INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 140) Waiting for transfer to /10.50.26.134 to complete INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 359) LocationInfo has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1277249454413.log', position=720) INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 622) Enqueuing flush of Memtable(LocationInfo)@1637794189 INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,370 Memtable.java (line 149) Writing Memtable(LocationInfo)@1637794189 INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,528 Memtable.java (line 163) Completed flushing /var/lib/cassandra/data/system/LocationInfo-d-9-Data.db INFO [MEMTABLE-POST-FLUSHER:1] 2010-06-22 17:36:53,529 ColumnFamilyStore.java (line 374) Discarding 1000 Nothing more after this line. Am I doing something wrong? Best Regards, -Arya
Re: Strage Read Perfoamnce 1xN column slice or N column slice
Hey'all, As Jonathan pointed out in CASSANDRA-1199, this issue seams to be related to https://issues.apache.org/jira/browse/THRIFT-788. If you experience slowness with multiget_slice, take a look at that bug. -Arya - Original Message - From: "Arya Goudarzi" To: user@cassandra.apache.org, "jbellis" Sent: Wednesday, June 9, 2010 4:51:18 PM Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice Hi Jonathan, This issue persists. I have prepared a code sample which you can use to reproduce what I am saying. Please see attached. It is using Thrift PHP libraries straight. I am running Cassandra 0.7 build from May 28th. I have tried this on a single host with replication factor 1 and 3 node cluster with replication factor 3. The results remains similar: 100 Sequential Writes took: 0.60781407356262 seconds; 100 Sequential Reads took: 0.23204588890076 seconds; 100 Batch Read took: 0.76512885093689 seconds; Please advice. Thank You, -Arya - Original Message - From: "Jonathan Ellis" To: user@cassandra.apache.org Sent: Monday, June 7, 2010 7:26:30 PM Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice That would be surprising (and it is not what you said in the first message). I suspect something is wrong with your test methodology. On Mon, Jun 7, 2010 at 11:23 AM, Arya Goudarzi wrote: > But I am not comparing reading 1 column vs 100 columns. I am comparing > reading of 100 columns in loop iterations (100 consecutive calls) vs > reading all 100 in batch in one call. Doing the loop is faster than > doing the batch call. Are you saying this is not surprising? > > - Original Message - > From: "Jonathan Ellis" > To: user@cassandra.apache.org > Sent: Saturday, June 5, 2010 6:26:46 AM > Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice > > reading 1 column, is faster than reading lots of columns. this > shouldn't be surprising. > > On Fri, Jun 4, 2010 at 3:52 PM, Arya Goudarzi > > wrote: >> Hi Fellows, >> >> I have the following design for a system which holds basically >> key->value pairs (aka Columns) for each user (SuperColumn Key) in >> different namespaces >> (SuperColumnFamily row key). >> >> Like this: >> >> Namesapce->user->column_name = column_value; >> >> keyspaces: >> - name: NKVP >> replica_placement_strategy: >> org.apache.cassandra.locator.RackUnawareStrategy >> replication_factor: 3 >> column_families: >> - name: Namespaces >> column_type: Super >> compare_with: BytesType >> compare_subcolumns_with: BytesType >> rows_cached: 2 >> keys_cached: 100 >> >> Cluster using random partitioner. >> >> I use multiget_slice() for fetching 1 or many columns inside the >> child supercolumn at the same time. This is an awkward performance >> result I >> get: >> >> 100 sequential reads completed in : 0.383 this uses multiget_slice() >> with 1 key, and 1 column name inside the predicate->column_names >> 100 batch loaded completed in : 0.786 this uses multiget_slice() with >> 1 key, and multiple column names inside the predicate->column_names >> >> read/write consistency are ONE. >> >> Questions: >> >> Why doing 100 sequential reads is faster than doing 100 in batch? >> Is this a good design for my problem? >> Does my issue relate to >> https://issues.apache.org/jira/browse/CASSANDRA-598? >> >> Now on a single node with replication factor 1 I get this: >> >> 100 sequential reads completed in : 0.438 >> 100 batch loaded completed in : 0.800 >> >> Please advice as to why is this happening? >> >> These nodes are VMs. 1 CPU and 1 Gb. >> >> Best Regards, >> =Arya >> >> >> >> >> >> >> >> > > > > -- Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Strage Read Perfoamnce 1xN column slice or N column slice
But I am not comparing reading 1 column vs 100 columns. I am comparing reading of 100 columns in loop iterations (100 consecutive calls) vs reading all 100 in batch in one call. Doing the loop is faster than doing the batch call. Are you saying this is not surprising? - Original Message - From: "Jonathan Ellis" To: user@cassandra.apache.org Sent: Saturday, June 5, 2010 6:26:46 AM Subject: Re: Strage Read Perfoamnce 1xN column slice or N column slice reading 1 column, is faster than reading lots of columns. this shouldn't be surprising. On Fri, Jun 4, 2010 at 3:52 PM, Arya Goudarzi wrote: > Hi Fellows, > > I have the following design for a system which holds basically > key->value pairs (aka Columns) for each user (SuperColumn Key) in > different namespaces > (SuperColumnFamily row key). > > Like this: > > Namesapce->user->column_name = column_value; > > keyspaces: > - name: NKVP > replica_placement_strategy: > org.apache.cassandra.locator.RackUnawareStrategy > replication_factor: 3 > column_families: > - name: Namespaces > column_type: Super > compare_with: BytesType > compare_subcolumns_with: BytesType > rows_cached: 2 > keys_cached: 100 > > Cluster using random partitioner. > > I use multiget_slice() for fetching 1 or many columns inside the child > supercolumn at the same time. This is an awkward performance result I > get: > > 100 sequential reads completed in : 0.383 this uses multiget_slice() > with 1 key, and 1 column name inside the predicate->column_names > 100 batch loaded completed in : 0.786 this uses multiget_slice() with > 1 key, and multiple column names inside the predicate->column_names > > read/write consistency are ONE. > > Questions: > > Why doing 100 sequential reads is faster than doing 100 in batch? > Is this a good design for my problem? > Does my issue relate to > https://issues.apache.org/jira/browse/CASSANDRA-598? > > Now on a single node with replication factor 1 I get this: > > 100 sequential reads completed in : 0.438 > 100 batch loaded completed in : 0.800 > > Please advice as to why is this happening? > > These nodes are VMs. 1 CPU and 1 Gb. > > Best Regards, > =Arya > > > > > > > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Strage Read Perfoamnce 1xN column slice or N column slice
Hi Fellows, I have the following design for a system which holds basically key->value pairs (aka Columns) for each user (SuperColumn Key) in different namespaces (SuperColumnFamily row key). Like this: Namesapce->user->column_name = column_value; keyspaces: - name: NKVP replica_placement_strategy: org.apache.cassandra.locator.RackUnawareStrategy replication_factor: 3 column_families: - name: Namespaces column_type: Super compare_with: BytesType compare_subcolumns_with: BytesType rows_cached: 2 keys_cached: 100 Cluster using random partitioner. I use multiget_slice() for fetching 1 or many columns inside the child supercolumn at the same time. This is an awkward performance result I get: 100 sequential reads completed in : 0.383 this uses multiget_slice() with 1 key, and 1 column name inside the predicate->column_names 100 batch loaded completed in : 0.786 this uses multiget_slice() with 1 key, and multiple column names inside the predicate->column_names read/write consistency are ONE. Questions: Why doing 100 sequential reads is faster than doing 100 in batch? Is this a good design for my problem? Does my issue relate to https://issues.apache.org/jira/browse/CASSANDRA-598? Now on a single node with replication factor 1 I get this: 100 sequential reads completed in : 0.438 100 batch loaded completed in : 0.800 Please advice as to why is this happening? These nodes are VMs. 1 CPU and 1 Gb. Best Regards, =Arya
New Changes in Cass 0.7 Thrift API Interface
Hi Fellows, I just joined this mailing list but I've been on the IRC for a while. Pardon if this post is a repeat but I would like to share with you some of my experiences with Cassandra Thrift Interface that comes with the nightly built and probably 0.7. I came across an issue last night that I shared on the IRC, and I've found the solutions to that too, so read on as this might be your problem too in the future. I've noticed fundamental differences in the new Thrift Interface from the bleeding edge version. First of all, if you've coded with the Thrift interface that is shipped with 0.6 branches, your code will not work with the new interface shipped with 0.7. This is because if you look at function declarations inside CassandraClient Interface, you'll see changes for example in insert() number of parameter and the way objects are passed. This also makes the examples on the Wiki site obsolete. If you are using TBinaryProtocolAccelerated in your application, then you won't be able to figure what is the problem at first glance because it crashes web server process with a message like this: [Wed May 19 16:15:12 2010] [notice] child pid 32414 exit signal Aborted (6) terminate called after throwing an instance of 'PHPExceptionWrapper' what(): PHP exception zval=0x2b1c2a0429a0 This is a Thrift bug with thrift_protocol.so extention. Luckily I found the patch here: https://issues.apache.org/jira/browse/THRIFT-780 After applying the patch and recompiling the module, Exceptions bubbled up to my app. I got complaines about T_STRUCT being different: TProtocolException Object ( [message:protected] => Attempt to send non-object type as a T_STRUCT [string:Exception:private] => [code:protected] ... Well because the way I used to pass stuff to insert() was different from the way it should be in the new API. I discovered this the hard way and now am sharing the sauce with you. The example on the Wiki, will become something like this: $column = new cassandra_Column(array('name' => 'email', 'value' => 'exam...@foo.com', 'timestamp' => time())); $parent = new cassandra_ColumnParent(array('column_family' => 'Standard1')); // We want the consistency level to be ZERO which means async operations on 1 node $consistency_level = cassandra_ConsistencyLevel::ZERO; // Add the value to be written to the table, User Key, and path. $client->set_keyspace('Standard1'); $client->insert('1', $parent, $column, $consistency_level); Notice no more KeySpaces and ColumnPath passed to the insert(). Other functions are changed too. Good Luck, -Arya P.S. By the way, if someone grants me access, I'd like to contribute to the documentaions on Apache Cassandra.