Nodetool refresh v/s sstableloader
Hi Cassandra users, Cassandra dev, When recovering using SSTables from a snapshot, I want to know what are the key differences between using: 1. Nodetool refresh and, 2. SSTableloader Does nodetool refresh have restrictions that need to be met? Does nodetool refresh work even if there is a change in the topology between the source cluster and the destination cluster? Does it work if the token ranges don't match between the source cluster and the destination cluster? Does it work when an old SSTable in the snapshot has a dropped column that is not part of the current schema? I appreciate any help in advance. Thanks, Rajath Rajath Subramanyam
Re: SSTable Ancestors information in Cassandra 3.0.x
Thanks Jeff. Rajath Subramanyam On Thu, Mar 23, 2017 at 4:50 PM, Jeff Jirsa wrote: > The ancestors were used primarily to clean up leftovers in the case that > cassandra was killed right as compaction finished, where the > source/origin/ancestors were still on the disk at the same time as the > compaction result. > > It's not timestamp based, though - that compaction process has moved to > using a transaction log, which tracks the source/results on a per > compaction basis, and cassandra uses those logs/journals rather than > inspecting the ancestors. > > - Jeff > > > > On Thu, Mar 23, 2017 at 4:35 PM, Rajath Subramanyam > wrote: > > > Thanks, Jeff. Did all the internal tasks and the compaction tasks move > to a > > timestamp-based approach? > > > > Regards, > > Rajath > > > > > > Rajath Subramanyam > > > > > > On Thu, Mar 23, 2017 at 2:12 PM, Jeff Jirsa wrote: > > > > > That information was removed, because it was really meant to be used > for > > a > > > handful of internal tasks, most of which were no longer used. The > > remaining > > > use was cleaning up compaction leftovers, and the compaction leftover > > code > > > was rewritten in 3.0 / CASSANDRA-7066 (note, though, that it's somewhat > > > incomplete in the upgrade case , so CASSANDRA-13313 may be interesting > to > > > people who are very very very very very very very sensitive to data > > > consistency) > > > > > > > > > On Thu, Mar 23, 2017 at 2:00 PM, Rajath Subramanyam < > rajat...@gmail.com> > > > wrote: > > > > > > > Hello Cassandra-Users and Cassandra-dev, > > > > > > > > One of the handy features in sstablemetadata that was part of > Cassandra > > > > 2.1.15 was that it displayed Ancestor information of an SSTable. Here > > is > > > a > > > > sample output of the sstablemetadata tool with the ancestors > > information > > > in > > > > C* 2.1.15: > > > > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$ > > > > sstablemetadata > > > > ks3-test1-ka-2-Statistics.db | grep "Ancestors" > > > > Ancestors: [1] > > > > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$ > > > > > > > > However, the same tool in Cassandra 3.0.x no longer gives us that > > > > information. Here is a sample output of the sstablemetadata grepping > > for > > > > Ancestors information in C* 3.0 (the output is empty since it is no > > > longer > > > > available): > > > > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$ > > > > sstablemetadata mc-5-big-Statistics.db | grep "Ancestors" > > > > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$ > > > > > > > > My question, how can I get this information in C* 3.0.x ? > > > > > > > > Thank you ! > > > > > > > > Regards, > > > > Rajath > > > > > > > > > > > > Rajath Subramanyam > > > > > > > > > >
Re: SSTable Ancestors information in Cassandra 3.0.x
Thanks, Jeff. Did all the internal tasks and the compaction tasks move to a timestamp-based approach? Regards, Rajath Rajath Subramanyam On Thu, Mar 23, 2017 at 2:12 PM, Jeff Jirsa wrote: > That information was removed, because it was really meant to be used for a > handful of internal tasks, most of which were no longer used. The remaining > use was cleaning up compaction leftovers, and the compaction leftover code > was rewritten in 3.0 / CASSANDRA-7066 (note, though, that it's somewhat > incomplete in the upgrade case , so CASSANDRA-13313 may be interesting to > people who are very very very very very very very sensitive to data > consistency) > > > On Thu, Mar 23, 2017 at 2:00 PM, Rajath Subramanyam > wrote: > > > Hello Cassandra-Users and Cassandra-dev, > > > > One of the handy features in sstablemetadata that was part of Cassandra > > 2.1.15 was that it displayed Ancestor information of an SSTable. Here is > a > > sample output of the sstablemetadata tool with the ancestors information > in > > C* 2.1.15: > > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$ > > sstablemetadata > > ks3-test1-ka-2-Statistics.db | grep "Ancestors" > > Ancestors: [1] > > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$ > > > > However, the same tool in Cassandra 3.0.x no longer gives us that > > information. Here is a sample output of the sstablemetadata grepping for > > Ancestors information in C* 3.0 (the output is empty since it is no > longer > > available): > > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$ > > sstablemetadata mc-5-big-Statistics.db | grep "Ancestors" > > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$ > > > > My question, how can I get this information in C* 3.0.x ? > > > > Thank you ! > > > > Regards, > > Rajath > > > > > > Rajath Subramanyam > > >
SSTable Ancestors information in Cassandra 3.0.x
Hello Cassandra-Users and Cassandra-dev, One of the handy features in sstablemetadata that was part of Cassandra 2.1.15 was that it displayed Ancestor information of an SSTable. Here is a sample output of the sstablemetadata tool with the ancestors information in C* 2.1.15: [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$ sstablemetadata ks3-test1-ka-2-Statistics.db | grep "Ancestors" Ancestors: [1] [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$ However, the same tool in Cassandra 3.0.x no longer gives us that information. Here is a sample output of the sstablemetadata grepping for Ancestors information in C* 3.0 (the output is empty since it is no longer available): [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$ sstablemetadata mc-5-big-Statistics.db | grep "Ancestors" [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$ My question, how can I get this information in C* 3.0.x ? Thank you ! Regards, Rajath -------- Rajath Subramanyam
Cassandra snapshot JMX commands
Hello Cassandra-users, I have a question about issuing snapshot using JMX commands. - Issuing snapshot on a single column family: Nodetool command: nodetool snapshot -cf -t ** The equivalent JMX command is: run -d org.apache.cassandra.db -b org.apache.cassandra.db:type=StorageService takeColumnFamilySnapshot < ks_name> - Issuing snapshot to multiple column families spanning keyspaces: Nodetool command: nodetool snapshot -kc .,.,.,. -t Can anybody help me with JMX equivalent of the above command ? Thanks in advance. Regards, Rajath ---- Rajath Subramanyam
Re: Massive churn in new SSTables getting generated.
Hello Cassandra-users / Cassandra-dev, (Resending with the correct user mailing list) We have a C* 2.1.12 cluster. We have a total size of ~ 100 GB, with 2-3 GB of data being pumped in per day. We are encountering an unique situation where the SSTables are getting compacted (or it seems so) multiple times within a span of 6 hours time interval. We can conclude this by seeing different set of generation numbers on the SSTables each time, but with the same amount of data. Is this a result of compaction or repair ? Or is there any other operation that can lead to such a situation. The tables are using STCS. Thanks in advance for your help. - Rajath
Re: SSTable generation numbers
Thanks Tyler. - Rajath Rajath Subramanyam On Tue, Jun 28, 2016 at 11:01 AM, Tyler Hobbs wrote: > 32 bit integer overflow is the only scenario where a single node would wrap > around. > > However, when copying sstables from one node to another, there can easily > be conflicts, so this is something to be careful about. > > On Mon, Jun 27, 2016 at 8:10 PM, Rajath Subramanyam > wrote: > > > Hello Cassandra-dev, > > > > Are there any scenarios in which the generation numbers of SSTables (i.e. > > ksname-cfname--Data.db) can wrap around, without the admin > > dropping and re-creating the CF with the same name ? > > > > I believe that the answer must be version-agnostic, but in case it > matters, > > I am specifically asking this question for C* 2.0/2.1. > > > > Thanks in advance for your help. > > > > - Rajath > > > > Rajath Subramanyam > > > > > > -- > Tyler Hobbs > DataStax <http://datastax.com/> >
SSTable generation numbers
Hello Cassandra-dev, Are there any scenarios in which the generation numbers of SSTables (i.e. ksname-cfname--Data.db) can wrap around, without the admin dropping and re-creating the CF with the same name ? I believe that the answer must be version-agnostic, but in case it matters, I am specifically asking this question for C* 2.0/2.1. Thanks in advance for your help. - Rajath Rajath Subramanyam
Re: Statistics.db file in Cassandra 3.0
Thanks Tyler. Rajath Subramanyam On Wed, Feb 10, 2016 at 8:45 AM, Tyler Hobbs wrote: > Take a look at MetadataSerializer and the MetadataComponent subclasses. > > On Tue, Feb 9, 2016 at 4:34 PM, Rajath Subramanyam > wrote: > > > Hello Cassandra-Dev, > > > > I have noticed that in Cassandra 3.0 there is a new file in the > > // called > > ma--big-Statistics.db. > > > > What does this file contain ? Is it compressed ? How do I read it ? > > > > Thanks in advance for sharing some information on this. > > > > - Rajath > > > > Rajath Subramanyam > > > > > > -- > Tyler Hobbs > DataStax <http://datastax.com/> >
Statistics.db file in Cassandra 3.0
Hello Cassandra-Dev, I have noticed that in Cassandra 3.0 there is a new file in the // called ma--big-Statistics.db. What does this file contain ? Is it compressed ? How do I read it ? Thanks in advance for sharing some information on this. - Rajath Rajath Subramanyam
Re: SSTable format in C* 2.2 and later
Thanks Tyler. Cassandra-8099 has a lot of meat. I will get back in case of further questions. - Rajath Rajath Subramanyam On Tue, Jan 19, 2016 at 8:27 AM, Tyler Hobbs wrote: > Primarily, CASSANDRA-8099. If you look at the Version class in > o.a.c.io.sstable.format.big.BigFormat, there are comments that list the > different sstable versions along with what changes went into those. You > can look at git blame to see what the related jira tickets are. > > On Mon, Jan 18, 2016 at 7:48 PM, Rajath Subramanyam > wrote: > > > Hello Cassandra-dev community, > > > > Does anyone know the JIRAs that affected the change in the SSTable format > > for C* 2.2 and later ? > > > > Thanks in advance. > > > > - Rajath > > > > Rajath Subramanyam > > > > > > -- > Tyler Hobbs > DataStax <http://datastax.com/> >
SSTable format in C* 2.2 and later
Hello Cassandra-dev community, Does anyone know the JIRAs that affected the change in the SSTable format for C* 2.2 and later ? Thanks in advance. - Rajath Rajath Subramanyam
Re: Partitioned Counters Design
Hi Aleksey, Thanks for the response. I read through several JIRAs ( CASSANDRA-1072, CASSANDRA-2495, CASSANDRA-4775, CASSANDRA-6504, CASSANDRA-6506). I would appreciate if you can clarify my understanding of the current state of art. I understand that the earlier design of having local shards, remote shards and logging only deltas for local shards provided good latency but suffered inconsistencies due to replays not being idempotent. Was this primarily lack of internal idempotent updates (replica-replica) or due to lack external idempotent updates (client-coordinator) ? Also, I have an additional question about the state of counters before CASSANDRA-6504 (or Counters 1.0 if you like). Does the client submit CounterUpdate ? During the retry is the same CounterUpdate submitted again identically as the original attempt ? When the coordinator (if it is one of the replica) OR the leader does a write propagation to the other replicas does it include the same CounterUpdate ? Essentially, my question is: Can we identify each counter update from a client uniquely ? and does the retry use the same unique id ? Also post the implementation of CASSANDRA-6504 (or Counters 2.0 design) I understand that local and remote shards have been replaced by global shards and now counters follow a lock-read-write-unlock-replicate model. Does this guarantee that retries from clients are still idempotent ? Also, in Counters 2.0, is the CounterUpdate sent by the client an unique id ? I would appreciate your clarification on this. Thank you ! Regards, Rajath Rajath Subramanyam On Mon, Oct 6, 2014 at 3:38 PM, Aleksey Yeschenko wrote: > No, there are no unique ids per increment. That was one of the ideas > suggested in https://issues.apache.org/jira/browse/CASSANDRA-4775, but > ultimately declined. > > Read that ticket, and the one linked to it, for details. > > -- > AY > > On October 6, 2014 at 10:20:05 PM, Rajath Subramanyam (rajat...@gmail.com) > wrote: > > Hi Cassandra developers, > > I am working on a project to make counter updates idempotent. I read that > via CASSANDRA-1546 assigns unique marker ids to counter updates in 0.7.1. > Does this unique marker id hold true in the later versions too ? Or at > least in 0.8.1 ? > > Please let me know. > > Thank you ! > > Regards, > Rajath > > Rajath Subramanyam > >
Partitioned Counters Design
Hi Cassandra developers, I am working on a project to make counter updates idempotent. I read that via CASSANDRA-1546 assigns unique marker ids to counter updates in 0.7.1. Does this unique marker id hold true in the later versions too ? Or at least in 0.8.1 ? Please let me know. Thank you ! Regards, Rajath Rajath Subramanyam
Fork Cassandra-0.8
Hi All, I want to fork Cassandra - 0.8 from git. Does anybody have any advice on how to do this ? I have already tried the following: - Forked the latest repo and created a branch to go back to an old commit (around 0.8) - Forked the latest repo and ran git reset --hard to go back to an old commit (around 0.8) I am looking for a much cleaner way to directly fork 0.8 version. Any help is appreciated. Thanks in advance. // Rajath Rajath Subramanyam