Nodetool refresh v/s sstableloader

2018-08-27 Thread Rajath Subramanyam
Hi Cassandra users, Cassandra dev,

When recovering using SSTables from a snapshot, I want to know what are the
key differences between using:
1. Nodetool refresh and,
2. SSTableloader

Does nodetool refresh have restrictions that need to be met?
Does nodetool refresh work even if there is a change in the topology
between the source cluster and the destination cluster? Does it work if the
token ranges don't match between the source cluster and the destination
cluster? Does it work when an old SSTable in the snapshot has a dropped
column that is not part of the current schema?

I appreciate any help in advance.

Thanks,
Rajath

Rajath Subramanyam


Re: SSTable Ancestors information in Cassandra 3.0.x

2017-03-25 Thread Rajath Subramanyam
Thanks Jeff.


Rajath Subramanyam


On Thu, Mar 23, 2017 at 4:50 PM, Jeff Jirsa  wrote:

> The ancestors were used primarily to clean up leftovers in the case that
> cassandra was killed right as compaction finished, where the
> source/origin/ancestors were still on the disk at the same time as the
> compaction result.
>
> It's not timestamp based, though - that compaction process has moved to
> using a transaction log, which tracks the source/results on a per
> compaction basis, and cassandra uses those logs/journals rather than
> inspecting the ancestors.
>
> - Jeff
>
>
>
> On Thu, Mar 23, 2017 at 4:35 PM, Rajath Subramanyam 
> wrote:
>
> > Thanks, Jeff. Did all the internal tasks and the compaction tasks move
> to a
> > timestamp-based approach?
> >
> > Regards,
> > Rajath
> >
> > 
> > Rajath Subramanyam
> >
> >
> > On Thu, Mar 23, 2017 at 2:12 PM, Jeff Jirsa  wrote:
> >
> > > That information was removed, because it was really meant to be used
> for
> > a
> > > handful of internal tasks, most of which were no longer used. The
> > remaining
> > > use was cleaning up compaction leftovers, and the compaction leftover
> > code
> > > was rewritten in 3.0 / CASSANDRA-7066 (note, though, that it's somewhat
> > > incomplete in the upgrade case , so CASSANDRA-13313 may be interesting
> to
> > > people who are very very very very very very very sensitive to data
> > > consistency)
> > >
> > >
> > > On Thu, Mar 23, 2017 at 2:00 PM, Rajath Subramanyam <
> rajat...@gmail.com>
> > > wrote:
> > >
> > > > Hello Cassandra-Users and Cassandra-dev,
> > > >
> > > > One of the handy features in sstablemetadata that was part of
> Cassandra
> > > > 2.1.15 was that it displayed Ancestor information of an SSTable. Here
> > is
> > > a
> > > > sample output of the sstablemetadata tool with the ancestors
> > information
> > > in
> > > > C* 2.1.15:
> > > > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$
> > > > sstablemetadata
> > > > ks3-test1-ka-2-Statistics.db | grep "Ancestors"
> > > > Ancestors: [1]
> > > > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$
> > > >
> > > > However, the same tool in Cassandra 3.0.x no longer gives us that
> > > > information. Here is a sample output of the sstablemetadata grepping
> > for
> > > > Ancestors information in C* 3.0 (the output is empty since it is no
> > > longer
> > > > available):
> > > > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$
> > > > sstablemetadata mc-5-big-Statistics.db | grep "Ancestors"
> > > > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$
> > > >
> > > > My question, how can I get this information in C* 3.0.x ?
> > > >
> > > > Thank you !
> > > >
> > > > Regards,
> > > > Rajath
> > > >
> > > > 
> > > > Rajath Subramanyam
> > > >
> > >
> >
>


Re: SSTable Ancestors information in Cassandra 3.0.x

2017-03-23 Thread Rajath Subramanyam
Thanks, Jeff. Did all the internal tasks and the compaction tasks move to a
timestamp-based approach?

Regards,
Rajath


Rajath Subramanyam


On Thu, Mar 23, 2017 at 2:12 PM, Jeff Jirsa  wrote:

> That information was removed, because it was really meant to be used for a
> handful of internal tasks, most of which were no longer used. The remaining
> use was cleaning up compaction leftovers, and the compaction leftover code
> was rewritten in 3.0 / CASSANDRA-7066 (note, though, that it's somewhat
> incomplete in the upgrade case , so CASSANDRA-13313 may be interesting to
> people who are very very very very very very very sensitive to data
> consistency)
>
>
> On Thu, Mar 23, 2017 at 2:00 PM, Rajath Subramanyam 
> wrote:
>
> > Hello Cassandra-Users and Cassandra-dev,
> >
> > One of the handy features in sstablemetadata that was part of Cassandra
> > 2.1.15 was that it displayed Ancestor information of an SSTable. Here is
> a
> > sample output of the sstablemetadata tool with the ancestors information
> in
> > C* 2.1.15:
> > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$
> > sstablemetadata
> > ks3-test1-ka-2-Statistics.db | grep "Ancestors"
> > Ancestors: [1]
> > [centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$
> >
> > However, the same tool in Cassandra 3.0.x no longer gives us that
> > information. Here is a sample output of the sstablemetadata grepping for
> > Ancestors information in C* 3.0 (the output is empty since it is no
> longer
> > available):
> > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$
> > sstablemetadata mc-5-big-Statistics.db | grep "Ancestors"
> > [centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$
> >
> > My question, how can I get this information in C* 3.0.x ?
> >
> > Thank you !
> >
> > Regards,
> > Rajath
> >
> > 
> > Rajath Subramanyam
> >
>


SSTable Ancestors information in Cassandra 3.0.x

2017-03-23 Thread Rajath Subramanyam
Hello Cassandra-Users and Cassandra-dev,

One of the handy features in sstablemetadata that was part of Cassandra
2.1.15 was that it displayed Ancestor information of an SSTable. Here is a
sample output of the sstablemetadata tool with the ancestors information in
C* 2.1.15:
[centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$ sstablemetadata
ks3-test1-ka-2-Statistics.db | grep "Ancestors"
Ancestors: [1]
[centos@chen-datos test1-b83746000fef11e7bdfc8bb2d6662df7]$

However, the same tool in Cassandra 3.0.x no longer gives us that
information. Here is a sample output of the sstablemetadata grepping for
Ancestors information in C* 3.0 (the output is empty since it is no longer
available):
[centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$
sstablemetadata mc-5-big-Statistics.db | grep "Ancestors"
[centos@rj-cassandra-1 elsevier1-ab7389f0fafb11e6ac23e7ccf62f494b]$

My question, how can I get this information in C* 3.0.x ?

Thank you !

Regards,
Rajath

--------
Rajath Subramanyam


Cassandra snapshot JMX commands

2016-10-25 Thread Rajath Subramanyam
Hello Cassandra-users,

I have a question about issuing snapshot using JMX commands.


   - Issuing snapshot on a single column family:

Nodetool command:
nodetool snapshot -cf  -t ** 

The equivalent JMX command is:
run -d org.apache.cassandra.db -b
org.apache.cassandra.db:type=StorageService takeColumnFamilySnapshot <
ks_name>  


   - Issuing snapshot to multiple column families spanning keyspaces:

Nodetool command:
nodetool snapshot -kc .,.,.,. -t


Can anybody help me with JMX equivalent of the above command ?

Thanks in advance.

Regards,
Rajath
----
Rajath Subramanyam


Re: Massive churn in new SSTables getting generated.

2016-08-03 Thread Rajath Subramanyam
Hello Cassandra-users / Cassandra-dev,

(Resending with the correct user mailing list)

We have a C* 2.1.12 cluster. We have a total size of ~ 100 GB, with 2-3 GB
of data being pumped in per day. We are encountering an unique situation
where the SSTables are getting compacted (or it seems so) multiple times
within a span of 6 hours time interval. We can conclude this by seeing
different set of generation numbers on the SSTables each time, but with the
same amount of data. Is this a result of compaction or repair ? Or is there
any other operation that can lead to such a situation. The tables are using
STCS.

Thanks in advance for your help.

 - Rajath


Re: SSTable generation numbers

2016-06-28 Thread Rajath Subramanyam
Thanks Tyler.

- Rajath


Rajath Subramanyam


On Tue, Jun 28, 2016 at 11:01 AM, Tyler Hobbs  wrote:

> 32 bit integer overflow is the only scenario where a single node would wrap
> around.
>
> However, when copying sstables from one node to another, there can easily
> be conflicts, so this is something to be careful about.
>
> On Mon, Jun 27, 2016 at 8:10 PM, Rajath Subramanyam 
> wrote:
>
> > Hello Cassandra-dev,
> >
> > Are there any scenarios in which the generation numbers of SSTables (i.e.
> > ksname-cfname--Data.db) can wrap around, without the admin
> > dropping and re-creating the CF with the same name ?
> >
> > I believe that the answer must be version-agnostic, but in case it
> matters,
> > I am specifically asking this question for C* 2.0/2.1.
> >
> > Thanks in advance for your help.
> >
> > - Rajath
> > 
> > Rajath Subramanyam
> >
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>


SSTable generation numbers

2016-06-27 Thread Rajath Subramanyam
Hello Cassandra-dev,

Are there any scenarios in which the generation numbers of SSTables (i.e.
ksname-cfname--Data.db) can wrap around, without the admin
dropping and re-creating the CF with the same name ?

I believe that the answer must be version-agnostic, but in case it matters,
I am specifically asking this question for C* 2.0/2.1.

Thanks in advance for your help.

- Rajath

Rajath Subramanyam


Re: Statistics.db file in Cassandra 3.0

2016-02-11 Thread Rajath Subramanyam
Thanks Tyler.


Rajath Subramanyam


On Wed, Feb 10, 2016 at 8:45 AM, Tyler Hobbs  wrote:

> Take a look at MetadataSerializer and the MetadataComponent subclasses.
>
> On Tue, Feb 9, 2016 at 4:34 PM, Rajath Subramanyam 
> wrote:
>
> > Hello Cassandra-Dev,
> >
> > I have noticed that in Cassandra 3.0 there is a new file in the
> > // called
> > ma--big-Statistics.db.
> >
> > What does this file contain ? Is it compressed ? How do I read it ?
> >
> > Thanks in advance for sharing some information on this.
> >
> > - Rajath
> > 
> > Rajath Subramanyam
> >
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>


Statistics.db file in Cassandra 3.0

2016-02-09 Thread Rajath Subramanyam
Hello Cassandra-Dev,

I have noticed that in Cassandra 3.0 there is a new file in the
// called
ma--big-Statistics.db.

What does this file contain ? Is it compressed ? How do I read it ?

Thanks in advance for sharing some information on this.

- Rajath

Rajath Subramanyam


Re: SSTable format in C* 2.2 and later

2016-01-20 Thread Rajath Subramanyam
Thanks Tyler. Cassandra-8099 has a lot of meat. I will get back in case of
further questions.

- Rajath


Rajath Subramanyam


On Tue, Jan 19, 2016 at 8:27 AM, Tyler Hobbs  wrote:

> Primarily, CASSANDRA-8099.  If you look at the Version class in
> o.a.c.io.sstable.format.big.BigFormat, there are comments that list the
> different sstable versions along with what changes went into those.  You
> can look at git blame to see what the related jira tickets are.
>
> On Mon, Jan 18, 2016 at 7:48 PM, Rajath Subramanyam 
> wrote:
>
> > Hello Cassandra-dev community,
> >
> > Does anyone know the JIRAs that affected the change in the SSTable format
> > for C* 2.2 and later ?
> >
> > Thanks in advance.
> >
> > - Rajath
> > 
> > Rajath Subramanyam
> >
>
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>


SSTable format in C* 2.2 and later

2016-01-18 Thread Rajath Subramanyam
Hello Cassandra-dev community,

Does anyone know the JIRAs that affected the change in the SSTable format
for C* 2.2 and later ?

Thanks in advance.

- Rajath

Rajath Subramanyam


Re: Partitioned Counters Design

2014-10-10 Thread Rajath Subramanyam
Hi Aleksey,

Thanks for the response. I read through several JIRAs ( CASSANDRA-1072,
CASSANDRA-2495, CASSANDRA-4775, CASSANDRA-6504, CASSANDRA-6506). I would
appreciate if you can clarify my understanding of the current state of art.

I understand that the earlier design of having local shards, remote shards
and logging only deltas for local shards provided good latency but suffered
inconsistencies due to replays not being idempotent. Was this primarily
lack of internal idempotent updates (replica-replica) or due to lack
external idempotent updates (client-coordinator) ? Also, I have an
additional question about the state of counters before CASSANDRA-6504 (or
Counters 1.0 if you like). Does the client submit
CounterUpdate ? During the retry is the
same  CounterUpdate submitted again
identically as the original attempt ? When the coordinator (if it is one of
the replica) OR the leader does a write propagation to the other replicas
does it include the same CounterUpdate ?
Essentially, my question is: Can we identify each counter update from a
client uniquely ? and does the retry use the same unique id ?

Also post the implementation of CASSANDRA-6504 (or Counters 2.0 design) I
understand that local and remote shards have been replaced by global shards
and now counters follow a lock-read-write-unlock-replicate model. Does this
guarantee that retries from clients are still idempotent ? Also, in
Counters 2.0, is the CounterUpdate sent by the client an unique id ?

I would appreciate your clarification on this.

Thank you !

Regards,
Rajath



Rajath Subramanyam


On Mon, Oct 6, 2014 at 3:38 PM, Aleksey Yeschenko 
wrote:

> No, there are no unique ids per increment. That was one of the ideas
> suggested in https://issues.apache.org/jira/browse/CASSANDRA-4775, but
> ultimately declined.
>
> Read that ticket, and the one linked to it, for details.
>
> --
> AY
>
> On October 6, 2014 at 10:20:05 PM, Rajath Subramanyam (rajat...@gmail.com)
> wrote:
>
> Hi Cassandra developers,
>
> I am working on a project to make counter updates idempotent. I read that
> via CASSANDRA-1546 assigns unique marker ids to counter updates in 0.7.1.
> Does this unique marker id hold true in the later versions too ? Or at
> least in 0.8.1 ?
>
> Please let me know.
>
> Thank you !
>
> Regards,
> Rajath
> 
> Rajath Subramanyam
>
>


Partitioned Counters Design

2014-10-06 Thread Rajath Subramanyam
Hi Cassandra developers,

I am working on a project to make counter updates idempotent. I read that
via CASSANDRA-1546 assigns unique marker ids to counter updates in 0.7.1.
Does this unique marker id hold true in the later versions too ? Or at
least in 0.8.1 ?

Please let me know.

Thank you !

Regards,
Rajath

Rajath Subramanyam


Fork Cassandra-0.8

2014-09-25 Thread Rajath Subramanyam
Hi All,

I want to fork Cassandra - 0.8 from git. Does anybody have any advice on
how to do this ?

I have already tried the following:
- Forked the latest repo and created a branch to go back to an old commit
(around 0.8)
- Forked the latest repo and ran git reset --hard to go back to an old
commit (around 0.8)

I am looking for a much cleaner way to directly fork 0.8 version. Any help
is appreciated.

Thanks in advance.

// Rajath

Rajath Subramanyam