Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
>
> I wouldn't use TWCS if there's updates, you're going to risk having
> data that's never deleted and really small sstables sticking around
> forever.

How do you risk having data sticking around forever when everything is
TTL'd?

If you use really large buckets, what's the point of TWCS?

No one said anything about really large buckets. I'd also note that if the
data was so small per partition it would be entirely reasonable to not
bucket by partition key (and window) and thus updates would become
irrelevant.

Honestly this is such a small workload you could easily use STCS or
> LCS and you'd likely never, ever see a problem.


While the numbers sound small, there must be some logical reason to have so
many nodes. In my experience STCS and LCS both have their own drawbacks in
regards to updates, more so when you have high data density, which sounds
like it might be the case here. It's not hard to test these things and it's
important to get these things right at the start to save yourself some
serious pain down the track.

On 13 June 2018 at 22:41, Jonathan Haddad  wrote:

> I wouldn't use TWCS if there's updates, you're going to risk having
> data that's never deleted and really small sstables sticking around
> forever.  If you use really large buckets, what's the point of TWCS?
>
> Honestly this is such a small workload you could easily use STCS or
> LCS and you'd likely never, ever see a problem.
> On Wed, Jun 13, 2018 at 3:34 PM kurt greaves  wrote:
> >
> > TWCS is probably still worth trying. If you mean updating old rows in
> TWCS "out of order updates" will only really mean you'll hit more SSTables
> on read. This might add a bit of complexity in your client if your
> bucketing partitions (not strictly necessary), but that's about it. As long
> as you're not specifying "USING TIMESTAMP" you still get the main benefit
> of efficient dropping of SSTables - C* only cares about the write timestamp
> of the data in regards to TTL's, not timestamps stored in your
> partition/clustering key.
> > Also keep in mind that you can specify the window size in TWCS, so if
> you can increase it enough to cover the "out of order" updates then that
> will also solve the problem w.r.t old buckets.
> >
> > In regards to LCS, the only way to really know if it'll be too much
> compaction overhead is to test it, but for the most part you should
> consider your read/write ratio, rather than the total number of
> reads/writes (unless it's so small that it's irrelevant, which it may well
> be).
> >
> > On 13 June 2018 at 19:25, manuj singh  wrote:
> >>
> >> Hi all,
> >> I am trying to determine compaction strategy for our use case.
> >> In our use case we will have updates on a row a few times. And we have
> a ttl also defined on the table level.
> >> Our typical workload is less then 1000 writes + reads per second. At
> the max it could go up to 2500 per second.
> >> We use SSD and have around 64 gb of ram on each node. Our cluster size
> is around 70 nodes.
> >>
> >> I looked at time series but we cant guarantee that the updates will
> happen within a give time window. And if we have out of order updates it
> might impact on when we remove that data from the disk.
> >>
> >> So i was looking at level tiered, which supposedly is good when you
> have updates. However its io bound and will affect the writes. everywhere i
> read it says its not good for write heavy workload.
> >> But Looking at our write velocity, is it really write heavy ?
> >>
> >> I guess what i am trying to find out is will level tiered compaction
> will impact the writes in our use case or it will be fine given our write
> rate is not that much.
> >> Also is there anything else i should keep in mind while deciding on the
> compaction strategy.
> >>
> >> Thanks!!
> >
> >
>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Timestamp on hints file and system.hints table data

2018-06-13 Thread kurt greaves
Does the UUID on the filename correspond with a UUID in nodetool status?

Sounds to me like it could be something weird with an old node that no
longer exists, although hints for old nodes are meant to be cleaned up.

On 14 June 2018 at 01:54, Nitan Kainth  wrote:

> Kurt,
>
> No node is down for months. And yes, I am surprised to look at Unix
> timestamp on files.
>
>
>
> On Jun 13, 2018, at 6:41 PM, kurt greaves  wrote:
>
> system.hints is not used in Cassandra 3. Can't explain the files though,
> are you referring to the files timestamp or the Unix timestamp in the file
> name? Is there a node that's been down for several months?
>
> On Wed., 13 Jun. 2018, 23:41 Nitan Kainth,  wrote:
>
>> Hi,
>>
>> I observed a strange behavior about stored hints.
>>
>> Time stamp of hints file shows several months old. I deleted them and saw
>> new hints files created with same old date. Why is that?
>>
>> Also, I see hints files on disk but if I query system.hints table, it
>> shows 0 rows. Why system.hints is not populated?
>>
>> Version 3.11-1
>>
>


Re: Timestamp on hints file and system.hints table data

2018-06-13 Thread Nitan Kainth
Kurt,

No node is down for months. And yes, I am surprised to look at Unix timestamp 
on files.


> On Jun 13, 2018, at 6:41 PM, kurt greaves  wrote:
> 
> system.hints is not used in Cassandra 3. Can't explain the files though, are 
> you referring to the files timestamp or the Unix timestamp in the file name? 
> Is there a node that's been down for several months?
> 
>> On Wed., 13 Jun. 2018, 23:41 Nitan Kainth,  wrote:
>> Hi,
>> 
>> I observed a strange behavior about stored hints. 
>> 
>> Time stamp of hints file shows several months old. I deleted them and saw 
>> new hints files created with same old date. Why is that?
>> 
>> Also, I see hints files on disk but if I query system.hints table, it shows 
>> 0 rows. Why system.hints is not populated?
>> 
>> Version 3.11-1


Re: compaction_throughput: Difference between 0 (unthrottled) and large value

2018-06-13 Thread Joshua Galbraith
Thomas,

This post from Ryan Svihla has a few notes in it that may or may not be
useful to you:

>If you read the original throttling Jira you can see that there is a hurry
up and wait component to unthrottled compaction (CASSANDRA-2156- Compaction
Throttling). Ultimately you will saturate your IO in bursts, backing up
other processes and making different bottlenecks spike up a long the way,
potentially causing something OTHER than compaction to get so far behind
that the server becomes unresponsive (such as GC).

via https://medium.com/@foundev/how-i-tune-cassandra-compaction-7c16fb0b1d99

On Mon, Jun 11, 2018 at 12:05 AM, Steinmaurer, Thomas <
thomas.steinmau...@dynatrace.com> wrote:

> Sorry, should have first looked at the source code. In case of 0, it is
> set to Double.MAX_VALUE.
>
>
>
> Thomas
>
>
>
> *From:* Steinmaurer, Thomas [mailto:thomas.steinmau...@dynatrace.com]
> *Sent:* Montag, 11. Juni 2018 08:53
> *To:* user@cassandra.apache.org
> *Subject:* compaction_throughput: Difference between 0 (unthrottled) and
> large value
>
>
>
> Hello,
>
>
>
> on a 3 node loadtest cluster with very capable machines (32 physical
> cores, 512G RAM, 20T storage (26 disk RAID)), I’m trying to max out
> compaction, thus currently testing with:
>
>
>
> concurrent_compactors: 16
>
> compaction_throughput_mb_per_sec: 0
>
>
>
> With our simulated incoming load + compaction etc., the Linux volume shows
> ~ 20 Mbyte/s Read IO + 50 Mbyte/s Write IO in AVG, constantly.
>
>
>
>
>
> Setting throughput to 0 should mean unthrottled, right? Is this really
> unthrottled from a throughput perspective and then is basically limited by
> disk capabilities only? Or should it be better set to a very high value
> instead of 0. Is there any semantical difference here?
>
>
>
>
>
> Thanks,
>
> Thomas
>
>
>
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freistädterstraße 313
>



-- 
*Joshua Galbraith *| Senior Software Engineer | New Relic
C: 907-209-1208 | jgalbra...@newrelic.com


Re: Compaction strategy for update heavy workload

2018-06-13 Thread Jonathan Haddad
I wouldn't use TWCS if there's updates, you're going to risk having
data that's never deleted and really small sstables sticking around
forever.  If you use really large buckets, what's the point of TWCS?

Honestly this is such a small workload you could easily use STCS or
LCS and you'd likely never, ever see a problem.
On Wed, Jun 13, 2018 at 3:34 PM kurt greaves  wrote:
>
> TWCS is probably still worth trying. If you mean updating old rows in TWCS 
> "out of order updates" will only really mean you'll hit more SSTables on 
> read. This might add a bit of complexity in your client if your bucketing 
> partitions (not strictly necessary), but that's about it. As long as you're 
> not specifying "USING TIMESTAMP" you still get the main benefit of efficient 
> dropping of SSTables - C* only cares about the write timestamp of the data in 
> regards to TTL's, not timestamps stored in your partition/clustering key.
> Also keep in mind that you can specify the window size in TWCS, so if you can 
> increase it enough to cover the "out of order" updates then that will also 
> solve the problem w.r.t old buckets.
>
> In regards to LCS, the only way to really know if it'll be too much 
> compaction overhead is to test it, but for the most part you should consider 
> your read/write ratio, rather than the total number of reads/writes (unless 
> it's so small that it's irrelevant, which it may well be).
>
> On 13 June 2018 at 19:25, manuj singh  wrote:
>>
>> Hi all,
>> I am trying to determine compaction strategy for our use case.
>> In our use case we will have updates on a row a few times. And we have a ttl 
>> also defined on the table level.
>> Our typical workload is less then 1000 writes + reads per second. At the max 
>> it could go up to 2500 per second.
>> We use SSD and have around 64 gb of ram on each node. Our cluster size is 
>> around 70 nodes.
>>
>> I looked at time series but we cant guarantee that the updates will happen 
>> within a give time window. And if we have out of order updates it might 
>> impact on when we remove that data from the disk.
>>
>> So i was looking at level tiered, which supposedly is good when you have 
>> updates. However its io bound and will affect the writes. everywhere i read 
>> it says its not good for write heavy workload.
>> But Looking at our write velocity, is it really write heavy ?
>>
>> I guess what i am trying to find out is will level tiered compaction will 
>> impact the writes in our use case or it will be fine given our write rate is 
>> not that much.
>> Also is there anything else i should keep in mind while deciding on the 
>> compaction strategy.
>>
>> Thanks!!
>
>


-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Timestamp on hints file and system.hints table data

2018-06-13 Thread kurt greaves
system.hints is not used in Cassandra 3. Can't explain the files though,
are you referring to the files timestamp or the Unix timestamp in the file
name? Is there a node that's been down for several months?

On Wed., 13 Jun. 2018, 23:41 Nitan Kainth,  wrote:

> Hi,
>
> I observed a strange behavior about stored hints.
>
> Time stamp of hints file shows several months old. I deleted them and saw
> new hints files created with same old date. Why is that?
>
> Also, I see hints files on disk but if I query system.hints table, it
> shows 0 rows. Why system.hints is not populated?
>
> Version 3.11-1
>


Re: Compaction strategy for update heavy workload

2018-06-13 Thread kurt greaves
TWCS is probably still worth trying. If you mean updating old rows in TWCS
"out of order updates" will only really mean you'll hit more SSTables on
read. This might add a bit of complexity in your client if your bucketing
partitions (not strictly necessary), but that's about it. As long as you're
not specifying "USING TIMESTAMP" you still get the main benefit of
efficient dropping of SSTables - C* only cares about the *write timestamp* of
the data in regards to TTL's, not timestamps stored in your
partition/clustering key.
Also keep in mind that you can specify the window size in TWCS, so if you
can increase it enough to cover the "out of order" updates then that will
also solve the problem w.r.t old buckets.

In regards to LCS, the only way to really know if it'll be too much
compaction overhead is to test it, but for the most part you should
consider your read/write ratio, rather than the total number of
reads/writes (unless it's so small that it's irrelevant, which it may well
be).

On 13 June 2018 at 19:25, manuj singh  wrote:

> Hi all,
> I am trying to determine compaction strategy for our use case.
> In our use case we will have updates on a row a few times. And we have a
> ttl also defined on the table level.
> Our typical workload is less then 1000 writes + reads per second. At the
> max it could go up to 2500 per second.
> We use SSD and have around 64 gb of ram on each node. Our cluster size is
> around 70 nodes.
>
> I looked at time series but we cant guarantee that the updates will happen
> within a give time window. And if we have out of order updates it might
> impact on when we remove that data from the disk.
>
> So i was looking at level tiered, which supposedly is good when you have
> updates. However its io bound and will affect the writes. everywhere i read
> it says its not good for write heavy workload.
> But Looking at our write velocity, is it really write heavy ?
>
> I guess what i am trying to find out is will level tiered compaction will
> impact the writes in our use case or it will be fine given our write rate
> is not that much.
> Also is there anything else i should keep in mind while deciding on the
> compaction strategy.
>
> Thanks!!
>


Re: Migrating to Reaper: Switching From Incremental to Reaper's Full Subrange Repair

2018-06-13 Thread kurt greaves
Not strictly necessary but probably a good idea as you don't want two
separate pools of SSTables unnecessarily. Also if you've set
"only_purge_repaired_tombstones" you'll need to turn that off.

On Wed., 13 Jun. 2018, 23:06 Fd Habash,  wrote:

> For those who are using Reaper …
>
>
>
> Currently, I'm run repairs using crontab/nodetool using 'repair -pr' on
> 2.2.8 which defaults to incremental. If I migrate to Reaper, do I have to
> mark sstables as un-repaired first? Also, out of the box, does Reaper run
> full parallel repair? If yes, is it not going to cause over-streaming since
> we are repairing ranges multiple times?
>
>
>
> 
> Thank you
>
>
>


Re: G1GC CPU Spike

2018-06-13 Thread Chris Lohfink
There are not even a 100ms GC pause in that, are you certain theres a problem?

> On Jun 13, 2018, at 3:00 PM, rajpal reddy  wrote:
> 
> Thanks Chris I did attached the gc logs already. reattaching them 
> now.
> 
> it started yesterday around 11:54PM 
>> On Jun 13, 2018, at 3:56 PM, Chris Lohfink  wrote:
>> 
>>> What is the criteria for picking up the value for G1ReservePercent?
>> 
>> 
>> it depends on the object allocation rate vs the size of the heap. Cassandra 
>> ideally would be sub 500-600mb/s allocations but it can spike pretty high 
>> with something like reading a wide partition or repair streaming which might 
>> exceed what the g1 ygcs tenuring and timing is prepared for from previous 
>> steady rate. Giving it a bigger buffer is a nice safety net for allocation 
>> spikes.
>> 
>>> is the HEAP_NEWSIZE is required only for CMS
>> 
>> 
>> it should only set Xmn with that if using CMS, with G1 it should be ignored 
>> or else yes it would be bad to set Xmn. Giving the gc logs will give the 
>> results of all the bash scripts along with details of whats happening so its 
>> your best option if you want help to share that.
>> 
>> Chris
>> 
>>> On Jun 13, 2018, at 12:17 PM, Subroto Barua  
>>> wrote:
>>> 
>>> Chris,
>>> What is the criteria for picking up the value for G1ReservePercent?
>>> 
>>> Subroto 
>>> 
 On Jun 13, 2018, at 6:52 AM, Chris Lohfink  wrote:
 
 G1ReservePercent
>>> 
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> 
> 
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: G1GC CPU Spike

2018-06-13 Thread Chris Lohfink
> What is the criteria for picking up the value for G1ReservePercent?


it depends on the object allocation rate vs the size of the heap. Cassandra 
ideally would be sub 500-600mb/s allocations but it can spike pretty high with 
something like reading a wide partition or repair streaming which might exceed 
what the g1 ygcs tenuring and timing is prepared for from previous steady rate. 
Giving it a bigger buffer is a nice safety net for allocation spikes.

> is the HEAP_NEWSIZE is required only for CMS


it should only set Xmn with that if using CMS, with G1 it should be ignored or 
else yes it would be bad to set Xmn. Giving the gc logs will give the results 
of all the bash scripts along with details of whats happening so its your best 
option if you want help to share that.

Chris

> On Jun 13, 2018, at 12:17 PM, Subroto Barua  
> wrote:
> 
> Chris,
> What is the criteria for picking up the value for G1ReservePercent?
> 
> Subroto 
> 
>> On Jun 13, 2018, at 6:52 AM, Chris Lohfink  wrote:
>> 
>> G1ReservePercent
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Compaction strategy for update heavy workload

2018-06-13 Thread manuj singh
Hi all,
I am trying to determine compaction strategy for our use case.
In our use case we will have updates on a row a few times. And we have a
ttl also defined on the table level.
Our typical workload is less then 1000 writes + reads per second. At the
max it could go up to 2500 per second.
We use SSD and have around 64 gb of ram on each node. Our cluster size is
around 70 nodes.

I looked at time series but we cant guarantee that the updates will happen
within a give time window. And if we have out of order updates it might
impact on when we remove that data from the disk.

So i was looking at level tiered, which supposedly is good when you have
updates. However its io bound and will affect the writes. everywhere i read
it says its not good for write heavy workload.
But Looking at our write velocity, is it really write heavy ?

I guess what i am trying to find out is will level tiered compaction will
impact the writes in our use case or it will be fine given our write rate
is not that much.
Also is there anything else i should keep in mind while deciding on the
compaction strategy.

Thanks!!


Re: G1GC CPU Spike

2018-06-13 Thread Subroto Barua
Chris,
What is the criteria for picking up the value for G1ReservePercent?

Subroto 

> On Jun 13, 2018, at 6:52 AM, Chris Lohfink  wrote:
> 
> G1ReservePercent

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: G1GC CPU Spike

2018-06-13 Thread rajpal reddy
does setting HEAP_NEWSIZE=“800M” mean young generation can only use 800M?

> On Jun 13, 2018, at 12:51 PM, Steinmaurer, Thomas 
>  wrote:
> 
> Explicitly setting Xmn with G1 basically results in  <>overriding the target 
> pause-time goal, thus should be avoided.
> http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html 
> 
>  
> Thomas
>  
>  
> From: rajpal reddy [mailto:rajpalreddy...@gmail.com 
> ] 
> Sent: Mittwoch, 13. Juni 2018 17:27
> To: user@cassandra.apache.org 
> Subject: Re: G1GC CPU Spike
>  
> we have this has the Heap settings . is the HEAP_NEWSIZE is required only for 
> CMS. can we get rid of that for G1GC so that it can be used?
> MAX_HEAP_SIZE="8192M"
> HEAP_NEWSIZE=“800M"
>  
> On Jun 13, 2018, at 11:15 AM, Chris Lohfink  > wrote:
>  
> That metric is the total number of seconds spent in GC, it will increase over 
> time with every young gc which is expected. Whats interesting is the rate of 
> growth not the fact that its increasing. If graphing tool has option to graph 
> derivative you should use that instead.
>  
> Chris
> 
> 
> On Jun 13, 2018, at 9:51 AM, rajpal reddy  > wrote:
>  
> jvm_gc_collection_seconds_count{gc="G1 Young Generation”} and also young 
> generation seconds count keep increasing
>  
> 
>  
> On Jun 13, 2018, at 9:52 AM, Chris Lohfink  > wrote:
>  
> The gc log file is best to share when asking for help with tuning. The top of 
> file has all the computed args it ran with and it gives details on what part 
> of the GC is taking time. I would guess the CPU spike is from full GCs which 
> with that small heap of a heap is probably from evacuation failures. 
> Reserving more of the heap to be free (-XX:G1ReservePercent=25) can help, 
> along with increasing the amount of heap. 8GB is pretty small for G1, might 
> be better off with CMS.
> 
> Chris
> 
> 
> On Jun 13, 2018, at 8:42 AM, rajpal reddy  > wrote:
> 
> Hello,
> 
> we are using G1GC and noticing garbage collection taking a while and during 
> that process we are seeing cpu spiking up to 70-80%. can you please let us 
> know. if we have to tune any parameters for that. attaching the cassandra-env 
> file with jam-options.
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
>  
>  
>  
> The contents of this e-mail are intended for the named addressee only. It 
> contains information that may be confidential. Unless you are the named 
> addressee or an authorized designee, you may not copy or use it, or disclose 
> it to anyone else. If you received it in error please notify us immediately 
> and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) 
> is a company registered in Linz whose registered office is at 4040 Linz, 
> Austria, Freistädterstraße 313



RE: G1GC CPU Spike

2018-06-13 Thread Steinmaurer, Thomas
Explicitly setting Xmn with G1 basically results in overriding the target 
pause-time goal, thus should be avoided.
http://www.oracle.com/technetwork/articles/java/g1gc-1984535.html

Thomas


From: rajpal reddy [mailto:rajpalreddy...@gmail.com]
Sent: Mittwoch, 13. Juni 2018 17:27
To: user@cassandra.apache.org
Subject: Re: G1GC CPU Spike

we have this has the Heap settings . is the HEAP_NEWSIZE is required only for 
CMS. can we get rid of that for G1GC so that it can be used?
MAX_HEAP_SIZE="8192M"
HEAP_NEWSIZE=“800M"

On Jun 13, 2018, at 11:15 AM, Chris Lohfink 
mailto:clohf...@apple.com>> wrote:

That metric is the total number of seconds spent in GC, it will increase over 
time with every young gc which is expected. Whats interesting is the rate of 
growth not the fact that its increasing. If graphing tool has option to graph 
derivative you should use that instead.

Chris


On Jun 13, 2018, at 9:51 AM, rajpal reddy 
mailto:rajpalreddy...@gmail.com>> wrote:

jvm_gc_collection_seconds_count{gc="G1 Young Generation”} and also young 
generation seconds count keep increasing



On Jun 13, 2018, at 9:52 AM, Chris Lohfink 
mailto:clohf...@apple.com>> wrote:

The gc log file is best to share when asking for help with tuning. The top of 
file has all the computed args it ran with and it gives details on what part of 
the GC is taking time. I would guess the CPU spike is from full GCs which with 
that small heap of a heap is probably from evacuation failures. Reserving more 
of the heap to be free (-XX:G1ReservePercent=25) can help, along with 
increasing the amount of heap. 8GB is pretty small for G1, might be better off 
with CMS.

Chris


On Jun 13, 2018, at 8:42 AM, rajpal reddy 
mailto:rajpalreddy...@gmail.com>> wrote:

Hello,

we are using G1GC and noticing garbage collection taking a while and during 
that process we are seeing cpu spiking up to 70-80%. can you please let us 
know. if we have to tune any parameters for that. attaching the cassandra-env 
file with jam-options.
-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org


-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: 
user-h...@cassandra.apache.org



The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freistädterstraße 313


Possible support for datatype timestamp in native aggregate functions ?

2018-06-13 Thread Devopam Mittra
hi ,
Just wondering if it would make sense to have support for timestamp
datatype in CQL native aggregates like MAX/MIN.


-- 
Devopam Mittra
Life and Relations are not binary


Re: G1GC CPU Spike

2018-06-13 Thread rajpal reddy
we have this has the Heap settings . is the HEAP_NEWSIZE is required only for 
CMS. can we get rid of that for G1GC so that it can be used?
MAX_HEAP_SIZE="8192M"
HEAP_NEWSIZE=“800M"

> On Jun 13, 2018, at 11:15 AM, Chris Lohfink  wrote:
> 
> That metric is the total number of seconds spent in GC, it will increase over 
> time with every young gc which is expected. Whats interesting is the rate of 
> growth not the fact that its increasing. If graphing tool has option to graph 
> derivative you should use that instead.
> 
> Chris
> 
>> On Jun 13, 2018, at 9:51 AM, rajpal reddy > > wrote:
>> 
>> jvm_gc_collection_seconds_count{gc="G1 Young Generation”} and also young 
>> generation seconds count keep increasing
>> 
>> 
>> 
>>> On Jun 13, 2018, at 9:52 AM, Chris Lohfink >> > wrote:
>>> 
>>> The gc log file is best to share when asking for help with tuning. The top 
>>> of file has all the computed args it ran with and it gives details on what 
>>> part of the GC is taking time. I would guess the CPU spike is from full GCs 
>>> which with that small heap of a heap is probably from evacuation failures. 
>>> Reserving more of the heap to be free (-XX:G1ReservePercent=25) can help, 
>>> along with increasing the amount of heap. 8GB is pretty small for G1, might 
>>> be better off with CMS.
>>> 
>>> Chris
>>> 
 On Jun 13, 2018, at 8:42 AM, rajpal reddy >>> > wrote:
 
 Hello,
 
 we are using G1GC and noticing garbage collection taking a while and 
 during that process we are seeing cpu spiking up to 70-80%. can you please 
 let us know. if we have to tune any parameters for that. attaching the 
 cassandra-env file with jam-options.
 -
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
 
 For additional commands, e-mail: user-h...@cassandra.apache.org 
 
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>>> 
>>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>>> 
>>> 
>> 
> 



Re: G1GC CPU Spike

2018-06-13 Thread rajpal reddy
Chris,

We have total memory on the machine as 16G so we kept 50% to heap . initial we 
had CMS we had same issues so thought of changing to G1GC we have the same 
problem still
> On Jun 13, 2018, at 9:52 AM, Chris Lohfink  wrote:
> 
> The gc log file is best to share when asking for help with tuning. The top of 
> file has all the computed args it ran with and it gives details on what part 
> of the GC is taking time. I would guess the CPU spike is from full GCs which 
> with that small heap of a heap is probably from evacuation failures. 
> Reserving more of the heap to be free (-XX:G1ReservePercent=25) can help, 
> along with increasing the amount of heap. 8GB is pretty small for G1, might 
> be better off with CMS.
> 
> Chris
> 
>> On Jun 13, 2018, at 8:42 AM, rajpal reddy  wrote:
>> 
>> Hello,
>> 
>> we are using G1GC and noticing garbage collection taking a while and during 
>> that process we are seeing cpu spiking up to 70-80%. can you please let us 
>> know. if we have to tune any parameters for that. attaching the 
>> cassandra-env file with jam-options.
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: G1GC CPU Spike

2018-06-13 Thread Chris Lohfink
That metric is the total number of seconds spent in GC, it will increase over 
time with every young gc which is expected. Whats interesting is the rate of 
growth not the fact that its increasing. If graphing tool has option to graph 
derivative you should use that instead.

Chris

> On Jun 13, 2018, at 9:51 AM, rajpal reddy  wrote:
> 
> jvm_gc_collection_seconds_count{gc="G1 Young Generation”} and also young 
> generation seconds count keep increasing
> 
> 
> 
>> On Jun 13, 2018, at 9:52 AM, Chris Lohfink > > wrote:
>> 
>> The gc log file is best to share when asking for help with tuning. The top 
>> of file has all the computed args it ran with and it gives details on what 
>> part of the GC is taking time. I would guess the CPU spike is from full GCs 
>> which with that small heap of a heap is probably from evacuation failures. 
>> Reserving more of the heap to be free (-XX:G1ReservePercent=25) can help, 
>> along with increasing the amount of heap. 8GB is pretty small for G1, might 
>> be better off with CMS.
>> 
>> Chris
>> 
>>> On Jun 13, 2018, at 8:42 AM, rajpal reddy >> > wrote:
>>> 
>>> Hello,
>>> 
>>> we are using G1GC and noticing garbage collection taking a while and during 
>>> that process we are seeing cpu spiking up to 70-80%. can you please let us 
>>> know. if we have to tune any parameters for that. attaching the 
>>> cassandra-env file with jam-options.
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>>> 
>>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>> 
>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>> 
>> 
> 



Re: G1GC CPU Spike

2018-06-13 Thread rajpal reddy
jvm_gc_collection_seconds_count{gc="G1 Young Generation”} and also young 
generation seconds count keep increasing



> On Jun 13, 2018, at 9:52 AM, Chris Lohfink  wrote:
> 
> The gc log file is best to share when asking for help with tuning. The top of 
> file has all the computed args it ran with and it gives details on what part 
> of the GC is taking time. I would guess the CPU spike is from full GCs which 
> with that small heap of a heap is probably from evacuation failures. 
> Reserving more of the heap to be free (-XX:G1ReservePercent=25) can help, 
> along with increasing the amount of heap. 8GB is pretty small for G1, might 
> be better off with CMS.
> 
> Chris
> 
>> On Jun 13, 2018, at 8:42 AM, rajpal reddy  wrote:
>> 
>> Hello,
>> 
>> we are using G1GC and noticing garbage collection taking a while and during 
>> that process we are seeing cpu spiking up to 70-80%. can you please let us 
>> know. if we have to tune any parameters for that. attaching the 
>> cassandra-env file with jam-options.
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 



Re: G1GC CPU Spike

2018-06-13 Thread Chris Lohfink
The gc log file is best to share when asking for help with tuning. The top of 
file has all the computed args it ran with and it gives details on what part of 
the GC is taking time. I would guess the CPU spike is from full GCs which with 
that small heap of a heap is probably from evacuation failures. Reserving more 
of the heap to be free (-XX:G1ReservePercent=25) can help, along with 
increasing the amount of heap. 8GB is pretty small for G1, might be better off 
with CMS.

Chris

> On Jun 13, 2018, at 8:42 AM, rajpal reddy  wrote:
> 
> Hello,
> 
> we are using G1GC and noticing garbage collection taking a while and during 
> that process we are seeing cpu spiking up to 70-80%. can you please let us 
> know. if we have to tune any parameters for that. attaching the cassandra-env 
> file with jam-options.
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



G1GC CPU Spike

2018-06-13 Thread rajpal reddy
Hello,

we are using G1GC and noticing garbage collection taking a while and during 
that process we are seeing cpu spiking up to 70-80%. can you please let us 
know. if we have to tune any parameters for that. attaching the cassandra-env 
file with jam-options.calculate_heap_sizes()
{
case "`uname`" in
Linux)
system_memory_in_mb=`free -m | awk '/:/ {print $2;exit}'`
system_cpu_cores=`egrep -c 'processor([[:space:]]+):.*' 
/proc/cpuinfo`
;;
FreeBSD)
system_memory_in_bytes=`sysctl hw.physmem | awk '{print $2}'`
system_memory_in_mb=`expr $system_memory_in_bytes / 1024 / 1024`
system_cpu_cores=`sysctl hw.ncpu | awk '{print $2}'`
;;
SunOS)
system_memory_in_mb=`prtconf | awk '/Memory size:/ {print $3}'`
system_cpu_cores=`psrinfo | wc -l`
;;
Darwin)
system_memory_in_bytes=`sysctl hw.memsize | awk '{print $2}'`
system_memory_in_mb=`expr $system_memory_in_bytes / 1024 / 1024`
system_cpu_cores=`sysctl hw.ncpu | awk '{print $2}'`
;;
*)
system_memory_in_mb="2048"
system_cpu_cores="2"
;;
esac

if [ "$system_cpu_cores" -lt "1" ]
then
system_cpu_cores="1"
fi

half_system_memory_in_mb=`expr $system_memory_in_mb / 2`
quarter_system_memory_in_mb=`expr $half_system_memory_in_mb / 2`
if [ "$half_system_memory_in_mb" -gt "1024" ]
then
half_system_memory_in_mb="1024"
fi
if [ "$quarter_system_memory_in_mb" -gt "8192" ]
then
quarter_system_memory_in_mb="8192"
fi
if [ "$half_system_memory_in_mb" -gt "$quarter_system_memory_in_mb" ]
then
max_heap_size_in_mb="$half_system_memory_in_mb"
else
max_heap_size_in_mb="$quarter_system_memory_in_mb"
fi
MAX_HEAP_SIZE="${max_heap_size_in_mb}M"

max_sensible_yg_per_core_in_mb="100"
max_sensible_yg_in_mb=`expr $max_sensible_yg_per_core_in_mb "*" 
$system_cpu_cores`

desired_yg_in_mb=`expr $max_heap_size_in_mb / 4`

if [ "$desired_yg_in_mb" -gt "$max_sensible_yg_in_mb" ]
then
HEAP_NEWSIZE="${max_sensible_yg_in_mb}M"
else
HEAP_NEWSIZE="${desired_yg_in_mb}M"
fi
}

java_ver_output=`"${JAVA:-java}" -version 2>&1`
jvmver=`echo "$java_ver_output" | grep '[openjdk|java] version' | awk -F'"' 
'NR==1 {print $2}' | cut -d\- -f1`
JVM_VERSION=${jvmver%_*}

if [ "$JVM_VERSION" \< "1.8" ] ; then
echo "Cassandra 3.0 and later require Java 8u40 or later."
exit 1;
fi

if [ "$JVM_VERSION" \< "1.8" ] && [ "$JVM_PATCH_VERSION" -lt 40 ] ; then
echo "Cassandra 3.0 and later require Java 8u40 or later."
exit 1;
fi

jvm=`echo "$java_ver_output" | grep -A 1 'java version' | awk 'NR==2 {print 
$1}'`
case "$jvm" in
OpenJDK)
JVM_VENDOR=OpenJDK
JVM_ARCH=`echo "$java_ver_output" | awk 'NR==3 {print $2}'`
;;
"Java(TM)")
JVM_VENDOR=Oracle
JVM_ARCH=`echo "$java_ver_output" | awk 'NR==3 {print $3}'`
;;
*)
JVM_VENDOR=other
JVM_ARCH=unknown
;;
esac

JVM_OPTS="$JVM_OPTS -Xloggc:$CASSANDRA_LOG/gc.log"


JVM_OPTS_FILE=$CASSANDRA_CONF/jvm.options
for opt in `grep "^-" $JVM_OPTS_FILE`
do
  JVM_OPTS="$JVM_OPTS $opt"
done

echo $JVM_OPTS | grep -q Xmn
DEFINED_XMN=$?
echo $JVM_OPTS | grep -q Xmx
DEFINED_XMX=$?
echo $JVM_OPTS | grep -q Xms
DEFINED_XMS=$?
echo $JVM_OPTS | grep -q UseConcMarkSweepGC
USING_CMS=$?
echo $JVM_OPTS | grep -q UseG1GC
USING_G1=$?


MAX_HEAP_SIZE="8192M"
HEAP_NEWSIZE="800M"

if [ "x$MAX_HEAP_SIZE" = "x" ] && [ "x$HEAP_NEWSIZE" = "x" -o $USING_G1 -eq 0 
]; then
calculate_heap_sizes
elif [ "x$MAX_HEAP_SIZE" = "x" ] ||  [ "x$HEAP_NEWSIZE" = "x" -a $USING_G1 -ne 
0 ]; then
echo "please set or unset MAX_HEAP_SIZE and HEAP_NEWSIZE in pairs when 
using CMS GC (see cassandra-env.sh)"
exit 1
fi

if [ "x$MALLOC_ARENA_MAX" = "x" ] ; then
export MALLOC_ARENA_MAX=4
fi

if [ $DEFINED_XMX -ne 0 ] && [ $DEFINED_XMS -ne 0 ]; then
 JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
 JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
elif [ $DEFINED_XMX -ne 0 ] || [ $DEFINED_XMS -ne 0 ]; then
 echo "Please set or unset -Xmx and -Xms flags in pairs on jvm.options 
file."
 exit 1
fi

if [ $DEFINED_XMN -eq 0 ] && [ $DEFINED_XMX -ne 0 ]; then
echo "Please set or unset -Xmx and -Xmn flags in pairs on jvm.options file."
exit 1
elif [ $DEFINED_XMN -ne 0 ] && [ $USING_CMS -eq 0 ]; then
JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"
fi

if [ "$JVM_ARCH" = "64-Bit" ] && [ $USING_CMS -eq 0 ]; then
JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark"
fi

JVM_OPTS="$JVM_OPTS -XX:CompileCommandFile=$CASSANDRA_CONF/hotspot_compiler"

JVM_OPTS="$JVM_OPTS -javaagent:$CASSANDRA_HOME/lib/jamm-0.3.0.jar"

if [ "x$CASSANDRA_HEAPDUMP_DIR" != "x" ]; then
JVM_OPTS="$JVM_OPTS 
-XX:HeapDumpPath=$CASSANDRA_HEAPDUMP_DIR/cassandra-`date +%s`-pid$$.hprof"
fi

if [ 

Timestamp on hints file and system.hints table data

2018-06-13 Thread Nitan Kainth
Hi,

I observed a strange behavior about stored hints.

Time stamp of hints file shows several months old. I deleted them and saw
new hints files created with same old date. Why is that?

Also, I see hints files on disk but if I query system.hints table, it shows
0 rows. Why system.hints is not populated?

Version 3.11-1


Migrating to Reaper: Switching From Incremental to Reaper's Full Subrange Repair

2018-06-13 Thread Fd Habash
For those who are using Reaper …

Currently, I'm run repairs using crontab/nodetool using 'repair -pr' on 2.2.8 
which defaults to incremental. If I migrate to Reaper, do I have to mark 
sstables as un-repaired first? Also, out of the box, does Reaper run full 
parallel repair? If yes, is it not going to cause over-streaming since we are 
repairing ranges multiple times?


Thank you



RE: Restoring snapshot

2018-06-13 Thread Vishal1.Sharma
On altering the Keyspace, the warning disappears. I think the warning was not 
totally wrong, just slightly inaccurate.


From: Nitan Kainth [mailto:nitankai...@gmail.com]
Sent: Wednesday, June 13, 2018 4:38 PM
To: user@cassandra.apache.org
Subject: Re: Restoring snapshot

Change RF fir k2 and then see.
Sent from my iPhone

On Jun 13, 2018, at 7:05 AM, 
mailto:vishal1.sha...@ril.com>> 
mailto:vishal1.sha...@ril.com>> wrote:
For both K1 and K2, replication factor is 2 in the new cluster(although the 
number of nodes is 1). I can understand the portion of the warning which says 
that “only 1 replica could be found” but the question is, why is it giving the 
name of keyspace K2 when I was restoring only K1(It should have given warning 
for K1).

From: Nitan Kainth [mailto:nitankai...@gmail.com]
Sent: Wednesday, June 13, 2018 4:31 PM
To: user@cassandra.apache.org
Subject: Re: Restoring snapshot

Verify dc name and replication factor in create keyspace command in new cluster.
Sent from my iPhone

On Jun 13, 2018, at 2:40 AM, 
mailto:vishal1.sha...@ril.com>> 
mailto:vishal1.sha...@ril.com>> wrote:
Dear Community,

I took a snapshot from a node which was part of a 2 node cluster. There were 2 
keyspaces in that cluster K1 and K2. I took snapshot of K1 only. Now I create 
both keyspaces in some other cluster having only one node. When I tried to 
restore the snapshot(of keyspace K1) in that cluster using sstableloader, I got 
a warning:

“WARN  11:55:48,921 Error while computing token map for keyspace K2 with 
datacenter dc1: could not achieve replication factor 2 (found 1 replicas only), 
check your keyspace replication settings.”

Like I’ve said above, the new cluster contains only one node, therefore I can 
understand the portion of the warning telling me that ‘it only found 1 replicas 
only’, but why is it computing token map for keyspace K2 when I was restoring 
sstables of keyspace K1? Also, the same warning(regarding only K2) is displayed 
whether I try to restore snapshot of K1 or K2.  Although, I’m able to get the 
complete data, but I would appreciate if someone can explain this observations.

Cassandra version: 3.11.2

Thanks and regards,
Vishal Sharma

"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s), are confidential and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, re-transmission, conversion to hard copy, copying, circulation or 
other use of this message and any attachments is strictly prohibited. If you 
are not the intended recipient, please notify the sender immediately by return 
email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. The company cannot accept responsibility 
for any loss or damage arising from the use of this email or attachment."

"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s), are confidential and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, re-transmission, conversion to hard copy, copying, circulation or 
other use of this message and any attachments is strictly prohibited. If you 
are not the intended recipient, please notify the sender immediately by return 
email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. The company cannot accept responsibility 
for any loss or damage arising from the use of this email or attachment."
"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s). 
are confidential and may be privileged. If you are not the intended recipient. 
you are hereby notified that any 
review. re-transmission. conversion to hard copy. copying. circulation or other 
use of this message and any attachments is 
strictly prohibited. If you are not the intended recipient. please notify the 
sender immediately by return email. 
and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. 
The company cannot accept responsibility for any loss or damage arising from 
the use of this email or attachment."


Re: Restoring snapshot

2018-06-13 Thread Nitan Kainth
Change RF fir k2 and then see.

Sent from my iPhone

> On Jun 13, 2018, at 7:05 AM,  
>  wrote:
> 
> For both K1 and K2, replication factor is 2 in the new cluster(although the 
> number of nodes is 1). I can understand the portion of the warning which says 
> that “only 1 replica could be found” but the question is, why is it giving 
> the name of keyspace K2 when I was restoring only K1(It should have given 
> warning for K1).
>  
> From: Nitan Kainth [mailto:nitankai...@gmail.com] 
> Sent: Wednesday, June 13, 2018 4:31 PM
> To: user@cassandra.apache.org
> Subject: Re: Restoring snapshot
>  
> Verify dc name and replication factor in create keyspace command in new 
> cluster.
> 
> Sent from my iPhone
> 
> On Jun 13, 2018, at 2:40 AM,  
>  wrote:
> 
> Dear Community,
>  
> I took a snapshot from a node which was part of a 2 node cluster. There were 
> 2 keyspaces in that cluster K1 and K2. I took snapshot of K1 only. Now I 
> create both keyspaces in some other cluster having only one node. When I 
> tried to restore the snapshot(of keyspace K1) in that cluster using 
> sstableloader, I got a warning:
>  
> “WARN  11:55:48,921 Error while computing token map for keyspace K2 with 
> datacenter dc1: could not achieve replication factor 2 (found 1 replicas 
> only), check your keyspace replication settings.”
>  
> Like I’ve said above, the new cluster contains only one node, therefore I can 
> understand the portion of the warning telling me that ‘it only found 1 
> replicas only’, but why is it computing token map for keyspace K2 when I was 
> restoring sstables of keyspace K1? Also, the same warning(regarding only K2) 
> is displayed whether I try to restore snapshot of K1 or K2.  Although, I’m 
> able to get the complete data, but I would appreciate if someone can explain 
> this observations.
>  
> Cassandra version: 3.11.2
>  
> Thanks and regards,
> Vishal Sharma
> 
> "Confidentiality Warning: This message and any attachments are intended only 
> for the use of the intended recipient(s), are confidential and may be 
> privileged. If you are not the intended recipient, you are hereby notified 
> that any review, re-transmission, conversion to hard copy, copying, 
> circulation or other use of this message and any attachments is strictly 
> prohibited. If you are not the intended recipient, please notify the sender 
> immediately by return email and delete this message and any attachments from 
> your system.
> 
> Virus Warning: Although the company has taken reasonable precautions to 
> ensure no viruses are present in this email. The company cannot accept 
> responsibility for any loss or damage arising from the use of this email or 
> attachment."
> 
> 
> "Confidentiality Warning: This message and any attachments are intended only 
> for the use of the intended recipient(s), are confidential and may be 
> privileged. If you are not the intended recipient, you are hereby notified 
> that any review, re-transmission, conversion to hard copy, copying, 
> circulation or other use of this message and any attachments is strictly 
> prohibited. If you are not the intended recipient, please notify the sender 
> immediately by return email and delete this message and any attachments from 
> your system.
> 
> Virus Warning: Although the company has taken reasonable precautions to 
> ensure no viruses are present in this email. The company cannot accept 
> responsibility for any loss or damage arising from the use of this email or 
> attachment."


RE: Restoring snapshot

2018-06-13 Thread Vishal1.Sharma
For both K1 and K2, replication factor is 2 in the new cluster(although the 
number of nodes is 1). I can understand the portion of the warning which says 
that “only 1 replica could be found” but the question is, why is it giving the 
name of keyspace K2 when I was restoring only K1(It should have given warning 
for K1).

From: Nitan Kainth [mailto:nitankai...@gmail.com]
Sent: Wednesday, June 13, 2018 4:31 PM
To: user@cassandra.apache.org
Subject: Re: Restoring snapshot

Verify dc name and replication factor in create keyspace command in new cluster.
Sent from my iPhone

On Jun 13, 2018, at 2:40 AM, 
mailto:vishal1.sha...@ril.com>> 
mailto:vishal1.sha...@ril.com>> wrote:
Dear Community,

I took a snapshot from a node which was part of a 2 node cluster. There were 2 
keyspaces in that cluster K1 and K2. I took snapshot of K1 only. Now I create 
both keyspaces in some other cluster having only one node. When I tried to 
restore the snapshot(of keyspace K1) in that cluster using sstableloader, I got 
a warning:

“WARN  11:55:48,921 Error while computing token map for keyspace K2 with 
datacenter dc1: could not achieve replication factor 2 (found 1 replicas only), 
check your keyspace replication settings.”

Like I’ve said above, the new cluster contains only one node, therefore I can 
understand the portion of the warning telling me that ‘it only found 1 replicas 
only’, but why is it computing token map for keyspace K2 when I was restoring 
sstables of keyspace K1? Also, the same warning(regarding only K2) is displayed 
whether I try to restore snapshot of K1 or K2.  Although, I’m able to get the 
complete data, but I would appreciate if someone can explain this observations.

Cassandra version: 3.11.2

Thanks and regards,
Vishal Sharma

"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s), are confidential and may be 
privileged. If you are not the intended recipient, you are hereby notified that 
any review, re-transmission, conversion to hard copy, copying, circulation or 
other use of this message and any attachments is strictly prohibited. If you 
are not the intended recipient, please notify the sender immediately by return 
email and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. The company cannot accept responsibility 
for any loss or damage arising from the use of this email or attachment."
"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s). 
are confidential and may be privileged. If you are not the intended recipient. 
you are hereby notified that any 
review. re-transmission. conversion to hard copy. copying. circulation or other 
use of this message and any attachments is 
strictly prohibited. If you are not the intended recipient. please notify the 
sender immediately by return email. 
and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. 
The company cannot accept responsibility for any loss or damage arising from 
the use of this email or attachment."


Re: Restoring snapshot

2018-06-13 Thread Nitan Kainth
Verify dc name and replication factor in create keyspace command in new cluster.

Sent from my iPhone

> On Jun 13, 2018, at 2:40 AM,  
>  wrote:
> 
> Dear Community,
>  
> I took a snapshot from a node which was part of a 2 node cluster. There were 
> 2 keyspaces in that cluster K1 and K2. I took snapshot of K1 only. Now I 
> create both keyspaces in some other cluster having only one node. When I 
> tried to restore the snapshot(of keyspace K1) in that cluster using 
> sstableloader, I got a warning:
>  
> “WARN  11:55:48,921 Error while computing token map for keyspace K2 with 
> datacenter dc1: could not achieve replication factor 2 (found 1 replicas 
> only), check your keyspace replication settings.”
>  
> Like I’ve said above, the new cluster contains only one node, therefore I can 
> understand the portion of the warning telling me that ‘it only found 1 
> replicas only’, but why is it computing token map for keyspace K2 when I was 
> restoring sstables of keyspace K1? Also, the same warning(regarding only K2) 
> is displayed whether I try to restore snapshot of K1 or K2.  Although, I’m 
> able to get the complete data, but I would appreciate if someone can explain 
> this observations.
>  
> Cassandra version: 3.11.2
>  
> Thanks and regards,
> Vishal Sharma
> 
> "Confidentiality Warning: This message and any attachments are intended only 
> for the use of the intended recipient(s), are confidential and may be 
> privileged. If you are not the intended recipient, you are hereby notified 
> that any review, re-transmission, conversion to hard copy, copying, 
> circulation or other use of this message and any attachments is strictly 
> prohibited. If you are not the intended recipient, please notify the sender 
> immediately by return email and delete this message and any attachments from 
> your system.
> 
> Virus Warning: Although the company has taken reasonable precautions to 
> ensure no viruses are present in this email. The company cannot accept 
> responsibility for any loss or damage arising from the use of this email or 
> attachment."


Restoring snapshot

2018-06-13 Thread Vishal1.Sharma
Dear Community,

I took a snapshot from a node which was part of a 2 node cluster. There were 2 
keyspaces in that cluster K1 and K2. I took snapshot of K1 only. Now I create 
both keyspaces in some other cluster having only one node. When I tried to 
restore the snapshot(of keyspace K1) in that cluster using sstableloader, I got 
a warning:

“WARN  11:55:48,921 Error while computing token map for keyspace K2 with 
datacenter dc1: could not achieve replication factor 2 (found 1 replicas only), 
check your keyspace replication settings.”

Like I’ve said above, the new cluster contains only one node, therefore I can 
understand the portion of the warning telling me that ‘it only found 1 replicas 
only’, but why is it computing token map for keyspace K2 when I was restoring 
sstables of keyspace K1? Also, the same warning(regarding only K2) is displayed 
whether I try to restore snapshot of K1 or K2.  Although, I’m able to get the 
complete data, but I would appreciate if someone can explain this observations.

Cassandra version: 3.11.2

Thanks and regards,
Vishal Sharma
"Confidentiality Warning: This message and any attachments are intended only 
for the use of the intended recipient(s). 
are confidential and may be privileged. If you are not the intended recipient. 
you are hereby notified that any 
review. re-transmission. conversion to hard copy. copying. circulation or other 
use of this message and any attachments is 
strictly prohibited. If you are not the intended recipient. please notify the 
sender immediately by return email. 
and delete this message and any attachments from your system.

Virus Warning: Although the company has taken reasonable precautions to ensure 
no viruses are present in this email. 
The company cannot accept responsibility for any loss or damage arising from 
the use of this email or attachment."