RE: Commit logs building up

2014-04-09 Thread Parag Patel
Nate,

What values for the FlushWriter line would draw concern to you?  What is the 
difference between Blocked and All Time Blocked?

Parag

From: Nate McCall [mailto:n...@thelastpickle.com]
Sent: Thursday, February 27, 2014 4:22 PM
To: Cassandra Users
Subject: Re: Commit logs building up

What was the impetus for turning up the commitlog_segment_size_in_mb?

Also, in nodetool tpstats, do what are the values for the FlushWriter line?

On Wed, Feb 26, 2014 at 12:18 PM, Christopher Wirt 
mailto:chris.w...@struq.com>> wrote:
We're running 2.0.5, recently upgraded from 1.2.14.

Sometimes we are seeing CommitLogs starting to build up.

Is this a potential bug? Or a symptom of something else we can easily address?

We have
commitlog_sync: periodic
commitlog_sync_period_in_ms:1
commitlog_segment_size_in_mb: 512


Thanks,
Chris



--
-
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Commitlog questions

2014-04-09 Thread Parag Patel


1)  Why is the default 4GB?  Has anyone changed this? What are some aspects 
to consider when determining the commitlog size?

2)  If the commitlog is in periodic mode, there is a property to set a time 
interval to flush the incoming mutations to disk.  This implies that there is a 
queue inside Cassandra to hold this data in memory until it is flushed.

a.   Is there a name for this queue?

b.  Is there a limit for this queue?

c.   Are there any tuning parameters for this queue?

Thanks,
Parag


RE: Commitlog questions

2014-04-10 Thread Parag Patel
Oleg,

Thanks for the response.  If the commitlog is in periodic mode and the fsync 
happens every 10 seconds, Cassandra is storing the stuff that needs to be 
sync'd somewhere for a period of 10 seconds.  I'm talking about before it even 
hits any disk.  This has to be in memory, correct?

Parag

-Original Message-
From: Oleg Dulin [mailto:oleg.du...@gmail.com] 
Sent: Wednesday, April 09, 2014 10:42 AM
To: user@cassandra.apache.org
Subject: Re: Commitlog questions

Parag:

To answer your questions:

1) Default is just that, a default. I wouldn't advise raising it though. The 
bigger it is the longer it takes to restart the node.
2) I think they juse use fsync. There is no queue. All files in cassandra use 
java.nio buffers, but they need to be fsynced periodically. Look at 
commitlog_sync parameters in cassandra.yaml file, the comments there explain 
how it works. I believe the difference between periodic and batch is just that 
-- if it is periodic, it will fsync every 10 seconds, if it is batch it will 
fsync if there were any changes within a time window.

On 2014-04-09 10:06:52 +0000, Parag Patel said:

>  
>>>>> 1)  Why is the default 4GB?  Has anyone changed this? What are 
>>>>> some aspects to consider when determining the commitlog size?
>>>>> 2)  If the commitlog is in periodic mode, there is a property 
>>>>> to set a time interval to flush the incoming mutations to disk.  
>>>>> This implies that there is a queue inside Cassandra to hold this 
>>>>> data in memory until it is flushed.
>>>>>>>>> a.   Is there a name for this queue?
>>>>>>>>> b.  Is there a limit for this queue?
>>>>>>>>> c.   Are there any tuning parameters for this queue?
>  
> Thanks,
> Parag


--
Regards,
Oleg Dulin
http://www.olegdulin.com




Cassandra memory consumption

2014-04-10 Thread Parag Patel
We're using Cassandra 1.2.12.  What aspects of the data is stored in off heap 
memory vs heap memory?


RE: Cassandra memory consumption

2014-04-10 Thread Parag Patel
If I'm inserting the following :

Partition key = 8 byte String
Clustering key = 20 byte String
Stored Data = 150 byte byte[]

If the insert is still in the memtable, what portion of the above is in the 
memtable?  All of it, or just the keys?  If just the keys, where does the 
stored data live?  (keep in mind in this scenario the data has been been purged 
to the data directory.  It's only been added to the commit log).

Parag

From: DuyHai Doan [mailto:doanduy...@gmail.com]
Sent: Thursday, April 10, 2014 3:35 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra memory consumption

Data structures that are stored off heaps:
1) Row cache (if JNA enabled, otherwise on heap)
2) Bloom filter
3) Compression offset
4) Key Index sample
On heap:
 1) Memtables
 2) Partition Key cache
Hope that I did not forget anything
 Regards

 Duy Hai DOAN

On Thu, Apr 10, 2014 at 9:13 PM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
We're using Cassandra 1.2.12.  What aspects of the data is stored in off heap 
memory vs heap memory?



RE: New application - separate column family or separate cluster?

2014-07-09 Thread Parag Patel
In your scenario #1, is the total number of nodes staying the same?  Meaning, 
if you launch multiple clusters for #2, you’d have N total nodes – are we 
assuming #1 has N or less than N?

If #1 and #2 both have N, wouldn’t the performance be the same since 
Cassandra’s performance increases linearly?

From: Tupshin Harper [mailto:tups...@tupshin.com]
Sent: Tuesday, July 08, 2014 11:13 PM
To: user@cassandra.apache.org
Subject: Re: New application - separate column family or separate cluster?


I've seen a lot of deployments, and I think you captured the scenarios and 
reasoning quite well. You can apply other nuances and details to #2 (e.g. 
segment based on SLA or topology), but I agree with all of your reasoning.

-Tupshin
-Global Field Strategy
-Datastax
On Jul 8, 2014 10:54 AM, "Jeremy Jongsma" 
mailto:jer...@barchart.com>> wrote:
Do you prefer purpose-specific Cassandra clusters that support a single 
application's data set, or a single Cassandra cluster that contains column 
families for many applications? I realize there is no ideal answer for every 
situation, but what have your experiences been in this area for cluster 
planning?

My reason for asking is that we have one application with high data volume 
(multiple TB, thousands of writes/sec) that caused us to adopt Cassandra in the 
first place. Now we have the tools and cluster management infrastructure built 
up to the point where it is not a major investment to store smaller sets of 
data for other applications in C* also, and I am debating whether to:

1) Store everything in one large cluster (no isolation, low cost)
2) Use one cluster for the high-volume data, and one for everything else (good 
isolation, medium cost)
3) Give every major service its own cluster, even if they have small amounts of 
data (best isolation, highest cost)

I suspect #2 is the way to go as far as balancing hosting costs and application 
performance isolation. Any pros or cons am I missing?

-j


adding more nodes into the cluster

2014-07-16 Thread Parag Patel
Hi,

We have a 12 node cluster with replication factor of 3 in 1 datacenter.  We 
want to add 6 more nodes into the cluster.  I'm trying to see what's better 
bootstapping all 6 at the same time or doing it one node at a time.

Anybody have any thoughts on this?

Thanks,
Parag


RE: adding more nodes into the cluster

2014-07-16 Thread Parag Patel
Thanks rob

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Wednesday, July 16, 2014 2:21 PM
To: user@cassandra.apache.org
Subject: Re: adding more nodes into the cluster

On Wed, Jul 16, 2014 at 9:16 AM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
We have a 12 node cluster with replication factor of 3 in 1 datacenter.  We 
want to add 6 more nodes into the cluster.  I’m trying to see what’s better 
bootstapping all 6 at the same time or doing it one node at a time.

I should really write a blog post on this.

For safety, operators should generally bootstrap one node at a time. There are 
rare cases in non-vnode operation where one can safely bootstrap more than one 
node, but in general one should not do so.

In the future in Cassandra, you will hopefully prohibited from bootstrapping 
more than one at a time, because it's a natural thing to do and Bad Stuff Can 
Happen.

https://issues.apache.org/jira/browse/CASSANDRA-7069

=Rob


RE: adding more nodes into the cluster

2014-07-16 Thread Parag Patel
Couple more questions about bootstrapping


1)  Should we bootstrap all 6 nodes first and then call clean up once or 
should cleanup be called after each node is bootstrapped?

2)  Is it safe to kill the cleanup call and expect it to resume the next 
it’s called?

Thanks,
Parag

From: Parag Patel [mailto:ppa...@clearpoolgroup.com]
Sent: Wednesday, July 16, 2014 5:22 PM
To: user@cassandra.apache.org
Subject: RE: adding more nodes into the cluster

Thanks rob

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Wednesday, July 16, 2014 2:21 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: adding more nodes into the cluster

On Wed, Jul 16, 2014 at 9:16 AM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
We have a 12 node cluster with replication factor of 3 in 1 datacenter.  We 
want to add 6 more nodes into the cluster.  I’m trying to see what’s better 
bootstapping all 6 at the same time or doing it one node at a time.

I should really write a blog post on this.

For safety, operators should generally bootstrap one node at a time. There are 
rare cases in non-vnode operation where one can safely bootstrap more than one 
node, but in general one should not do so.

In the future in Cassandra, you will hopefully prohibited from bootstrapping 
more than one at a time, because it's a natural thing to do and Bad Stuff Can 
Happen.

https://issues.apache.org/jira/browse/CASSANDRA-7069

=Rob


bootstrapping new nodes on 1.2.12

2014-07-29 Thread Parag Patel
Hi,

It's taking a while to boostrap a 13th node into a 12 node cluster.  The 
average node size is about 1.7TB.  At the beginning of today we were close to 
.9TB on the new node and 12 hours later we're at 1.1TB.  I figured it would 
have finished by now because when I was looking on OpsCenter, there were 2 
transfers remaining.  1 was at 0% and the other was at 2%.  I look again now 
and those same nodes haven't progressed all day.  Instead I see 9 more 
transfers (some of which are progressing).


1)  Would anyone be able to help me interrupt this information from 
OpsCenter?

2)  Is there anything I can do to speed this up?

Thanks,
Parag




dropping secondary indexes

2014-07-30 Thread Parag Patel
Hi,

I've noticed that our datamodel has many unnecessary secondary indexes. Are 
there a recommended procedure to drop a secondary index on a very large table?  
Is there any sort of repair/cleanup that should be done after calling the DROP 
command?

Thanks,
Parag


RE: bootstrapping new nodes on 1.2.12

2014-07-30 Thread Parag Patel
Thanks for the detailed response.  I checked ‘nodetool netstats’ and I see 
there are pending streams, all of which are stuck at 0%.  I was expecting to 
see at least one output that was more than 0%.  Have you seen this before?

Side question – does a new node stream from other nodes in any particular 
order?  Perhaps this is a coincidence, but if I were to sort my hostnames in 
alphabetical order, it’s currently streaming from the last 2.

From: Mark Reddy [mailto:mark.re...@boxever.com]
Sent: Wednesday, July 30, 2014 4:42 AM
To: user@cassandra.apache.org
Subject: Re: bootstrapping new nodes on 1.2.12

Hi Parag,

1)  Would anyone be able to help me interrupt this information from 
OpsCenter?

At a high level bootstrapping a new node has two phases, streaming and 
secondary index builds. I believe OpsCenter will only report active streams, 
the pending stream will be listed as such in OpsCenter as well. In OpsCenter 
rather than looking at the Data Size check the used space on the Storage 
Capacity pie chart, this will show how much data is on disk but not necessarily 
live on the node yet.

Personally I would check 'nodetool netstats' to see what streams are remaining, 
this will list all active / pending stream and what files are to be streamed, 
at the moment you might just be streaming some very large files and once 
complete you will see a dramatic increase in data size.

If streaming is complete and you use secondary indexes, check 'nodetool 
compcationstats' for any secondary index builds that may be taking place.


2)  Is there anything I can do to speed this up?

If you have the capacity you could increase 
stream_throughput_outbound_megabits_per_sec in your cassandra.yaml

If you don't have the capacity you could add more nodes to spread the data so 
you stream less in future.

Finally you could upgrade to 2.0.x as it contains a complete refactor of 
streaming and should make your streaming sessions more robust and transparent: 
https://issues.apache.org/jira/browse/CASSANDRA-5286


Mark

On Wed, Jul 30, 2014 at 3:15 AM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
Hi,

It’s taking a while to boostrap a 13th node into a 12 node cluster.  The 
average node size is about 1.7TB.  At the beginning of today we were close to 
.9TB on the new node and 12 hours later we’re at 1.1TB.  I figured it would 
have finished by now because when I was looking on OpsCenter, there were 2 
transfers remaining.  1 was at 0% and the other was at 2%.  I look again now 
and those same nodes haven’t progressed all day.  Instead I see 9 more 
transfers (some of which are progressing).


1)  Would anyone be able to help me interrupt this information from 
OpsCenter?

2)  Is there anything I can do to speed this up?

Thanks,
Parag





RE: bootstrapping new nodes on 1.2.12

2014-07-30 Thread Parag Patel
Mark,

I see this output my log many times over for 2 nodes.  We have a cron entry 
across all clusters that force a full GC at 2 AM.  node1 is due to Full GC that 
was scheduled (I can disable this).  Node2 was due to a Full GC that occurred 
during our peak operation (these happen occasionally, we’ve been working to 
reduce them).  Few Questions


1)  Will any node leaving the cluster while streaming force us to bootsrap 
all over again?  If so, is this addressed in future versions?

2)  We have too much data to migrate to run on non-production hours.  How 
do we make it such that full GC’s don’t impact bootstrapping?  Should we 
increase phi_convict_threshold ?

Parag



From: Mark Reddy [mailto:mark.re...@boxever.com]
Sent: Wednesday, July 30, 2014 7:58 AM
To: user@cassandra.apache.org
Subject: Re: bootstrapping new nodes on 1.2.12

Thanks for the detailed response.  I checked ‘nodetool netstats’ and I see 
there are pending streams, all of which are stuck at 0%.  I was expecting to 
see at least one output that was more than 0%.  Have you seen this before?

This could indicate that the bootstrap process is hung due to a failed 
streaming session. Can you check your logs for the following line:

AbstractStreamSession.java (line 110) Stream failed because /xxx.xxx.xxx.xxx 
died or was restarted/removed (streams may still be active in background, but 
further streams won't be started)

If that is the case you will need to wipe the node and begin the bootstrapping 
process again


Mark


On Wed, Jul 30, 2014 at 12:03 PM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
Thanks for the detailed response.  I checked ‘nodetool netstats’ and I see 
there are pending streams, all of which are stuck at 0%.  I was expecting to 
see at least one output that was more than 0%.  Have you seen this before?

Side question – does a new node stream from other nodes in any particular 
order?  Perhaps this is a coincidence, but if I were to sort my hostnames in 
alphabetical order, it’s currently streaming from the last 2.

From: Mark Reddy [mailto:mark.re...@boxever.com<mailto:mark.re...@boxever.com>]
Sent: Wednesday, July 30, 2014 4:42 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: bootstrapping new nodes on 1.2.12

Hi Parag,

1)  Would anyone be able to help me interrupt this information from 
OpsCenter?

At a high level bootstrapping a new node has two phases, streaming and 
secondary index builds. I believe OpsCenter will only report active streams, 
the pending stream will be listed as such in OpsCenter as well. In OpsCenter 
rather than looking at the Data Size check the used space on the Storage 
Capacity pie chart, this will show how much data is on disk but not necessarily 
live on the node yet.

Personally I would check 'nodetool netstats' to see what streams are remaining, 
this will list all active / pending stream and what files are to be streamed, 
at the moment you might just be streaming some very large files and once 
complete you will see a dramatic increase in data size.

If streaming is complete and you use secondary indexes, check 'nodetool 
compcationstats' for any secondary index builds that may be taking place.


2)  Is there anything I can do to speed this up?

If you have the capacity you could increase 
stream_throughput_outbound_megabits_per_sec in your cassandra.yaml

If you don't have the capacity you could add more nodes to spread the data so 
you stream less in future.

Finally you could upgrade to 2.0.x as it contains a complete refactor of 
streaming and should make your streaming sessions more robust and transparent: 
https://issues.apache.org/jira/browse/CASSANDRA-5286


Mark

On Wed, Jul 30, 2014 at 3:15 AM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
Hi,

It’s taking a while to boostrap a 13th node into a 12 node cluster.  The 
average node size is about 1.7TB.  At the beginning of today we were close to 
.9TB on the new node and 12 hours later we’re at 1.1TB.  I figured it would 
have finished by now because when I was looking on OpsCenter, there were 2 
transfers remaining.  1 was at 0% and the other was at 2%.  I look again now 
and those same nodes haven’t progressed all day.  Instead I see 9 more 
transfers (some of which are progressing).


1)  Would anyone be able to help me interrupt this information from 
OpsCenter?

2)  Is there anything I can do to speed this up?

Thanks,
Parag






RE: bootstrapping new nodes on 1.2.12

2014-07-30 Thread Parag Patel
As to why we do it, we need to reevaluate because the GC optimizations we’ve 
made recently probably don’t require it anymore.  However, prior to our 
optimizations we observed a benefit at our peak time. When we force a GC, we 
don’t remove it from the ring.  This seems like a fundamental flaw in our 
approach.  Thanks for pointing this out.  For the purposes of bootstrapping, we 
will disable the manual GC’s to make sure we don’t interrupt the joining 
process.  However, one unpredictable problem can always remain – a Full GC 
happens causing the node to go offline and causing the bootstrap to fail.  To 
solve this, we’ll try increasing the phi_convict_threshold.

Our Full GC’s take about 9 seconds.  If we were to increase the 
phi_convict_threshold to not take a node offline for a 9 second unavailability, 
what negative side effects can there be?

Parag


From: Mark Reddy [mailto:mark.re...@boxever.com]
Sent: Wednesday, July 30, 2014 9:06 AM
To: user@cassandra.apache.org
Subject: Re: bootstrapping new nodes on 1.2.12

HI Parag,

I see this output my log many times over for 2 nodes.  We have a cron entry 
across all clusters that force a full GC at 2 AM.  node1 is due to Full GC that 
was scheduled (I can disable this).  Node2 was due to a Full GC that occurred 
during our peak operation (these happen occasionally, we’ve been working to 
reduce them).

Firstly, why are you forcing a GC? Do you have sufficient evidence that 
Cassandra is not managing the heap in the way in which your application 
requires?

Also how are you accomplishing this full GC? Are you removing the node from the 
ring, forcing a GC and then adding it back in? Or are you forcing a GC while it 
is in the ring?

1)  Will any node leaving the cluster while streaming force us to bootsrap 
all over again?  If so, is this addressed in future versions?

If the node that is leaving the ring is streaming data to  the bootstrapping 
node, yes, this will break the streaming session and no further streams will be 
started from that node. To my knowledge, there is nothing in newer / future 
versions that will prevent this.


2)  We have too much data to migrate to run on non-production hours.  How 
do we make it such that full GC’s don’t impact bootstrapping?  Should we 
increase phi_convict_threshold ?

Again I'll need some more information around these manual GC's. But yes, 
increasing the phi value would reduce the chance of a node in the ring being 
marked down during a heavy gc cycle.


Mark

On Wed, Jul 30, 2014 at 1:41 PM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
Mark,

I see this output my log many times over for 2 nodes.  We have a cron entry 
across all clusters that force a full GC at 2 AM.  node1 is due to Full GC that 
was scheduled (I can disable this).  Node2 was due to a Full GC that occurred 
during our peak operation (these happen occasionally, we’ve been working to 
reduce them).  Few Questions


1)  Will any node leaving the cluster while streaming force us to bootsrap 
all over again?  If so, is this addressed in future versions?

2)  We have too much data to migrate to run on non-production hours.  How 
do we make it such that full GC’s don’t impact bootstrapping?  Should we 
increase phi_convict_threshold ?

Parag



From: Mark Reddy [mailto:mark.re...@boxever.com<mailto:mark.re...@boxever.com>]
Sent: Wednesday, July 30, 2014 7:58 AM

To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: bootstrapping new nodes on 1.2.12

Thanks for the detailed response.  I checked ‘nodetool netstats’ and I see 
there are pending streams, all of which are stuck at 0%.  I was expecting to 
see at least one output that was more than 0%.  Have you seen this before?

This could indicate that the bootstrap process is hung due to a failed 
streaming session. Can you check your logs for the following line:

AbstractStreamSession.java (line 110) Stream failed because /xxx.xxx.xxx.xxx 
died or was restarted/removed (streams may still be active in background, but 
further streams won't be started)

If that is the case you will need to wipe the node and begin the bootstrapping 
process again


Mark


On Wed, Jul 30, 2014 at 12:03 PM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
Thanks for the detailed response.  I checked ‘nodetool netstats’ and I see 
there are pending streams, all of which are stuck at 0%.  I was expecting to 
see at least one output that was more than 0%.  Have you seen this before?

Side question – does a new node stream from other nodes in any particular 
order?  Perhaps this is a coincidence, but if I were to sort my hostnames in 
alphabetical order, it’s currently streaming from the last 2.

From: Mark Reddy [mailto:mark.re...@boxever.com<mailto:mark.re...@boxever.com>]
Sent: Wednesday, July 30, 2014 4:42 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: bootstrapping new nodes on

RE: bootstrapping new nodes on 1.2.12

2014-07-30 Thread Parag Patel
My understanding of a 9 second GC seems to be very off based on the gossip 
logs.  Correct me if im wrong, but the “handshaking version” is just a log for 
it to attempt to connect to the other nodes?

Manual FGC

2:01:02 - Node1  full GC
2:01:25 - Node2  detects node1 DOWN
2:01:27 - Node2  handshaking version with Node1
2:01:32 - Node2  handshaking version with Node1 because failed previously
2:01:37 - Node2  handshaking version with Node1 because failed previously
2:01:39 - Node2  detects Node1 UP

Production FGC

9:30:45  - Node1 full gc
9:30:47 - Node2  detects Node1 DOWN
9:30:55 - handshaking version with Node1
9:31:00  - handshaking version with Node1 because failed previously
9:31:05 - handshaking version with Node1 because failed previously
9:31:10  - handshaking version with Node1 because failed previously
9:31:15  - handshaking version with Node1 because failed previously
9:31:20  - handshaking version with Node1 because failed previously
9:31:37 - Node2 – detects Node1 UP


From: Mark Reddy [mailto:mark.re...@boxever.com]
Sent: Wednesday, July 30, 2014 9:41 AM
To: user@cassandra.apache.org
Subject: Re: bootstrapping new nodes on 1.2.12

Our Full GC’s take about 9 seconds.  If we were to increase the 
phi_convict_threshold to not take a node offline for a 9 second unavailability, 
what negative side effects can there be?

When you observe these GC's do you also see the node being marked down and then 
back up ~9 seconds later? GC's can often happen and have no effect on gossip 
marking a node as down, in which case the streaming session will remain intact. 
The side effect of long GC's is increased latency from that node during that 
period.


Mark

On Wed, Jul 30, 2014 at 2:24 PM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
As to why we do it, we need to reevaluate because the GC optimizations we’ve 
made recently probably don’t require it anymore.  However, prior to our 
optimizations we observed a benefit at our peak time. When we force a GC, we 
don’t remove it from the ring.  This seems like a fundamental flaw in our 
approach.  Thanks for pointing this out.  For the purposes of bootstrapping, we 
will disable the manual GC’s to make sure we don’t interrupt the joining 
process.  However, one unpredictable problem can always remain – a Full GC 
happens causing the node to go offline and causing the bootstrap to fail.  To 
solve this, we’ll try increasing the phi_convict_threshold.

Our Full GC’s take about 9 seconds.  If we were to increase the 
phi_convict_threshold to not take a node offline for a 9 second unavailability, 
what negative side effects can there be?

Parag


From: Mark Reddy [mailto:mark.re...@boxever.com<mailto:mark.re...@boxever.com>]
Sent: Wednesday, July 30, 2014 9:06 AM

To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: bootstrapping new nodes on 1.2.12

HI Parag,

I see this output my log many times over for 2 nodes.  We have a cron entry 
across all clusters that force a full GC at 2 AM.  node1 is due to Full GC that 
was scheduled (I can disable this).  Node2 was due to a Full GC that occurred 
during our peak operation (these happen occasionally, we’ve been working to 
reduce them).

Firstly, why are you forcing a GC? Do you have sufficient evidence that 
Cassandra is not managing the heap in the way in which your application 
requires?

Also how are you accomplishing this full GC? Are you removing the node from the 
ring, forcing a GC and then adding it back in? Or are you forcing a GC while it 
is in the ring?

1)  Will any node leaving the cluster while streaming force us to bootsrap 
all over again?  If so, is this addressed in future versions?

If the node that is leaving the ring is streaming data to  the bootstrapping 
node, yes, this will break the streaming session and no further streams will be 
started from that node. To my knowledge, there is nothing in newer / future 
versions that will prevent this.


2)  We have too much data to migrate to run on non-production hours.  How 
do we make it such that full GC’s don’t impact bootstrapping?  Should we 
increase phi_convict_threshold ?

Again I'll need some more information around these manual GC's. But yes, 
increasing the phi value would reduce the chance of a node in the ring being 
marked down during a heavy gc cycle.


Mark

On Wed, Jul 30, 2014 at 1:41 PM, Parag Patel 
mailto:ppa...@clearpoolgroup.com>> wrote:
Mark,

I see this output my log many times over for 2 nodes.  We have a cron entry 
across all clusters that force a full GC at 2 AM.  node1 is due to Full GC that 
was scheduled (I can disable this).  Node2 was due to a Full GC that occurred 
during our peak operation (these happen occasionally, we’ve been working to 
reduce them).  Few Questions


1)  Will any node leaving the cluster while streaming force us to bootsrap 
all over again?  If so, is this addressed in future versions?

2)   

Manually deleting sstables

2014-08-19 Thread Parag Patel
After we dropped a table, we noticed that the sstables are still there.  After 
searching through the forum history, I noticed that this is known behavior.


1)  Is there any negative impact of deleting the sstables off disk and then 
restarting Cassandra?

2)  Are there any other recommended procedures for this?

Thanks,
Parag


RE: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).

2014-09-19 Thread Parag Patel
Agreed.  We only use secondary indexes for column families that are relatively 
small (~5k rows).  For anything larger, we store the data into a wide row (but 
this depends on your data model) 

-Original Message-
From: jonathan.had...@gmail.com [mailto:jonathan.had...@gmail.com] On Behalf Of 
Jonathan Haddad
Sent: Friday, September 19, 2014 4:01 AM
To: user@cassandra.apache.org
Subject: Re: Slow down of secondary index query with VNODE (C* version 1.2.18, 
jre6).

Keep in mind secondary indexes in cassandra are not there to improve 
performance, or even really be used in a serious user facing manner.

Build and maintain your own view of the data, it'll be much faster.



On Thu, Sep 18, 2014 at 6:33 PM, Jay Patel  wrote:
> Hi there,
>
> We are seeing extreme slow down (500ms to 1s) in query on secondary 
> index with vnode. I'm seeing multiple secondary index scans on a given 
> node in trace output when vnode is enabled. Without vnode, everything is good.
>
> Cluster size: 6 nodes
> Replication factor: 3
> Consistency level: local_quorum. Same behavior happens with 
> consistency level of ONE.
>
> Snippet from the trace output. Pls see the attached output1.txt for 
> the full log. Are we hitting any bug? Do not understand why 
> coordinator sends requests multiple times to the same node (e.g. 
> 192.168.51.22 in below
> output) for different token ranges.
>

>
> Executing indexed scan for [min(-9223372036854775808), 
> max(-9193352069377957523)] | 23:11:30,992 | 192.168.51.22 | Executing 
> indexed scan for (max(-9193352069377957523), 
> max(-9136021049555745100)] | 23:11:30,998 | 192.168.51.25 |  Executing 
> indexed scan for (max(-9136021049555745100), 
> max(-8959555493872108621)] | 23:11:30,999 | 192.168.51.22 |  Executing 
> indexed scan for (max(-8959555493872108621), 
> max(-8929774302283364912)] | 23:11:31,000 | 192.168.51.25 | Executing 
> indexed scan for (max(-8929774302283364912), 
> max(-8854653908608918942)] | 23:11:31,001 | 192.168.51.22 |  Executing 
> indexed scan for (max(-8854653908608918942), 
> max(-8762620856967633953)] | 23:11:31,002 | 192.168.51.25 |
>   Executing indexed scan for (max(-8762620856967633953), 
> max(-8668275030769104047)] | 23:11:31,003 | 192.168.51.22 | Executing 
> indexed scan for (max(-8668275030769104047), 
> max(-8659066486210615614)] | 23:11:31,003 | 192.168.51.25 |  Executing 
> indexed scan for (max(-8659066486210615614), 
> max(-8419137646248370231)] | 23:11:31,004 | 192.168.51.22 |  Executing 
> indexed scan for (max(-8419137646248370231), 
> max(-8416786876632807845)] | 23:11:31,005 | 192.168.51.25 |  Executing 
> indexed scan for (max(-8416786876632807845), 
> max(-8315889933848495185)] | 23:11:31,006 | 192.168.51.22 | Executing 
> indexed scan for (max(-8315889933848495185), 
> max(-8270922890152952193)] | 23:11:31,006 | 192.168.51.25 | Executing 
> indexed scan for (max(-8270922890152952193), 
> max(-8260813759533312175)] | 23:11:31,007 | 192.168.51.22 |  Executing 
> indexed scan for (max(-8260813759533312175), 
> max(-8234845345932129353)] | 23:11:31,008 | 192.168.51.25 |  Executing 
> indexed scan for (max(-8234845345932129353), 
> max(-8216636461332030758)] | 23:11:31,008 | 192.168.51.22 |
>
> Thanks,
> Jay
>



--
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade


Off heap memory leak?

2015-10-02 Thread Parag Patel
We have a 12node Cassandra cluster running on 1.2.12.  Each node is using 1.1TB 
out of 2TB.  Each node has a min+max heap of 24Gb and the physical server has 
48Gb.  Our nodes do not restart during the week, only on the weekend, and we're 
observing that the off heap memory that is consumed ramps up over the course of 
the week.  There were a few occasions lately where it consumed so much, it 
started swapping and OS eventually killed the process (for good reason).

I know we need to upgrade, but I'd like to evaluate if upgrading will fix the 
problem.  Has anyone experienced this or can anyone provide guidance?

Parag


Read query slows down when a node goes down

2013-09-15 Thread Parag Patel
Hi,

We have a six node cluster running DataStax Community Edition 1.2.9.  From our 
app, we use the Netflix Astyanax library to read and write records into our 
cluster.  We read and write with QUARUM.  We're experiencing an issue where 
when a node goes down, we see our read queries slowing down in our app whenever 
a node goes offline.  This is a problem that is very reproducible.  Has anybody 
experienced this before or do people have suggestions on what I could try?

Thanks,
Parag


RE: Read query slows down when a node goes down

2013-09-16 Thread Parag Patel
RF=3.  Single dc deployment.  No v-nodes.

Is there a certain amount of time I need to wait from the time the down node is 
started to the point where it's ready to be used?  If so, what's that time?  If 
it's dynamic, how would I know when it's ready?

Thanks,
Parag

From: sankalp kohli [mailto:kohlisank...@gmail.com]
Sent: Sunday, September 15, 2013 4:52 PM
To: user@cassandra.apache.org
Subject: Re: Read query slows down when a node goes down

What is your replication factor? DO you have multi-DC deployment? Also are u 
using v nodes?

On Sun, Sep 15, 2013 at 7:54 AM, Parag Patel 
mailto:parag.pa...@fusionts.com>> wrote:
Hi,

We have a six node cluster running DataStax Community Edition 1.2.9.  From our 
app, we use the Netflix Astyanax library to read and write records into our 
cluster.  We read and write with QUARUM.  We're experiencing an issue where 
when a node goes down, we see our read queries slowing down in our app whenever 
a node goes offline.  This is a problem that is very reproducible.  Has anybody 
experienced this before or do people have suggestions on what I could try?

Thanks,
Parag



RE: Read query slows down when a node goes down

2013-09-16 Thread Parag Patel
Thanks.  I've noticed that a repair takes a long to time to finish.  My data is 
quite small, 1.5GB on each node when running nodetool status.  Is there anyway 
to speed up repairs? (FYI, I haven't actually seen a repair finish since it 
didn't retrun after 10 mins - I figured I was doing something wrong).

From: sankalp kohli [mailto:kohlisank...@gmail.com]
Sent: Monday, September 16, 2013 1:10 PM
To: user@cassandra.apache.org
Subject: Re: Read query slows down when a node goes down

For how long does the read latencies go up once a machine is down? It takes a 
configurable amount of time for machines to detect that another machine is 
down. This is done through Gossip. The algo to detect failures is The Phi 
accrual failure detector.

Regarding your question, if you are bootstrapping then it need to get the data 
from other nodes and during this time, it will not serve any reads but will 
accept writes. Once it has all the data, it will start serving reads. In the 
logs it will have something like "now serving reads"
.
If you are bringing back a machine which is offline, then it will start 
accepting reads and writes immediately but then you should run a repair to get 
the missing data.





On Mon, Sep 16, 2013 at 8:12 AM, Parag Patel 
mailto:parag.pa...@fusionts.com>> wrote:
RF=3.  Single dc deployment.  No v-nodes.

Is there a certain amount of time I need to wait from the time the down node is 
started to the point where it's ready to be used?  If so, what's that time?  If 
it's dynamic, how would I know when it's ready?

Thanks,
Parag

From: sankalp kohli 
[mailto:kohlisank...@gmail.com<mailto:kohlisank...@gmail.com>]
Sent: Sunday, September 15, 2013 4:52 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Read query slows down when a node goes down

What is your replication factor? DO you have multi-DC deployment? Also are u 
using v nodes?

On Sun, Sep 15, 2013 at 7:54 AM, Parag Patel 
mailto:parag.pa...@fusionts.com>> wrote:
Hi,

We have a six node cluster running DataStax Community Edition 1.2.9.  From our 
app, we use the Netflix Astyanax library to read and write records into our 
cluster.  We read and write with QUARUM.  We're experiencing an issue where 
when a node goes down, we see our read queries slowing down in our app whenever 
a node goes offline.  This is a problem that is very reproducible.  Has anybody 
experienced this before or do people have suggestions on what I could try?

Thanks,
Parag




Statistics

2013-11-08 Thread Parag Patel
Hi,

I'm looking for a way to view statistics.  Mainly, I'd like to see the 
distribution of writes and reads over the course of a day or a set of days. Is 
there a way to do this through nodetool or by downloading a utility?

Thanks,
Parag


Issue upgrading from 1.2 to 2.0.3

2013-12-19 Thread Parag Patel
Hi,

We are in the process of upgrading 1.2 to 2.0.3.  We have a four node cluster 
and we're upgrading one node at a time.  After upgrading two of the nodes, we 
encountered a problem.  We observed that if we run nodetool status on the 2.0.3 
hosts, they would show 2 nodes down and 2 nodes up.  If we run nodetool status 
on the 1.2 hosts, it would show all nodes up.

Has anyone encountered this?  Perhaps I'm missing a step in my upgrade 
procedure?  Please help as this will prevent us from pushing into production.

Thanks,
Parag


RE: Issue upgrading from 1.2 to 2.0.3

2013-12-19 Thread Parag Patel
Thanks for that link.

Our 1.2 version is 1.2.12

Our 2.0.3 nodes were restarted once.  Before restart, it was the 1.2.12 binary, 
after it was the 2.0.3.  Immediately after the node was back in the cluster, we 
ran nodetool upgradesstables.  We haven't restarted since.

Is a restart required for each node?

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Thursday, December 19, 2013 4:17 PM
To: user@cassandra.apache.org
Subject: Re: Issue upgrading from 1.2 to 2.0.3

On Thu, Dec 19, 2013 at 1:03 PM, Parag Patel 
mailto:parag.pa...@fusionts.com>> wrote:
We are in the process of upgrading 1.2 to 2.0.3.
 ...
Please help as this will prevent us from pushing into production.

(as a general commentary : 
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ )

specific feedback on your question :

Did the 2.0.3 nodes see the 1.2.x (which 1.2.x?) nodes after the first restart?

=Rob



Astyanax - multiple key search with pagination

2013-12-20 Thread Parag Patel
Hi,

I'm using Astyanax and trying to do search for multiple keys with pagination.  
I tried ".getKeySlice" with a list a of primary keys, but it doesn't allow 
pagination.  Does anyone know how to tackle this issue with Astyanax?

Parag


RE: Issue upgrading from 1.2 to 2.0.3

2013-12-24 Thread Parag Patel
After restarting all the nodes, they all see each other.   I'll try nodetool 
gossipinfo if it happens again.  Thanks.

From: Aaron Morton [mailto:aa...@thelastpickle.com]
Sent: Monday, December 23, 2013 10:19 PM
To: Cassandra User
Subject: Re: Issue upgrading from 1.2 to 2.0.3

If this is still a concern can you post the output from nodetool gossipinfo ? 
It will give the details of the nodes think of the other ones.

A

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 20/12/2013, at 11:38 am, Parag Patel 
mailto:parag.pa...@fusionts.com>> wrote:


Thanks for that link.

Our 1.2 version is 1.2.12

Our 2.0.3 nodes were restarted once.  Before restart, it was the 1.2.12 binary, 
after it was the 2.0.3.  Immediately after the node was back in the cluster, we 
ran nodetool upgradesstables.  We haven't restarted since.

Is a restart required for each node?

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Thursday, December 19, 2013 4:17 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Issue upgrading from 1.2 to 2.0.3

On Thu, Dec 19, 2013 at 1:03 PM, Parag Patel 
mailto:parag.pa...@fusionts.com>> wrote:
We are in the process of upgrading 1.2 to 2.0.3.
 ...
Please help as this will prevent us from pushing into production.

(as a general commentary : 
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ )

specific feedback on your question :

Did the 2.0.3 nodes see the 1.2.x (which 1.2.x?) nodes after the first restart?

=Rob