Re: Unexpected high internode network activity

2016-02-25 Thread daemeon reiydelle
Hmm. From the AWS FAQ: *Q: If I have two instances in different availability zones, how will I be charged for regional data transfer?* Each instance is charged for its data in and data out. Therefore, if data is transferred between these two instances, it is charged out for the first instance

RE: CsvReporter not spitting out metrics in cassandra

2016-02-25 Thread Leleu Eric
Hi, I configured this reporter recently thought the Apache Cassandra v2.1.x and I had no troubles. Here is some points to check : - The directory “/etc/dse/Cassandra” has to be in the classpath (I’m not a DSE user so I don’t know if it is already the case.) - If the

Re: Unexpected high internode network activity

2016-02-25 Thread Gianluca Borello
It is indeed very intriguing and I really hope to learn more from the experience of this mailing list. To address your points: - The theory that full data is coming from replicas during reads is not enough to explain the situation. In my scenario, over a time window I had 17.5 GB of intra node

Re: Unexpected high internode network activity

2016-02-25 Thread daemeon reiydelle
Intriguing. It's enough data to look like full data is coming from the replicants instead of digests when the read of the copy occurs. Are you doing backup/dr? Are directories copied regularly and over the network or ? *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20

Re: Unexpected high internode network activity

2016-02-25 Thread Gianluca Borello
Thank you for your reply. To answer your points: - I fully agree on the write volume, in fact my isolated tests confirm your estimation - About the read, I agree as well, but the volume of data is still much higher - I am writing to one single keyspace with RF 3, there's just one keyspace - I

Re: Unexpected high internode network activity

2016-02-25 Thread daemeon reiydelle
If read & write at quorum then you write 3 copies of the data then return to the caller; when reading you read one copy (assume it is not on the coordinator), and 1 digest (because read at quorum is 2, not 3). When you insert, how many keyspaces get written to? (Are you using e.g. inverted

Re: Checking replication status

2016-02-25 Thread Jimmy Lin
so far they are not long, just some config change and restart. if it is a 2 hrs downtime due to whatever reason, a repair is better option than trying to figure out if replication syn finish or not? On Thu, Feb 25, 2016 at 1:09 PM, daemeon reiydelle wrote: > Hmm. What are

Re: how to read parent_repair_history table?

2016-02-25 Thread Jimmy Lin
hi Paulo, that is right, I forgot there is another table that actually tracking the rest of the detail of the repairs. thanks for the pointers, will explore more with those info. I am actually surprised not much doc out there talk about these two tables, or other tools or utilities harvesting

Migrating from single node to cluster

2016-02-25 Thread Jason Kania
Hi, I am wondering if there is any documentation on migrating from a single node cassandra instance to a multinode cluster? My searches have been unsuccessful so far and I have had no luck playing with tools due to terse output from the tools. I currently use a single node having data that

Unexpected high internode network activity

2016-02-25 Thread Gianluca Borello
Hello, We have a Cassandra 2.1.9 cluster on EC2 for one of our live applications. There's a total of 21 nodes across 3 AWS availability zones, c3.2xlarge instances. The configuration is pretty standard, we use the default settings that come with the datastax AMI and the driver in our application

Re: how to read parent_repair_history table?

2016-02-25 Thread Paulo Motta
> how does it work when repair job targeting only local vs all DC? is there any columns or flag i can tell the difference? or does it actualy matter? You can not easily find out from the parent_repair_session table if a repair is local-only or multi-dc. I created

Re: Checking replication status

2016-02-25 Thread daemeon reiydelle
Hmm. What are your processes when a node comes back after "a long offline"? Long enough to take the node offline and do a repair? Run the risk of serving stale data? Parallel repairs? ??? So, what sort of time frames are "a long time"? *...* *Daemeon C.M. ReiydelleUSA (+1)

CsvReporter not spitting out metrics in cassandra

2016-02-25 Thread Vikram Kone
Hi, I have added the following file on my cassandra node /etc/dse/cassandra/metrics-reporter-config.yaml csv: - outdir: '/mnt/cassandra/metrics' period: 10 timeunit: 'SECONDS' predicate: color: "white" useQualifiedName: true patterns: -

Checking replication status

2016-02-25 Thread Jimmy Lin
hi all, what are the better ways to check replication overall status of cassandra cluster? within a single DC, unless a node is down for long time, most of the time i feel it is pretty much non-issue and things are replicated pretty fast. But when a node come back from a long offline, is

Re: how to read parent_repair_history table?

2016-02-25 Thread Jimmy Lin
hi Paulo, one more follow up ... :) I noticed these tables are suppose to replicatd to all nodes in the cluster, and it is not per node specific. how does it work when repair job targeting only local vs all DC? is there any columns or flag i can tell the difference? or does it actualy matter?

Re: Handling uncommitted paxos state

2016-02-25 Thread Robert Coli
On Thu, Feb 25, 2016 at 1:23 AM, Nicholas Wilson < nicholas.wil...@realvnc.com> wrote: > If a WriteTimeoutException with WriteType.SIMPLE is thrown for a CAS > write, that means that the paxos phase was successful, but the data > couldn't be committed during the final 'commit/reset' phase. On the

Re: how to read parent_repair_history table?

2016-02-25 Thread Paulo Motta
> why each job repair execution will have 2 entries? I thought it will be one entry, begining with started_at column filled, and when it completed, finished_at column will be filled. that's correct, I was mistaken! > Also, if my cluster has more than 1 keyspace, and the way this table is

Re: Handling uncommitted paxos state

2016-02-25 Thread Carl Yeksigian
The paxos state is written to a system table (system.paxos) on each of the paxos coordinators, so it goes through the normal write path, including persisting to the log and being stored in a memtable until being flushed to disk. As such, the state can survive restarts. These states are not treated

Re: how to read parent_repair_history table?

2016-02-25 Thread Jimmy Lin
hi Paulo, follow up on the # of entries question... why each job repair execution will have 2 entries? I thought it will be one entry, begining with started_at column filled, and when it completed, finished_at column will be filled. Also, if my cluster has more than 1 keyspace, and the way

Re: how to read parent_repair_history table?

2016-02-25 Thread Jimmy Lin
Hi Anuj, i never thought of using JMX notification as way to check. Partially i think it require a live connection or application to keep the notification flowing in, while the DB approach let you look it up whenever you want current or the past jobs. thanks Sent from my iPhone > On Feb

Re: how to read parent_repair_history table?

2016-02-25 Thread Anuj Wadehra
Hi Jimmy, We are on 2.0.x. We are planning to use JMX notifications for getting repair status. To repair database, we call forceTableRepairPrimaryRange JMX operation from our Java client application on each node. You can call other latest JMX methods for repair. I would be keen in knowing the

Re: Consistent read timeouts for bursts of reads

2016-02-25 Thread Emīls Šolmanis
Having had a read through the archives, I missed this at first, but this seems to be *exactly* like what we're experiencing. http://www.mail-archive.com/user@cassandra.apache.org/msg46064.html Only difference is we're getting this for reads and using CQL, but the behaviour is identical. On Thu,

Consistent read timeouts for bursts of reads

2016-02-25 Thread Emīls Šolmanis
Hello, We're having a problem with concurrent requests. It seems that whenever we try resolving more than ~ 15 queries at the same time, one or two get a read timeout and then succeed on a retry. We're running Cassandra 2.2.4 accessed via the 2.1.9 Datastax driver on AWS. What we've found while

Re: how to read parent_repair_history table?

2016-02-25 Thread Paulo Motta
Hello Jimmy, The parent_repair_history table keeps track of start and finish information of a repair session. The other table repair_history keeps track of repair status as it progresses. So, you must first query the parent_repair_history table to check if a repair started and finish, as well as

Re: Cassandra nodes reduce disks per node

2016-02-25 Thread Alain RODRIGUEZ
You're welcome, if you have some feedback you can comment the blog post :-). C*heers, --- Alain Rodriguez - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2016-02-25 12:28 GMT+01:00 Anishek Agarwal

Re: Cassandra Data Audit

2016-02-25 Thread Jack Krupansky
There is an open Jira on this exact topic - Change Data Capture (CDC): https://issues.apache.org/jira/browse/CASSANDRA-8844 Unfortunately, open means not yet done. -- Jack Krupansky On Thu, Feb 25, 2016 at 2:13 AM, Charulata Sharma (charshar) < chars...@cisco.com> wrote: > Thanks for the

Re: Cassandra nodes reduce disks per node

2016-02-25 Thread Anishek Agarwal
Nice thanks ! On Thu, Feb 25, 2016 at 1:51 PM, Alain RODRIGUEZ wrote: > For what it is worth, I finally wrote a blog post about this --> > http://thelastpickle.com/blog/2016/02/25/removing-a-disk-mapping-from-cassandra.html > > If you are not done yet, every step is detailed

Handling uncommitted paxos state

2016-02-25 Thread Nicholas Wilson
Hi, I have some questions about the behaviour of 'uncommitted paxos state', as described here: http://www.datastax.com/dev/blog/cassandra-error-handling-done-right If a WriteTimeoutException with WriteType.SIMPLE is thrown for a CAS write, that means that the paxos phase was successful, but

Re: Cassandra nodes reduce disks per node

2016-02-25 Thread Alain RODRIGUEZ
For what it is worth, I finally wrote a blog post about this --> http://thelastpickle.com/blog/2016/02/25/removing-a-disk-mapping-from-cassandra.html If you are not done yet, every step is detailed in there. C*heers, --- Alain Rodriguez - al...@thelastpickle.com France The