Re: Open File Handles for Deleted sstables

2016-09-28 Thread Anuj Wadehra
Restarting may be a temporary workaround but cant be a permanent solution. After some days, the problem will come back again. ThanksAnuj Sent from Yahoo Mail on Android On Thu, 29 Sep, 2016 at 12:54 AM, sai krishnam raju potturi wrote: restarting the cassandra

Re: New node block in autobootstrap

2016-09-28 Thread Alain RODRIGUEZ
> > Forgot to set replication for new data center :( I was feeling like it could be it :-). From the other thread: > It should be ran from DC3 servers, after altering keyspace to add > keyspaces to the new datacenter. Is this the way you're doing it? > >- Are all the nodes using the same

Re: New node block in autobootstrap

2016-09-28 Thread techpyaasa .
Forgot to set replication for new data center :( On Wed, Sep 28, 2016 at 11:33 PM, Jonathan Haddad wrote: > What was the reason? > > On Wed, Sep 28, 2016 at 9:58 AM techpyaasa . wrote: > >> Very sorry...I got the reason for this issue.. >> Please

Re: Open File Handles for Deleted sstables

2016-09-28 Thread Jeff Jirsa
There have been a history of leaks where repairs are multiple repairs were run on the same node at the same time ( e.g: https://issues.apache.org/jira/browse/CASSANDRA-11215 ) You’re running a very old version of Cassandra. If you’re able to upgrade to newest 2.1 or 2.2, it’s likely that

Re: Open File Handles for Deleted sstables

2016-09-28 Thread sai krishnam raju potturi
restarting the cassandra service helped get rid of those files in our situation. thanks Sai On Wed, Sep 28, 2016 at 3:15 PM, Anuj Wadehra wrote: > Hi, > > We are facing an issue where Cassandra has open file handles for deleted > sstable files. These open file handles

Open File Handles for Deleted sstables

2016-09-28 Thread Anuj Wadehra
Hi, We are facing an issue where Cassandra has open file handles for deleted sstable files. These open file handles keep on increasing with time and eventually lead to disk crisis. This is visible via lsof command.  There are no Exceptions in logs.We are suspecting a race condition where

Re: TRUNCATE throws OperationTimedOut randomly

2016-09-28 Thread George Sigletos
Even when I set a lower request-timeout in order to trigger a timeout, still no WARN or ERROR in the logs On Wed, Sep 28, 2016 at 8:22 PM, George Sigletos wrote: > Hi Joaquin, > > Unfortunately neither WARN nor ERROR found in the system logs across the > cluster when

Re: TRUNCATE throws OperationTimedOut randomly

2016-09-28 Thread George Sigletos
Hi Joaquin, Unfortunately neither WARN nor ERROR found in the system logs across the cluster when executing truncate. Sometimes it executes immediately, other times it takes 25 seconds, given that I have connected with --request-timeout=30 seconds. The nodes are a bit busy compacting. On a

Re: New node block in autobootstrap

2016-09-28 Thread Jonathan Haddad
What was the reason? On Wed, Sep 28, 2016 at 9:58 AM techpyaasa . wrote: > Very sorry...I got the reason for this issue.. > Please ignore. > > > On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . > wrote: > >> @Paulo >> >> We have done changes as you

Re: TRUNCATE throws OperationTimedOut randomly

2016-09-28 Thread Joaquin Casares
Hi George, Try grepping for WARN and ERROR on the system.logs across all nodes when you run the command. Could you post any of the recent stacktraces that you see? Cheers, Joaquin Casares Consultant Austin, TX Apache Cassandra Consulting http://www.thelastpickle.com On Wed, Sep 28, 2016 at

Re: TRUNCATE throws OperationTimedOut randomly

2016-09-28 Thread George Sigletos
Thanks a lot for your reply. I understand that truncate is an expensive operation. But throwing a timeout while truncating a table that is already empty? A workaround is to set a high --request-timeout when connecting. Even 20 seconds is not always enough Kind regards, George On Wed, Sep 28,

WARN Writing large partition for materialized views

2016-09-28 Thread Robert Sicoie
Hi guys, I run a cluster with 5 nodes, cassandra version 3.0.5. I get this warning: 2016-09-28 17:22:18,480 BigTableWriter.java:171 - Writing large partition... for some materialized view. Some have values over 500MB. How this affects performance? What can/should be done? I suppose is a problem

Contains-query leads to error when list in selected row is empty

2016-09-28 Thread Michael Mirwaldt
Hi Cassandra-users, my name is Michael Mirwaldt and I work for financial.com. I have encountered this problem with Cassandra 3.7 running 4 nodes: Given the data model CREATE KEYSPACE mykeyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes =

Re: TRUNCATE throws OperationTimedOut randomly

2016-09-28 Thread Edward Capriolo
Truncate does a few things (based on version) truncate takes snapshots truncate causes a flush in very old versions truncate causes a schema migration. In newer versions like cassandra 3.4 you have this knob. # How long the coordinator should wait for truncates to complete # (This can be

Re: New node block in autobootstrap

2016-09-28 Thread techpyaasa .
Very sorry...I got the reason for this issue.. Please ignore. On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . wrote: > @Paulo > > We have done changes as you said > net.ipv4.tcp_keepalive_time=60 > net.ipv4.tcp_keepalive_probes=3 > net.ipv4.tcp_keepalive_intvl=10 > > and

Re: New node block in autobootstrap

2016-09-28 Thread techpyaasa .
@Paulo We have done changes as you said net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10 and increased streaming_socket_timeout_in_ms to 48 hours , "phi_convict_threshold : 9". And once again recommissioned new data center (DC3) , ran " nodetool

TRUNCATE throws OperationTimedOut randomly

2016-09-28 Thread George Sigletos
Hello, I keep executing a TRUNCATE command on an empty table and it throws OperationTimedOut randomly: cassandra@cqlsh> truncate test.mytable; OperationTimedOut: errors={}, last_host=cassiebeta-01 cassandra@cqlsh> truncate test.mytable; OperationTimedOut: errors={}, last_host=cassiebeta-01

Re: nodetool rebuild streaming exception

2016-09-28 Thread Alain RODRIGUEZ
Hi techpyaasa, That was one of my teammate , very sorry for it/multiple threads. No big deal :-). *It looks like streams are failing right away when trying to rebuild.?* > No , after partial streaming of data (around 150 GB - we have around 600 > GB of data on each node) streaming is getting

Re: How to get rid of "Cannot start multiple repair sessions over the same sstables" exception

2016-09-28 Thread Alexander Dejanovski
Robert, You can restart them in any order, that doesn't make a difference afaik. Cheers Le mer. 28 sept. 2016 17:10, Robert Sicoie a écrit : > Thanks Alexander, > > Yes, with tpstats I can see the hanging active repair(s) (output > attached). For one there are 31

[RELEASE] Apache Cassandra 2.2.8 released

2016-09-28 Thread Michael Shuler
* NOTICE * This is the first release signed with key 0xA278B781FE4B2BDA by Michael Shuler. Debian users will need to add the key to `apt-key` and the process has been updated on https://wiki.apache.org/cassandra/DebianPackaging and patch created for source docs. Either method will work:

Re: How to get rid of "Cannot start multiple repair sessions over the same sstables" exception

2016-09-28 Thread Robert Sicoie
Thanks Alexander, Yes, with tpstats I can see the hanging active repair(s) (output attached). For one there are 31 pending repair. On others there are less pending repairs (min 12). Is there any recomandation for the restart order? The one with more less pending repairs first, perhaps? Thanks,

Re: How to get rid of "Cannot start multiple repair sessions over the same sstables" exception

2016-09-28 Thread Alexander Dejanovski
They will show up in nodetool compactionstats : https://issues.apache.org/jira/browse/CASSANDRA-9098 Did you check nodetool tpstats to see if you didn't have any running repair session ? Just to make sure (and if you can actually do it), roll restart the cluster and try again. Repair sessions can

Re: nodetool rebuild streaming exception

2016-09-28 Thread Alain RODRIGUEZ
Just saw a very similar question from Laxmikanth (laxmikanth...@gmail.com) on an other thread, with the same logs. Would you mind to avoid splitting multiple threads, to gather up informations so we can better help you from this mailing list? C*heers, 2016-09-28 14:28 GMT+02:00 Alain RODRIGUEZ

Re: nodetool rebuild streaming exception

2016-09-28 Thread Alain RODRIGUEZ
Hi, It looks like streams are failing right away when trying to rebuild. - Could you please share with us the command you used? It should be ran from DC3 servers, after altering keyspace to add keyspaces to the new datacenter. Is this the way you're doing it? - Are all the nodes using

Re: Repairs at scale in Cassandra 2.1.13

2016-09-28 Thread Paulo Motta
There were a few streaming bugs fixed between 2.1.13 and 2.1.15 (see CHANGES.txt for more details), so I'd recommend you to upgrade to 2.1.15 in order to avoid having those. 2016-09-28 9:08 GMT-03:00 Alain RODRIGUEZ : > Hi Anubhav, > > >> I’m considering doing subrange

Re: Repairs at scale in Cassandra 2.1.13

2016-09-28 Thread Alain RODRIGUEZ
Hi Anubhav, > I’m considering doing subrange repairs (https://github.com/ > BrianGallew/cassandra_range_repair/blob/master/src/range_repair.py) > I used this script a lot, and quite successfully. An other working option that people are using is: https://github.com/spotify/cassandra-reaper

Re: How long/how many days 'nodetool gossipinfo' will have decommissioned nodes info

2016-09-28 Thread Alain RODRIGUEZ
> > I've read from some that the gossip info will stay > around for 72h before being removed. > I've read this one too :-). It is 3 days indeed. This might be of some interest: https://issues.apache.org/jira/browse/CASSANDRA-10371 (Fix Version/s: 2.1.14, 2.2.6, 3.0.4, 3.4) C*heers,

Re: How to get rid of "Cannot start multiple repair sessions over the same sstables" exception

2016-09-28 Thread Alexander Dejanovski
Hi, nodetool scrub won't help here, as what you're experiencing is most likely that one SSTable is going through anticompaction, and then another node is asking for a Merkle tree that involves it. For understandable reasons, an SSTable cannot be anticompacted and validation compacted at the same

How to get rid of "Cannot start multiple repair sessions over the same sstables" exception

2016-09-28 Thread Robert Sicoie
Hi guys, I have a cluster of 5 nodes, cassandra 3.0.5. I was running nodetool repair last days, one node at a time, when I first encountered this exception *ERROR [ValidationExecutor:11] 2016-09-27 16:12:20,409 CassandraDaemon.java:195 - Exception in thread Thread[ValidationExecutor:11,1,main]*