Re: Two problems with Cassandra
Update should not be a problem because no read is done, so no need to pull the data out. Is that row bigger than your memory capacity (Or HEAP size)? For dealing with large heaps you can refer to this ticket: CASSANDRA-8150. It provides some nice tips. If someone else can share experience would be good. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Wed, Feb 11, 2015 at 12:05 PM, Pavel Velikhov pavel.velik...@gmail.com wrote: Hi Carlos, I tried on a single node and a 4-node cluster. On the 4-node cluster I setup the tables with replication factor = 2. I usually iterate over a subset, but it can be about ~40% right now. Some of my column values could be quite big… I remember I was exporting to csv and I had to change the default csv max column length. If I just update, there are no problems, its reading and updating that kills everything (could it have something to do with the driver?) I’m using 2.0.8 release right now. I was trying to tweak memory sizes. If I give Cassandra too much memory (8 or 16 GB) it dies much faster due to GC not being able to keep up. But it consistently dies on a specific row in single instance case… Is this enough info to point me somewhere? Thank you, Pavel On Feb 11, 2015, at 1:48 PM, Carlos Rolo r...@pythian.com wrote: Hello Pavel, What is the size of the Cluster (# of nodes)? And you need to iterate over the full 1TB every time you do the update? Or just parts of it? IMO information is short to make any kind of assessment of the problem you are having. I can suggest to try a 2.0.x (or 2.1.1) release to see if you get the same problem. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Wed, Feb 11, 2015 at 11:22 AM, Pavel Velikhov pavel.velik...@gmail.com wrote: Hi, I’m using Cassandra to store NLP data, the dataset is not that huge (about 1TB), but I need to iterate over it quite frequently, updating the full dataset (each record, but not necessarily each column). I’ve run into two problems (I’m using the latest Cassandra): 1. I was trying to copy from one Cassandra cluster to another via a python driver, however the driver confused the two instances 2. While trying to update the full dataset with a simple transformation (again via python driver), single node and clustered Cassandra run out of memory no matter what settings I try, even I put a lot of sleeps into the mix. However simpler transformations (updating just one column, specially when there is a lot of processing overhead) work just fine. I’m really concerned about #2, since we’re moving all heavy processing to a Spark cluster and will expand it, and I would expect much heavier traffic to/from Cassandra. Any hints, war stories, etc. very appreciated! Thank you, Pavel Velikhov -- -- --
Re: nodetool status shows large numbers of up nodes are down
Can you run nodetool tpstats and check if there is pending requests on GossipStage. The timeout should not affect gossip (AFAIK). As for problems you can have with this state is, if your nodes are marked down for long and if you are using hinted handoff, your hints may not be delivered and your data can be out of sync (can be fixed by increasing the timeout limit or during repairs). Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Tue, Feb 10, 2015 at 8:51 PM, Chris Lohfink clohfin...@gmail.com wrote: Are you hitting long GCs on your nodes? Can check gc log or look at cassandra log for GCInspector. Chris On Tue, Feb 10, 2015 at 1:28 PM, Cheng Ren cheng@bloomreach.com wrote: Hi Carlos, Thanks for your suggestion. We did check the NTP setting and clock, and they are all working normally. Schema versions are also consistent with peers'. BTW, the only change we made was to set some of nodes' request timeout(read_request_timeout, write_request_timeout, range_request_timeout and request_timeout) from 3 to 1 for 6 nodes yesterday. Will this affect internode gossip? Thanks, Cheng On Mon, Feb 9, 2015 at 11:07 PM, Carlos Rolo r...@pythian.com wrote: Hi Cheng, Are all machines configured with NTP and all clocks in sync? If that is not the case do it. If your clocks are not in sync it causes some weird issues like the ones you see, but also schema disagreements and in some cases corrupted data. Regards, Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Tue, Feb 10, 2015 at 3:40 AM, Cheng Ren cheng@bloomreach.com wrote: Hi, We have a two-dc cluster with 21 nodes and 27 nodes in each DC. Over the past few months, we have seen nodetool status marks 4-8 nodes down while they are actually functioning. Particularly today we noticed that running nodetool status on some nodes shows higher number of nodes are down than before while they are actually up and serving requests. For example, on one node it shows 42 nodes are down. phi_convict_threshold of all nodes are set as 12, and we are running cassandra 2.0.4 on AWS EC2 machines. Does anyone have recommendation on identifying the root cause of this? Will this cause any consequences? Thanks, Cheng -- -- --
Re: nodetool status shows large numbers of up nodes are down
Hi Cheng, Are all machines configured with NTP and all clocks in sync? If that is not the case do it. If your clocks are not in sync it causes some weird issues like the ones you see, but also schema disagreements and in some cases corrupted data. Regards, Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Tue, Feb 10, 2015 at 3:40 AM, Cheng Ren cheng@bloomreach.com wrote: Hi, We have a two-dc cluster with 21 nodes and 27 nodes in each DC. Over the past few months, we have seen nodetool status marks 4-8 nodes down while they are actually functioning. Particularly today we noticed that running nodetool status on some nodes shows higher number of nodes are down than before while they are actually up and serving requests. For example, on one node it shows 42 nodes are down. phi_convict_threshold of all nodes are set as 12, and we are running cassandra 2.0.4 on AWS EC2 machines. Does anyone have recommendation on identifying the root cause of this? Will this cause any consequences? Thanks, Cheng -- --
Re: how to batch the select query to reduce network communication
Hi, You can't. Batches are only available for INSERT, UPDATE and DELETE operations. Batches exist to give Cassandra some atomicity, as in, or all operations succeed or all fail. Regards, Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Fri, Feb 6, 2015 at 12:21 PM, diwayou diwa...@vip.qq.com wrote: create table t { a int, b int, c int } if i want to execute select * from t where a = 1 and b = 2 limit 10; select * from t where a = 1 and b = 3 limit 10; how can i batch this, and only execute once to get the result -- --
Re: Newly added column not visible
Hey Saurabh, Your issue seems similar to one I have, but mine seems like a timing issue (and not easy to reproduce) , check the comments here https://issues.apache.org/jira/browse/CASSANDRA-8012 and see if it fits your problem. Otherwise do like Mark recommended and create a new JIRA issue. Regards, Carlos Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 5, 2015 at 10:25 PM, Mark Reddy mark.l.re...@gmail.com wrote: Hey Saurabh, I can't say that I have experienced this before, however if you can reliably reproduce the issue it would be worth commenting on the JIRA issue you linked to or alternatively creating a new JIRA with as much info (setup, test case, debug logs, etc) as possible. Regards, Mark On 5 February 2015 at 00:50, Saurabh Sethi saurabh_se...@symantec.com wrote: I have a 3 node cluster running Cassandra version 2.1.2. Through my unit test, I am creating a column family with 3 columns, inserting a row, asserting that the values got inserted and then truncating the column family. After that I am adding a fourth column to the column family and inserting a new row with four values but this insertion is failing because it can’t find the newly added column. If I go via debug mode, it works fine but not otherwise. I also tried putting a Thread.sleep() for 10 seconds after adding the new column but to no avail. Anyone has any idea what might be going on here? I see that a jira was filed related to similar issue in May 2014 but they closed it in December 2014 stating not reproducible. https://issues.apache.org/jira/browse/CASSANDRA-7186 Thanks, Saurabh -- --
Re: Anonymous user in permissions system?
Hello Erik, It seems possible, refer to the following documentation to see if it fits your needs: http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureInternalAuthenticationTOC.html http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureInternalAuthorizationTOC.html And you can check the permissions available here: http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/list_permissions_r.html I'm assuming you are using CQL. Regards, Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Feb 5, 2015 at 12:23 PM, Erik Forsberg forsb...@opera.com wrote: Hi! Is there such a thing as the anonymous/unauthenticated user in the cassandra permissions system? What I would like to do is to grant select, i.e. provide read-only access, to users which have not presented a username and password. Then grant update/insert to other users which have presented a username and (correct) password. Doable? Regards, \EF -- --
Re: Upgrading from 1.2 to 2.1 questions
Using Pycassa (https://github.com/pycassa/pycassa)I had no trouble with the Clients writing/reading from 1.2.x to 2.0.x (Can't recall the minor versions out of my head right now). Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Mon, Feb 2, 2015 at 3:21 PM, Oleg Dulin oleg.du...@gmail.com wrote: Sure but the question is really about going from 1.2 to 2.0 ... On 2015-02-02 13:59:27 +, Kai Wang said: I would not use 2.1.2 for production yet. It doesn't seem stable enough based on the feedbacks I see here. The newest 2.0.12 may be a better option. On Feb 2, 2015 8:43 AM, Sibbald, Charles charles.sibb...@bskyb.com wrote: Hi Oleg, What is the minor version of 1.2? I am looking to do the same for 1.2.14 in a very large cluster. Regards Charles On 02/02/2015 13:33, Oleg Dulin oleg.du...@gmail.com wrote: Dear Distinguished Colleagues: We'd like to upgrade our cluster from 1.2 to 2.0 and then to 2.1 . We are using Pelops Thrift client, which has long been abandoned by its authors. I've read that 2.x has changes to the Thrift protocol making it incompatible with 1.2 (and of course now the link to that site eludes me). If that is true, we need to first upgrade our Thrift client and then upgrade cassandra. Let's start by confirming if that indeed is the case -- if that is true, I have my work cut out for me. Anyone knows for sure ? Regards, Oleg Information in this email including any attachments may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. SKY and the SKY marks are trademarks of British Sky Broadcasting Group plc and Sky International AG and are used under licence. British Sky Broadcasting Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are direct or indirect subsidiaries of British Sky Broadcasting Group plc (Registration No. 2247735). All of th! e compani es mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD. -- --
Re: Cassandra 2.0.11 with stargate-core read writes are slow
HI Asit, The only help I'm going to give is on point 3), as I have little experience with 2) and 1) depends on a lot of factors. For testing the workload use this: http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCStress_t.html It probably covers all your testing needs. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Sat, Jan 31, 2015 at 2:49 AM, Asit KAUSHIK asitkaushikno...@gmail.com wrote: Hi all, We are testing our logging application on 3 node cluster each system is virtual machine with 4 cores and 8GB RAM with RedHat enterprise. Now my question is in 3 parts 1) Am I using the right hardware as of now I am testing say 10 record reads. 2) I am using Stargate-core for full text search is there any slowness observed because of that as ??? 2) How can I simulate the write load I created an application which creates say 20 threads and each tread I insert 1000 records and on each thread I open cluster connection session connection execute 1000 records and close the connection. This takes a lot of time please suggest if I missing something -- --
Re: Upgrading from Cassandra 1.2.14 to Cassandra 2.10
Sorry, my bad. I misread as 1.2.4 for some reason! You should be able to do it with the 2-way upgrade then. Don't forget that once you upgrade to 2.0.x you need to perform a nodetool upgradesstables and this is where, in my point of view, things can get tricky. First make sure you have the space available (it will probably duplicate your space, since both new and old tables need to coexist for a while. It depends if the process goes in parallel or not, which I don't know). And some for security sake I would do it one by one, so if things go bad, you have one node down. Once this process is done, the upgrade to 2.1.x should be simple. Make sure the new machines joining in are 2.0.X. Then upgrade all the cluster. Maybe you can even work it from your version to 2.1.x since you are already network compatible, but 2.1.x brings new features (the new counters) that could prove problematic if (I assume that is the case) you will keep your cluster running. Some oversimplified documentation can also be found here: http://www.datastax.com/documentation/upgrade/doc/upgrade/cassandra/upgradeCassandraDetails.html I had my fair share of bad upgrades with Cassandra, and also a couple of good ones. The approach I used last time was the safest I ever did, but for sure, the most time consuming of them all. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Jan 29, 2015 at 4:52 PM, Sibbald, Charles charles.sibb...@bskyb.com wrote: Hi Carlos, We are running 1.2.14 which is higher than 1.2.9 so I have an assumption that the network layer is compatible, or do you specifically mean that 2.10 is not compatible with 1.2.x ? Regards Charles From: Carlos Rolo r...@pythian.com Reply-To: user@cassandra.apache.org user@cassandra.apache.org Date: Thursday, 29 January 2015 14:47 To: user@cassandra.apache.org user@cassandra.apache.org Subject: Re: Upgrading from Cassandra 1.2.14 to Cassandra 2.10 Hello Charles, I think you have to do a 2-time upgrade given SSTables and Network incompatibilities between versions. You have to upgrade to 2.0 and then to 2.1. Acording to this http://www.datastax.com/documentation/upgrade/doc/upgrade/cassandra/upgradeC_c.html you should even do a upgrade to 1.2.9 first. The major upgrade I did 1.2.x to 2.0.x I did it putting a new ring with the 2.0 machines, clients writing to both and the move the old machines from one ring to the other like they were brand new machines. The historical data was imported in batches. Regards, Carlos Juzarte Rolo Cassandra Consultant Pythian - Love your data rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo http://linkedin.com/in/carlosjuzarterolo* Tel: 1649 www.pythian.com On Thu, Jan 29, 2015 at 3:15 PM, Sibbald, Charles charles.sibb...@bskyb.com wrote: Hi All, I am looking into the possibility of upgrading from Cassandra 1.2.14 to Cassandra 2.1 in the following manor. I have a large Cassandra cluster with dozens of nodes, and would like to build new instances at version 2.1 to join the cluster and once they have successfully joined the rink these should then stream data in. Once they have fully joined the cluster I would like to decommission a single Cassandra 1.2.14 instance, and repeat. Due to the fact that our 2.1 installations have a different directory layout we would like to go with this ‘streaming’ option for the upgrade rather than an inplace upgrade. Does anyone foresee any issues with this. Thanks in advance. Regards Charles Information in this email including any attachments may be privileged, confidential and is intended exclusively for the addressee. The views expressed may not be official policy, but the personal views of the originator. If you have received it in error, please notify the sender by return e-mail and delete it from your system. You should not reproduce, distribute, store, retransmit, use or disclose its contents to anyone. Please note we reserve the right to monitor all e-mail communication through our internal and external networks. SKY and the SKY marks are trademarks of British Sky Broadcasting Group plc and Sky International AG and are used under licence. British Sky Broadcasting Limited (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are direct or indirect subsidiaries of British Sky Broadcasting Group plc (Registration No. 2247735). All of the companies mentioned in this paragraph are incorporated in England and Wales and share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD. -- Information in this email including any attachments may be privileged, confidential and is intended
Added nodes to cluster, authentication stopped working
Hi all, I have a Cassandra Cluster running and we recently duplicated the cluster. After following all the steps, the cassandra clients started failing with the following message: AuthenticationException(why='Username and/or password are incorrect') The problem is that even I can't login to the CQL shell and check the users, since it says (even with the correct username/password) that 'You have to be logged in and not anonymous to perform this request'. When I created the Cluster I disabled the 'cassandra' superuser and now I can't do anything on my Cluster. Is there any method to reset a user and/or password, or recreate a new superuser?? Otherwise I need to drop all the data from the cluster... Since even disabling authentication and authorization my clients give errors writting data.
Re: How many BATCH inserts in to many?
Hello, I have managed to insert up to 63k records without any problem. In certain workloads I found that massive batch inserts perform way better than lots of not-so-massive inserts. I guess it also depends on your setup. Just try it. Alan Ristić alan.ris...@gmail.com escreveu: Hi, I'm implementing Facebook style notifications/activities (e.g your friend liked this article) in our app. I considered message queue for this task before exploring how BATCH insert in C* could perform. Could I leave Queue altogether and just use batch insert? Is like...5000 insers too much overload for this kind of work? And what is limiting the number of inserts? oh..the data model in C* is straightforward - give me all notifications for user A Lp, Alan Ristić m: 040 423 688
RE: cassandra most stable version ?
Hi Pierre, Using 1.0.2 without any problem so far. 0.8.x had problems for us. Never tried 0.8.7 or later tough. Carlos Rolo From: Karsten Pappert [mailto:kars...@pappert.de] Sent: woensdag 7 december 2011 16:54 To: user@cassandra.apache.org; pie...@chalamet.net Subject: AW: cassandra most stable version ? Hi Pierre, we started some Tests with 1.0.x, and had a lot of trouble. Now we are running 0.8.7 and had also no failures and good performance. -Karsten Von: Pierre Chalamet [mailto:pie...@chalamet.net]mailto:[mailto:pie...@chalamet.net] Gesendet: Mittwoch, 7. Dezember 2011 16:37 An: user@cassandra.apache.orgmailto:user@cassandra.apache.org Betreff: Re: cassandra most stable version ? Thanks. Anyone else to share his production version and some feedbacks ? - Pierre From: Jahangir Mohammed md.jahangi...@gmail.commailto:md.jahangi...@gmail.com Date: Tue, 6 Dec 2011 17:36:37 -0500 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org ReplyTo: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: cassandra most stable version ? We are running 0.8.7. No big issues so far. Thanks, Jahangir. On Tue, Dec 6, 2011 at 5:05 PM, Pierre Chalamet pie...@chalamet.netmailto:pie...@chalamet.net wrote: Hello, Recent problems with Cassandra 1.0.x versions seems to tell it is still not ready for prime time. We are currently using version 0.8.5 on our development cluster - although we have not seen much problems with this one, maybe recent versions of 0.8.x might be safer to use. So what version are you running in production ? What kinds of problems do you encounter if any ? Thanks, - Pierre
RE: Client Timeouts on incrementing counters
I have digged a bit more to try to find the root cause of the error, and I have some more information. It seems that all started after I upgraded Cassandra from 0.8.x to 1.0.0 When I do a incr on the CLI I also get a timeout. row_cache_save_period_in_seconds is set to 60sec. Could be a problem from the upgrade? I just did a rolling restart of all nodes one-by-one. From: Tyler Hobbs [mailto:ty...@datastax.com] Sent: vrijdag 11 november 2011 20:18 To: user@cassandra.apache.org Subject: Re: Client Timeouts on incrementing counters On Fri, Nov 11, 2011 at 7:17 AM, Carlos Rolo c.r...@ocom.commailto:c.r...@ocom.com wrote: Also Cassandra logs have lots (as in, several times per second) of this message now: INFO 14:15:25,740 Saved ClusterCassandra-CounterFamily-RowCache (52 items) in 1 ms What does the CLI say the row_cache_save_period_in_seconds for this CF is? -- Tyler Hobbs DataStaxhttp://datastax.com/
Client Timeouts on incrementing counters
Hi, I was having lots of problems with cassandra 0.8.x running OOM. After moving to Cassandra 1.0.x OOM just disappeared, but now, my python client is having troubles incrementing counters. 2/3 of the times it tries to increment a counter it get a Timeout exception. Also incrementing on CLI I get a Null response! I didn't had this problem with 0.8.x. Also Cassandra logs have lots (as in, several times per second) of this message now: INFO 14:15:25,740 Saved ClusterCassandra-CounterFamily-RowCache (52 items) in 1 ms Replacing and adding new columns/rows is functioning perfectly. I'm bagging my head against the wall checking where can I tune Cassandra to get rid of this error! Thx, Carlos Rolo