RE: [External][RELEASE] Apache Cassandra 5.0.1 released

2024-10-01 Thread Jiri Steuer (EIT)
Hi all, only dummy question, sorry for that. Is it enough to open only issue (e.g. this [CASSANDRA-19971] CRC mismatch - ASF JIRA (apache.org)) or it is useful/necessary to switch the issue to a specific state or assign to relevant role/pe

Re: [EXTERNAL] Cassandra 3.11 - below normal disk read after restart

2024-09-06 Thread Const Eust
RF 6??? Well, the traffic is routing to the other 6 nodes, which likeky can serve the traffic with the super high RF, and the newly restarted node not seeing the traffic until gossip settles? On Fri, Sep 6, 2024, 2:54 PM Jeff Jirsa wrote: > The unfortunate reality here is I don’t think anyone i

Re: [EXTERNAL] Cassandra 3.11 - below normal disk read after restart

2024-09-06 Thread Jeff Jirsa
The unfortunate reality here is I don’t think anyone is going to be able to answer with the data provided. Are the disk IOPS from cassandra reads? Or compaction? Or repair? Do they ramp with client reads (is that curve matching your customer traffic?)? Are they from client data reads or from in

Re: [EXTERNAL] Re: About Cassandra stable version having Java 17 support

2024-03-18 Thread Bowen Song via user
Short answer: There's no definite answer to that question. Longer answer: I doubt such date has already been decided. It's largely driven by the time required to fix known issues and any potential new issues discovered during the BETA and RC process. If you want to track the progress, feel

Re: [EXTERNAL] Re: About Cassandra stable version having Java 17 support

2024-03-18 Thread Divyanshi Kaushik via user
Thanks for your reply. As Cassandra has moved to Java 17 in it's 5.0-BETA1 (Latest release on 2023-12-05). Can you please let us know when the team is planning to GA Cassandra 5.0 version which has Java 17 support? Regards, Divyanshi From: Bowen Song via user S

RE: [EXTERNAL] Re: Running and Managing Large Cassandra Clusters

2020-10-29 Thread Gediminas Blazys
Hey, @Tom I just wanted to clarify when I mention workload in the last email I meant it as the amount of requests the cluster has to serve. Gediminas From: Tom van der Woerdt Sent: Wednesday, October 28, 2020 14:35 To: user Subject: [EXTERNAL] Re: Running and Managing Large Cassandra Clusters

RE: [EXTERNAL] Re: Running and Managing Large Cassandra Clusters

2020-10-28 Thread Gediminas Blazys
Hey, Thanks chipping in Tomas. Could you describe what sort of workload is the big cluster receiving in terms of local C* reads, writes and client requests as well? You mention repairs, how do you run them? Gediminas From: Tom van der Woerdt Sent: Wednesday, October 28, 2020 14:35 To: user

Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect

2020-10-22 Thread João Reis
; which is slightly undesirable. As we could not afford to keep the DC in > this state we have removed it from our cluster. I’m afraid we cannot > provide you with the info you’ve requested. > > > > Gediminas > > > > *From:* João Reis > *Sent:* Tuesday, May 12, 2020

RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect

2020-05-17 Thread Gediminas Blazys
cluster. I’m afraid we cannot provide you with the info you’ve requested. Gediminas From: João Reis Sent: Tuesday, May 12, 2020 19:58 To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect Unfortunately I'm not able to reproduce this.

Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect

2020-05-12 Thread João Reis
ra having to place two replicas on the same rack > maybe placed both the primary and a backup replica on the same node. Hence > a duplicate... > > > > Gediminas > > > > *From:* João Reis > *Sent:* Thursday, May 7, 2020 19:22 > *To:* user@cassandra.apache.org >

RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect

2020-05-08 Thread Gediminas Blazys
t's a theoretical but cassandra having to place two replicas on the same rack maybe placed both the primary and a backup replica on the same node. Hence a duplicate... Gediminas From: João Reis Sent: Thursday, May 7, 2020 19:22 To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Add

Re: [EXTERNAL] Re: Adding new DC results in clients failing to connect

2020-05-07 Thread João Reis
t; > > > null > > null > > 192.168.104.111 > > null > > null > > null > > null > > null > > > > Have a wonderful day 😊 > > > > Gediminas > > > >

RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect

2020-05-04 Thread Gediminas Blazys
@cassandra.apache.org Subject: RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect Hello, Thanks for the reply. Following your advice we took a look at system.local for seed nodes and compared that data with nodetool ring. Both sources contain the same tokens for these specific hosts. Will

RE: [EXTERNAL] Re: Adding new DC results in clients failing to connect

2020-05-04 Thread Gediminas Blazys
Hello, Thanks for the reply. Following your advice we took a look at system.local for seed nodes and compared that data with nodetool ring. Both sources contain the same tokens for these specific hosts. Will continue looking into system.peers. We have enabled more verbosity on the C# driver an

RE: [EXTERNAL] Re: Performance of Data Types used for Primary keys

2020-03-06 Thread Durity, Sean R
I agree. Cassandra already hashes the partition key to a numeric token. Sean Durity From: Jon Haddad Sent: Friday, March 6, 2020 9:29 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Re: Performance of Data Types used for Primary keys It's not going to matter at all. On Fri, Mar 6, 2020, 2

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-05 Thread Hossein Ghiyasi Mehr
There isn't any Rollback in real time systems. It's better to test upgrade sstables and binary on one node. Then - if it was OK, upgrade binary on all nodes then run upgrade sstables one server at a time. OR - If it was OK, upgrade servers (binary+sstables) one by one. *---

RE: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-04 Thread Durity, Sean R
: Erick Ramirez Sent: Tuesday, March 3, 2020 11:35 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Cassandra 3.11.X upgrades Should upgradesstables not be run after every node is upgraded? If we need to rollback then we will not be able to downgrade sstables to older version You can

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-03 Thread Erick Ramirez
> > Should upgradesstables not be run after every node is upgraded? If we need > to rollback then we will not be able to downgrade sstables to older version > You can choose to (a) upgrade the SSTables one node at a time as you complete the binary upgrade, or (b) upgrade the binaries on all nodes

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-03 Thread Anthony Grasso
Manish is correct. Upgrade the Cassandra version of a single node only. If that node is behaving as expected (i.e. is in an Up/Normal state and no errors in the logs), then upgrade the Cassandra version for each node one at a time. Be sure to check that each node is running as expected. Once the C

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-03 Thread manish khandelwal
Should upgradesstables not be run after every node is upgraded? If we need to rollback then we will not be able to downgrade sstables to older version. Regards Manish On Tue, Mar 3, 2020 at 11:26 PM Hossein Ghiyasi Mehr wrote: > It's more safe to upgrade one node before upgrading another node

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-03 Thread Hossein Ghiyasi Mehr
It's more safe to upgrade one node before upgrading another node to avoid down time. After upgrading binary and package, run upgradesstables on candidate node then do it on all cluster nodes one by one. *---* *VafaTech :

RE: [EXTERNAL] Re: IN OPERATOR VS BATCH QUERY

2020-02-21 Thread Durity, Sean R
Batches are for atomicity, not performance. I would do single deletes with a prepared statement. An IN clause causes extra work for the coordinator because multiple partitions are being impacted. So, the coordinator has to coordinate all nodes involved in those writes (up to the whole cluster).

RE: [EXTERNAL] Re: Null values in sasi indexed column

2020-02-21 Thread Durity, Sean R
I would consider building a lookup table instead. Something like: Create table new_lookup ( new-lookup-partition text, existing-key text PRIMARY KEY (new-lookup-partition) ) For me, these are easier to understand and reason through for Cassandra performance and availability. I would use

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-13 Thread Sergio
- Verify that nodetool upgradesstables has completed successfully on all nodes from any previous upgrade - Turn off repairs and any other streaming operations (add/remove nodes) - Nodetool drain on the node that needs to be stopped (seeds first, preferably) - Stop an un-upgraded n

Re: [EXTERNAL] Re: Cassandra Encyrption between DC

2020-02-13 Thread Jai Bheemsen Rao Dhanwada
thank you On Thu, Feb 13, 2020 at 6:30 AM Durity, Sean R wrote: > I will just add-on that I usually reserve security changes as the primary > exception where app downtime may be necessary with Cassandra. (DSE has some > Transitional tools that are useful, though.) Sometimes a short outage is > p

RE: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-13 Thread Durity, Sean R
+1 on nodetool drain. I added that to our upgrade automation and it really helps with post-upgrade start-up time. Sean Durity From: Erick Ramirez Sent: Wednesday, February 12, 2020 10:29 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Cassandra 3.11.X upgrades Yes to the steps. The

RE: [EXTERNAL] Re: Cassandra Encyrption between DC

2020-02-13 Thread Durity, Sean R
I will just add-on that I usually reserve security changes as the primary exception where app downtime may be necessary with Cassandra. (DSE has some Transitional tools that are useful, though.) Sometimes a short outage is preferred over a longer, more-complicated attempt to keep the app up. And

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Erick Ramirez
Yes to the steps. The only thing I would add is to run a nodetool drain before shutting C* down so all mutations are flushed to SSTables and there won't be any commit logs to replay on startup. Also, the usual "backup your cluster and configuration files" boilerplate applies. 😁 >

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Sergio
Should I follow the steps above right? Thanks Erick! On Wed, Feb 12, 2020, 6:58 PM Erick Ramirez wrote: > In case you have an hybrid situation with 3.11.3 , 3.11.4 and 3.11.5 that >> it is working and it is in production what do you recommend? > > > You shouldn't end up in this mixed-version sit

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Erick Ramirez
> > In case you have an hybrid situation with 3.11.3 , 3.11.4 and 3.11.5 that > it is working and it is in production what do you recommend? You shouldn't end up in this mixed-version situation at all. I would highly recommend you upgrade all the nodes to 3.11.5 or whatever the latest version is

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Sergio
Thanks everyone! In case you have an hybrid situation with 3.11.3 , 3.11.4 and 3.11.5 that it is working and it is in production what do you recommend? On Wed, Feb 12, 2020, 5:55 PM Erick Ramirez wrote: > So unless the sstable format has not been changed I can avoid to do that. > > > Just to

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Erick Ramirez
> > So unless the sstable format has not been changed I can avoid to do that. Just to reinforce what Jon and Sean already said, the above assumption is dangerous. It is always best to follow the recommended upgrade procedure and mixed-versions is never a good idea unless you've received instructi

RE: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Durity, Sean R
Ah - I should have looked it up! Thank you for fixing my mistake. Sean Durity -Original Message- From: Michael Shuler Sent: Wednesday, February 12, 2020 3:17 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Cassandra 3.11.X upgrades On 2/12/20 12:58 PM, Durity, Sean R wrote

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Reid Pinchback
e.org" Subject: Re: [EXTERNAL] Cassandra 3.11.X upgrades Message from External Sender Thanks, everyone! @Jon https://lists.apache.org/thread.html/rd18814bfba487824ca95a58191f4dcdb86f15c9bb66cf2bcc29ddf0b%40%3Cuser.cassandra.apache.org%3E<https://urldefense.proofp

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Michael Shuler
On 2/12/20 12:58 PM, Durity, Sean R wrote: Check the readme.txt for any upgrade notes Just a quick correction: NEWS.txt (upgrade (and other important) notes) CHANGES.txt (changelog with JIRAs) This is why we list links to these two files in the release announcements. -- Kind regards, Michael

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Sergio
Thanks, everyone! @Jon https://lists.apache.org/thread.html/rd18814bfba487824ca95a58191f4dcdb86f15c9bb66cf2bcc29ddf0b%40%3Cuser.cassandra.apache.org%3E I have a side response to something that looks to be controversial with the response from Anthony. So is it safe to go to production in a 1TB clus

RE: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Durity, Sean R
>>A while ago, on my first cluster Understatement used so effectively. Jon is a master. On Wed, Feb 12, 2020 at 11:02 AM Sergio mailto:lapostadiser...@gmail.com>> wrote: Thanks for your reply! So unless the sstable format has not been changed I can avoid to do that. Correct? Best, Sergio

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Jon Haddad
A while ago, on my first cluster, I decided to do an upgrade by adding nodes running 1.2 to an existing cluster running version 1.1. This was a bad decision, and at that point I decided to always play it safe and always stick to a single version, and never bootstrap in a node running different ver

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Sergio
Thanks for your reply! So unless the sstable format has not been changed I can avoid to do that. Correct? Best, Sergio On Wed, Feb 12, 2020, 10:58 AM Durity, Sean R wrote: > Check the readme.txt for any upgrade notes, but the basic procedure is to: > >- Verify that nodetool upgradesstabl

RE: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-12 Thread Durity, Sean R
Check the readme.txt for any upgrade notes, but the basic procedure is to: * Verify that nodetool upgradesstables has completed successfully on all nodes from any previous upgrade * Turn off repairs and any other streaming operations (add/remove nodes) * Stop an un-upgraded node (seed

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-11 Thread Erick Ramirez
> > I am seeing some unbalancing and I was worried because I have 256 vnodes > Weird stuff is related to this post where I don't find a match between the > load and du -sh * for the node 10.1.31.60 and I was trying to figure out > the reason, if it was due to the number of vnodes. Out of curiosit

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-11 Thread Sergio
e current "default recommendation" is 32 tokens. But there's a >>>> push for 4 in combination with allocate_tokens_for_keyspace from Jon >>>> Haddad & co (based on a paper from Joe Lynch & Josh Snyder). >>>> >>>> If you're satisfie

Re: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Reid Pinchback
performance hit, and later drag on memory on I/O, than a query model that makes you browse through more data than necessary. From: "Durity, Sean R" Reply-To: "user@cassandra.apache.org" Date: Thursday, February 6, 2020 at 4:24 PM To: "user@cassandra.apache.org&quo

RE: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Durity, Sean R
:10 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Running select against cassandra Abdul, When in doubt, have a query model that immediately feeds you exactly what you are looking for. That’s kind of the data model philosophy that you want to shoot for as much as feasible with C

RE: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Durity, Sean R
From reports on this mailing list, I do not allow materialized views. Sean Durity From: Reid Pinchback Sent: Thursday, February 6, 2020 4:10 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: Running select against cassandra Abdul, When in doubt, have a query model that immediately

Re: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Reid Pinchback
. You might want a composite partition key for having an efficient selection of narrow time ranges. From: Abdul Patel Reply-To: "user@cassandra.apache.org" Date: Thursday, February 6, 2020 at 2:42 PM To: "user@cassandra.apache.org" Subject: Re: [EXTERNAL] Re: Running sele

Re: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Abdul Patel
this is the schema similar to what we have , they want to get user connected - concurrent count for every say 1-5 minutes. i am thinking will simple select will have performance issue or we can go for materialized views ? CREATE TABLE usr_session ( userid bigint, session_usr text,

RE: [EXTERNAL] Re: Running select against cassandra

2020-02-06 Thread Durity, Sean R
Do you only need the current count or do you want to keep the historical counts also? By active users, does that mean some kind of user that the application tracks (as opposed to the Cassandra user connected to the cluster)? I would consider a table like this for tracking active users through ti

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-03 Thread Sergio
u must test, test, TEST! Cheers! >>> >>> On Sat, Feb 1, 2020 at 5:17 AM Arvinder Dhillon >>> wrote: >>> >>>> What is recommended vnodes now? I read 8 in later cassandra 3.x >>>> Is the new recommendation 4 now even in version 3.x (asking for 3.11)? &g

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-03 Thread Sergio
Thanks Erick! Best, Sergio On Sun, Feb 2, 2020, 10:07 PM Erick Ramirez wrote: > If you are after more details about the trade-offs between different sized >> token values, please see the discussion on the dev mailing list: "[Discuss] >> num_tokens default in Cassandra 4.0 >>

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-03 Thread Maxim Parkachov
Hi guys, thanks a lot for useful tips. I obviously underestimated complexity of such change. Thanks again, Maxim. >

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-02 Thread Erick Ramirez
> > If you are after more details about the trade-offs between different sized > token values, please see the discussion on the dev mailing list: "[Discuss] > num_tokens default in Cassandra 4.0 >

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-02 Thread Sergio
AM Arvinder Dhillon >>> wrote: >>> >>>> What is recommended vnodes now? I read 8 in later cassandra 3.x >>>> Is the new recommendation 4 now even in version 3.x (asking for 3.11)? >>>> Thanks >>>> >&g

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-02 Thread Anthony Grasso
ead 8 in later cassandra 3.x >>> Is the new recommendation 4 now even in version 3.x (asking for 3.11)? >>> Thanks >>> >>> On Fri, Jan 31, 2020 at 9:49 AM Durity, Sean R < >>> sean_r_dur...@homedepot.com> wrote: >>> >>>&g

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-01-31 Thread Sergio
tions and expansions. >>> >>> >>> >>> Sean Durity >>> >>> >>> >>> *From:* Anthony Grasso >>> *Sent:* Thursday, January 30, 2020 7:25 PM >>> *To:* user >>> *Subject:* Re: [EXTERNAL] How to reduce vno

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-01-31 Thread Erick Ramirez
dation 4 now even in version 3.x (asking for 3.11)? > Thanks > > On Fri, Jan 31, 2020 at 9:49 AM Durity, Sean R < > sean_r_dur...@homedepot.com> wrote: > >> These are good clarifications and expansions. >> >> >> >> Sean Durity >> >>

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-01-31 Thread Arvinder Dhillon
> *From:* Anthony Grasso > *Sent:* Thursday, January 30, 2020 7:25 PM > *To:* user > *Subject:* Re: [EXTERNAL] How to reduce vnodes without downtime > > > > Hi Maxim, > > > > Basically what Sean suggested is the way to do this without downtime. > > >

RE: [EXTERNAL] How to reduce vnodes without downtime

2020-01-31 Thread Durity, Sean R
These are good clarifications and expansions. Sean Durity From: Anthony Grasso Sent: Thursday, January 30, 2020 7:25 PM To: user Subject: Re: [EXTERNAL] How to reduce vnodes without downtime Hi Maxim, Basically what Sean suggested is the way to do this without downtime. To clarify the, the

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-01-30 Thread Anthony Grasso
Hi Maxim, Basically what Sean suggested is the way to do this without downtime. To clarify the, the *three* steps following the "Decommission each node in the DC you are working on" step should be applied to *only* the decommissioned nodes. So where it say "*all nodes*" or "*every node*" it appli

RE: [EXTERNAL] How to reduce vnodes without downtime

2020-01-30 Thread Durity, Sean R
Your procedure won’t work very well. On the first node, if you switched to 4, you would end up with only a tiny fraction of the data (because the other nodes would still be at 256). I updated a large cluster (over 150 nodes – 2 DCs) to smaller number of vnodes. The basic outline was this: *

Re: [EXTERNAL] Re: sstableloader & num_tokens change

2020-01-27 Thread Voytek Jarnot
Odd. Have you seen this behavior? I ran a test last week, loaded snapshots from 4 nodes to 4 nodes (RF 3 on both ends) and did not notice a spike. That's not to say that it didn't happen, but I think I'd have noticed as I was loading approx 250GB x 4 (although sequentially rather than 4x sstableloa

RE: [EXTERNAL] Re: sstableloader & num_tokens change

2020-01-27 Thread Durity, Sean R
I would suggest to be aware of potential data size expansion. If you load (for example) three copies of the data into a new cluster (because the RF of the origin cluster is 3), it will also get written to the RF of the new cluster (3 more times). So, you could see data expansion of 9x the origin

Re: [EXTERNAL] Re: COPY command with where condition

2020-01-20 Thread Jean Carlo
Hello Nobody has mentioned but you can use spark cassandra connector also. Preferably if your data set is so big that a simple copy to csv cannot handle it Saludos Jean Carlo "The best way to predict the future is to invent it" Alan Kay On Fri, Jan 17, 2020 at 8:11 PM Durity, Sean R wrote:

Re: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names

2020-01-17 Thread Dor Laor
Another option instead of raw sstables is to use the Spark Migrator [1]. It reads a source cluster, can make some transformations (like table/column naming) and writes to a target cluster. It's a very convenient tool, OSS and free of charge. [1] https://github.com/scylladb/scylla-migrator On Fri,

Re: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names

2020-01-17 Thread Erick Ramirez
> > > *In terms of speed, the sstableloader should be faster correct?Maybe the > DSE BulkLoader finds application when you want a slice of the data and not > the entire cake. Is it correct?* There's no real direct comparison because DSBulk is designed for operating on data in CSV or JSON as a rep

Re: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names

2020-01-17 Thread Sergio
Hi everyone, Is the DSE BulkLoader faster than the sstableloader? Sometimes I need to make a cluster snapshot and replicate a Cluster A to a Cluster B with fewer performance capabilities but the same data size. In terms of speed, the sstableloader should be faster correct? Maybe the DSE BulkLo

RE: [EXTERNAL] Re: COPY command with where condition

2020-01-17 Thread Durity, Sean R
sstablekeys (in the tools directory?) can extract the actual keys from your sstables. You have to run it on each node and then combine and de-dupe the final results, but I have used this technique with a query generator to extract data more efficiently. Sean Durity From: Chris Splinter Sent:

RE: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names

2020-01-17 Thread Durity, Sean R
To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names Hi Sean, You got all valid points. Please see my answers below - 1. Reason we want to move from 'A' to 'B' is to get rid

Re: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names

2020-01-17 Thread Ankit Gadhiya
Hi Sean, You got all valid points. Please see my answers below - 1. Reason we want to move from 'A' to 'B' is to get rid of 'A' Azure region completely. 2. Cluster names in 'A' and 'B' are different. 3. DSbulk - Is there anyway I can do online migration? - I still need to get clarity on whethe

Re: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names

2020-01-17 Thread Jeff Jirsa
The migration requirements are impossible given the current state of the database You probably can’t join two distinct clusters without app changes and without downtime unless you’re very lucky (same cluster name, app using quorum but not local quorum, both clusters using NetworkTopologyStrateg

RE: [EXTERNAL] Re: *URGENT* Migration across different Cassandra cluster few having same keyspace/table names

2020-01-17 Thread Durity, Sean R
A couple things to consider: * A separation of apps into their own clusters is typically a better model to avoid later entanglements * Dsbulk (1.4.1) is now available for only open source clusters. It is a great tool for unloading/loading * What data problem are you trying to solve w

RE: [EXTERNAL] Re: Log output when Cassandra is "up"?

2020-01-08 Thread Durity, Sean R
I use a script that calls nodetool info. If nodetool info returns an error (instance isn’t up, on the way up, etc.) then I return that same error code (and I know the node is NOT OK). If nodetool info succeeds, I then parse the output for each protocol to be up. A node can be up, but have gossip

Re: [EXTERNAL] Re: How bottom of cassandra save data efficiently?

2020-01-02 Thread lampahome
Thank you all. I found doc in datastax and it said the compression is default to enabled and set it as LZ4Compressor.

RE: [EXTERNAL] Re: How bottom of cassandra save data efficiently?

2020-01-02 Thread Durity, Sean R
100,000 rows is pretty small. Import your data to your cluster, do a nodetool flush on each node, then you can see how much disk space is actually used. There are different compression tools available to you when you create the table. It also matters if the rows are in separate partitions or you

RE: [EXTERNAL] Re: Facing issues while starting Cassandra

2020-01-02 Thread Durity, Sean R
Any read-only file systems? Have you tried to start from the command line (instead of a service)? Sometimes that will give a more helpful error when start-up can’t complete. If your error is literally what you included, it looks like the executable can’t find the cassandra.yaml file. I will ag

Re: [EXTERNAL] Re: Connection Pooling in v4.x Java Driver

2019-12-11 Thread Caravaggio, Kevin
Hi Alexandre, Thank you for the explanation. I understand that reasoning very well now. Jon, appreciate the link, and will follow up there for this sort of thing then. Thanks, Kevin From: Alexandre Dutra Reply-To: "user@cassandra.apache.org" Date: Wednesday, December 11, 2019 at 3:33 AM To

RE: [EXTERNAL] Migration a Keyspace from 3.0.X to 3.11.2 Cluster which already have keyspaces

2019-12-02 Thread Durity, Sean R
The size of the data matters here. Copy to/from is ok if the data is a few million rows per table, but not billions. It is also relatively slow (but with small data or a decent outage window, it could be fine). If the data is large and the outage time matters, you may need custom code to read fr

RE: [EXTERNAL] Re: Upgrade strategy for high number of nodes

2019-12-02 Thread Durity, Sean R
All my upgrades are without downtime for the application. Yes, do the binary upgrade one node at a time. Then run upgradesstables on as many nodes as your app load can handle (maybe you can point the app to a different DC, while another DC is doing upgradesstables). Upgradesstables doesn’t cause

RE: [EXTERNAL] performance

2019-12-02 Thread Durity, Sean R
I’m not sure this is the fully correct question to ask. The size of the data will matter. The importance of high availability matters. Performance can be tuned by taking advantage of Cassandra’s design strengths. In general, you should not be doing queries with a where clause on non-key columns.

RE: [EXTERNAL] Re: Cassandra 3.11.4 Node the load starts to increase after few minutes to 40 on 4 CPU machine

2019-10-31 Thread Durity, Sean R
There is definitely a resource risk to having thousands of open connections to each node. Some of the drivers have (had?) less than optimal default settings, like acquiring 50 connections per Cassandra node. This is usually overkill. I think 5-10/node is much more reasonable. It depends on your

RE: [EXTERNAL] n00b q re UPDATE v. INSERT in CQL

2019-10-25 Thread Durity, Sean R
Everything in Cassandra is an insert. So, an update and an insert are functionally equivalent. An update doesn't go update the existing data on disk; it is a new write of the columns involved. So, the difference in your scenario is that with the "targeted" update, you are writing less of the col

Re: [EXTERNAL] Cassandra Export error in COPY command

2019-10-24 Thread Hossein Ghiyasi Mehr
I tested dsbulk too. But there are many errors: "[1710949318] Error writing cancel request. This is not critical (the request will eventually time out server-side)." "Forcing termination of Connection[/127.0.0.1:9042-14, inFlight=1, closed=true]. This should not happen and is likely a bug, please

Re: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

2019-10-21 Thread Sergio
>> >>> >>> >>> Sergio, also be aware that -XX:+CMSClassUnloadingEnabled probably >>> doesn’t do anything for you. I believe that only applies to CMS, not >>> G1GC. I also wouldn’t take it as gospel truth that -XX:+UseNUMA is a good >>> t

Re: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

2019-10-21 Thread Jon Haddad
ospel truth that -XX:+UseNUMA is a good >> thing on AWS (or anything virtualized), you’d have to run your own tests >> and find out. >> >> >> >> R >> >> *From: *Jon Haddad >> *Reply-To: *"user@cassandra.apache.org" >> *Date: *Mo

Re: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

2019-10-21 Thread Reid Pinchback
understand and can refine; understanding comes from knowing how to do your own performance monitoring. From: Sergio Reply-To: "user@cassandra.apache.org" Date: Monday, October 21, 2019 at 1:16 PM To: "user@cassandra.apache.org" Subject: Re: [EXTERNAL] Re: GC Tuning https

Re: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

2019-10-21 Thread Sergio
> > R > > *From: *Jon Haddad > *Reply-To: *"user@cassandra.apache.org" > *Date: *Monday, October 21, 2019 at 12:06 PM > *To: *"user@cassandra.apache.org" > *Subject: *Re: [EXTERNAL] Re: GC Tuning > https://thelastpickle.com/blog/2018/04/11/gc-tuning.htm

Re: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

2019-10-21 Thread Reid Pinchback
UMA is a good thing on AWS (or anything virtualized), you’d have to run your own tests and find out. R From: Jon Haddad Reply-To: "user@cassandra.apache.org" Date: Monday, October 21, 2019 at 12:06 PM To: "user@cassandra.apache.org" Subject: Re: [EXTERNAL] Re: GC Tuning ht

Re: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

2019-10-21 Thread Jon Haddad
One thing to note, if you're going to use a big heap, cap it at 31GB, not 32. Once you go to 32GB, you don't get to use compressed pointers [1], so you get less addressable space than at 31GB. [1] https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/ On Mon, Oct 21

RE: [EXTERNAL] Re: GC Tuning https://thelastpickle.com/blog/2018/04/11/gc-tuning.html

2019-10-21 Thread Durity, Sean R
I don’t disagree with Jon, who has all kinds of performance tuning experience. But for ease of operation, we only use G1GC (on Java 8), because the tuning of ParNew+CMS requires a high degree of knowledge and very repeatable testing harnesses. It isn’t worth our time. As a previous writer mentio

Re: [EXTERNAL] Cassandra Export error in COPY command

2019-09-23 Thread Hossein Ghiyasi Mehr
The table has more than 10 M rows. I used COPY command in a cluster with five machine for this table and everything was OK. I took a backup to a single machine using sstableloader. Now I want to extract rows using COPY command but I can't! On Mon, Sep 23, 2019 at 6:30 AM Durity, Sean R wrote: >

RE: [EXTERNAL] Cassandra Export error in COPY command

2019-09-22 Thread Durity, Sean R
Copy command tries to export all rows in the table, not just the ones on the node. It will eventually timeout if the table is large. It is really built for something under 5 million rows or so. Dsbulk (from DataStax) is great for this, if you are a customer. Otherwise, you will probably need to

Re: [EXTERNAL] Re: loading big amount of data to Cassandra

2019-08-06 Thread Amanda Moran
With DataStax bulkloader you can only export from a Cassandra table but not import into Cassandra (only load into DSE cluster). And +1 on the confusing name of batches ... yes it’s for writes but not for loading data. Amanda > On Aug 5, 2019, at 8:14 AM, Durity, Sean R > wrote: > > DataS

Re: [EXTERNAL] Re: loading big amount of data to Cassandra

2019-08-05 Thread Hiroyuki Yamada
cassandra-loader is also useful because you don't need to create sstables. https://github.com/brianmhess/cassandra-loader Hiro On Tue, Aug 6, 2019 at 12:15 AM Durity, Sean R wrote: > > DataStax has a very fast bulk load tool - dsebulk. Not sure if it is > available for open source or not. In my

RE: [EXTERNAL] Re: loading big amount of data to Cassandra

2019-08-05 Thread Durity, Sean R
DataStax has a very fast bulk load tool - dsebulk. Not sure if it is available for open source or not. In my experience so far, I am very impressed with it. Sean Durity – Staff Systems Engineer, Cassandra -Original Message- From: p...@xvalheru.org Sent: Saturday, August 3, 2019 6:06 A

Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-29 Thread Jai Bheemsen Rao Dhanwada
streaming protocol between > nodes. > > > > > > Sean Durity – Staff Systems Engineer, Cassandra > > > > *From:* Alok Dwivedi > *Sent:* Friday, July 26, 2019 3:21 PM > *To:* user@cassandra.apache.org > *Subject:* Re: [EXTERNAL] Apache Cassandra upgrade path &

Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-27 Thread Romain Hardouin
ystems Engineer, Cassandra   From: Alok Dwivedi Sent: Friday, July 26, 2019 3:21 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Apache Cassandra upgrade path   Hi Sean The recommended practice for upgrade is to explicitly control protocol version in your application during upgrade proc

Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Jai Bheemsen Rao Dhanwada
between > nodes. > > > > > > Sean Durity – Staff Systems Engineer, Cassandra > > > > *From:* Alok Dwivedi > *Sent:* Friday, July 26, 2019 3:21 PM > *To:* user@cassandra.apache.org > *Subject:* Re: [EXTERNAL] Apache Cassandra upgrade path > > &g

RE: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Durity, Sean R
This would handle client protocol, but not streaming protocol between nodes. Sean Durity – Staff Systems Engineer, Cassandra From: Alok Dwivedi Sent: Friday, July 26, 2019 3:21 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Apache Cassandra upgrade path Hi Sean The recommended

Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Alok Dwivedi
Hi Sean The recommended practice for upgrade is to explicitly control protocol version in your application during upgrade process. Basically the protocol version is negotiated on first connection and based on chance it can talk to an already upgraded node first which means it will negotiate a highe

Re: [EXTERNAL] Apache Cassandra upgrade path

2019-07-26 Thread Jai Bheemsen Rao Dhanwada
Thanks Sean, In my use case all my clusters are multi DC, and I am trying my best effort to upgrade ASAP, however there is a chance since all machines are VMs. Also my key spaces are not uniform across DCs. some are replicated to all DCs and some of them are just one DC, so I am worried there. Is

  1   2   3   4   >