Re: TimedOutException caused by

2012-07-24 Thread J . Amding
ig per node,default configuration (which means 1/3 heap for memtable), replicate number 3, write all, read one. > When I run stress load testing, I got this TimedOutException, and some operation failed, and all traffic hang for a while.  > > And when I have 1G memory 32 bit cassandra

Re: Answering the question of Cassandra Summit 2017

2017-08-11 Thread Ben Bromhead
t. > > There it is. Fire away with your questions, comments. All I ask is keep it > respectful because this is a community of amazing people. You have changed > the world over these years and I know it won’t stop. You know I got a hug > for you wherever we just happen to meet.

Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
GC Heap Size: 16 GB Large SSTables: 50 GB to 300+ GB We see the daemons crash after some back-to-back long GCs (1.5 to 3.5 seconds). Note that we had set the target for GC pauses to be 200 ms We have been somewhat able to tame the crashes by updating the TWCS compaction propert

Re: Tuning cassandra (compactions overall)

2012-05-21 Thread Alain RODRIGUEZ
or 16 GB of memory is so different. > > Less frequent and less aggressive garbage collection frees up CPU resources > to run the database. > > Less memory results in frequent and aggressive (i.e. stop the world) GC, and > increase IO pressure. Which reduces read performance and

Re: Read Latency Degradation

2010-12-18 Thread Peter Schuller
ormal operation? > I have considered dropping the heap down to 8gb, but having pained through > many cmf in the past I thought the larger heap should help prevent the stop > the world gc. I'm not sure what got merged to 0.6.8, but you may way want to grab the JVM options from th

Re: cassandra and G1 gc

2011-03-09 Thread Peter Schuller
potential to completely eliminate even the occasional stop-the-world full GC even after extended runtime. (Keyword being *potential*.) Now, first of all, G1 is still immature compared to CMS. But even if you are in a position that you are willing to trust G1 in some particular JVM version for yo

Re: Upgrade to a different version?

2011-03-16 Thread Jeremy Hanna
e have > taken a step back in relability and scalability since so many features > were added without adequate testing. Hopefully, at some point soon, it > will get better and doing a data import job won't take a cassandra > cluster to it's knees or we won't experience stop

Re: Data model design question

2010-11-20 Thread Peter Schuller
>  Our team decided to use Cassandra as storage solution to a dataset. > I am very new to the NoSQL world and Cassandra so I am hoping to get > some help from the community: The dataset is pretty simple, we have > for each key a number of columns with values. Each day we compute a >

Re: Cluster status instability

2015-04-08 Thread Erik Forsberg
having when deploying C* on AWS EC2 (keyword to >> look for: phi_convict_threshold) >> >> 2. Heavy load. Node is under heavy load because of massive number of >> reads / writes / bulkloads or e.g. unthrottled compaction etc., which may >> result in extensive GC. &g

Re: scylladb

2017-03-11 Thread Dor Laor
>> > > F*or my workload and probably others I had seen Cassandra was always > been CPU bound.* > Could be. However, try to make it CPU bound on 10 core, 20 core and more. The more core you use, the less nodes you need and the overall overhead decreases. > >&g

Re: scylladb

2017-03-12 Thread Avi Kivity
hine you give it, the happier it will be. There are other factors, like NUMA-friendliness, but in the end it all boils down to efficiency and control. None of this is new btw, it's pretty common in the storage world. Avi On 03/11/2017 11:18 PM, Kant Kodali wrote:

Re: My cluster shows high system load without any apparent reason

2016-07-22 Thread Mark Rose
m for around 18-24 hours. >> - No or very little IO-wait. >> - top shows that around 3-10 threads are running on high cpu, but that >> alone should not cause a load of 20-30. >> - Doesn't seem to be GC load: A system starts to show symptoms so that it >> has ran o

Re: My cluster shows high system load without any apparent reason

2016-07-22 Thread Ryan Svihla
nd 250-300 reads per second and around 200 > > > writes per second. > > > - Restarting node fixes the problem for around 18-24 hours. > > > - No or very little IO-wait. > > > - top shows that around 3-10 threads are running on high cpu, but that > > > a

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-10 Thread horschi
aven't been following this thread closely, perhaps I have > missed what exactly it is you're trying to do. > > It's possible you'd need to : > > 1) join the node with auto_bootstrap=false > 2) immediately stop it > 3) re-start it with join_ring=false > > To a

Re: nodetool drain running for days

2016-04-06 Thread Jeff Jirsa
2 is obvious here: if nobody uses it, nobody tests it in the real world to find the bugs not discovered in automated testing. The Datastax folks did some awesome work for 3.0 to extend the unit and distributed tests – they’re MUCH better than they were in 2.2, so hopefully there are fewer surprise bu

Re: Tuning cassandra (compactions overall)

2012-05-22 Thread aaron morton
/5/17 aaron morton : >> What is the the benefit of having more memory ? I mean, I don't >> >> understand why having 1, 2, 4, 8 or 16 GB of memory is so different. >> >> Less frequent and less aggressive garbage collection frees up CPU resources >> to

RE: Nodes frozen in GC

2011-03-21 Thread Gregory Szorc
tion between new-gen and > old-gen with CMS, object's aren't being allocated *period* while the "young > generation is being collected" as that is a stop-the-world pause. (This is > also > why I said before that at 10 gig new-gen size, the observed behavior on > y

Re: COUNTER timeout

2021-09-15 Thread Bowen Song
Well, the log says cross node timeout, latency a bit over 44 seconds. Here's a few most likely causes: 1. The clocks are not in sync - please check the time on each server, and ensure NTP client is running on all Cassandra servers 2. Long stop the world GC pauses - please check the GC

Re: COUNTER timeout

2021-09-15 Thread Joe Obernberger
n sync - please check the time on each server, and ensure NTP client is running on all Cassandra servers 2. Long stop the world GC pauses - please check the GC logs and make sure this isn't the case 3. Overload - please monitor the CPU usage and disk IO when timeout happens and make s

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Sylvain Lebresne
t; This is part of a much bigger problem, one which has many parts, >> >> >>> among >> >> >>> them: >> >> >>> >> >> >>> 1. Cassandra is complex. Getting a gestalt understanding of it >> >> >>> makes >>

Re: Reduce Cassandra GC

2013-06-18 Thread Joel Samuelsson
itiatingOccupancyFraction=75" >>> JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly" >>> >>> I haven't changed anything in the environment config up until now. >>> >>> > Also can you take a heap dump at 2 diff points so that

Re: Reduce Cassandra GC

2013-06-18 Thread Takenori Sato
x, where >>>> on idle cluster java memory stay on the same value. >>>> >>>> No I am running Cassandra 1.1.8. >>>> >>>> > Can you paste you gc config? >>>> >>>> I believe the relevant configs are these: >>>

Re: scylladb

2017-03-12 Thread Avi Kivity
NUMA-friendliness, but in the end it all boils down to efficiency and control. None of this is new btw, it's pretty common in the storage world. Avi On 03/11/2017 11:18 PM, Kant Kodali wrote: Here is the Java version http://docs.paralleluniverse.co/quasar/ <http://d

Re: scylladb

2017-03-12 Thread Kant Kodali
ine), and when you profile it, > you see it spending much of its time in atomic operations, or > parking/unparking threads -- fighting with itself. It doesn't scale within > the machine. Scylla will happily utilize all of the cores that it is > assigned (all of them by de

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-10 Thread Samuel CARRIERE
d. > > I admit I haven't been following this thread closely, perhaps I have > missed what exactly it is you're trying to do. > > It's possible you'd need to : > > 1) join the node with auto_bootstrap=false > 2) immediately stop it > 3) re-start it with join_

Re: Is it possible to bootstrap the 1st node of a new DC?

2015-09-10 Thread horschi
s. > > > > regards, > > Christian > > > > On Tue, Sep 8, 2015 at 11:51 PM, Robert Coli > wrote: > > On Tue, Sep 8, 2015 at 2:39 PM, horschi wrote: > > I tried to set up a new node with join_ring=false once. In my test > > that node did not pick a token i

Re: GC pauses affecting entire cluster.

2015-06-01 Thread graham sanderson
uster are marked down. > > Thanks > Anuj Wadehra > > > > On Monday, 1 June 2015 7:12 PM, Carl Hu <mailto:m...@carlhu.com>> wrote: > > > We are running Cassandra version 2.1.5.469 on 15 nodes and are experiencing a > problem where the entire cluster

Re: Finding bottleneck of a cluster

2012-07-05 Thread aaron morton
30%, cassandra is not disk bound(insignificant > read operations and cpu's iowait is around 0.05%) and is not swapping > its memory(around 15 gb RAM is free or inactive). The average gc pause > time for parnew are 100ms occuring every second. So cassandra spends > 10% of its time s

Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-28 Thread Peter Schuller
, the more efficient the GC (if the application behaves according to the weak generational hypothesis - google it if you want a ref) because less data is promoted to old gen and because the overhead of stop-the-world is lessened. (3) The larger the young generation, the longer the pause times to

Re: Reduce Cassandra GC

2013-06-18 Thread Joel Samuelsson
romotion failure". Bingo if it happened at the time. >>> >>> Otherwise, post the relevant portion of the log here. Someone may find a >>> hint. >>> >>> >>> On Mon, Jun 17, 2013 at 5:51 PM, Joel Samuelsson < >>> samuelsson.j..

Re: Reduce Cassandra GC

2013-06-18 Thread Mohit Anchlia
[ScheduledTasks:1] 2013-06-17 08:13:47,520 StatusLogger.java (line >>> 116) OpsCenter.settings0,0 >>> >>> And from gc-1371454124.log I get: >>> 2013-06-17T08:11:22.300+0000: 2551.288: [GC 870971K->216494K(4018176K), >>> 145.188

Re: Random slow read times in Cassandra

2017-03-17 Thread daemeon reiydelle
check for level 2 (stop the world) garbage collections. *...* *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* On Fri, Mar 17, 2017 at 11:51 AM, Chuck Reynolds wrote: > I have a large Cassandra 2.1.13 ring (60 nodes) in AWS that has > consistently rando

Re: cassandra OOM

2017-04-03 Thread Gopal, Dhruva
# It is recommended to set min (-Xms) and max (-Xmx) heap sizes to # the same value to avoid stop-the-world GC pauses during resize, and # so that we can lock the heap in memory on startup to prevent any # of it from being swapped out. -Xms16G -Xmx16G # Young generation size is automatically calcu

Re: Finding bottleneck of a cluster

2012-07-05 Thread rohit bhatia
> idle time is around 30%, cassandra is not disk bound(insignificant > read operations and cpu's iowait is around 0.05%) and is not swapping > its memory(around 15 gb RAM is free or inactive). The average gc pause > time for parnew are 100ms occuring every second. So cassandra s

Re: Read Latency Degradation

2010-12-18 Thread Wayne
other concurrent activity, that smells of something being wrong > to me. > > Otherwise, is your primary concern worse latency/throughput during > compactions/repairs, or just the overall throughput/latency during > normal operation? > > > I have considered dropping the heap do

Re: Changing the CLI, not a great idea!

2011-07-28 Thread Edward Capriolo
gt;> >> wrote: > >> >> >>> This is part of a much bigger problem, one which has many parts, > >> >> >>> among > >> >> >>> them: > >> >> >>> > >> >> >>> 1. Cassandr

Re: Cassandra Database Modeling

2011-04-14 Thread aaron morton
have > no idea what you've put in the columns other than bytes . It really depends > on how much data you have per pair, but generally it's easier to pull back > more data than try to get exactly what you need. Downside is you have to > update all the data. > &g

Re: Debugging high coordinator latencies

2018-07-04 Thread Alain RODRIGUEZ
p -n 20 -o CPU # On my mac/ccm test cluster I ran something like this: java -jar sjk-0.10.1.jar ttop -p $(ps u | grep cassandra | grep -v grep | awk '{print $2}' | head -n 1) -n 25 -o CPU Anything else I can do to conclude whether this is GC related or not ? > In most cases, it is

Re: My cluster shows high system load without any apparent reason

2016-07-22 Thread Juho Mäkinen
eads per second and around 200 > >> writes per second. > >> - Restarting node fixes the problem for around 18-24 hours. > >> - No or very little IO-wait. > >> - top shows that around 3-10 threads are running on high cpu, but that > >> alone should not c

Re: cassandra OOM

2017-04-03 Thread Alexander Dejanovski
hat's the case, uncomment the -Xmx and Xms options below to override > the > > # automatic calculation of JVM heap memory. > > # > > # It is recommended to set min (-Xms) and max (-Xmx) heap sizes to > > # the same value to avoid stop-the-world GC pauses during resiz

Re: Finding bottleneck of a cluster

2012-07-05 Thread rohit bhatia
trying to figure out the bottleneck, >> >> 1) Is using JDK 1.7 any way detrimental to cassandra? >> >> 2) What is the max write operation qps that should be expected. Is the >> netflix benchmark also applicable for counter incrmenting tasks? >> >> http:

Re: No node was available to execute query error

2021-03-12 Thread Joe Obernberger
On 3/12/2021 9:07 AM, Bowen Song wrote: Millions rows in a single query? That sounds like a bad idea to me. Your "NoNodeAvailableException" could be caused by stop-the-world GC pauses, and the GC pauses are likely caused by the query itself. On 12/03/2021 13:39, Joe Obernberger wro

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
s.hasNext()) where itRs is an iterator over a select query from another table.  I'm iterating over a result set from a select and inserting those results via executeAsync. -Joe On 3/12/2021 9:07 AM, Bowen Song wrote: Millions rows in a single query? That sounds like a bad idea to me.

Re: scylladb

2017-03-11 Thread Edward Capriolo
On Sat, Mar 11, 2017 at 9:41 PM, daemeon reiydelle wrote: > Recall that garbage collection on a busy node can occur minutes or seconds > apart. Note that stop the world GC also happens as frequently as every > couple of minutes on every node. Remove that and do the simple arithmetic. &

Re: scylladb

2017-03-12 Thread Bhuvan Rawal
achine? > Cassandra generally doesn't (on a larger machine), and when you profile it, > you see it spending much of its time in atomic operations, or > parking/unparking threads -- fighting with itself. It doesn't scale within > the machine. Scylla will happily utilize all of th

Re: scylladb

2017-03-12 Thread Bhuvan Rawal
arity is on the >>> order of milliseconds. User-level scheduling, because it leaves control in >>> the hand of the application, allows you to both saturate the CPU and >>> maintain low latency. >>> >> >> F*or my workload and probably others I had

Re: Cassandra Database Modeling

2011-04-14 Thread csharpplusproject
that would return all 10M pairs ? Or would the queries generally want to get some small fraction of the data set? Again, depends on how the sim runs. If you sim has stop the world pauses were you have a full view of the data space, then you could grab all the points at a certain distance and effici

Re: concurrent_compactors via JMX

2018-07-17 Thread Alain RODRIGUEZ
e Collection* activity. As Cassandre uses the JVM, a badly tuned GC will induce long pauses. Depending on the workload, and I must say for most of the cluster I work on, default the tuning is not that good and can keep server busy 10-15% of the time with stop the world GC. You might find this post

Re: My cluster shows high system load without any apparent reason

2016-07-22 Thread Mark Rose
> 3.13.0-87-generic >> >> on HVM virtualisation. Cluster has 26 TiB of data stored in two tables. >> >> >> >> Symptoms: >> >> - High load, sometimes up to 30 for a short duration of few minutes, >> >> then >> >> the load

Re:Re: Cassandra Tuning Issue

2015-12-06 Thread xutom
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" And I check the gc log: gc.log.0.current, I found there is only one Full GC. The stop-the-world times is low. CMS-initial-mark: 0.2747280 secs CMS-remark: 0.3623090 secs

RE: nodetool drain running for days

2016-04-06 Thread Paco Trujillo
obody tests it in the real world to find the bugs not discovered in automated testing. The Datastax folks did some awesome work for 3.0 to extend the unit and distributed tests – they’re MUCH better than they were in 2.2, so hopefully there are fewer surprise bugs in 3+, but there’s bound to be

Re: Failed disks - correct procedure

2023-01-17 Thread Joe Obernberger
I come from the hadoop world where we have a cluster with probably over 500 drives.  Drives fail all the time; or well several a year anyway.  We remove that single drive from HDFS, HDFS re-balances, and when we get around to it, we swap in a new drive, format it, and add it back to HDFS.  We

Re: scylladb

2017-03-12 Thread Kant Kodali
F*or my workload and probably others I had seen Cassandra was always > been CPU bound.* > >> >> > > Yes, but does it consume 100% of all of the cores on your machine? > Cassandra generally doesn't (on a larger machine), and when you profile it, > you see it spending

Re: Pending compactions not going down on some nodes of the cluster

2016-03-25 Thread Alain RODRIGUEZ
ith no compaction. The second one is mainstream it is present in Cassandra, TWCS is not, but has been built to work around some DTCS issues and is heavy usage and maintained by Jeff for now. I drain the node, the compactions completely stop after a few minutes > (like they would normally do o

Re: Cassandra Database Modeling

2011-04-15 Thread Aaron Morton
your query, it depends on how big a slice you want to get how time > critical it is. e.g. Could you be making queries that would return all 10M > pairs ? Or would the queries generally want to get some small fraction of the > data set? Again, depends on how the sim runs. > > If y

Re: Re: Cassandra Tuning Issue

2015-12-06 Thread Jack Krupansky
X:+CMSParallelRemarkEnabled" > And I check the gc log: gc.log.0.current, I found there is only one > Full GC. The stop-the-world times is low. > CMS-initial-mark: 0.2747280 secs > CMS-remark: 0.3623090 secs > > The insert codes in my test client are following: >

Re: cassandra OOM

2017-04-04 Thread Gopal, Dhruva
heap memory. # # It is recommended to set min (-Xms) and max (-Xmx) heap sizes to # the same value to avoid stop-the-world GC pauses during resize, and # so that we can lock the heap in memory on startup to prevent any # of it from being swapped out. -Xms16G -Xmx16G # Young generation size is a

Re: No deletes - is periodic repair needed? I think not...

2014-02-06 Thread Alain RODRIGUEZ
Hi, In a distributed system, such as Cassandra, things can happen (node down, stop the world GC, hardware issue, ...) and desynchronize replicas, isn't repairing also a needed operation to keep replicas up to date at least once a week or once a month ? It is a strong and reliable process to

Re: Intermittent long application pauses on nodes

2014-02-17 Thread Ondřej Černoš
Hi, we tried to switch to G1 because we observed this behaviour on CMS too (27 seconds pause in G1 is quite an advise not to use it). Pauses with CMS were not easily traceable - JVM stopped even without stop-the-world pause scheduled (defragmentation, remarking). We thought the go-to-safepoint

Re: 99.999% uptime - Operations Best Practices?

2011-06-22 Thread C. Scott Andreas
ra's behavior itself; documented APIs such as querying for the count of all columns associated with a key will materialize the row across all nodes being queried. Once when issuing a "count" query for a key that had around 300k columns at CL.QUORUM, we knocked three nodes out of

RE: Adding nodes

2022-07-11 Thread Marc Hoppins
which is unlikely to cause problems? After all, in the modern world of big (how big is big?) data, 600G per node is far less than the real BIG big-data. Marc From: Jeff Jirsa Sent: Friday, July 8, 2022 5:46 PM To: cassandra Cc: Bowen Song Subject: Re: Adding nodes EXTERNAL Having a node UJ

Re: Pending compactions not going down on some nodes of the cluster

2016-03-21 Thread Gianluca Borello
compactors and even putting the compaction throughput to unlimited, and nothing changes. Again, the key here to remember is that if I drain the node, the compactions completely stop after a few minutes (like they would normally do on another "healthy" node), it's just the "pe

Re: Bootstrap failure

2014-02-05 Thread Keith Wright
I did find a stop the world GC on one of the non-bootstrapping nodes during one of our previous bootstrap failures (see below) as well as the flags being passed to the java process. Perhaps this is just a GC tuning issue? >From what I’ve read, bootstrap is supposed to be a light wei

Re: Intermittent long application pauses on nodes

2014-02-27 Thread Frank Ng
d other native work shouldn't in >>> anyway inhibit GC activity or other safepoint pause times, unless there's a >>> bug in the VM. These threads will simply enter a safepoint as they return >>> to the VM execution context, and are considered safe for the dur

Re: Intermittent long application pauses on nodes

2014-02-27 Thread Benedict Elliott Smith
t;>> GC) >>>> >>>> Note that mmapped file accesses and other native work shouldn't in >>>> anyway inhibit GC activity or other safepoint pause times, unless there's a >>>> bug in the VM. These threads will simply enter a safepoint as the

Re: Intermittent long application pauses on nodes

2014-02-27 Thread Frank Ng
bly that paging is entirely disabled, as the bloom filters and >>>>> other memory won't be locked, although that shouldn't cause pauses during >>>>> GC) >>>>> >>>>> Note that mmapped file accesses and other native work shouldn't in

Re: Failed disks - correct procedure

2023-01-17 Thread C. Scott Andreas
lost.Repairing surviving replicas before bootstrapping a replacement node is necessary to avoid this.— ScottOn Jan 17, 2023, at 7:28 AM, Joe Obernberger wrote: I come from the hadoop world where we have a cluster with probably over 500 drives.  Drives fail all the time; or well

Re: concurrent_compactors via JMX

2018-07-18 Thread Chris Lohfink
log/cassandra/system.log) > - Check local latencies (nodetool tablestats / nodetool tablehistogram) and > compare it to the client request latency. At the node level, reads should > probably be a single digit in milliseconds, rather close to 1 ms with SSDs > and writes below the millisec

Re: My cluster shows high system load without any apparent reason

2016-07-23 Thread Juho Mäkinen
gt;> shown by uptime) without any apparent reason and I'm not quite sure > >> >> what > >> >> could be causing it. > >> >> > >> >> We are running on i2.4xlarge, so we have 16 cores, 120GB of ram, four > >> >>

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez
with the latest Java 8 from Oracle. Do *not* set the newgen size for G1 sets it dynamically: # min and max heap sizes should be set to the same value to avoid > # stop-the-world GC pauses during resize, and so that we can lock the > # heap in memory on startup to prevent any of it from be

Re: too many full gc in one node of the cluster

2015-11-13 Thread Jeff Jirsa
(especially from the read path) in young gen rather than promoted to old gen, so they’ll be dropped quickly rather than promoted/cleaned with stop-the-world phase of CMS. It’s often the case that increasing Xmn and setting XX:MaxTenuringThreshold=6 or 8 will reduce the number of long CMS pauses

Re:Re: Re: Cassandra Tuning Issue

2015-12-08 Thread xutom
default for such as: # GC tuning options JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" And I check the gc log: gc.log.0.current, I found there is only one Full GC. The stop-the-w

Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Jack Krupansky
t;> # GC tuning options >> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" >> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" >> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" >> And I check the gc log: gc.log.0.current, I found there

Re:

2015-01-07 Thread Ryan Svihla
a separate table what shardIds there are for that particular status (this is not uncommon) - it can be a fixed size say 1 and you can just increment the number by 1 (but make sure as you're updating this you're not introducing any fun state bugs that have to different shards writing

RE: Adding nodes

2022-07-11 Thread Marc Hoppins
all, in the modern world of big (how big is big?) data, 600G per node is far less than the real BIG big-data. Marc From: Jeff Jirsa mailto:jji...@gmail.com>> Sent: Friday, July 8, 2022 5:46 PM To: cassandra mailto:user@cassandra.apache.org>> Cc: Bowen Song mailto:bo...@bso.ng>> S

Re: Intermittent long application pauses on nodes

2014-02-27 Thread Benedict Elliott Smith
you made certain the VM memory is >>>>>> locked >>>>>> (and preferably that paging is entirely disabled, as the bloom filters >>>>>> and >>>>>> other memory won't be locked, although that shouldn't cause paus

Re: scylladb

2017-03-12 Thread Avi Kivity
x27;t scale within the machine. Scylla will happily utilize all of the cores that it is assigned (all of them by default in most configurations), and the bigger the machine you give it, the happier it will be. There are other factors, like NUMA-friendliness, but in the

Re: My cluster shows high system load without any apparent reason

2016-07-25 Thread Mark Rose
>> >> I just recently upgraded our cluster to 2.2.7 and after turning the >> >> >> cluster under production load the instances started to show high >> >> >> load >> >> >> (as >> >> >> shown by uptime) without any ap

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Kunal Gangakhedkar
se it with the latest Java 8 from Oracle. Do *not* set the > newgen size for G1 sets it dynamically: > > # min and max heap sizes should be set to the same value to avoid >> # stop-the-world GC pauses during resize, and so that we can lock the >> # heap in memory on startup to p

Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Anuj Wadehra
PTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"     And I check the gc log: gc.log.0.current, I found there is only one Full GC. The stop-the-world times is low. CMS-initial-mark: 0.2747280 secs CMS-remark: 0.3623090 secs     The insert codes in my test client are following:   

Re:Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread xutom
uot; And I check the gc log: gc.log.0.current, I found there is only one Full GC. The stop-the-world times is low. CMS-initial-mark: 0.2747280 secs CMS-remark: 0.3623090 secs The insert codes in my test client are following: String content = RandomStringUt

Re: Pending compactions not going down on some nodes of the cluster

2016-03-21 Thread Fabrice Facorat
t; - This theory doesn't seem to explain why, when doing "nodetool drain", the > compactions completely stop after a few minutes and I get something such as: > > $ nodetool compactionstats > pending tasks: 128 > > So no compactions being executed (since there is no mo

Re: Intermittent long application pauses on nodes

2014-02-17 Thread Benedict Elliott Smith
is quite an advise not to use it). Pauses with CMS were > not easily traceable - JVM stopped even without stop-the-world pause > scheduled (defragmentation, remarking). We thought the go-to-safepoint > waiting time might have been involved (we saw waiting for safepoint > resolution) - e

Re: 99.999% uptime - Operations Best Practices?

2011-06-22 Thread Les Hazlewood
ring happy even under load. We've not found GC tuning to bring > night-and-day differences outside of resolving the STW collections, but the > difference is noticeable. > > Occasionally, these issues will result from Cassandra's behavior itself; > documented APIs such as quer

Re: Read Latency Degradation

2010-12-18 Thread Wayne
that said, the practical reality with Cassandra is not the > theoretical optimum w.r.t. caching. Some things to consider: > > (1) Whatever memory you give the JVM is going to be a direct trade-off > in the sense that said memory is no longer available for the operating > system pa

Re: Adding nodes

2022-07-11 Thread Bowen Song via user
des. If our LOAD per node is around 400-600GB, is there any practical method to speed up adding multiple new nodes which is unlikely to cause problems?  After all, in the modern world of big (how big is big?) data, 600G per node is far less than the real BIG big-data. Marc *From:*Jeff Jirsa

Re: Failed disks - correct procedure

2023-01-23 Thread Joe Obernberger
pping a replacement node is necessary to avoid this. — Scott On Jan 17, 2023, at 7:28 AM, Joe Obernberger wrote:  I come from the hadoop world where we have a cluster with probably over 500 drives.  Drives fail all the time; or well several a year anyway.  We remove that single drive

Re: scylladb

2017-03-12 Thread Kant Kodali
gt; >>> If you run too few threads, then you will not be able to saturate the >>> CPU resources. This is a common problem with Cassandra -- it's very hard >>> to get it to consume all of the CPU power on even a moderately large >>> machine. On the other ha

Re: scylladb

2017-03-12 Thread Avi Kivity
cores that it is assigned (all of them by default in most configurations), and the bigger the machine you give it, the happier it will be. There are other factors, like NUMA-friendliness, but in the end it all boils down to efficiency and control.

Re: scylladb

2017-03-11 Thread daemeon reiydelle
Recall that garbage collection on a busy node can occur minutes or seconds apart. Note that stop the world GC also happens as frequently as every couple of minutes on every node. Remove that and do the simple arithmetic. sent from my mobile Daemeon Reiydelle skype daemeon.c.m.reiydelle USA

Re: Cassandra OOM on joining existing ring

2015-07-10 Thread Sebastian Estevez
ely (tl;dr G1GC is much simpler than CMS and almost as good as a finely >> tuned CMS). *Note:* Use it with the latest Java 8 from Oracle. Do *not* >> set the newgen size for G1 sets it dynamically: >> >> # min and max heap sizes should be set to the same value to avoid

Re: Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Jack Krupansky
t;>> Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79, my >>> keyspace replication factor is 2,and I do enable the "token aware". The GC >>> configuration is default for such as: >>> # GC tuning options >>> JVM_OPT

Re: Re:Re: Re: Re: Cassandra Tuning Issue

2015-12-08 Thread Anuj Wadehra
I do enable the "token aware". The GC configuration is default for such as: # GC tuning options JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"     And I check the gc log: gc.log.0

RE: cassandra OOM

2017-04-25 Thread Durity, Sean R
92MB # - pick the max # # For production use you may wish to adjust this for your environment. # If that's the case, uncomment the -Xmx and Xms options below to override the # automatic calculation of JVM heap memory. # # It is recommended to set min (-Xms) and max (-Xmx) heap sizes to # the same value

Re: cassandra OOM

2017-04-25 Thread Carlos Rolo
t; > # - pick the max > > # > > # For production use you may wish to adjust this for your environment. > > # If that's the case, uncomment the -Xmx and Xms options below to override > the > > # automatic calculation of JVM heap memory. > > # > > # It is re

Re: cassandra OOM

2017-04-26 Thread Jean Carlo
ulated by cassandra-env based on this > > # formula: max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB)) > > # That is: > > # - calculate 1/2 ram and cap to 1024MB > > # - calculate 1/4 ram and cap to 8192MB > > # - pick the max > > # > > # For production use you ma

Re: Intermittent long application pauses on nodes

2014-02-20 Thread Joel Samuelsson
2014 12:30, Ondřej Černoš wrote: > >> Hi, >> >> we tried to switch to G1 because we observed this behaviour on CMS too >> (27 seconds pause in G1 is quite an advise not to use it). Pauses with CMS >> were not easily traceable - JVM stopped even without stop-th

Re: Intermittent long application pauses on nodes

2014-02-21 Thread Joel Samuelsson
red safe for the duration they are >> outside. >> >> >> >> >> On 17 February 2014 12:30, Ondřej Černoš wrote: >> >>> Hi, >>> >>> we tried to switch to G1 because we observed this behaviour on CMS too >>> (27 seconds p

Re: underutilized servers

2021-03-05 Thread Attila Wind
age is often an indicator of bad table schema design (e.g.: large partitions) or bad query (e.g. without partition key). Check the Cassandra logs first, is there any long stop-the-world GC? tombstone warning? anything else that's out of ordinary? Check the output from "nodetool tpstats&

Re: Adding nodes

2022-07-11 Thread Bowen Song via user
ikely to cause problems?  After all, in the modern world of big (how big is big?) data, 600G per node is far less than the real BIG big-data. Marc *From:*Jeff Jirsa *Sent:* Friday, July 8, 2022 5:46 PM *To:* cassandra *Cc:* Bowen Song *Subject:* Re: Adding nodes EXTERNAL Having a node UJ

<    1   2   3   4   >