(sorry for the delay in following up on this thread)

>  Actually, there's a question - is it 'acceptable' do you think
>  for GC to take out a small number of your nodes at a time,
>  so long as the bulk (or at least where RF is > nodes gone
>  on STW GC) of the nodes are okay?  I suspect this is a
>  question peculiar to Amazon EC2, as I've never seen a box
>  rendered non-communicative by a single core flat-lining.

Well, first of all I still find it very strange that GC takes nodes
down at all, unless one is specifically putting sufficiant CPU load on
the cluster that e.g. concurrent GC causes a problem. But in
particular if you're still seeing those crazy long GC pause times
still, IMO something is severaly wrong and I would not personally
recommend going to production with that unresolved since whatever the
cause is, may suddenly start having other effects. Severely long
ParNew pause times are really not expected; the only two major reasons
I can think of, at least when running on real hardware, and barring
JVM bugs, are (1) swapping, and (2) possibly extreme performance
penalties associated with a very full old generation in which case the
solution is "larger heap". I don't remember whether you indicated any
heap statistics so I'm not sure whether (2) is a possibility. But I
would expect OutOfMemory errors long before a ParNew takes 300+
*seconds*, just out of JVM policies w.r.t. acceptable GC efficiency.

Bottom line: 300+ seconds for a ParNew collections is *way way way*
out there. 300 *milli*-seconds is more along the lines of what one
might expect (usually lower than that). Even if you can seemingly
lessen the impact by using the throughput collector, I wouldn't be
comfortable with shrugging off whatever is happening.

That said, in terms of the effects on the cluster: I have not had much
hands-on experience with this, but I believe you'd expect a definite
visible from the point of view of clients. Cassandra is not optimized
for instantly detecting slow nodes and transparently working around
them with zero impact on clients; I don't think it is recommended to
be running a cluster with nodes regularly bouncing in and our, for
whatever reason, if it can be avoided.

Not sure what to say, other than to strongly recommend getting to the
bottom of this problem which seems non-specific to Cassandra, before
relying on the system in a production setting. The extremity of the
issues you're seeing are far beyond what I would ever expect even
allowing for "who knows what EC2 is doing or what other people are
running on the machine", except for the hypothesis that they
over-commit memory and the extreme latencies are due to swapping. But
if that is what is happening, that just tells me that EC2 is unusable
for this type of thing, but I still think it's far fetched since the
impact should be significant on a great number of customers of theirs.

I forget and I didn't find it by brief sifting through thread history;
were you running on small EC2 instances or larger ones?

-- 
/ Peter Schuller

Reply via email to