hello,
we are quite inexperienced with cassandra at the moment and are playing
around with a new cluster we built up for getting familiar with
cassandra and its possibilites.
while getting familiar with that topic we recognized that repairs in
our cluster take a long time. To get an idea of our c
gs, load of the "old nodes" of
the cluster.
This is a quite individual problem you have to track down individually.
2017-03-17 22:07 GMT+01:00 Roland Otta
mailto:roland.o...@willhaben.at>>:
hello,
we are quite inexperienced with cassandra at the moment and are playing
around wit
... maybe i should just try increasing the job threads with --job-threads
shame on me
On Fri, 2017-03-17 at 21:30 +, Roland Otta wrote:
forgot to mention the version we are using:
we are using 3.0.7 - so i guess we should have incremental repairs by default.
it also prints out
did not recognize that so far.
thank you for the hint. i will definitely give it a try
On Fri, 2017-03-17 at 22:32 +0100, benjamin roth wrote:
The fork from thelastpickle is. I'd recommend to give it a try over pure
nodetool.
2017-03-17 22:30 GMT+01:00 Roland Otta
mailto:rol
ummit-2016
From: Roland Otta
Date: Friday, March 17, 2017 at 5:47 PM
To: "user@cassandra.apache.org"
Subject: Re: repair performance
did not recognize that so far.
thank you for the hint. i will definitely give it a try
On Fri, 2017-03-17 at 22:32 +0100, benjamin roth wrote:
The fork f
we have a datacenter which is currently used exlusively for spark batch
jobs.
in case batch jobs are running against that environment we can see very
high peaks in blocked native transport requests (up to 10k / minute).
i am concerned because i guess that will slow other queries (in case
other ap
blocked
threads still exist afterwards.
On Mon, 2017-03-20 at 08:55 +0100, benjamin roth wrote:
Did you check STW GCs?
You can do that with 'nodetool gcstats', by looking at the gc.log or observing
GC related JMX metrics.
2017-03-20 8:52 GMT+01:00 Roland Otta
mailto:roland.o...@wi
hi,
we see the following behaviour in our environment:
cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a
replication factor 3.
clients are writing data to the keyspace with consistency one.
we are doing parallel, incremental repairs with cassandra reaper.
even if a repair ju
s use each or local quorum for both.
Chris
On Thu, Mar 30, 2017 at 1:22 AM, Roland Otta
mailto:roland.o...@willhaben.at>> wrote:
hi,
we see the following behaviour in our environment:
cluster consists of 6 nodes (cassandra version 3.0.7). keyspace has a
replication factor 3.
clients are w
hi,
we are trying to setup a new datacenter and are initalizing the data
with nodetool rebuild.
after some hours it seems that the node stopped streaming (at least
there is no more streaming traffic on the network interface).
nodetool netstats shows that the streaming is still in progress
Mode:
checked the pending compactions. there are no pending compactions at the
moment.
bg - roland otta
On Fri, 2017-04-07 at 06:47 -0400, Jacob Shadix wrote:
What version are you running? Do you see any errors in the system.log
(SocketTimeout, for instance)?
And what values do you have for the following
at 7:16 AM, Roland Otta
mailto:roland.o...@willhaben.at>> wrote:
Hi!
we are on 3.7.
we have some debug messages ... but i guess they are not related to that issue
DEBUG [GossipStage:1] 2017-04-07 13:11:00,440 FailureDetector.java:456 -
Ignoring interval time of 2002469610 for /192.168.0.2
Hi,
we have seen similar issues here.
have you verified that your rebuilds have been finished successfully? we have
seen rebuilds that stopped streaming and working but have not finished.
what does nodetool netstats output for your newly built up nodes?
br,
roland
On Mon, 2017-04-10 at 17:15
hi,
sometimes we have the problem that we have hinted handoffs (for example
because auf network problems between 2 DCs) that do not get processed
even if the connection problem between the dcs recovers. Some of the
files stay in the hints directory until we restart the node that
contains the hints
22565129 n/a
On Mon, Apr 10, 2017 at 5:28 PM, Roland Otta
mailto:roland.o...@willhaben.at>> wrote:
Hi,
we have seen similar issues here.
have you verified that your rebuilds have been finished successfully? we have
seen rebuilds that stopped streaming and working but ha
Hi Benjamin,
its unlikely that i can assist you .. but nevertheless ... i give it a try ;-)
whats your consistency level for the insert?
what if one ore more nodes are marked down and proper consistency cant be
achieved?
of course the error message does not indicate that problem (as it says its
sorry .. ignore my comment ...
i missed your comment that the record is in the table ...
On Wed, 2017-04-12 at 16:48 +0200, Roland Otta wrote:
Hi Benjamin,
its unlikely that i can assist you .. but nevertheless ... i give it a try ;-)
whats your consistency level for the insert?
what if one
hi,
we have the following issue on our 3.10 development cluster.
we are doing regular repairs with thelastpickle's fork of creaper.
sometimes the repair (it is a full repair in that case) hangs because
of a stuck validation compaction
nodetool compactionstats gives me
a1bb45c0-1fc6-11e7-81de-0fb
connect to the node with JConsole and see where the compaction
thread is stuck
2017-04-13 8:34 GMT+02:00 Roland Otta
mailto:roland.o...@willhaben.at>>:
hi,
we have the following issue on our 3.10 development cluster.
we are doing regular repairs with thelastpickle's fork of creaper.
so
/HintedHandOffManagerMBean.html
but everytime i try invoking that operation i get an
UnsupportedOperationException (tried it with hostname, ip and host-id
as parameters - everytime the same exception)
On Tue, 2017-04-11 at 07:40 +, Roland Otta wrote:
> hi,
>
> sometimes we have the pro
oh ... the operation is deprecated according to the docs ...
On Thu, 2017-04-13 at 07:40 +, Roland Otta wrote:
> i figured out that there is an mbean
> org.apache.cassandra.db.type=HintedHandoffManager with the operation
> scheduleHintDelivery
>
> i guess thats what i wou
rote:
There is a nodetool command to resume hints. Maybe that helps?
Am 13.04.2017 09:42 schrieb "Roland Otta"
mailto:roland.o...@willhaben.at>>:
oh ... the operation is deprecated according to the docs ...
On Thu, 2017-04-13 at 07:40 +, Roland Otta wrote:
> i figured out
tely I am no caffeine expert. It looks like the read is cached and
after the read caffeine tries to drain the cache and this is stuck. I don't see
the reason from that stack trace.
Someone had to dig deeper into caffeine to find the root cause.
2017-04-13 9:27 GMT+02:00 Roland Otta
mail
related to my config changes
On Thu, 2017-04-13 at 11:58 +0200, benjamin roth wrote:
If you restart the server the same validation completes successfully?
If not, have you tries scrubbing the affected sstables?
2017-04-13 11:43 GMT+02:00 Roland Otta
mailto:roland.o...@willhaben.at>>:
thank yo
/899929247.run(Unknown
Source)
java.lang.Thread.run(Thread.java:745)
br,
roland
On Thu, 2017-04-13 at 10:04 +, Roland Otta wrote:
i did 2 restarts before which did not help
after that i have set for testing purposes file_cache_size_in_mb: 0 and
buffer_pool_use_heap_if_exhausted: false and
reproduction case for the issue - you should copy the
sstable away for further testing. Are you allowed to upload the broken sstable
to JIRA?
2017-04-13 13:15 GMT+02:00 Roland Otta
mailto:roland.o...@willhaben.at>>:
sorry .. i have to correct myself .. the problem still persists.
tried nodetool
you which sstable is being scrubbed.
2017-04-13 15:07 GMT+02:00 Roland Otta
mailto:roland.o...@willhaben.at>>:
i made a copy and also have the permission to upload sstables for that
particular column_family
is it possible to track down which sstable of that cf is affected or should i
uplo
27 matches
Mail list logo