: at
>>> com.thinkaurelius.thrift.TDisruptorServer$SelectorThread.processKey(TDisruptorServer.java:569)
>>> [thrift-server-0.3.7.jar:na]
>>> Jul 9 03:00:30 cassandra: at
>>> com.thinkaurelius.thrift.TDisruptorServer$AbstractSelectorThread.select(TDisruptorServer.jav
ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db >
>> /dev/null"
>> If you get an error message, it's probably a hardware issue.
>>
>> - Erik -
>>
>> --
>> *From:* Philip Ó Condúin
>> *Sent:* Thursday, August 8, 2019 09:58
>> *To:* user@cassa
cat /data/ssd2/data/KeyspaceMetadata/x-x/lb-26203-big-Data.db >
> /dev/null"
> If you get an error message, it's probably a hardware issue.
>
> - Erik -
>
> --
> *From:* Philip Ó Condúin
> *Sent:* Thursday, August 8, 2019 09:58
> *To:*
y a hardware issue.
- Erik -
From: Philip Ó Condúin
Sent: Thursday, August 8, 2019 09:58
To: user@cassandra.apache.org
Subject: Re: Datafile Corruption
Hi Jon,
Good question, I'm not sure if we're using NVMe, I don't see /dev/nvme but we
could
timer: 8 callbacks suppressed
>>> Jul 9 03:00:37 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
>>> Jul 9 03:00:43 kernel: fnic_handle_fip_timer: 8 callbacks suppressed
>>>
>>>
>>>
>>> On Thu, 8 Aug 2019 at 15:42, ZAIDI, ASAD A wrote:
t;>> Did you check if packets are NOT being dropped for network interfaces
>>> Cassandra instances are consuming (ifconfig –a) internode compression is
>>> set for all endpoint – may be network is playing any role here?
>>>
>>> is this corruption limite
ou shared it looked like only specific
>> keyspace/table is affected – is that correct?
>>
>> When you remove corrupted sstable of a certain table, I guess you
>> verifies all nodes for corrupted sstables for same table (may be with with
>> nodetool scrub tool) so to limit spread of c
> scrub tool) so to limit spread of corruptions – right?
>
> Just curious to know – you’re not using lz4/default compressor for all
> tables there must be some reason for it.
>
>
>
>
>
>
>
> *From:* Philip Ó Condúin [mailto:philipocond...@gmail.com]
> *Sent:*
– right?
Just curious to know – you’re not using lz4/default compressor for all tables
there must be some reason for it.
From: Philip Ó Condúin [mailto:philipocond...@gmail.com]
Sent: Thursday, August 08, 2019 6:20 AM
To: user@cassandra.apache.org
Subject: Re: Datafile Corruption
Hi All,
Thank
The corrupt block exception from the compressor in 2.1/2.2 is something I don’t
recall ever being attributed to anything other than bad hardware, so that seems
by far the most likely option.
The corruption that the compressor is catching says the checksum written
immediately after the compress
Hi All,
Thank you so much for the replies.
Currently, I have the following list that can potentially cause some sort
of corruption in a Cassandra cluster.
- Sudden Power cut - *We have had no power cuts in the datacenters*
- Network Issues - *no network issues from what I can tell*
-
Repair during upgrade have caused corruption too.
Also, dropping and adding columns with same name but different type
Regards,
Nitan
Cell: 510 449 9629
> On Aug 7, 2019, at 2:42 PM, Jeff Jirsa wrote:
>
> Is compression enabled?
>
> If not, bit flips on disk can corrupt data files and reads +
Is compression enabled?
If not, bit flips on disk can corrupt data files and reads + repair may send
that corruption to other hosts in the cluster
> On Aug 7, 2019, at 3:46 AM, Philip Ó Condúin wrote:
>
> Hi All,
>
> I am currently experiencing multiple datafile corruptions across most node
Few for reasons:
Sudden Power cut
Disk full
Issue in casandra version like Cassandra-13752
On Wed, Aug 7, 2019, 4:16 PM Philip Ó Condúin
wrote:
> Hi All,
>
> I am currently experiencing multiple datafile corruptions across most
> nodes in my cluster, there seems to be no pattern to the corruptio
14 matches
Mail list logo