Re: Corrupted sstables

2019-05-14 Thread Roy Burstein
Hi Alain ,
We are adding 12 tables on weekly basis job  , and dropping history table
.
Our job is looking for schema mismatch by running "SELECT peer,
schema_version, tokens FROM peers"  before it adds/drops each table .
nodetool describecluster  looks ok  , only one schema version   .
Cluster Information:
Name:  [removed]
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
30cb4963-109c-3077-8bdd-df9bfb313568: [10...output truncated]

We have shutdown recently (maint job) the cluster and started it again ,but
the job is daily .
I have tried to correlate job time to the table corruption timestamp but
did not find any relation, but this direction may be relevant .

Thanks,
Roy

On Fri, May 10, 2019 at 3:13 PM Alain RODRIGUEZ  wrote:

> Hello Roy,
>
> The name of the table makes me think that you might be doing automated
> changes to the schema. I just dug this topic for someone else and schema
> changes are way less consistent than standard Cassandra operations (see
> https://issues.apache.org/jira/browse/CASSANDRA-10699).
>
>> sessions_rawdata/sessions_v2_2019_05_06-9cae0c20585411e99aa867a11519e31c/md-816-big-I
>>
>>
> Idea 1: Some of these queries might have failed for multiple reasons on a
> node (down for too long, race conditions, ...), leaving the cluster in an
> unstable state where there is a schema disagreement. In that case, you
> could have troubles when adding a new node I have seen it happening. Could
> you check/share with us the output of: 'nodetool describecluster'?
>
> Also did you tried recently to perform a rolling restart? This often helps
> synchronising local schemas and 'could' fix the issue. Another option is
> 'nodetool resetlocalschema' on node(s) out of sync.
>
> idea 2: If you identified that you have broken second indexes, maybe give
> a try at running 'nodetool rebuild_index   '
> on all nodes before adding the next node?
> https://cassandra.apache.org/doc/latest/tools/nodetool/rebuild_index.html
>
> Hope this helps,
> C*heers,
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
> Le jeu. 9 mai 2019 à 17:29, Jason Wee  a écrit :
>
>> maybe print out value into the logfile and that should lead to some
>> clue where it might be the problem?
>>
>> On Tue, May 7, 2019 at 4:58 PM Paul Chandler  wrote:
>> >
>> > Roy, We spent along time trying to fix it, but didn’t find a solution,
>> it was a test cluster, so we ended up rebuilding the cluster, rather than
>> spending anymore time trying to fix the corruption. We have worked out what
>> had caused it, so were happy it wasn’t going to occur in production. Sorry
>> that is not much help, but I am not even sure it is the same issue you have.
>> >
>> > Paul
>> >
>> >
>> >
>> > On 7 May 2019, at 07:14, Roy Burstein  wrote:
>> >
>> > I can say that it happens now as well ,currently no node has been
>> added/removed .
>> > Corrupted sstables are usually the index files and in some machines the
>> sstable even does not exist on the filesystem.
>> > On one machine I was able to dump the sstable to dump file without any
>> issue  . Any idea how to tackle this issue ?
>> >
>> >
>> > On Tue, May 7, 2019 at 12:32 AM Paul Chandler 
>> wrote:
>> >>
>> >> Roy,
>> >>
>> >> I have seen this exception before when a column had been dropped then
>> re added with the same name but a different type. In particular we dropped
>> a column and re created it as static, then had this exception from the old
>> sstables created prior to the ddl change.
>> >>
>> >> Not sure if this applies in your case.
>> >>
>> >> Thanks
>> >>
>> >> Paul
>> >>
>> >> On 6 May 2019, at 21:52, Nitan Kainth  wrote:
>> >>
>> >> can Disk have bad sectors? fccheck or something similar can help.
>> >>
>> >> Long shot: repair or any other operation conflicting. Would leave that
>> to others.
>> >>
>> >> On Mon, May 6, 2019 at 3:50 PM Roy Burstein 
>> wrote:
>> >>>
>> >>> It happens on the same column families and they have the same ddl (as
>> already posted) . I did not check it after cleanup
>> >>> .
>> >>>
>> >>> On Mon, May 6, 2019, 23:43 Nitan Kainth 
>> wrote:
>> 
>>  This is strange, never saw this. does it happen to same column
>> family?
>> 
>>  Does it happen after cleanup?
>> 
>>  On Mon, May 6, 2019 at 3:41 PM Roy Burstein 
>> wrote:
>> >
>> > Yes.
>> >
>> > On Mon, May 6, 2019, 23:23 Nitan Kainth 
>> wrote:
>> >>
>> >> Roy,
>> >>
>> >> You mean all nodes show corruption when you add a node to cluster??
>> >>
>> >>
>> >> Regards,
>> >> Nitan
>> >> Cell: 510 449 9629
>> >>
>> >> On May 6, 2019, at 2:48 PM, Roy Burstein 
>> wrote:
>> >>
>> >> It happened  on all the servers in the cluster every time I have
>> added node
>> >> .
>> >> 

Re: Corrupted sstables

2019-05-10 Thread Alain RODRIGUEZ
Hello Roy,

The name of the table makes me think that you might be doing automated
changes to the schema. I just dug this topic for someone else and schema
changes are way less consistent than standard Cassandra operations (see
https://issues.apache.org/jira/browse/CASSANDRA-10699).

> sessions_rawdata/sessions_v2_2019_05_06-9cae0c20585411e99aa867a11519e31c/md-816-big-I
>
>
Idea 1: Some of these queries might have failed for multiple reasons on a
node (down for too long, race conditions, ...), leaving the cluster in an
unstable state where there is a schema disagreement. In that case, you
could have troubles when adding a new node I have seen it happening. Could
you check/share with us the output of: 'nodetool describecluster'?

Also did you tried recently to perform a rolling restart? This often helps
synchronising local schemas and 'could' fix the issue. Another option is
'nodetool resetlocalschema' on node(s) out of sync.

idea 2: If you identified that you have broken second indexes, maybe give a
try at running 'nodetool rebuild_index   ' on
all nodes before adding the next node?
https://cassandra.apache.org/doc/latest/tools/nodetool/rebuild_index.html

Hope this helps,
C*heers,
---
Alain Rodriguez - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



Le jeu. 9 mai 2019 à 17:29, Jason Wee  a écrit :

> maybe print out value into the logfile and that should lead to some
> clue where it might be the problem?
>
> On Tue, May 7, 2019 at 4:58 PM Paul Chandler  wrote:
> >
> > Roy, We spent along time trying to fix it, but didn’t find a solution,
> it was a test cluster, so we ended up rebuilding the cluster, rather than
> spending anymore time trying to fix the corruption. We have worked out what
> had caused it, so were happy it wasn’t going to occur in production. Sorry
> that is not much help, but I am not even sure it is the same issue you have.
> >
> > Paul
> >
> >
> >
> > On 7 May 2019, at 07:14, Roy Burstein  wrote:
> >
> > I can say that it happens now as well ,currently no node has been
> added/removed .
> > Corrupted sstables are usually the index files and in some machines the
> sstable even does not exist on the filesystem.
> > On one machine I was able to dump the sstable to dump file without any
> issue  . Any idea how to tackle this issue ?
> >
> >
> > On Tue, May 7, 2019 at 12:32 AM Paul Chandler  wrote:
> >>
> >> Roy,
> >>
> >> I have seen this exception before when a column had been dropped then
> re added with the same name but a different type. In particular we dropped
> a column and re created it as static, then had this exception from the old
> sstables created prior to the ddl change.
> >>
> >> Not sure if this applies in your case.
> >>
> >> Thanks
> >>
> >> Paul
> >>
> >> On 6 May 2019, at 21:52, Nitan Kainth  wrote:
> >>
> >> can Disk have bad sectors? fccheck or something similar can help.
> >>
> >> Long shot: repair or any other operation conflicting. Would leave that
> to others.
> >>
> >> On Mon, May 6, 2019 at 3:50 PM Roy Burstein 
> wrote:
> >>>
> >>> It happens on the same column families and they have the same ddl (as
> already posted) . I did not check it after cleanup
> >>> .
> >>>
> >>> On Mon, May 6, 2019, 23:43 Nitan Kainth  wrote:
> 
>  This is strange, never saw this. does it happen to same column family?
> 
>  Does it happen after cleanup?
> 
>  On Mon, May 6, 2019 at 3:41 PM Roy Burstein 
> wrote:
> >
> > Yes.
> >
> > On Mon, May 6, 2019, 23:23 Nitan Kainth 
> wrote:
> >>
> >> Roy,
> >>
> >> You mean all nodes show corruption when you add a node to cluster??
> >>
> >>
> >> Regards,
> >> Nitan
> >> Cell: 510 449 9629
> >>
> >> On May 6, 2019, at 2:48 PM, Roy Burstein 
> wrote:
> >>
> >> It happened  on all the servers in the cluster every time I have
> added node
> >> .
> >> This is new cluster nothing was upgraded here , we have a similar
> cluster
> >> running on C* 2.1.15 with no issues .
> >> We are aware to the scrub utility just it reproduce every time we
> added
> >> node to the cluster .
> >>
> >> We have many tables there
> >>
> >>
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Corrupted sstables

2019-05-09 Thread Jason Wee
maybe print out value into the logfile and that should lead to some
clue where it might be the problem?

On Tue, May 7, 2019 at 4:58 PM Paul Chandler  wrote:
>
> Roy, We spent along time trying to fix it, but didn’t find a solution, it was 
> a test cluster, so we ended up rebuilding the cluster, rather than spending 
> anymore time trying to fix the corruption. We have worked out what had caused 
> it, so were happy it wasn’t going to occur in production. Sorry that is not 
> much help, but I am not even sure it is the same issue you have.
>
> Paul
>
>
>
> On 7 May 2019, at 07:14, Roy Burstein  wrote:
>
> I can say that it happens now as well ,currently no node has been 
> added/removed .
> Corrupted sstables are usually the index files and in some machines the 
> sstable even does not exist on the filesystem.
> On one machine I was able to dump the sstable to dump file without any issue  
> . Any idea how to tackle this issue ?
>
>
> On Tue, May 7, 2019 at 12:32 AM Paul Chandler  wrote:
>>
>> Roy,
>>
>> I have seen this exception before when a column had been dropped then re 
>> added with the same name but a different type. In particular we dropped a 
>> column and re created it as static, then had this exception from the old 
>> sstables created prior to the ddl change.
>>
>> Not sure if this applies in your case.
>>
>> Thanks
>>
>> Paul
>>
>> On 6 May 2019, at 21:52, Nitan Kainth  wrote:
>>
>> can Disk have bad sectors? fccheck or something similar can help.
>>
>> Long shot: repair or any other operation conflicting. Would leave that to 
>> others.
>>
>> On Mon, May 6, 2019 at 3:50 PM Roy Burstein  wrote:
>>>
>>> It happens on the same column families and they have the same ddl (as 
>>> already posted) . I did not check it after cleanup
>>> .
>>>
>>> On Mon, May 6, 2019, 23:43 Nitan Kainth  wrote:

 This is strange, never saw this. does it happen to same column family?

 Does it happen after cleanup?

 On Mon, May 6, 2019 at 3:41 PM Roy Burstein  wrote:
>
> Yes.
>
> On Mon, May 6, 2019, 23:23 Nitan Kainth  wrote:
>>
>> Roy,
>>
>> You mean all nodes show corruption when you add a node to cluster??
>>
>>
>> Regards,
>> Nitan
>> Cell: 510 449 9629
>>
>> On May 6, 2019, at 2:48 PM, Roy Burstein  wrote:
>>
>> It happened  on all the servers in the cluster every time I have added 
>> node
>> .
>> This is new cluster nothing was upgraded here , we have a similar cluster
>> running on C* 2.1.15 with no issues .
>> We are aware to the scrub utility just it reproduce every time we added
>> node to the cluster .
>>
>> We have many tables there
>>
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Corrupted sstables

2019-05-07 Thread Paul Chandler
Roy, We spent along time trying to fix it, but didn’t find a solution, it was a 
test cluster, so we ended up rebuilding the cluster, rather than spending 
anymore time trying to fix the corruption. We have worked out what had caused 
it, so were happy it wasn’t going to occur in production. Sorry that is not 
much help, but I am not even sure it is the same issue you have. 

Paul



> On 7 May 2019, at 07:14, Roy Burstein  wrote:
> 
> I can say that it happens now as well ,currently no node has been 
> added/removed . 
> Corrupted sstables are usually the index files and in some machines the 
> sstable even does not exist on the filesystem.
> On one machine I was able to dump the sstable to dump file without any issue  
> . Any idea how to tackle this issue ? 
>  
> 
> On Tue, May 7, 2019 at 12:32 AM Paul Chandler  > wrote:
> Roy,
> 
> I have seen this exception before when a column had been dropped then re 
> added with the same name but a different type. In particular we dropped a 
> column and re created it as static, then had this exception from the old 
> sstables created prior to the ddl change.
> 
> Not sure if this applies in your case.
> 
> Thanks 
> 
> Paul
> 
>> On 6 May 2019, at 21:52, Nitan Kainth > > wrote:
>> 
>> can Disk have bad sectors? fccheck or something similar can help.
>> 
>> Long shot: repair or any other operation conflicting. Would leave that to 
>> others.
>> 
>> On Mon, May 6, 2019 at 3:50 PM Roy Burstein > > wrote:
>> It happens on the same column families and they have the same ddl (as 
>> already posted) . I did not check it after cleanup 
>> .
>> 
>> On Mon, May 6, 2019, 23:43 Nitan Kainth > > wrote:
>> This is strange, never saw this. does it happen to same column family?
>> 
>> Does it happen after cleanup?
>> 
>> On Mon, May 6, 2019 at 3:41 PM Roy Burstein > > wrote:
>> Yes.
>> 
>> On Mon, May 6, 2019, 23:23 Nitan Kainth > > wrote:
>> Roy,
>> 
>> You mean all nodes show corruption when you add a node to cluster??
>> 
>> 
>> Regards,
>> Nitan
>> Cell: 510 449 9629 
>> 
>> On May 6, 2019, at 2:48 PM, Roy Burstein > > wrote:
>> 
>>> It happened  on all the servers in the cluster every time I have added node
>>> .
>>> This is new cluster nothing was upgraded here , we have a similar cluster
>>> running on C* 2.1.15 with no issues .
>>> We are aware to the scrub utility just it reproduce every time we added
>>> node to the cluster .
>>> 
>>> We have many tables there
> 



Re: Corrupted sstables

2019-05-07 Thread Roy Burstein
I can say that it happens now as well ,currently no node has been
added/removed .
Corrupted sstables are usually the index files and in some machines the
sstable even does not exist on the filesystem.
On one machine I was able to dump the sstable to dump file without any
issue  . Any idea how to tackle this issue ?


On Tue, May 7, 2019 at 12:32 AM Paul Chandler  wrote:

> Roy,
>
> I have seen this exception before when a column had been dropped then re
> added with the same name but a different type. In particular we dropped a
> column and re created it as static, then had this exception from the old
> sstables created prior to the ddl change.
>
> Not sure if this applies in your case.
>
> Thanks
>
> Paul
>
> On 6 May 2019, at 21:52, Nitan Kainth  wrote:
>
> can Disk have bad sectors? fccheck or something similar can help.
>
> Long shot: repair or any other operation conflicting. Would leave that to
> others.
>
> On Mon, May 6, 2019 at 3:50 PM Roy Burstein 
> wrote:
>
>> It happens on the same column families and they have the same ddl (as
>> already posted) . I did not check it after cleanup
>> .
>>
>> On Mon, May 6, 2019, 23:43 Nitan Kainth  wrote:
>>
>>> This is strange, never saw this. does it happen to same column family?
>>>
>>> Does it happen after cleanup?
>>>
>>> On Mon, May 6, 2019 at 3:41 PM Roy Burstein 
>>> wrote:
>>>
 Yes.

 On Mon, May 6, 2019, 23:23 Nitan Kainth  wrote:

> Roy,
>
> You mean all nodes show corruption when you add a node to cluster??
>
>
> Regards,
> Nitan
> Cell: 510 449 9629
>
> On May 6, 2019, at 2:48 PM, Roy Burstein 
> wrote:
>
> It happened  on all the servers in the cluster every time I have added
> node
> .
> This is new cluster nothing was upgraded here , we have a similar
> cluster
> running on C* 2.1.15 with no issues .
> We are aware to the scrub utility just it reproduce every time we added
> node to the cluster .
>
> We have many tables there
>
>
>


Re: Corrupted sstables

2019-05-06 Thread Paul Chandler
Roy,

I have seen this exception before when a column had been dropped then re added 
with the same name but a different type. In particular we dropped a column and 
re created it as static, then had this exception from the old sstables created 
prior to the ddl change.

Not sure if this applies in your case.

Thanks 

Paul

> On 6 May 2019, at 21:52, Nitan Kainth  wrote:
> 
> can Disk have bad sectors? fccheck or something similar can help.
> 
> Long shot: repair or any other operation conflicting. Would leave that to 
> others.
> 
> On Mon, May 6, 2019 at 3:50 PM Roy Burstein  > wrote:
> It happens on the same column families and they have the same ddl (as already 
> posted) . I did not check it after cleanup 
> .
> 
> On Mon, May 6, 2019, 23:43 Nitan Kainth  > wrote:
> This is strange, never saw this. does it happen to same column family?
> 
> Does it happen after cleanup?
> 
> On Mon, May 6, 2019 at 3:41 PM Roy Burstein  > wrote:
> Yes.
> 
> On Mon, May 6, 2019, 23:23 Nitan Kainth  > wrote:
> Roy,
> 
> You mean all nodes show corruption when you add a node to cluster??
> 
> 
> Regards,
> Nitan
> Cell: 510 449 9629 
> 
> On May 6, 2019, at 2:48 PM, Roy Burstein  > wrote:
> 
>> It happened  on all the servers in the cluster every time I have added node
>> .
>> This is new cluster nothing was upgraded here , we have a similar cluster
>> running on C* 2.1.15 with no issues .
>> We are aware to the scrub utility just it reproduce every time we added
>> node to the cluster .
>> 
>> We have many tables there



Re: Corrupted sstables

2019-05-06 Thread Nitan Kainth
can Disk have bad sectors? fccheck or something similar can help.

Long shot: repair or any other operation conflicting. Would leave that to
others.

On Mon, May 6, 2019 at 3:50 PM Roy Burstein  wrote:

> It happens on the same column families and they have the same ddl (as
> already posted) . I did not check it after cleanup
> .
>
> On Mon, May 6, 2019, 23:43 Nitan Kainth  wrote:
>
>> This is strange, never saw this. does it happen to same column family?
>>
>> Does it happen after cleanup?
>>
>> On Mon, May 6, 2019 at 3:41 PM Roy Burstein 
>> wrote:
>>
>>> Yes.
>>>
>>> On Mon, May 6, 2019, 23:23 Nitan Kainth  wrote:
>>>
 Roy,

 You mean all nodes show corruption when you add a node to cluster??


 Regards,

 Nitan

 Cell: 510 449 9629

 On May 6, 2019, at 2:48 PM, Roy Burstein 
 wrote:

 It happened  on all the servers in the cluster every time I have added
 node
 .
 This is new cluster nothing was upgraded here , we have a similar
 cluster
 running on C* 2.1.15 with no issues .
 We are aware to the scrub utility just it reproduce every time we added
 node to the cluster .

 We have many tables there




Re: Corrupted sstables

2019-05-06 Thread Roy Burstein
It happens on the same column families and they have the same ddl (as
already posted) . I did not check it after cleanup
.

On Mon, May 6, 2019, 23:43 Nitan Kainth  wrote:

> This is strange, never saw this. does it happen to same column family?
>
> Does it happen after cleanup?
>
> On Mon, May 6, 2019 at 3:41 PM Roy Burstein 
> wrote:
>
>> Yes.
>>
>> On Mon, May 6, 2019, 23:23 Nitan Kainth  wrote:
>>
>>> Roy,
>>>
>>> You mean all nodes show corruption when you add a node to cluster??
>>>
>>>
>>> Regards,
>>>
>>> Nitan
>>>
>>> Cell: 510 449 9629
>>>
>>> On May 6, 2019, at 2:48 PM, Roy Burstein  wrote:
>>>
>>> It happened  on all the servers in the cluster every time I have added
>>> node
>>> .
>>> This is new cluster nothing was upgraded here , we have a similar cluster
>>> running on C* 2.1.15 with no issues .
>>> We are aware to the scrub utility just it reproduce every time we added
>>> node to the cluster .
>>>
>>> We have many tables there
>>>
>>>


Re: Corrupted sstables

2019-05-06 Thread Nitan Kainth
This is strange, never saw this. does it happen to same column family?

Does it happen after cleanup?

On Mon, May 6, 2019 at 3:41 PM Roy Burstein  wrote:

> Yes.
>
> On Mon, May 6, 2019, 23:23 Nitan Kainth  wrote:
>
>> Roy,
>>
>> You mean all nodes show corruption when you add a node to cluster??
>>
>>
>> Regards,
>>
>> Nitan
>>
>> Cell: 510 449 9629
>>
>> On May 6, 2019, at 2:48 PM, Roy Burstein  wrote:
>>
>> It happened  on all the servers in the cluster every time I have added
>> node
>> .
>> This is new cluster nothing was upgraded here , we have a similar cluster
>> running on C* 2.1.15 with no issues .
>> We are aware to the scrub utility just it reproduce every time we added
>> node to the cluster .
>>
>> We have many tables there
>>
>>


Re: Corrupted sstables

2019-05-06 Thread Roy Burstein
Yes.

On Mon, May 6, 2019, 23:23 Nitan Kainth  wrote:

> Roy,
>
> You mean all nodes show corruption when you add a node to cluster??
>
>
> Regards,
>
> Nitan
>
> Cell: 510 449 9629
>
> On May 6, 2019, at 2:48 PM, Roy Burstein  wrote:
>
> It happened  on all the servers in the cluster every time I have added node
> .
> This is new cluster nothing was upgraded here , we have a similar cluster
> running on C* 2.1.15 with no issues .
> We are aware to the scrub utility just it reproduce every time we added
> node to the cluster .
>
> We have many tables there
>
>


Re: Corrupted sstables

2019-05-06 Thread Nitan Kainth
Roy,

You mean all nodes show corruption when you add a node to cluster??


Regards,
Nitan
Cell: 510 449 9629

> On May 6, 2019, at 2:48 PM, Roy Burstein  wrote:
> 
> It happened  on all the servers in the cluster every time I have added node
> .
> This is new cluster nothing was upgraded here , we have a similar cluster
> running on C* 2.1.15 with no issues .
> We are aware to the scrub utility just it reproduce every time we added
> node to the cluster .
> 
> We have many tables there


Re: Corrupted sstables

2019-05-06 Thread Roy Burstein
It happened  on all the servers in the cluster every time I have added node
.
This is new cluster nothing was upgraded here , we have a similar cluster
running on C* 2.1.15 with no issues .
We are aware to the scrub utility just it reproduce every time we added
node to the cluster .

We have many tables therethe DDL of the corrupted sstables looks the same:
CREATE TABLE rawdata.a1 (
session_start_time_timeslice bigint,
uid_bucket int,
vid_bucket int,
pid int,
uid text,
sid bigint,
vid bigint,
data_type text,
data_id bigint,
data blob,
PRIMARY KEY ((session_start_time_timeslice, uid_bucket, vid_bucket),
pid, uid, sid, vid, data_type, data_id)
) WITH CLUSTERING ORDER BY (pid ASC, uid ASC, sid ASC, vid ASC, data_type
ASC, data_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.2
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE TABLE rawdata.a2 (
session_start_time_timeslice bigint,
uid_bucket int,
vid_bucket int,
pid int,
uid text,
sid bigint,
vid bigint,
data_type text,
data_id bigint,
data blob,
PRIMARY KEY ((session_start_time_timeslice, uid_bucket, vid_bucket),
pid, uid, sid, vid, data_type, data_id)
) WITH CLUSTERING ORDER BY (pid ASC, uid ASC, sid ASC, vid ASC, data_type
ASC, data_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.2
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

CREATE TABLE rawdata.a3 (
session_start_time_timeslice bigint,
uid_bucket int,
vid_bucket int,
pid int,
uid text,
sid bigint,
vid bigint,
data_type text,
data_id bigint,
data blob,
PRIMARY KEY ((session_start_time_timeslice, uid_bucket, vid_bucket),
pid, uid, sid, vid, data_type, data_id)
) WITH CLUSTERING ORDER BY (pid ASC, uid ASC, sid ASC, vid ASC, data_type
ASC, data_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.2
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';


CREATE TABLE rawdata.a4 (
session_start_time_timeslice bigint,
uid_bucket int,
vid_bucket int,
pid int,
uid text,
sid bigint,
vid bigint,
data_type text,
data_id bigint,
data blob,
PRIMARY KEY ((session_start_time_timeslice, uid_bucket, vid_bucket),
pid, uid, sid, vid, data_type, data_id)
) WITH CLUSTERING ORDER BY (pid ASC, uid ASC, sid ASC, vid ASC, data_type
ASC, data_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.2
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';



On Mon, May 6, 2019 at 9:44 PM Jeff Jirsa  wrote:

> Before you scrub, from which version were you upgrading and can you post
> a(n 

Re: Corrupted sstables

2019-05-06 Thread Jeff Jirsa
Before you scrub, from which version were you upgrading and can you post a(n 
anonymized) schema?

-- 
Jeff Jirsa


> On May 6, 2019, at 11:37 AM, Nitan Kainth  wrote:
> 
> Did you try sstablescrub?
> If that doesn't work, you can delete all files of this sstable id and then 
> run repair -pr on this node.
> 
>> On Mon, May 6, 2019 at 9:20 AM Roy Burstein  wrote:
>> Hi , 
>> We are having issues with Cassandra 3.11.4 , after adding node to the 
>> cluster we get many corrupted files across the cluster (almost all nodes) 
>> ,this is reproducible in our env.  .
>> We  have 69 nodes in the cluster ,disk_access_mode: standard . 
>> 
>> The stack trace : 
>> WARN  [ReadStage-4] 2019-05-06 06:44:19,843 
>> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
>> Thread[ReadStage-4,5,main]: {}
>> java.lang.RuntimeException: 
>> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
>> /var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_05_06-9cae0c20585411e99aa867a11519e31c/md-816-big-I
>> ndex.db
>> at 
>> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2588)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
>> ~[na:1.8.0-zing_19.03.0.0]
>> at 
>> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>>  [apache-cassandra-3.11.4.jar:3.11.4]
>> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) 
>> [apache-cassandra-3.11.4.jar:3.11.4]
>> at java.lang.Thread.run(Thread.java:748) [na:1.8.0-zing_19.03.0.0]
>> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
>> Corrupted: 
>> /var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_05_06-9cae0c20585411e99aa867a11519e31c/md-816-big-Index.db
>> at 
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:275)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1586)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:64)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:108)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:99)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:119)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:48)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:525)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:385)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.rows.UnfilteredRowIterator.isEmpty(UnfilteredRowIterator.java:67)
>>  ~[apache-cassandra-3.11.4.jar:3.11.4]
>> at 
>> org.apache.cassandra.db.SinglePartitionReadCommand.withSSTablesIterated(SinglePartitionReadCommand.java:853)
>>  

Re: Corrupted sstables

2019-05-06 Thread Nitan Kainth
Did you try sstablescrub?
If that doesn't work, you can delete all files of this sstable id and then
run repair -pr on this node.

On Mon, May 6, 2019 at 9:20 AM Roy Burstein  wrote:

> Hi ,
> We are having issues with Cassandra 3.11.4 , after adding node to the
> cluster we get many corrupted files across the cluster (almost all nodes)
> ,this is reproducible in our env.  .
> We  have 69 nodes in the cluster ,disk_access_mode: standard .
>
> The stack trace :
>
> WARN  [ReadStage-4] 2019-05-06 06:44:19,843 
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
> Thread[ReadStage-4,5,main]: {}
> java.lang.RuntimeException: 
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_05_06-9cae0c20585411e99aa867a11519e31c/md-816-big-I
> ndex.db
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2588)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0-zing_19.03.0.0]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
>  [apache-cassandra-3.11.4.jar:3.11.4]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) 
> [apache-cassandra-3.11.4.jar:3.11.4]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0-zing_19.03.0.0]
> Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
> Corrupted: 
> /var/lib/cassandra/data/disk1/sessions_rawdata/sessions_v2_2019_05_06-9cae0c20585411e99aa867a11519e31c/md-816-big-Index.db
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.getPosition(BigTableReader.java:275)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.getPosition(SSTableReader.java:1586)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:64)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:108)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:99)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:119)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.computeNext(UnfilteredRowIteratorWithLowerBound.java:48)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:525)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:385)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIterator.isEmpty(UnfilteredRowIterator.java:67)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.withSSTablesIterated(SinglePartitionReadCommand.java:853)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
> org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:797)
>  ~[apache-cassandra-3.11.4.jar:3.11.4]
> at 
>