Hi,

One of my 1.1.1 nodes doesn't restart due to stack overflow on building the
interval tree. Bumping the stack size doesn't help. Here's the stack trace:

https://gist.github.com/2889611

It looks more like an infinite loop on IntervalNode constructor's logic
than a deep tree since DEBUG log shows looping over the same intervals:

https://gist.github.com/2889862

Running it with assertions enabled shows a number of sstables which the
first key > last key, for example:

2012-06-07_16:12:18.18781 java.lang.AssertionError: SSTable first key
DecoratedKey(22540092521493542684444486114339861094,
3730343137317c3438333632333932) > last key
DecoratedKey(22166106697727078019854024428005234814,
313138323637397c3432373931353435)

and let's the node come up without hitting IntervalNode constructor. I
wonder how invalid sstables get create in the first place? Is there a way
to verify if other nodes in the cluster are affected as well?

Speaking of a solution to get the node back up without wiping the data off
and let it bootstrap again, I was wondering if I remove affected sstables
and restart the node followed by a repair, will the node end up in a
consistent state?

SStables contain counter columns and leveled compaction is used.

Thanks,
Omid

Reply via email to