Re: Cassandra 1.1.1 stack overflow on an infinite loop building IntervalTree

2012-06-08 Thread Omid Aladini
Thanks. Yes it's exactly the same. Will follow up there.

-- Omid

On Fri, Jun 8, 2012 at 5:55 PM, Sylvain Lebresne wrote:

> Looks a lot like https://issues.apache.org/jira/browse/CASSANDRA-4321.
> Feel free to add a comment on there if you have any additional info.
>
> --
> Sylvain
>
> On Fri, Jun 8, 2012 at 12:06 PM, Omid Aladini 
> wrote:
> > Also looks similar to this ticket:
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-4078
> >
> >
> >
> > On Thu, Jun 7, 2012 at 6:48 PM, Omid Aladini 
> wrote:
> >>
> >> Hi,
> >>
> >> One of my 1.1.1 nodes doesn't restart due to stack overflow on building
> >> the interval tree. Bumping the stack size doesn't help. Here's the stack
> >> trace:
> >>
> >> https://gist.github.com/2889611
> >>
> >> It looks more like an infinite loop on IntervalNode constructor's logic
> >> than a deep tree since DEBUG log shows looping over the same intervals:
> >>
> >> https://gist.github.com/2889862
> >>
> >> Running it with assertions enabled shows a number of sstables which the
> >> first key > last key, for example:
> >>
> >> 2012-06-07_16:12:18.18781 java.lang.AssertionError: SSTable first key
> >> DecoratedKey(2254009252149354268486114339861094,
> >> 3730343137317c3438333632333932) > last key
> >> DecoratedKey(22166106697727078019854024428005234814,
> >> 313138323637397c3432373931353435)
> >>
> >> and let's the node come up without hitting IntervalNode constructor. I
> >> wonder how invalid sstables get create in the first place? Is there a
> way to
> >> verify if other nodes in the cluster are affected as well?
> >>
> >> Speaking of a solution to get the node back up without wiping the data
> off
> >> and let it bootstrap again, I was wondering if I remove affected
> sstables
> >> and restart the node followed by a repair, will the node end up in a
> >> consistent state?
> >>
> >> SStables contain counter columns and leveled compaction is used.
> >>
> >> Thanks,
> >> Omid
> >
> >
>


Re: Cassandra 1.1.1 stack overflow on an infinite loop building IntervalTree

2012-06-08 Thread Sylvain Lebresne
Looks a lot like https://issues.apache.org/jira/browse/CASSANDRA-4321.
Feel free to add a comment on there if you have any additional info.

--
Sylvain

On Fri, Jun 8, 2012 at 12:06 PM, Omid Aladini  wrote:
> Also looks similar to this ticket:
>
> https://issues.apache.org/jira/browse/CASSANDRA-4078
>
>
>
> On Thu, Jun 7, 2012 at 6:48 PM, Omid Aladini  wrote:
>>
>> Hi,
>>
>> One of my 1.1.1 nodes doesn't restart due to stack overflow on building
>> the interval tree. Bumping the stack size doesn't help. Here's the stack
>> trace:
>>
>> https://gist.github.com/2889611
>>
>> It looks more like an infinite loop on IntervalNode constructor's logic
>> than a deep tree since DEBUG log shows looping over the same intervals:
>>
>> https://gist.github.com/2889862
>>
>> Running it with assertions enabled shows a number of sstables which the
>> first key > last key, for example:
>>
>> 2012-06-07_16:12:18.18781 java.lang.AssertionError: SSTable first key
>> DecoratedKey(2254009252149354268486114339861094,
>> 3730343137317c3438333632333932) > last key
>> DecoratedKey(22166106697727078019854024428005234814,
>> 313138323637397c3432373931353435)
>>
>> and let's the node come up without hittingĀ IntervalNode constructor. I
>> wonder how invalid sstables get create in the first place? Is there a way to
>> verify if other nodes in the cluster are affected as well?
>>
>> Speaking of a solution to get the node back up without wiping the data off
>> and let it bootstrap again, I was wondering if I remove affected sstables
>> and restart the node followed by a repair, will the node end up in a
>> consistent state?
>>
>> SStables contain counter columns and leveled compaction is used.
>>
>> Thanks,
>> Omid
>
>


Re: Cassandra 1.1.1 stack overflow on an infinite loop building IntervalTree

2012-06-08 Thread Omid Aladini
Also looks similar to this ticket:

https://issues.apache.org/jira/browse/CASSANDRA-4078


On Thu, Jun 7, 2012 at 6:48 PM, Omid Aladini  wrote:

> Hi,
>
> One of my 1.1.1 nodes doesn't restart due to stack overflow on building
> the interval tree. Bumping the stack size doesn't help. Here's the stack
> trace:
>
> https://gist.github.com/2889611
>
> It looks more like an infinite loop on IntervalNode constructor's logic
> than a deep tree since DEBUG log shows looping over the same intervals:
>
> https://gist.github.com/2889862
>
> Running it with assertions enabled shows a number of sstables which the
> first key > last key, for example:
>
> 2012-06-07_16:12:18.18781 java.lang.AssertionError: SSTable first key
> DecoratedKey(2254009252149354268486114339861094,
> 3730343137317c3438333632333932) > last key
> DecoratedKey(22166106697727078019854024428005234814,
> 313138323637397c3432373931353435)
>
> and let's the node come up without hitting IntervalNode constructor. I
> wonder how invalid sstables get create in the first place? Is there a way
> to verify if other nodes in the cluster are affected as well?
>
> Speaking of a solution to get the node back up without wiping the data off
> and let it bootstrap again, I was wondering if I remove affected sstables
> and restart the node followed by a repair, will the node end up in a
> consistent state?
>
> SStables contain counter columns and leveled compaction is used.
>
> Thanks,
> Omid
>


Cassandra 1.1.1 stack overflow on an infinite loop building IntervalTree

2012-06-07 Thread Omid Aladini
Hi,

One of my 1.1.1 nodes doesn't restart due to stack overflow on building the
interval tree. Bumping the stack size doesn't help. Here's the stack trace:

https://gist.github.com/2889611

It looks more like an infinite loop on IntervalNode constructor's logic
than a deep tree since DEBUG log shows looping over the same intervals:

https://gist.github.com/2889862

Running it with assertions enabled shows a number of sstables which the
first key > last key, for example:

2012-06-07_16:12:18.18781 java.lang.AssertionError: SSTable first key
DecoratedKey(2254009252149354268486114339861094,
3730343137317c3438333632333932) > last key
DecoratedKey(22166106697727078019854024428005234814,
313138323637397c3432373931353435)

and let's the node come up without hitting IntervalNode constructor. I
wonder how invalid sstables get create in the first place? Is there a way
to verify if other nodes in the cluster are affected as well?

Speaking of a solution to get the node back up without wiping the data off
and let it bootstrap again, I was wondering if I remove affected sstables
and restart the node followed by a repair, will the node end up in a
consistent state?

SStables contain counter columns and leveled compaction is used.

Thanks,
Omid