[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-03-01 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173441#comment-15173441
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

I had to bring the cluster into a healthy state and deleted the whole keyspace 
which was affected. Node that to me it seemed that the problem somehow spread 
from first only occurring on one node and then spread to other nodes. Also I 
guess rate limiting the amount of how many times this log message gets logged 
would be nice in order to be able to debug issues. 

Since the deletion of the whole keyspace I haven't seen any issues.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169058#comment-15169058
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

Here is the exception what happens prior the log spamming starts
{code}
ERROR [CompactionExecutor:5] 2016-02-26 14:05:26,622 CassandraDaemon.java:195 - 
Exception in thread Thread[CompactionExecutor:5,1,main]
java.lang.AssertionError: null
at org.apache.cassandra.db.rows.BufferCell.(BufferCell.java:49) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BufferCell.tombstone(BufferCell.java:88) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BufferCell.tombstone(BufferCell.java:83) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at org.apache.cassandra.db.rows.BufferCell.purge(BufferCell.java:175) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.lambda$purge$107(ComplexColumnData.java:165)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData$$Lambda$126/379481423.apply(Unknown
 Source) ~[na:na]
at 
org.apache.cassandra.utils.btree.BTree$FiltrationTracker.apply(BTree.java:650) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:693) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:668) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.transformAndFilter(ComplexColumnData.java:170)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.purge(ComplexColumnData.java:165)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.ComplexColumnData.purge(ComplexColumnData.java:43) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BTreeRow.lambda$purge$102(BTreeRow.java:333) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BTreeRow$$Lambda$125/1572342504.apply(Unknown 
Source) ~[na:na]
at 
org.apache.cassandra.utils.btree.BTree$FiltrationTracker.apply(BTree.java:650) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:693) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.btree.BTree.transformAndFilter(BTree.java:668) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.rows.BTreeRow.transformAndFilter(BTreeRow.java:338) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at org.apache.cassandra.db.rows.BTreeRow.purge(BTreeRow.java:333) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.partitions.PurgeFunction.applyToRow(PurgeFunction.java:88)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:116) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:120) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:57) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:153)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:118)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:74)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:132)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
 ~[apache-cassandra-3.3.0.jar:3.3.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_45]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617

[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15169049#comment-15169049
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

I've hit the bug again without doing any repair. Again the logs are so quickly 
filled with the message that I was unable to get any message prior the LCS 
message.
{code}
root@cassandra1:/var/log/cassandra# nodetool compactionstats
pending tasks: 41
- ham.raw_sessions: 41
{code}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Cadetg updated CASSANDRA-11172:
-
Attachment: tpstats.txt

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168960#comment-15168960
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

[~krummas] I've added nodetool tpstats output. Would you like to see some other 
output from nodetool?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168954#comment-15168954
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

No I didn't start any repair after restarting the nodes. Actually on one node I 
just restarted and will try to catch the exception happening in the thread 
doing the compaction. I hope that helps to figure out the root cause.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168938#comment-15168938
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

Unfortunately I can't provide any exception trace from before the issue starts 
because all the 20 zip logs are already full with the log line above. I also 
started a {{nodetool scrub ham raw_sessions}} but I'm wondering whether this 
will do any good and there is lot's data so it might take a while until it hits 
the sstables causing the issue.

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168934#comment-15168934
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

[~krummas] I've restarted nodes many times and the issue just comes up again. 
Does this mean that it is not related to the fix you've done? The issue happens 
during normal compactions on just restarted node after some time (I think after 
it tries to do the first compaction) with the result that the system.log starts 
get spammed with the infinite amount to the messages.
{code}
INFO  [CompactionExecutor:5] 2016-02-26 05:45:56,479 LeveledManifest.java:438 - 
Adding high-level (L3) 
BigTableReader(path='/var/lib/cassandra/data2/ham/raw_sessions-417de7c0bb4711e4972d05e7bd5b0c2f/ma-45159-big-Data.db')
 to candidates
{code}

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-26 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168706#comment-15168706
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

[~krummas] Ok I guess that is to prevent it from occurring but we've already 
done this. Is there a way to fix the issue afterwards e.g. would crub'ing the 
tables bring something?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11172) Infinite loop bug adding high-level SSTableReader in compaction

2016-02-25 Thread Marco Cadetg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168435#comment-15168435
 ] 

Marco Cadetg commented on CASSANDRA-11172:
--

We are seeing this issue on multiple nodes and it brings down our cassandra 
cluster. Is there some way to mitigate the issue until 3.4 is out?

> Infinite loop bug adding high-level SSTableReader in compaction
> ---
>
> Key: CASSANDRA-11172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11172
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: DSE 4.x / Cassandra 2.1.11.969
>Reporter: Jeff Ferland
>Assignee: Marcus Eriksson
> Fix For: 2.1.14, 2.2.6, 3.0.4, 3.4
>
> Attachments: beep.txt, tpstats_compaction.txt, 
> trapped_in_compaction.txt, trapped_in_compaction_mixed.txt
>
>
> Observed that after a large repair on LCS that sometimes the system will 
> enter an infinite loop with vast amounts of logs lines recording, "Adding 
> high-level (L${LEVEL}) SSTableReader(path='${TABLE}') to candidates"
> This results in an outage of the node and eventual crashing. The log spam 
> quickly rotates out possibly useful earlier debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)