[ 
https://issues.apache.org/jira/browse/CASSANDRA-9021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-9021:
--------------------------------
    Attachment: 9021.txt

OK, so it was a really stupid mistake; the runOnClose now depends on 
ifile/dfile which are closed prior to it being run. Simple patch attached to 
reorder these operations.

> AssertionError and Leak detected during sstable compaction
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-9021
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9021
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Cluster setup:
> - 20-node stress cluster, GCE n1-standard-2
> - 10-node receiver cluster ingesting data, GCE n1-standard-8 
> Platform:
> - Ubuntu 12.0.4 x86_64
> Versions:
> - DSE 4.7.0
> - Cassandra 2.1.3.304
> - Java 1.7.0_45
> DSE Configuration:
> - Xms7540M 
> - Xmx7540M 
> - Xmn800M
> - Ddse.system_cpu_cores=8 -Ddse.system_memory_in_mb=30161 
> - Dcassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader 
> - ea -javaagent:/usr/local/lib/dse/resources/cassandra/lib/jamm-0.3.0.jar 
> - XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities 
> - XX:ThreadPriorityPolicy=42 -Xms7540M -Xmx7540M -Xmn800M 
> - XX:+HeapDumpOnOutOfMemoryError -Xss256k 
> - XX:StringTableSize=1000003 -XX:+UseParNewGC 
> - XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled 
> - XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 
> - XX:CMSInitiatingOccupancyFraction=75 
> - XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB
>            Reporter: Rocco Varela
>            Assignee: Benedict
>             Fix For: 2.1.4
>
>         Attachments: 9021.txt, system.log
>
>
> After ~3 hours of data ingestion we see assertion errors and 'LEAK DETECTED' 
> errors during what looks like sstable compaction.
> system.log snippets (full log attached):
> {code}
> ...
> INFO  [CompactionExecutor:12] 2015-03-23 02:45:51,770  
> CompactionTask.java:267 - Compacted 4 sstables to 
> [/mnt/cass_data_disks/data1/requests_ks/timeline-       
> 9500fe40d0f611e495675d5ea01541b5/requests_ks-timeline-ka-185,].  65,916,594 
> bytes to 66,159,512 (~100% of original) in 26,554ms = 2.376087MB/s.  983 
> total       partitions merged to 805.  Partition merge counts were {1:627, 
> 2:178, }
> INFO  [CompactionExecutor:11] 2015-03-23 02:45:51,837  
> CompactionTask.java:267 - Compacted 4 sstables to 
> [/mnt/cass_data_disks/data1/system/                     
> compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-119,].
>   426 bytes to 42 (~9% of original) in 82ms = 0.000488MB/s.  5  total 
> partitions merged to 1.  Partition merge counts were {1:1, 2:2, }
> ERROR [NonPeriodicTasks:1] 2015-03-23 02:45:52,251  CassandraDaemon.java:167 
> - Exception in thread Thread[NonPeriodicTasks:1,5,main]
> java.lang.AssertionError: null
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata$Chunk.<init>(CompressionMetadata.java:438)
>  ~[cassandra-all-2.1.3.304.jar:2.1.3.304]
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:228)
>  ~[cassandra-all-2.1.3.304.jar:2.1.3.304]
>  at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.dropPageCache(CompressedPoolingSegmentedFile.java:80)
>  ~[cassandra-all-2.1.3.304.jar:2.1.3.304]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:923) 
> ~[cassandra-all-2.1.3.304.jar:2.1.3.304]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier$1.run(SSTableReader.java:2036)
>  ~[cassandra-all-2.1.3.304.jar:2.1.3.304]
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_45]
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_45]
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
>  ~[na:1.7.0_45]
>  at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
>  ~[na:1.7.0_45]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_45]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>  at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> ...
> INFO  [MemtableFlushWriter:50] 2015-03-23 02:47:29,465  Memtable.java:378 - 
> Completed flushing /mnt/cass_data_disks/data1/requests_ks/timeline-           
>        9500fe40d0f611e495675d5ea01541b5/requests_ks-timeline-ka-188-Data.db 
> (16311981 bytes) for commitlog position 
> ReplayPosition(segmentId=1427071574495,             position=4523631)
> ERROR [Reference-Reaper:1] 2015-03-23 02:47:33,987  Ref.java:181 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@2f59b10) to class 
> org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@1251424500:/mnt/cass_data_disks/data1/requests_ks/timeline-9500fe40d0f611e495675d5ea01541b5/
>     requests_ks-timeline-ka-149 was not released before the reference was 
> garbage collected
> INFO  [Service Thread] 2015-03-23 02:47:40,158  GCInspector.java:142 - 
> ConcurrentMarkSweep GC in 12247ms.  CMS Old Gen: 5318987136 -> 457655168; CMS 
> Perm Gen:   44731264 -> 44699160; Par Eden Space: 8597912 -> 418006664; Par 
> Survivor Space: 71865728 -> 59679584
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to