[ 
https://issues.apache.org/jira/browse/CASSANDRA-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584498#comment-16584498
 ] 

Jeff Jirsa commented on CASSANDRA-14653:
----------------------------------------

On which version was this observed?


> The performance of "NonPeriodicTasks" pools defined in class 
> ScheduledExecutors is low
> --------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-14653
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14653
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>         Environment: Cassandra nodes :
> 3 nodes, 330G physical memory per node , and four data directory (ssd)  per 
> node.
>            Reporter: Peter Xie
>            Priority: Major
>
> We use cassandra as backend storage for Janusgraph. when we loading huge data 
> (~2 billion vertex, ~10 billion edges), we met some problems.
>  
> At first, we use STCS as compaction strategy , but met below exception.  we 
> checked the value of  "max memory lock" is unlimited and "file map count" is 
> 1 million, these values should enough for loading data. last we found this 
> problem is caused by the virtual memory are all cosumed by cassandra.  So not 
> additional virtual memory can be used by compaction task , and below 
> exception is thrown out.   
> {quote}ERROR [CompactionExecutor:267] 2018-08-09 02:28:40,952 
> JVMStabilityInspector.javv
>  a:74 - OutOfMemory error letting the JVM handle the error:
>  java.lang.OutOfMemoryError: Map failed
> {quote}
> So, we change compaction strategy to LCS, this change seems can resolve the 
> virtual memory problem. But we found another problem : Many sstables which 
> has been compacted are still retained on disk,  these old sstables consume so 
> many disk space, it's causing no enough disk for saving real data. and we 
> found that many files like "mc_txn_compaction_xxx.log" are created under the 
> data directory. 
> After some times' investigaton, found this problem is caused by 
> "NonPeriodicTasks" thread pools.  this pools is always using only one thread 
> for processing clean task after compaction. this thread pool is instanced 
> with class DebuggableScheduledThreadPoolExecutor,
> and DebuggableScheduledThreadPoolExecutor is inherit from class  
> ScheduledThreadPoolExecutor.
> By reading the code of class DebuggableScheduledThreadPoolExecutor,  found 
> DebuggableScheduledThreadPoolExecutor is using an unbound task queue, and 
> core pool size is 1. I think it should wrong using unbound queue.  If we 
> using unbound queue, the thread pool wouldn't  increasing thread even 
> there're many tasks are blocked in queue, because unbound queue never would 
> be full.  I think here should use bound queue, so when clean task is heavily, 
> more threads would created for processing them. 
> {quote}public DebuggableScheduledThreadPoolExecutor(int corePoolSize, String 
> threadPoolName, int priority)
>  Unknown macro: \{ super(corePoolSize, new NamedThreadFactory(threadPoolName, 
> priority)); setRejectedExecutionHandler(rejectedExecutionHandler); }
>   
> public ScheduledThreadPoolExecutor(int corePoolSize,
>  ThreadFactory threadFactory)
>  Unknown macro: \{ super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new 
> DelayedWorkQueue(), threadFactory); }
> {quote}
>  Below is the case about clean task after compaction.  there nearly 3 hours 
> delay for removing file "mc-56525". 
> {quote} 
> TRACE [CompactionExecutor:81] 2018-08-16 21:22:29,664 
> LifecycleTransaction.java:363 - Staging for obsolescence 
> BigTableReader(path='/sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big-Data.db')
>  ..........
>  TRACE [CompactionExecutor:81] 2018-08-16 21:22:41,162 Tracker.java:165 - 
> removing 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big from 
> list of files tracked for test_2.edgestore
>  ............
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,179 SSTableReader.java:2175 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> before barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,180 SSTableReader.java:2181 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> after barrier
>  TRACE [NonPeriodicTasks:1] 2018-08-17 00:28:47,182 SSTableReader.java:2196 - 
> Async instance tidier for 
> /sdb/data/test_2/edgestore-365b0b70a05911e8806001ebe60a5ce7/mc-56525-big, 
> completed
> {quote}
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to