Goodness Ayinmode created CASSANDRA-19959:
---------------------------------------------

             Summary:  Out of memory (OOM) risks due to unbound growth in 
collections
                 Key: CASSANDRA-19959
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19959
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Goodness Ayinmode


I noticed some methods with collections that could cause OOM issues. For 
example in [ 
Keyspace.getValidColumnFamilies,|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/db/Keyspace.java#L707]
 this method retrieves a set of valid ColumnFamilyStore objects based on the 
provided column family name. When cfNames.length == 0, it iterates over all the 
column family stores returned by getColumnFamilyStores() and then adds each to 
the valid set. For each cfstore, If autoAddIndexes is true, 
getIndexColumnFamilyStores(cfStore) is called and will add additional index 
column family stores to the set (valid). Since the set grows in size as more 
column families and indexes are added, when a large number of column families 
or indexes are all added at once, there is a potential for significant memory 
consumption increasing the risk of OOM errors. 

This risk also appears in 
[Sets$Literal.prepare|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/cql3/Sets.java#L136],
 
[PendingAntiCompaction$AcquisitionCallback.apply|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/db/repair/PendingAntiCompaction.java#L291]
 , 
[RepairSession.start|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/repair/RepairSession.java#L272],
  
[RepairedState.addAll|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/repair/consistent/RepairedState.java#L208],
  
[SEPExecutor.addTask|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L119],
 
[SystemDistributedKeyspace.startRepairs|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/schema/SystemDistributedKeyspace.java#L226],
 
[SingleTableUpdatesCollector.toMutations|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/cql3/statements/SingleTableUpdatesCollector.java#L95],
 
[AbstractReplicaCollection.filter|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/locator/AbstractReplicaCollection.java#L504],
 
[BatchMessage.execute|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/transport/messages/BatchMessage.java#L173]
 and 
[SystemKeyspace.tokensAsSet|https://github.com/apache/cassandra/blob/ea801625f64bdebf78cf03634e30a1fde037f965/src/java/org/apache/cassandra/db/SystemKeyspace.java#L887]
 with these methods having collections that show potential unbounded growth and 
can cause OOM issues. 

If processing all elements at once is not essential, an optimization could be 
to batch the processing of elements, by splitting the elements into batches of 
smaller chunks and accumulating the results in values per batch or assigning 
fixed sizes for the collections when initializing these collections. 

Please let me know if my analysis is wrong, or if you have any comments 
regarding the optimization suggestion. 

Thank you



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to