[ https://issues.apache.org/jira/browse/CASSANDRA-12268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15415777#comment-15415777 ]
T Jake Luciani commented on CASSANDRA-12268: -------------------------------------------- I think you need to restart those tests since they didn't run for 3.0 but the trunk version has [View test failures|https://cassci.datastax.com/view/Dev/view/carlyeks/job/carlyeks-ticket-12268-testall/lastCompletedBuild/testReport/] > Make MV Index creation robust for wide referent rows > ---------------------------------------------------- > > Key: CASSANDRA-12268 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12268 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Shook > Assignee: Carl Yeksigian > Fix For: 3.0.x, 3.x > > Attachments: 12268.py > > > When creating an index for a materialized view for extant data, heap pressure > is very dependent on the cardinality of of rows associated with each index > value. With the way that per-index value rows are created within the index, > this can cause unbounded heap pressure, which can cause OOM. This appears to > be a side-effect of how each index row is applied atomically as with batches. > The commit logs can accumulate enough during the process to prevent the node > from being restarted. Given that this occurs during global index creation, > this can happen on multiple nodes, making stable recovery of a node set > difficult, as co-replicas become unavailable to assist in back-filling data > from commitlogs. > While it is understandable that you want to avoid having relatively wide rows > even in materialized views, this represents a particularly difficult > scenario for triage. > The basic recommendation for improving this is to sub-group the index > creation into smaller chunks internally, providing a maximal bound against > the heap pressure when it is needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)