[ https://issues.apache.org/jira/browse/CASSANDRA-12268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeremy Hanna updated CASSANDRA-12268: ------------------------------------- Component/s: Materialized Views > Make MV Index creation robust for wide referent rows > ---------------------------------------------------- > > Key: CASSANDRA-12268 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12268 > Project: Cassandra > Issue Type: Improvement > Components: Core, Materialized Views > Reporter: Jonathan Shook > Assignee: Carl Yeksigian > Fix For: 3.0.10, 3.10, 4.0 > > Attachments: 12268.py > > > When creating an index for a materialized view for extant data, heap pressure > is very dependent on the cardinality of of rows associated with each index > value. With the way that per-index value rows are created within the index, > this can cause unbounded heap pressure, which can cause OOM. This appears to > be a side-effect of how each index row is applied atomically as with batches. > The commit logs can accumulate enough during the process to prevent the node > from being restarted. Given that this occurs during global index creation, > this can happen on multiple nodes, making stable recovery of a node set > difficult, as co-replicas become unavailable to assist in back-filling data > from commitlogs. > While it is understandable that you want to avoid having relatively wide rows > even in materialized views, this represents a particularly difficult > scenario for triage. > The basic recommendation for improving this is to sub-group the index > creation into smaller chunks internally, providing a maximal bound against > the heap pressure when it is needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org