[ 
https://issues.apache.org/jira/browse/CASSANDRA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134733#comment-16134733
 ] 

ZhaoYang edited comment on CASSANDRA-13299 at 8/21/17 11:26 AM:
----------------------------------------------------------------

[trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13299-trunk]
[dtest|https://github.com/riptano/cassandra-dtest/commits/CASSANDRA-13299 ]

Changes:

1. Throttle by number of base unfiltered. default is 100. 
2. A pair of open/close range tombstone could have any number of unshadowed 
rows in between. In the patch, when reaching the limit of each batch, if there 
is an open range-tombstone-mark, it will generate a corresponding close marker 
for it. 



Note:
One partition deletion or a range deletion could cause huge number of view rows 
to be removed, thus view mutation may fail to apply due to WTE or 
max_mutation_size, but it could be resolved separately in CASSANDRA-12783. 
Here, I only address the issue of holding entire partition into memory when 
repairing base with mv.


was (Author: jasonstack):
[trunk|https://github.com/jasonstack/cassandra/commits/CASSANDRA-13299-trunk]
[dtest|https://github.com/riptano/cassandra-dtest/commits/CASSANDRA-13299 ]

Changes:

1. Throttle by number of base unfiltered. default is 100. 
2. A pair of open/close range tombstone could have any number of unshadowed 
rows in between, in the patch, simply cache the range tombstones to avoid 
exceeding the limit. And apply cached range tombstones, in next batch.

Note:
One partition deletion or a range deletion could cause huge number of view rows 
to be removed, thus view mutation may fail to apply due to WTE or 
max_mutation_size, but it could be resolved separately in CASSANDRA-12783. 
Here, I only address the issue of holding entire partition into memory when 
repairing base with mv.

> Potential OOMs and lock contention in write path streams
> --------------------------------------------------------
>
>                 Key: CASSANDRA-13299
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13299
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benjamin Roth
>            Assignee: ZhaoYang
>
> I see a potential OOM, when a stream (e.g. repair) goes through the write 
> path as it is with MVs.
> StreamReceiveTask gets a bunch of SSTableReaders. These produce rowiterators 
> and they again produce mutations. So every partition creates a single 
> mutation, which in case of (very) big partitions can result in (very) big 
> mutations. Those are created on heap and stay there until they finished 
> processing.
> I don't think it is necessary to create a single mutation for each partition. 
> Why don't we implement a PartitionUpdateGeneratorIterator that takes a 
> UnfilteredRowIterator and a max size and spits out PartitionUpdates to be 
> used to create and apply mutations?
> The max size should be something like min(reasonable_absolute_max_size, 
> max_mutation_size, commitlog_segment_size / 2). reasonable_absolute_max_size 
> could be like 16M or sth.
> A mutation shouldn't be too large as it also affects MV partition locking. 
> The longer a MV partition is locked during a stream, the higher chances are 
> that WTE's occur during streams.
> I could also imagine that a max number of updates per mutation regardless of 
> size in bytes could make sense to avoid lock contention.
> Love to get feedback and suggestions, incl. naming suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to