[ 
https://issues.apache.org/jira/browse/KAFKA-7297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16721960#comment-16721960
 ] 

Jun Rao commented on KAFKA-7297:
--------------------------------

[~lindong], Ismael pointed out that ConcurrentSkipListMap supports weakly 
consistent which guarantees the following.

they are guaranteed to traverse elements as they existed upon construction 
exactly once, and may (but are not guaranteed to) reflect any modifications 
subsequent to construction

In the common case where we just add new segments to the end, iterating the 
segments w/o a lock seems ok.

This only case is the segment replacing issue that you mentioned in the 
description. That may not be causing a problem now. For example, readers that 
iterates segments such as flush() and offsetForTimestamp seem to be ok with 
overlapping segments. So for now, maybe we should at least (1) make the 
implementation of both logSegments() consistent, i.e, not taking a lock to get 
the iterator (2) document the behavior of the returned iterator, i.e, the 
underlying elements could change and there could be potential segment overlap.

> Both read/write access to Log.segments should be protected by lock
> ------------------------------------------------------------------
>
>                 Key: KAFKA-7297
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7297
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Dong Lin
>            Assignee: Zhanxiang (Patrick) Huang
>            Priority: Major
>
> Log.replaceSegments() updates segments in two steps. It first adds new 
> segments and then remove old segments. Though this operation is protected by 
> a lock, other read access such as Log.logSegments does not grab lock and thus 
> these methods may return an inconsistent view of the segments.
> As an example, say Log.replaceSegments() intends to replace segments [0, 
> 100), [100, 200) with a new segment [0, 200). In this case if Log.logSegments 
> is called right after the new segments are added, the method may return 
> segments [0, 200), [100, 200) and messages in the range [100, 200) may be 
> duplicated if caller choose to enumerate all messages in all segments 
> returned by the method.
> The solution is probably to protect read/write access to Log.segments with 
> read/write lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to