Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

Dong Lin Wed, 18 Jan 2017 09:50:03 -0800

I forgot to mention this in response to question 2 c). If logEndOffset <
offsetToPurge, leader will set low_watermark to min(leader's logEndOffset,
max(low_watermark, offsetToPurge)) = logEndOffset. I made this
<https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=67636826&selectedPageVersions=7&selectedPageVersions=8>
change in the KIP.



On Tue, Jan 17, 2017 at 9:54 PM, Dong Lin <lindon...@gmail.com> wrote:

> Hi Jun,
>
> Thank you. Please see my answers below. The KIP is updated to answer these
> questions (see here
> <https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=67636826&selectedPageVersions=5&selectedPageVersions=6>
> ).
>
> 1. Yes, in this KIP we wait for all replicas. This is the same as if
> producer sends a messge with ack=all and isr=all_replicas. So it seems that
> the comparison is OK?
>
> 2. Good point! I haven't thought about the case where the user-specified
> offset > logEndOffset. Please see answers below.
>
> a) If offsetToPurge < lowWatermark, the first condition
> of DelayedOperationPurgatory will be satisfied immediately when broker
> receives PurgeRequest. Broker will send PurgeResponse to admin client
> immediately. The response maps this partition to the lowWatermark.
>
> This case is covered as the first condition of DelayedOperationPurgatory
> in the current KIP.
>
> b) If highWatermark < offsetToPurge < logEndOffset, leader will send
> FetchResponse with low_watermark=offsetToPurge. Follower records the
> offsetToPurge as low_watermark and sends FetchRequest to the leader with
> the new low_watermark. Leader will then send PurgeResponse to admin client
> which maps this partition to the new low_watermark. The data in the range
> [highWatermark, offsetToPurge] will still be appended from leader to
> followers but will not be exposed to consumers. And in a short period of
> time low_watermark on the follower will be higher than their highWatermark.
>
> This case is also covered in the current KIP so no change is required.
>
> c) If logEndOffset < offsetToPurge, leader will send PurgeResponse to
> admin client immediately. The response maps this partition to
> OffsetOutOfRangeException.
>
> This case is not covered by the current KIP. I just added this as the
> second condition for the PurgeRequest to be removed from
> DelayedOperationPurgatory (in the Proposed Change section). Since the
> PurgeRequest is satisfied immediately when the leader receives it, it
> actually won't be put into the DelayedOperationPurgatory.
>
> 3. Yes, lowWatermark will be used when smallest_offset is used in the
> ListOffsetRequest. I just updated Proposed Change section to specify this.
>
> Thanks,
> Dong
>
> On Tue, Jan 17, 2017 at 6:53 PM, Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Dong,
>>
>> Thanks for the KIP. Looks good overall. Just a few more comments.
>>
>> 1."Note that the way broker handles PurgeRequest is similar to how it
>> handles ProduceRequest with ack = -1 and isr=all_replicas". It seems that
>> the implementation is a bit different. In this KIP, we wait for all
>> replicas. But in producer, acks=all means waiting for all in-sync
>> replicas.
>>
>> 2. Could you describe the behavior when the specified offsetToPurge is (a)
>> smaller than lowWatermark, (b) larger than highWatermark, but smaller than
>> log end offset, (c) larger than log end offset?
>>
>> 3. In the ListOffsetRequest, will lowWatermark be returned when the
>> smallest_offset option is used?
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Wed, Jan 11, 2017 at 1:01 PM, Dong Lin <lindon...@gmail.com> wrote:
>>
>> > Hi all,
>> >
>> > It seems that there is no further concern with the KIP-107. At this
>> point
>> > we would like to start the voting process. The KIP can be found at
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-107
>> > %3A+Add+purgeDataBefore%28%29+API+in+AdminClient.
>> >
>> > Thanks,
>> > Dong
>> >
>>
>
>

Re: [VOTE] KIP-107 Add purgeDataBefore() API in AdminClient

Reply via email to