Re: [DISCUSS] KIP-280: Enhanced log compaction

Luís Cabral Wed, 02 May 2018 02:37:07 -0700

 Hi Guozhang,

Have you managed to have a look at my reply?
How do you feel about this?

Kind Regards,
Luís Cabral
    On Monday, April 30, 2018, 9:27:15 AM GMT+2, Luís Cabral 
<[email protected]> wrote:  

  Hi Guozhang,

I understand the argument, but this is a hazardous compromise for using Kafka 
as an event store (as is my original intention).

I expect to have many duplicated messages in Kafka as the overall architecture 
being used allows for the producer to re-send a fresh state of the backed data 
into Kafka.Though this scenario is not common, as the intention is for Kafka to 
bear the weight of replaying all the records for new consumers, but it will 
occasionally happen.

As there are plenty of records which are not updated frequently, this would 
leave the topic with a surplus of quite a few million duplicate records (and 
increasing every time the above mentioned function is applied).

I would prefer to have the temporary memory footprint of 8 bytes per record 
whenever the compaction is run (only when not in 'offset' mode), than allowing 
for the topic to run into this state.

What do you think? Is this scenario too specific for me, or do you believe that 
it could happen to other clients as well?

Thanks again for the continued discussion!
Cheers,
Luis    On Friday, April 27, 2018, 8:21:13 PM GMT+2, Guozhang Wang 
<[email protected]> wrote:  

 Hello Luis,

When the comparing the version returns `equal`, the original proposal is to
use the offset as the tie breaker. My previous comment is that

1) when we build the map calling `put`, if there is already an entry for
the key, compare its stored version, and replace if the put record's
version is "no smaller than" the stored record: this is because when
building the map we are always going from smaller offsets to larger ones.

2) when making a second pass to determine if each record should be retained
based on the map, we do not try to break the tie if the map's returned
version is the same but always treat it as "keep". In this case when we are
comparing a record with itself stored in the offset map, version comparison
would return `equals`. As I mentioned in the PR, one caveat is that we may
indeed have multiple records with the same key and the same version, but
once a new versioned record is appended it will be deleted.

Does that make sense?

Guozhang

On Fri, Apr 27, 2018 at 4:20 AM, Luís Cabral <[email protected]>
wrote:

>  Hi,
>
> I was updating the PR to match the latest decisions and noticed (or
> rather, the integration tests noticed) that without storing the offset,
> then the cache doesn't know when to keep the record itself.
>
> This is because, after the cache is populated, all the records are
> compared against the stored ones, so "Record{key:A,offset:1,version:1}"
> will compare against itself and be flagged as "don't keep", since we only
> compared based on the version and didn't check to see if the offset was the
> same or not.
>
> This sort of invalidates not storing the offset in the cache,
> unfortunately, and the binary footprint increases two-fold when "offset" is
> not used as a compaction strategy.
>
> Guozhang: Is it ok with you if we go back on this decision and leave the
> offset as a tie-breaker?
>
>
> Kind Regards,Luis
>
>    On Friday, April 27, 2018, 11:11:55 AM GMT+2, Luís Cabral
> <[email protected]> wrote:
>
>  Hi,
>
> The KIP is now updated with the results of the byte array discussion.
>
> This is my first contribution to Kafka, so I'm not sure on what the
> processes are. Is it now acceptable to take this into a vote, or should I
> ask for more contributors to join the discussion first?
>
> Kind Regards,Luis    On Friday, April 27, 2018, 6:12:03 AM GMT+2, Guozhang
> Wang <[email protected]> wrote:
>
>  Hello Luís,
>
> > Offset is an integer? I've only noticed it being resolved as a long so
> far.
>
> You are right, offset is a long.
>
> As for timestamp / other types, I left a comment in your PR about handling
> tie breakers.
>
> > Given these arguments, is this point something that you absolutely must
> have?
>
> No I do not have a strong use case in mind to go with arbitrary byte
> arrays, was just thinking that if we are going to enhance log compaction
> why not generalize it more :)
>
> Your concern about the memory usage makes sense. I'm happy to take my
> suggestion back and enforce only long typed fields.
>
>
> Guozhang
>
>
>
>
>
> On Thu, Apr 26, 2018 at 1:44 AM, Luís Cabral <[email protected]
> >
> wrote:
>
> >  Hi,
> >
> > bq. have a integer typed OffsetMap (for offset)
> >
> > Offset is an integer? I've only noticed it being resolved as a long so
> far.
> >
> >
> > bq. long typed OffsetMap (for timestamp)
> >
> > We would still need to store the offset, as it is functioning as a
> > tie-breaker. Not that this is a big deal, we can be easily have both (as
> > currently done on the PR).
> >
> >
> > bq. For the byte array typed offset map, we can use a general hashmap,
> > where the hashmap's CAPACITY will be reasoned from the given "val memory:
> > Int" parameter
> >
> > If you have a map with 128 byte capacity, then store a value with 16
> bytes
> > and another with 32 bytes, how many free slots do you have left in this
> map?
> >
> > You can make this work, but I think you would need to re-design the whole
> > log cleaner approach, which implies changing some of the already existing
> > configurations (like "log.cleaner.io.buffer.load.factor"). I would
> rather
> > maintain backwards compatibility as much as possible in this KIP, and if
> > this means that using "foo" / "bar" or "2.1-a" / "3.20-b" as record
> > versions is not viable, then so be it.
> >
> > Given these arguments, is this point something that you absolutely must
> > have? I'm still sort of hoping that you are just entertaining the idea
> and
> > are ok with having a long (now conceded to be unsigned, so the byte
> arrays
> > can be compared directly).
> >
> >
> > Kind Regards,Luis
> >
>
>
>
> --
> -- Guozhang
>

-- 
-- Guozhang

Re: [DISCUSS] KIP-280: Enhanced log compaction

Reply via email to