Re: HighwaterMarkCheckpoint

Jay Kreps Mon, 26 Nov 2012 17:21:21 -0800

Also I don't see any unit test coverage for this class directly, is that
right? If so I will add it...


I do see two big integration tests HighwatermarkPersistenceTest and
LogRecoveryTest both use it. What is the difference between these?

-Jay


On Mon, Nov 26, 2012 at 5:17 PM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Does this capture the intention of this code:
> https://gist.github.com/4151743
> ?
>
> The idea is to rename it to OffsetCheckpoint so it is generic. To do this
> 1. I make the file name and argument
> 2. read now returns a Map of all values. To get the old behavior you would
> do chkpt.read.getOrElse(TopicPartition(t,p), 0L). This makes sense as
> defaulting to 0L may only make sense for highwatermarks.
> 3. Throw and exception rather than calling shutdown (always a good idea)
>
> Some things I am confused about:
> 1. Parsing logic--there is very complex being done but I don't know why.
> Are we just parsing three space-separated values? Why all the indexOf stuff?
> 2. Why do we store the number of entries in the file?
>
> Cheers,
>
> -Jay
>
>
> On Mon, Nov 26, 2012 at 4:14 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
>> The api for the HighwaterMarkCheckpoint is:
>>
>>   def write(highwaterMarksPerPartition: Map[TopicAndPartition, Long])
>>
>>   def read(topic: String, partition: Int): Long
>>
>> Pretty weird, no? The write method writes a map of all values,
>> obliterating previous values. The read method internally reads back that
>> whole map, then throws away all the values but one and returns that one. It
>> seems like the natural way to do it would be to either read and write one
>> key/value pair or read and write them all together.
>>
>> Is there a rationale for this? Can we change it?
>>
>> It would be nice to reuse this class to track log flushes, and the
>> cleaner point. However in its current form it isn't particularly useful.
>>
>> -Jay
>>
>
>

Re: HighwaterMarkCheckpoint

Reply via email to