Hi Neha,

   6. It seems like #4 can be avoided by using Map<TopicPartition,
>> Long> or Map<TopicPartition, TopicPartitionOffset> as the argument type.
>
> How? lastCommittedOffsets() is independent of positions(). I'm not sure I
> understood your suggestion.

I think of subscription as you're subscribing to a Set of TopicPartitions.
Because the argument to positions() is TopicPartitionOffset ... it's
conceivable that the method can be called with two offsets for the same
TopicPartition. One way to handle this, is to accept either the first or
the last offset for a TopicPartition. However, if the argument type is
changed to Map<TopicPartition, Long> it precludes the possibility of
getting duplicate offsets of the same TopicPartition.

   7. To address #3, maybe we can return List<TopicPartitionOffset> that are
>> invalid.
>
> I don't particularly see the advantage of returning a list of invalid

partitions from position(). It seems a bit awkward to return a list to

indicate what is obviously a bug. Prefer throwing an error since the user
> should just fix that logic.

I'm not sure if an Exception is needed or desirable here. I don't see this
as a catastrophic failure or a non-recoverable failure. Even if we just
write the bad offsets to a log file and call it a day, I'm ok with that.
But my main goal is to communicate to the API users somehow that they've
provided bad offests which are simply being ignored.

Hi Jay,

I would also like to shorten the name TopicOffsetPosition. Offset and
> Position are duplicative of each other. So perhaps we could call it a
> PartitionOffset or a TopicPosition or something like that. In general class
> names that are just a concatenation of the fields (e.g.
> TopicAndPartitionAndOffset) seem kind of lazy to me since the name doesn't
> really describe it just enumerates. But that is more of a nit pick.


   1. Did you mean to say TopicPartitionOffset instead of
   TopicOffsetPosition?
   2. +1 on PartitionOffset

The lastCommittedPosition is particular bothersome because:
> 1. The name is weird and long
> 2. It returns a list of results. But how can you use the list? The only way
> to use the list is to make a map of tp=>offset and then look up results in
> this map (or do a for loop over the list for the partition you want).

This is sort of what I was talking about in my previous email. My
suggestion was to change the return type to Map<TopicPartition, Long>.

What if we made it:
>    long position(TopicPartition tp)
>    void seek(TopicOffsetPosition p)
>    long committed(TopicPartition tp)
>    void commit(TopicOffsetPosition...);


   1. Absolutely love the idea of position(TopicPartition tp).
   2. I think we also need to provide a method for accessing all positions
   positions() which maybe returns a Map<TopicPartition, Long>?
   3. What is the difference between position(TopicPartition tp) and
committed(TopicPartition
   tp)?
   4. +1 on commit(PartitionOffset...)
   5. +1 on seek(PartitionOffset p)
   6. We should also provide a seek(PartitionOffset... offsets)

Finally, in all the methods where we're using varargs, we should use an
appropriate Collection data structure. For example, for the
subscribe(TopicPartition...
partitions) method, I think a more accurate API would be
subscribe(Set<TopicPartition>
partitions). This allows for the code to be self-documenting.

Reply via email to