Hi Omnia, nice design. I have few questions: FV1: Looking at PayloadStore and PayloadResponse, "id" and "path" seems to be the same thing, which is a reference to the data in the external store. If my interpretation is correct, could we converge to using one of the two terms?
FV2: I think PayloadResponse javadoc or PayloadStore interface may need some updates. It says "Response from publish / download", but PayloadStore.download returns a byte array, not a PayloadResponse. FV3: In PayloadStore interface, public and abstract are implicit modifiers, so I think we can remove them. Thanks Fede On Fri, Jul 25, 2025 at 3:28 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com> wrote: > > Hi Jun, thanks for having the time to review this > > > JR1. While the KIP is potentially useful, I am wondering who is responsible > > for retention for the objects in the payload store. Once a message with a > > reference is deleted, the key of the external object is lost and the object > > may never be deleted. > > The `ttl` in the object store is the responsibility of the owner of this > store it should be configured in away that is reasonable with the retention > config in Kafka. > I have updated the KIP with `Consideration` section. > > > > JR2. Configs: For all new configs, it would be useful to list their types. > Updated the KIP now > > > > JR3. value.serializers: Why is this required? If a user doesn't set it, we > > should just use value.serializer, right? Ditto for key.serializers. > No you right this was copy/past mistake > > > > > JR4. For all new public interfaces such as LargeMessageSerializer, > > PayloadStore and PayloadResponse, it would be useful to include the full > > package name. > Updated the KIP now > > > > JR5. large.message.payload.store.retry.max.backoff.ms and > > large.message.payload.store.retry.delay.backoff.ms: Is the intention to > > implement exponential backoff on retries? If so, it's more consistent if we > > can follow the existing naming convention like retry.backoff.max.ms > > <http://retry.backoff.max.ms/> and > > retry.backoff.ms <http://retry.backoff.ms/>. > I have removed these to simplify the config more (as Luke suggested > initially) and added these to the consideration section. > > > > > JR6. large.message.skip.not.found.error : If the reference can't be found, > > what value does the deserializer return? Note that null has a special > > meaning for tombstone in compacted topics. > The deserialiser will return `new byte[0]` not null. > > > > JR7. PayloadResponse: Why do we have both responseCode and > > PayloadStoreException? > We can do without responseCode, the initial though was to report response > code form payload store. > Update the KIP. > > JR8. Why do we need PayloadStore.metrics? Note that we could monitor the > > metrics in a plugin through the Monitorable interface. > Oh nice, I didn’t know about this interface before. Updated the KIP with this > now. > > > > JR9. Why do we need the protected field PayloadStoreException.isRetryable? > Initial thought here was the serializer can retry the upload. But I have > removed all the retry logic from serializer and it will be up to the > PayloadStore provider to implement this if they need it. > > > > JR10. As Luke mentioned earlier, we could turn PayloadStore to an interface. > It is updated now to interface. > > Hope the last version of the KIP is more simpler now > > Thanks > Omnia > > > On 23 Jul 2025, at 00:43, Jun Rao <j...@confluent.io.INVALID> wrote: > > > > Thanks for the KIP. A few comments. > > > > JR1. While the KIP is potentially useful, I am wondering who is responsible > > for retention for the objects in the payload store. Once a message with a > > reference is deleted, the key of the external object is lost and the object > > may never be deleted. > > > > JR2. Configs: For all new configs, it would be useful to list their types. > > > > JR3. value.serializers: Why is this required? If a user doesn't set it, we > > should just use value.serializer, right? Ditto for key.serializers. > > > > JR4. For all new public interfaces such as LargeMessageSerializer, > > PayloadStore and PayloadResponse, it would be useful to include the full > > package name. > > > > JR5. large.message.payload.store.retry.max.backoff.ms and > > large.message.payload.store.retry.delay.backoff.ms: Is the intention to > > implement exponential backoff on retries? If so, it's more consistent if we > > can follow the existing naming convention like retry.backoff.max.ms > > <http://retry.backoff.max.ms/> and > > retry.backoff.ms <http://retry.backoff.ms/>. > > > > JR6. large.message.skip.not.found.error : If the reference can't be found, > > what value does the deserializer return? Note that null has a special > > meaning for tombstone in compacted topics. > > > > JR7. PayloadResponse: Why do we have both responseCode and > > PayloadStoreException? > > > > JR8. Why do we need PayloadStore.metrics? Note that we could monitor the > > metrics in a plugin through the Monitorable interface. > > > > JR9. Why do we need the protected field PayloadStoreException.isRetryable? > > > > JR10. As Luke mentioned earlier, we could turn PayloadStore to an interface. > > > > Thanks, >