Hi,

I am actually not sure if using `Record` is really the right thing? While `Record` is technically public API, it does not seem to be intended to be exposed to end users?

But I guess my main question is really about what metadata we really want to add to `RecordDeserializationException`? `Record` expose all kind of internal (serialization) metadata like `keySize()`, `valueSize()` and many more. For the DLQ use-case it seems we don't really want any of these? So I am wondering if just adding key/value/ts/headers would be sufficient?

The motivation section of the KIP is somewhat spare about DLQ details, so it's hard to judge what is needed / useful and would would be a leaky abstraction?

About "when we cannot construct a `ConsumerRecord` -- I was not really digging into it until know, and was just following Kirks comment blindly. But looking into the code, we would only not be able to construct a `CosumerRecord` when either key or value deserialization fails? But as we would pass in byte[] type it would not matter. -- Kirk did also mention a corrupted batch, but it seems for this case we might not even hit the deserialization code path, but would error out earlier?

I was also looking into the build setup, and I think the idea of the import control is to have some sanity check about import dependencies. I currently don't see why we should not add an allow rule for Record.

But if we decide to not pass in Record/ConsumerRecord both questions are void anyway. Of course, for this case, we would need to add a getter method for each metadata field we add (but I think that would be totally ok?)

I also seen know, that the old constructor is deprecated, and thus, I think using `Optional` a return type is not required (already reflected on the wiki page).

Bottom line seems to be: the motivation about what metadata is needed for the DLQ use-case is not described in much detail and thus it's hard to judge what the right design might be?

The wiki account thing is unfortunately nothing we can fix on our side. We did file a ticket with INFRA team, but need to wait for them to address it... In the meantime, if you can provide the missing information, and what you want to get edited, I can help to update the wiki page accordingly.


-Matthias

On 4/16/24 11:18 AM, Sophie Blee-Goldman wrote:
Also ignore everything I said about Streams earlier. I didn't look closely
enough on my first pass over the KIP and thought this was changing the
DeserializationExceptionHandler in Streams. I see now that this is
actually about the consumer client's DeserializationException so everything
I said about using a ByteArray Deserialization and Kafka Streams doesn't
apply here. The important thing is just that it still deserializes one
record
at a time, and essentially throws this when it fails to convert the Record
type into a ConsumerRecord type. So there's always only one record
at a type to consider.

Sorry for any confusion I caused

On Tue, Apr 16, 2024 at 11:15 AM Sophie Blee-Goldman <sop...@responsive.dev>
wrote:

Ah, thanks for the additional context. I should have looked at the code
before I opened my mouth (so to speak)

In that case, I fully agree that using Record instead of ConsumerRecord
makes sense. It does indeed seem like by definition, if there is a
DeserializationException then there is no ConsumerRecord since this
is where/how it gets thrown:

try {
...
return new ConsumerRecord<>(...);
} catch (RuntimeException e) {
...
throw new RecordDeserializationException(...);
}

As you mentioned the Record is an input to the method so we definitely have
one of those, and imo, it makes sense to use. As far as I can tell it's
just
a regular public interface so exposing it shouldn't be an issue just based
on
the class itself. But I'm still a bit concerned about the checkstyle
complaint.

I'll try to find someone who can explain why or if we should avoid
returning
a Record type here. Other than that, I'd say the KIP LGTM as-is and we
could kick off voting

On Tue, Apr 16, 2024 at 10:47 AM Frédérik Rouleau
<froul...@confluent.io.invalid> wrote:

Thanks Sophie,

I can write something in the KIP on how KStreams solves that issue, but as
I can't create a Wiki account, I will have to find someone to do this on
my
behalf (if someone can work on solving that wiki account creation, it
would
be great).

The biggest difference between Record and ConsumerRecord is that data are
stored respectively using ByteBuffer and Byte array.

For the Record option, the object already exists in the parsing method, so
it's roughly just a parameter type change in the Exception. The point is
just about exposing the Record class externally. By the way, the name
Record is also making some IDE a bit crazy by confusing it with the new
Java Record feature. An alternative could be to create another wrapper
type
of just include key and value ByteBuffer in the
RecordDeserializationException itself.

For the ConsumerRecord option, it requires to allocate Byte arrays, even
if
the user does not need it (skip the poison pill for example). This might
have some extra cost on GC for some specific use case.

Fred



Reply via email to