Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2020-01-15 Thread Bill Bejeck
Great all of that sounds good to me.

Since we are all in agreement, we can move to a vote.

-Bill

On Wed, Jan 15, 2020 at 11:51 AM am  wrote:

> I agree WARN seems like it would be fine given we already log at that
> level for the handler. It also seems reasonable to exclude the
> ClassCastException as that can indicate something other then a simple
> serialization exception and would keep the current behavior.
>
> anna
>
> On Tue, Jan 14, 2020 at 11:41 PM Matthias J. Sax 
> wrote:
> >
> > I was just checking the existing code, and we currently log at WARN
> > level if the handler returns CONTINUE -- we did not have any complaints
> > about it, hence, I don't see an issue with WARN -- and we should keep it
> > consistent.
> >
> > I think the explicit mentioning of `ClassCastException` is based on the
> > current code that catches this exception to rethrow it -- this was a
> > minor improvement to help people to more easily detect miss-configure
> > serdes.
> >
> > In think, we can just catch all exception and the handler can decide
> > what to do. Thinking about this once more, it might actually be better
> > if we could _exclude_ `ClassCastException` as it may indicate a miss
> > configured Serde?
> >
> >
> > -Matthias
> >
> > On 1/14/20 4:15 PM, Bill Bejeck wrote:
> > > Hi Anna,
> > >
> > > Thanks for getting this KIP going again.
> > >
> > > I agree with pushing forward on option 0 as well.  I a couple of
> questions
> > > about the KIP as written.
> > >
> > > The KIP states that any {{ClassCastException}} thrown plus any other
> > > unchecked exceptions will result in a log statement and not stop
> processing
> > > if the handler returns CONTINUE.
> > >
> > >1. I'm wondering if DEBUG is the correct level vs. a WARN,
> although, at
> > >WARN, we could end up spamming the log file.
> > >2. Are allowing all unchecked exceptions to proceed to permissive?
> I
> > >could be overly cautious here, but I'm afraid of masking a serious
> > >problem.
> > >
> > > Overall I'm in favor of this KIP and if you feel it's good as is, I
> > > wouldn't block on these questions  I just wanted to throw in my 2
> cents.
> > >
> > > Thanks,
> > > Bill
> > >
> > > On Sat, Jan 11, 2020 at 7:02 PM Mitchell  wrote:
> > >
> > >> I'm happy to have the serialization handler now.  I've hit this issue
> > >> a number of times in the past.
> > >>
> > >> I think the other options are also large enough they probably deserve
> > >> their own KIPs to properly document the changes.
> > >> -mitch
> > >>
> > >> On Fri, Jan 10, 2020 at 7:33 PM am  wrote:
> > >>>
> > >>> Hello,
> > >>>
> > >>> I would like to re-restart the discussion of KIP-399
> > >>>
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> > >>>
> > >>> The last conversation centered on if this KIP should address the
> issues
> > >>> around state store/change log divergence with Matthias presenting
> three
> > >>> options:
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> *To move this KIP forward, maybe we can just (0) add the handler
> > >>> forserialization exceptions when writing into any topic and consider
> it
> > >>> anincremental improvement. Ie, (1) we keep the door open to let state
> > >>> andchangelog topic diverge (current status) (2) we allow people to
> > >>> violateEOS (current state) (3) and we don't improve the handling of
> DSL
> > >>> statestore serialization exceptions.We could address (1), (2),
> and/or (3)
> > >>> in follow up KIPs.Thoughts? Let us know if you only want to address
> (0),
> > >> or
> > >>> extend thecurrent KIP to include any of (1-3).*
> > >>>
> > >>>
> > >>> I would like to propose we go with option 0 and treat this as an
> > >>> incremental improvement that applies to any topic and address the
> issue
> > >> of
> > >>> divergence in future KIP(s).
> > >>>
> > >>> Feedback, thoughts and musings appreciated,
> > >>>
> > >>> anna
> > >>
> > >
> >
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2020-01-15 Thread am
I agree WARN seems like it would be fine given we already log at that
level for the handler. It also seems reasonable to exclude the
ClassCastException as that can indicate something other then a simple
serialization exception and would keep the current behavior.

anna

On Tue, Jan 14, 2020 at 11:41 PM Matthias J. Sax  wrote:
>
> I was just checking the existing code, and we currently log at WARN
> level if the handler returns CONTINUE -- we did not have any complaints
> about it, hence, I don't see an issue with WARN -- and we should keep it
> consistent.
>
> I think the explicit mentioning of `ClassCastException` is based on the
> current code that catches this exception to rethrow it -- this was a
> minor improvement to help people to more easily detect miss-configure
> serdes.
>
> In think, we can just catch all exception and the handler can decide
> what to do. Thinking about this once more, it might actually be better
> if we could _exclude_ `ClassCastException` as it may indicate a miss
> configured Serde?
>
>
> -Matthias
>
> On 1/14/20 4:15 PM, Bill Bejeck wrote:
> > Hi Anna,
> >
> > Thanks for getting this KIP going again.
> >
> > I agree with pushing forward on option 0 as well.  I a couple of questions
> > about the KIP as written.
> >
> > The KIP states that any {{ClassCastException}} thrown plus any other
> > unchecked exceptions will result in a log statement and not stop processing
> > if the handler returns CONTINUE.
> >
> >1. I'm wondering if DEBUG is the correct level vs. a WARN, although, at
> >WARN, we could end up spamming the log file.
> >2. Are allowing all unchecked exceptions to proceed to permissive?  I
> >could be overly cautious here, but I'm afraid of masking a serious
> >problem.
> >
> > Overall I'm in favor of this KIP and if you feel it's good as is, I
> > wouldn't block on these questions  I just wanted to throw in my 2 cents.
> >
> > Thanks,
> > Bill
> >
> > On Sat, Jan 11, 2020 at 7:02 PM Mitchell  wrote:
> >
> >> I'm happy to have the serialization handler now.  I've hit this issue
> >> a number of times in the past.
> >>
> >> I think the other options are also large enough they probably deserve
> >> their own KIPs to properly document the changes.
> >> -mitch
> >>
> >> On Fri, Jan 10, 2020 at 7:33 PM am  wrote:
> >>>
> >>> Hello,
> >>>
> >>> I would like to re-restart the discussion of KIP-399
> >>>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> >>>
> >>> The last conversation centered on if this KIP should address the issues
> >>> around state store/change log divergence with Matthias presenting three
> >>> options:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *To move this KIP forward, maybe we can just (0) add the handler
> >>> forserialization exceptions when writing into any topic and consider it
> >>> anincremental improvement. Ie, (1) we keep the door open to let state
> >>> andchangelog topic diverge (current status) (2) we allow people to
> >>> violateEOS (current state) (3) and we don't improve the handling of DSL
> >>> statestore serialization exceptions.We could address (1), (2), and/or (3)
> >>> in follow up KIPs.Thoughts? Let us know if you only want to address (0),
> >> or
> >>> extend thecurrent KIP to include any of (1-3).*
> >>>
> >>>
> >>> I would like to propose we go with option 0 and treat this as an
> >>> incremental improvement that applies to any topic and address the issue
> >> of
> >>> divergence in future KIP(s).
> >>>
> >>> Feedback, thoughts and musings appreciated,
> >>>
> >>> anna
> >>
> >
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2020-01-14 Thread Matthias J. Sax
I was just checking the existing code, and we currently log at WARN
level if the handler returns CONTINUE -- we did not have any complaints
about it, hence, I don't see an issue with WARN -- and we should keep it
consistent.

I think the explicit mentioning of `ClassCastException` is based on the
current code that catches this exception to rethrow it -- this was a
minor improvement to help people to more easily detect miss-configure
serdes.

In think, we can just catch all exception and the handler can decide
what to do. Thinking about this once more, it might actually be better
if we could _exclude_ `ClassCastException` as it may indicate a miss
configured Serde?


-Matthias

On 1/14/20 4:15 PM, Bill Bejeck wrote:
> Hi Anna,
> 
> Thanks for getting this KIP going again.
> 
> I agree with pushing forward on option 0 as well.  I a couple of questions
> about the KIP as written.
> 
> The KIP states that any {{ClassCastException}} thrown plus any other
> unchecked exceptions will result in a log statement and not stop processing
> if the handler returns CONTINUE.
> 
>1. I'm wondering if DEBUG is the correct level vs. a WARN, although, at
>WARN, we could end up spamming the log file.
>2. Are allowing all unchecked exceptions to proceed to permissive?  I
>could be overly cautious here, but I'm afraid of masking a serious
>problem.
> 
> Overall I'm in favor of this KIP and if you feel it's good as is, I
> wouldn't block on these questions  I just wanted to throw in my 2 cents.
> 
> Thanks,
> Bill
> 
> On Sat, Jan 11, 2020 at 7:02 PM Mitchell  wrote:
> 
>> I'm happy to have the serialization handler now.  I've hit this issue
>> a number of times in the past.
>>
>> I think the other options are also large enough they probably deserve
>> their own KIPs to properly document the changes.
>> -mitch
>>
>> On Fri, Jan 10, 2020 at 7:33 PM am  wrote:
>>>
>>> Hello,
>>>
>>> I would like to re-restart the discussion of KIP-399
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
>>>
>>> The last conversation centered on if this KIP should address the issues
>>> around state store/change log divergence with Matthias presenting three
>>> options:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *To move this KIP forward, maybe we can just (0) add the handler
>>> forserialization exceptions when writing into any topic and consider it
>>> anincremental improvement. Ie, (1) we keep the door open to let state
>>> andchangelog topic diverge (current status) (2) we allow people to
>>> violateEOS (current state) (3) and we don't improve the handling of DSL
>>> statestore serialization exceptions.We could address (1), (2), and/or (3)
>>> in follow up KIPs.Thoughts? Let us know if you only want to address (0),
>> or
>>> extend thecurrent KIP to include any of (1-3).*
>>>
>>>
>>> I would like to propose we go with option 0 and treat this as an
>>> incremental improvement that applies to any topic and address the issue
>> of
>>> divergence in future KIP(s).
>>>
>>> Feedback, thoughts and musings appreciated,
>>>
>>> anna
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2020-01-14 Thread Bill Bejeck
Hi Anna,

Thanks for getting this KIP going again.

I agree with pushing forward on option 0 as well.  I a couple of questions
about the KIP as written.

The KIP states that any {{ClassCastException}} thrown plus any other
unchecked exceptions will result in a log statement and not stop processing
if the handler returns CONTINUE.

   1. I'm wondering if DEBUG is the correct level vs. a WARN, although, at
   WARN, we could end up spamming the log file.
   2. Are allowing all unchecked exceptions to proceed to permissive?  I
   could be overly cautious here, but I'm afraid of masking a serious
   problem.

Overall I'm in favor of this KIP and if you feel it's good as is, I
wouldn't block on these questions  I just wanted to throw in my 2 cents.

Thanks,
Bill

On Sat, Jan 11, 2020 at 7:02 PM Mitchell  wrote:

> I'm happy to have the serialization handler now.  I've hit this issue
> a number of times in the past.
>
> I think the other options are also large enough they probably deserve
> their own KIPs to properly document the changes.
> -mitch
>
> On Fri, Jan 10, 2020 at 7:33 PM am  wrote:
> >
> > Hello,
> >
> > I would like to re-restart the discussion of KIP-399
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> >
> > The last conversation centered on if this KIP should address the issues
> > around state store/change log divergence with Matthias presenting three
> > options:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > *To move this KIP forward, maybe we can just (0) add the handler
> > forserialization exceptions when writing into any topic and consider it
> > anincremental improvement. Ie, (1) we keep the door open to let state
> > andchangelog topic diverge (current status) (2) we allow people to
> > violateEOS (current state) (3) and we don't improve the handling of DSL
> > statestore serialization exceptions.We could address (1), (2), and/or (3)
> > in follow up KIPs.Thoughts? Let us know if you only want to address (0),
> or
> > extend thecurrent KIP to include any of (1-3).*
> >
> >
> > I would like to propose we go with option 0 and treat this as an
> > incremental improvement that applies to any topic and address the issue
> of
> > divergence in future KIP(s).
> >
> > Feedback, thoughts and musings appreciated,
> >
> > anna
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2020-01-11 Thread Mitchell
I'm happy to have the serialization handler now.  I've hit this issue
a number of times in the past.

I think the other options are also large enough they probably deserve
their own KIPs to properly document the changes.
-mitch

On Fri, Jan 10, 2020 at 7:33 PM am  wrote:
>
> Hello,
>
> I would like to re-restart the discussion of KIP-399
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
>
> The last conversation centered on if this KIP should address the issues
> around state store/change log divergence with Matthias presenting three
> options:
>
>
>
>
>
>
>
>
>
>
>
> *To move this KIP forward, maybe we can just (0) add the handler
> forserialization exceptions when writing into any topic and consider it
> anincremental improvement. Ie, (1) we keep the door open to let state
> andchangelog topic diverge (current status) (2) we allow people to
> violateEOS (current state) (3) and we don't improve the handling of DSL
> statestore serialization exceptions.We could address (1), (2), and/or (3)
> in follow up KIPs.Thoughts? Let us know if you only want to address (0), or
> extend thecurrent KIP to include any of (1-3).*
>
>
> I would like to propose we go with option 0 and treat this as an
> incremental improvement that applies to any topic and address the issue of
> divergence in future KIP(s).
>
> Feedback, thoughts and musings appreciated,
>
> anna


[DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2020-01-10 Thread am
Hello,

I would like to re-restart the discussion of KIP-399
https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions

The last conversation centered on if this KIP should address the issues
around state store/change log divergence with Matthias presenting three
options:











*To move this KIP forward, maybe we can just (0) add the handler
forserialization exceptions when writing into any topic and consider it
anincremental improvement. Ie, (1) we keep the door open to let state
andchangelog topic diverge (current status) (2) we allow people to
violateEOS (current state) (3) and we don't improve the handling of DSL
statestore serialization exceptions.We could address (1), (2), and/or (3)
in follow up KIPs.Thoughts? Let us know if you only want to address (0), or
extend thecurrent KIP to include any of (1-3).*


I would like to propose we go with option 0 and treat this as an
incremental improvement that applies to any topic and address the issue of
divergence in future KIP(s).

Feedback, thoughts and musings appreciated,

anna


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-10-22 Thread Matthias J. Sax
Some more details:

Changelog Topics: in the DSL, serialization happens in the store level
(MeteredXxxStore) and hence any serialization exception should be thrown
to the `Processor` before the message is written to the changelog topic
or underlying store. When the actually write to the changelog topic
happens, no serialization exception can occur any longer as we already
have `` key-value-pairs at hand.

Note, that for the existing ProductionExceptionHandler, the handler
might still swallow an exception currently and state store and changelog
topic may diverge. If we believe that this is an issue we should fix, it
would be in addition to allow handling serialization exceptions -- on
the other hand, one could argue, that it's the users responsibility to
check the topic name on the exception handler callback and don't swallow
for changelog topics. If we follow this argument, we don't need to
extend the scope of the KIP but push this onto the users (more
flexibility == more responsibility).

Getting back to the DSL serialization exception handler, there is still
the problem that users don't have any chance to catch such an exception
atm. For PAPI users, this issue does not persist as they can add
try-catch blocks to react to serialization exceptions and customize
their code accordingly.

Similar for EOS, users can register a handler, but it would of course
violate the guarantees if they swallow an exception. Again, we could
disallow it, or argue that it's a users own responsibility.

To move this KIP forward, maybe we can just (0) add the handler for
serialization exceptions when writing into any topic and consider it an
incremental improvement. Ie, (1) we keep the door open to let state and
changelog topic diverge (current status) (2) we allow people to violate
EOS (current state) (3) and we don't improve the handling of DSL state
store serialization exceptions.

We could address (1), (2), and/or (3) in follow up KIPs.

Thoughts? Let us know if you only want to address (0), or extend the
current KIP to include any of (1-3).

Would love the hear from others, too.


-Matthias

On 10/17/19 2:52 PM, Walker Carlson wrote:
> I have read through the logs and I am of the opinion that while we should
> not let the store and the change Log diverge. However, it is not obvious
> how we would be able to do that and allow the custom serializer to effect
> that topic. In order to get around this we can create a shell around the
> handler that will not trigger the custom handler if it is a changeLog
> topic. This is also a problem that we can fix in the general handler, so I
> think we can add that fix to this kip.
> 
> As for the repartition topics I can not see any reason to not apply the
> custom logic. By default it fails in any case and does not have to be
> changed from that.
> 
> For the DSL I think that that can come later as this change would not
> affect the status quo. EOS and this handler can be made
> mutually exclusive or we could give users the option to use both with a
> warning. I would be interested in hearing other people's suggestions about
> that.Walker
> 
> On Tue, Oct 15, 2019 at 11:50 PM Matthias J. Sax 
> wrote:
> 
>> Walker,
>>
>> thanks for picking up this KIP. Did you read the previous discussion?
>> It's still unclear if we want to apply the handler to repartition topics
>> or not, and how errors for stores and changelog topic should be handled.
>> For the Processor API, users could catch `SerializationException` but
>> for the DSL this is not possible. Even if there is no serialization
>> problem, it's questionable if we should allow to swallow error when
>> writing into the changelog topic, because the store content and the
>> topic content could diverge, what is especially critical for the EOS case.
>>
>>
>> -Matthias
>>
>> On 10/15/19 10:25 AM, Walker Carlson wrote:
>>> Hello all,
>>>
>>> I would like to restart the discussion of this KIP 399
>>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
>>> .
>>> I think it is some low hanging fruit that could be quite beneficial.
>>>
>>> Thanks,
>>> Walker
>>>
>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-10-17 Thread Walker Carlson
I have read through the logs and I am of the opinion that while we should
not let the store and the change Log diverge. However, it is not obvious
how we would be able to do that and allow the custom serializer to effect
that topic. In order to get around this we can create a shell around the
handler that will not trigger the custom handler if it is a changeLog
topic. This is also a problem that we can fix in the general handler, so I
think we can add that fix to this kip.

As for the repartition topics I can not see any reason to not apply the
custom logic. By default it fails in any case and does not have to be
changed from that.

For the DSL I think that that can come later as this change would not
affect the status quo. EOS and this handler can be made
mutually exclusive or we could give users the option to use both with a
warning. I would be interested in hearing other people's suggestions about
that.Walker

On Tue, Oct 15, 2019 at 11:50 PM Matthias J. Sax 
wrote:

> Walker,
>
> thanks for picking up this KIP. Did you read the previous discussion?
> It's still unclear if we want to apply the handler to repartition topics
> or not, and how errors for stores and changelog topic should be handled.
> For the Processor API, users could catch `SerializationException` but
> for the DSL this is not possible. Even if there is no serialization
> problem, it's questionable if we should allow to swallow error when
> writing into the changelog topic, because the store content and the
> topic content could diverge, what is especially critical for the EOS case.
>
>
> -Matthias
>
> On 10/15/19 10:25 AM, Walker Carlson wrote:
> > Hello all,
> >
> > I would like to restart the discussion of this KIP 399
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> >.
> > I think it is some low hanging fruit that could be quite beneficial.
> >
> > Thanks,
> > Walker
> >
>
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-10-16 Thread Matthias J. Sax
Walker,

thanks for picking up this KIP. Did you read the previous discussion?
It's still unclear if we want to apply the handler to repartition topics
or not, and how errors for stores and changelog topic should be handled.
For the Processor API, users could catch `SerializationException` but
for the DSL this is not possible. Even if there is no serialization
problem, it's questionable if we should allow to swallow error when
writing into the changelog topic, because the store content and the
topic content could diverge, what is especially critical for the EOS case.


-Matthias

On 10/15/19 10:25 AM, Walker Carlson wrote:
> Hello all,
> 
> I would like to restart the discussion of this KIP 399
> .
> I think it is some low hanging fruit that could be quite beneficial.
> 
> Thanks,
> Walker
> 



signature.asc
Description: OpenPGP digital signature


[DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-10-15 Thread Walker Carlson
Hello all,

I would like to restart the discussion of this KIP 399
.
I think it is some low hanging fruit that could be quite beneficial.

Thanks,
Walker


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-10-07 Thread Matthias J. Sax
Alaa,

what is the status of this KIP? Are you still working on it?

It would also be good to get feedback from other about the open questions.


-Matthias

On 9/17/19 5:33 PM, Matthias J. Sax wrote:
> Thanks for picking up this KIP.
> 
> My personal take about repartition topics is, that it seems to be ok to
> apply the handler for those, too (in addition to output topics). It
> seems to be more flexible and also simplifies the code. In the end, the
> topic name is passed via `ProducerRecord` into the handler and thus
> users can decide on a per-topic basis what to do.
> 
> About stores and changelogs: yes, serialization happens first. Hence,
> when we put() into RocksDB and also send() to the changelog topic (in
> that case we use `ByteArraySerializer`) no serialization error should
> happen (if there would have been a problem, it would have happened earlier).
> 
> However, in KIP-210, we did not consider the case that a send() might
> fail for a changelog topic while the put() into the store was already
> successfully applies. Hence, it's possible atm, to skip a failed write
> into the changelog topic, even if the put() into the store was
> successful. This seems to be a bug to me and we might want to create a
> separate Jira for it -- it's related to this KIP but should not be
> mangled into the KIP IMHO.
> 
> For the KIP itself, what we _could_ do is, to apply the handler to the
> serialization that happens before the data is put() into the store.
> However, I am not sure if we should allow this -- atm, I tend to think
> we should not allow it and exclude store serialization for the handler.
> 
> 
> 
> -Matthias
> 
> On 9/10/19 1:46 AM, Alaa Zbair wrote:
>> Hi,
>>
>> I have checked the KIP-399 and the discussion and also KIP-210.
>>
>> So the question we need to answer is whether it's okay to also skip
>> writing the record in the internal topics, the current implementation of
>> 'ProductionExceptionHandler' is applied for all topics and if we decided
>> to keep it that way, how to ensure that there will be no divergence in
>> local store and changelog topic ?
>>
>> I would like to get input from others on what they think about this.
>>
>> There is a point that I don't understand which is: why in case of a
>> serialization error do we have the choice of either skipping it or
>> putting in the store ? shouldn't the record be correctly serialized
>> before putting it into the store ?
>>
>> On 13/12/2018 14:13, Matthias J. Sax wrote:
>>> For store updates, records are first serialized and afterwards put into
>>> the store and written into the changelog topic.
>>>
>>> In the current implementation, if the send() into the changelog topic
>>> produces an error and the handler skips over it, the local store content
>>> and the changelog topic diverge. This seems to be a correctness issue.
>>>
>>> For serialization error, it would not happen that store and changelog
>>> diverge, because serialization happens before and put/send. Thus, with
>>> this KIP we could skip both put() and send(). However, I am still
>>> wondering, if it would be ok to skip a store update for this case? (Btw:
>>> the current PR does not address this atm, and a serialization error for
>>> a store write would not be covered but kill the instance).
>>>
>>> IIRC, the original idea of the KIP was to allow skipping over record for
>>> output topics only. That's why I am wondering if it's ok to allow
>>> skipper over record in repartitions topics, too.
>>>
>>> In the end, it's some data loss for all 3 cases, so maybe it's ok to
>>> allow skipping for all 3 cases. However, we should not allow that local
>>> store and changelog topic diverge IMHO (what might been an orthogonal
>>> bug thought).
>>>
>>> I also don't have an answer or preference. Just think, it's important to
>>> touch on those cases and get input how people think about it.
>>>
>>>
>>> -Matthias
>>>
>>>
>>>
>>> On 12/11/18 11:43 AM, Kamal Chandraprakash wrote:
 Matthias,

 For changelog topics, I think it does not make sense to allow skipping
 records if serialization fails? For internal repartitions topics, I am
 not sure if we should allow it or not. Would you agree with this? We
 should discuss the implication to derive a sound design.

 Can you explain the issue that happens when records are skipped to
 changelog / internal-repartition topics ? So, that I can look into it.

 On Fri, Dec 7, 2018 at 12:07 AM Matthias J. Sax 
 wrote:

>>> To accept different types of records from multiple topologies, I
>>> have to
>>> define the ProducerRecord without generics.
> Yes. It does make sense. My point was, that the KIP should
> mention/explain this explicitly to allow other not familiar with the
> code base to understand it more easily :)
>
>
>
> About `ClassCastException`: seems to be an implementation detail. No
> need to make it part of the KIP discussion.
>
>
>
> One 

Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-09-17 Thread Matthias J. Sax
Thanks for picking up this KIP.

My personal take about repartition topics is, that it seems to be ok to
apply the handler for those, too (in addition to output topics). It
seems to be more flexible and also simplifies the code. In the end, the
topic name is passed via `ProducerRecord` into the handler and thus
users can decide on a per-topic basis what to do.

About stores and changelogs: yes, serialization happens first. Hence,
when we put() into RocksDB and also send() to the changelog topic (in
that case we use `ByteArraySerializer`) no serialization error should
happen (if there would have been a problem, it would have happened earlier).

However, in KIP-210, we did not consider the case that a send() might
fail for a changelog topic while the put() into the store was already
successfully applies. Hence, it's possible atm, to skip a failed write
into the changelog topic, even if the put() into the store was
successful. This seems to be a bug to me and we might want to create a
separate Jira for it -- it's related to this KIP but should not be
mangled into the KIP IMHO.

For the KIP itself, what we _could_ do is, to apply the handler to the
serialization that happens before the data is put() into the store.
However, I am not sure if we should allow this -- atm, I tend to think
we should not allow it and exclude store serialization for the handler.



-Matthias

On 9/10/19 1:46 AM, Alaa Zbair wrote:
> Hi,
> 
> I have checked the KIP-399 and the discussion and also KIP-210.
> 
> So the question we need to answer is whether it's okay to also skip
> writing the record in the internal topics, the current implementation of
> 'ProductionExceptionHandler' is applied for all topics and if we decided
> to keep it that way, how to ensure that there will be no divergence in
> local store and changelog topic ?
> 
> I would like to get input from others on what they think about this.
> 
> There is a point that I don't understand which is: why in case of a
> serialization error do we have the choice of either skipping it or
> putting in the store ? shouldn't the record be correctly serialized
> before putting it into the store ?
> 
> On 13/12/2018 14:13, Matthias J. Sax wrote:
>> For store updates, records are first serialized and afterwards put into
>> the store and written into the changelog topic.
>>
>> In the current implementation, if the send() into the changelog topic
>> produces an error and the handler skips over it, the local store content
>> and the changelog topic diverge. This seems to be a correctness issue.
>>
>> For serialization error, it would not happen that store and changelog
>> diverge, because serialization happens before and put/send. Thus, with
>> this KIP we could skip both put() and send(). However, I am still
>> wondering, if it would be ok to skip a store update for this case? (Btw:
>> the current PR does not address this atm, and a serialization error for
>> a store write would not be covered but kill the instance).
>>
>> IIRC, the original idea of the KIP was to allow skipping over record for
>> output topics only. That's why I am wondering if it's ok to allow
>> skipper over record in repartitions topics, too.
>>
>> In the end, it's some data loss for all 3 cases, so maybe it's ok to
>> allow skipping for all 3 cases. However, we should not allow that local
>> store and changelog topic diverge IMHO (what might been an orthogonal
>> bug thought).
>>
>> I also don't have an answer or preference. Just think, it's important to
>> touch on those cases and get input how people think about it.
>>
>>
>> -Matthias
>>
>>
>>
>> On 12/11/18 11:43 AM, Kamal Chandraprakash wrote:
>>> Matthias,
>>>
>>> For changelog topics, I think it does not make sense to allow skipping
>>> records if serialization fails? For internal repartitions topics, I am
>>> not sure if we should allow it or not. Would you agree with this? We
>>> should discuss the implication to derive a sound design.
>>>
>>> Can you explain the issue that happens when records are skipped to
>>> changelog / internal-repartition topics ? So, that I can look into it.
>>>
>>> On Fri, Dec 7, 2018 at 12:07 AM Matthias J. Sax 
>>> wrote:
>>>
>> To accept different types of records from multiple topologies, I
>> have to
>> define the ProducerRecord without generics.
 Yes. It does make sense. My point was, that the KIP should
 mention/explain this explicitly to allow other not familiar with the
 code base to understand it more easily :)



 About `ClassCastException`: seems to be an implementation detail. No
 need to make it part of the KIP discussion.



 One more thing that came to my mind. We use the `RecordCollector` to
 write into all topics, ie, user output topics and internal repartition
 and changelog topics.

 For changelog topics, I think it does not make sense to allow skipping
 records if serialization fails? For internal repartitions topics, I am
 not 

Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-09-10 Thread Alaa Zbair

Hi,

I have checked the KIP-399 and the discussion and also KIP-210.

So the question we need to answer is whether it's okay to also skip
writing the record in the internal topics, the current implementation of
'ProductionExceptionHandler' is applied for all topics and if we decided
to keep it that way, how to ensure that there will be no divergence in
local store and changelog topic ?

I would like to get input from others on what they think about this.

There is a point that I don't understand which is: why in case of a
serialization error do we have the choice of either skipping it or
putting in the store ? shouldn't the record be correctly serialized
before putting it into the store ?

On 13/12/2018 14:13, Matthias J. Sax wrote:

For store updates, records are first serialized and afterwards put into
the store and written into the changelog topic.

In the current implementation, if the send() into the changelog topic
produces an error and the handler skips over it, the local store content
and the changelog topic diverge. This seems to be a correctness issue.

For serialization error, it would not happen that store and changelog
diverge, because serialization happens before and put/send. Thus, with
this KIP we could skip both put() and send(). However, I am still
wondering, if it would be ok to skip a store update for this case? (Btw:
the current PR does not address this atm, and a serialization error for
a store write would not be covered but kill the instance).

IIRC, the original idea of the KIP was to allow skipping over record for
output topics only. That's why I am wondering if it's ok to allow
skipper over record in repartitions topics, too.

In the end, it's some data loss for all 3 cases, so maybe it's ok to
allow skipping for all 3 cases. However, we should not allow that local
store and changelog topic diverge IMHO (what might been an orthogonal
bug thought).

I also don't have an answer or preference. Just think, it's important to
touch on those cases and get input how people think about it.


-Matthias



On 12/11/18 11:43 AM, Kamal Chandraprakash wrote:

Matthias,

For changelog topics, I think it does not make sense to allow skipping
records if serialization fails? For internal repartitions topics, I am
not sure if we should allow it or not. Would you agree with this? We
should discuss the implication to derive a sound design.

Can you explain the issue that happens when records are skipped to
changelog / internal-repartition topics ? So, that I can look into it.

On Fri, Dec 7, 2018 at 12:07 AM Matthias J. Sax 
wrote:


To accept different types of records from multiple topologies, I have to
define the ProducerRecord without generics.

Yes. It does make sense. My point was, that the KIP should
mention/explain this explicitly to allow other not familiar with the
code base to understand it more easily :)



About `ClassCastException`: seems to be an implementation detail. No
need to make it part of the KIP discussion.



One more thing that came to my mind. We use the `RecordCollector` to
write into all topics, ie, user output topics and internal repartition
and changelog topics.

For changelog topics, I think it does not make sense to allow skipping
records if serialization fails? For internal repartitions topics, I am
not sure if we should allow it or not. Would you agree with this? We
should discuss the implication to derive a sound design.

I was also just double checking the code, and it seems that the current
`ProductionExceptionHandler` is applied for all topics. This seems to be
incorrect to me. Seems we missed this case when doing KIP-210? (Or did
we discuss this and I cannot remember? Might be worth to double check.)

Last thought: of course, the handler will know which topic is affected
and can provide a corresponding implementation. Was just wondering if we
should be more strict?


-Matthias

On 12/6/18 10:01 AM, Kamal Chandraprakash wrote:

Matt,
 I agree with Matthias on not to altering the serializer as it's used

by

multiple components.

Matthias,

  - the proposed method accepts a `ProducerRecord` -- it might be good to
explain why this cannot be done in a type safe way (ie, missing generics)

To accept different types of records from multiple topologies, I have to
define the ProducerRecord without generics.

- `AlwaysProductionExceptionHandler` ->
`AlwaysContinueProductionExceptionHandler`

Updated the typo error in KIP.

  - `DefaultProductionExceptionHandler` is not mentioned

The `handleSerializationException` method in the
`ProductionExceptionHandler` interface will have default implementation
that is set to FAIL by default.
This is done to avoid any changes in the user implementation. So, I

didn't

mentioned the `DefaultProductionExceptionHandler` class. Updated the KIP.

- Why do you distinguish between `ClassCastException` and "any other
unchecked exception? Both second case seems to include the first one?

In SinkNode.java#93
<


[DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2019-09-09 Thread alaa

Hi Matthias,

I have checked the KIP-399 and the discussion and also KIP-210.

So the question we need to answer is whether it's okay to also skip 
writing the record in the internal topics, the current implementation of 
'ProductionExceptionHandler' is applied for all topics and if we decided 
to keep it that way, how to ensure that there will be no divergence in 
local store and changelog topic ?


I would like to get input from others on what they think about this.

There is a point I don't understand which is: why in case of a 
serialization error do we have the choice of either skipping it or 
putting in the store ? shouldn't the record be correctly serialized 
before putting it into the store ?


Alaa.





Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-13 Thread Matthias J. Sax
For store updates, records are first serialized and afterwards put into
the store and written into the changelog topic.

In the current implementation, if the send() into the changelog topic
produces an error and the handler skips over it, the local store content
and the changelog topic diverge. This seems to be a correctness issue.

For serialization error, it would not happen that store and changelog
diverge, because serialization happens before and put/send. Thus, with
this KIP we could skip both put() and send(). However, I am still
wondering, if it would be ok to skip a store update for this case? (Btw:
the current PR does not address this atm, and a serialization error for
a store write would not be covered but kill the instance).

IIRC, the original idea of the KIP was to allow skipping over record for
output topics only. That's why I am wondering if it's ok to allow
skipper over record in repartitions topics, too.

In the end, it's some data loss for all 3 cases, so maybe it's ok to
allow skipping for all 3 cases. However, we should not allow that local
store and changelog topic diverge IMHO (what might been an orthogonal
bug thought).

I also don't have an answer or preference. Just think, it's important to
touch on those cases and get input how people think about it.


-Matthias



On 12/11/18 11:43 AM, Kamal Chandraprakash wrote:
> Matthias,
> 
> For changelog topics, I think it does not make sense to allow skipping
> records if serialization fails? For internal repartitions topics, I am
> not sure if we should allow it or not. Would you agree with this? We
> should discuss the implication to derive a sound design.
> 
> Can you explain the issue that happens when records are skipped to
> changelog / internal-repartition topics ? So, that I can look into it.
> 
> On Fri, Dec 7, 2018 at 12:07 AM Matthias J. Sax 
> wrote:
> 
 To accept different types of records from multiple topologies, I have to
 define the ProducerRecord without generics.
>>
>> Yes. It does make sense. My point was, that the KIP should
>> mention/explain this explicitly to allow other not familiar with the
>> code base to understand it more easily :)
>>
>>
>>
>> About `ClassCastException`: seems to be an implementation detail. No
>> need to make it part of the KIP discussion.
>>
>>
>>
>> One more thing that came to my mind. We use the `RecordCollector` to
>> write into all topics, ie, user output topics and internal repartition
>> and changelog topics.
>>
>> For changelog topics, I think it does not make sense to allow skipping
>> records if serialization fails? For internal repartitions topics, I am
>> not sure if we should allow it or not. Would you agree with this? We
>> should discuss the implication to derive a sound design.
>>
>> I was also just double checking the code, and it seems that the current
>> `ProductionExceptionHandler` is applied for all topics. This seems to be
>> incorrect to me. Seems we missed this case when doing KIP-210? (Or did
>> we discuss this and I cannot remember? Might be worth to double check.)
>>
>> Last thought: of course, the handler will know which topic is affected
>> and can provide a corresponding implementation. Was just wondering if we
>> should be more strict?
>>
>>
>> -Matthias
>>
>> On 12/6/18 10:01 AM, Kamal Chandraprakash wrote:
>>> Matt,
>>> I agree with Matthias on not to altering the serializer as it's used
>> by
>>> multiple components.
>>>
>>> Matthias,
>>>
>>>  - the proposed method accepts a `ProducerRecord` -- it might be good to
>>> explain why this cannot be done in a type safe way (ie, missing generics)
>>>
>>> To accept different types of records from multiple topologies, I have to
>>> define the ProducerRecord without generics.
>>>
>>> - `AlwaysProductionExceptionHandler` ->
>>> `AlwaysContinueProductionExceptionHandler`
>>>
>>> Updated the typo error in KIP.
>>>
>>>  - `DefaultProductionExceptionHandler` is not mentioned
>>>
>>> The `handleSerializationException` method in the
>>> `ProductionExceptionHandler` interface will have default implementation
>>> that is set to FAIL by default.
>>> This is done to avoid any changes in the user implementation. So, I
>> didn't
>>> mentioned the `DefaultProductionExceptionHandler` class. Updated the KIP.
>>>
>>> - Why do you distinguish between `ClassCastException` and "any other
>>> unchecked exception? Both second case seems to include the first one?
>>>
>>> In SinkNode.java#93
>>> <
>> https://github.com/apache/kafka/blob/87cc31c4e7ea36e7e832a1d02d71480a91a75293/streams/src/main/java/org/apache/kafka/streams/processor/internals/SinkNode.java#L93
>>>
>>> on
>>> hitting `ClassCastException`, we are halting the streams as it's a fatal
>>> error.
>>> To keep the original behavior, I've to distinguish the exceptions.
>>>
>>>
>>> On Thu, Dec 6, 2018 at 10:44 PM Matthias J. Sax 
>>> wrote:
>>>
 Well, that's exactly the point. The serializer should not be altered
 IMHO because this would have impact on other 

Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-11 Thread Matt Farmer
Sorry for the delay in reply here. My personal life is a bit interesting at
the moment so keeping up with email seems more and more impossible. :)

That all makes sense to me! Thanks for entertaining the questions!

On Thu, Dec 6, 2018 at 1:01 PM Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Matt,
> I agree with Matthias on not to altering the serializer as it's used by
> multiple components.
>
> Matthias,
>
>  - the proposed method accepts a `ProducerRecord` -- it might be good to
> explain why this cannot be done in a type safe way (ie, missing generics)
>
> To accept different types of records from multiple topologies, I have to
> define the ProducerRecord without generics.
>
> - `AlwaysProductionExceptionHandler` ->
> `AlwaysContinueProductionExceptionHandler`
>
> Updated the typo error in KIP.
>
>  - `DefaultProductionExceptionHandler` is not mentioned
>
> The `handleSerializationException` method in the
> `ProductionExceptionHandler` interface will have default implementation
> that is set to FAIL by default.
> This is done to avoid any changes in the user implementation. So, I didn't
> mentioned the `DefaultProductionExceptionHandler` class. Updated the KIP.
>
> - Why do you distinguish between `ClassCastException` and "any other
> unchecked exception? Both second case seems to include the first one?
>
> In SinkNode.java#93
> <
> https://github.com/apache/kafka/blob/87cc31c4e7ea36e7e832a1d02d71480a91a75293/streams/src/main/java/org/apache/kafka/streams/processor/internals/SinkNode.java#L93
> >
> on
> hitting `ClassCastException`, we are halting the streams as it's a fatal
> error.
> To keep the original behavior, I've to distinguish the exceptions.
>
>
> On Thu, Dec 6, 2018 at 10:44 PM Matthias J. Sax 
> wrote:
>
> > Well, that's exactly the point. The serializer should not be altered
> > IMHO because this would have impact on other components. Also, for
> > applications that use KafkaProducer directly, they can catch any
> > serialization exception and react to it. Hence, I don't don't see a
> > reason to change the serializer interface.
> >
> > Instead, it seems better to solve this issue in Streams by allowing to
> > skip over a record for this case.
> >
> > Some more comments on the KIP:
> >
> >  - the proposed method accepts a `ProducerRecord` -- it might be good to
> > explain why this cannot be done in a type safe way (ie, missing generics)
> >
> >  - `AlwaysProductionExceptionHandler` ->
> > `AlwaysContinueProductionExceptionHandler`
> >
> >  - `DefaultProductionExceptionHandler` is not mentioned
> >
> >  - Why do you distinguish between `ClassCastException` and "any other
> > unchecked exception? Both second case seems to include the first one?
> >
> >
> >
> > -Matthias
> >
> > On 12/6/18 8:35 AM, Matt Farmer wrote:
> > > Ah, good point.
> > >
> > > Should we consider altering the serializer interface to permit not
> > sending
> > > the record?
> > >
> > > On Wed, Dec 5, 2018 at 9:23 PM Kamal Chandraprakash <
> > > kamal.chandraprak...@gmail.com> wrote:
> > >
> > >> Matt,
> > >>
> > >> That's a good point. If these cases are handled in the serializer,
> > then
> > >> one cannot continue the stream processing by skipping the record.
> > >> To continue, you may have to send a empty record serialized key/value
> > (new
> > >> byte[0]) to the downstream on hitting the error which may cause
> > un-intended
> > >> results.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Wed, Dec 5, 2018 at 8:41 PM Matt Farmer  wrote:
> > >>
> > >>> Hi there,
> > >>>
> > >>> Thanks for this KIP.
> > >>>
> > >>> What’s the thinking behind doing this in ProductionExceptionHandler
> > >> versus
> > >>> handling these cases in your serializer implementation?
> > >>>
> > >>> On Mon, Dec 3, 2018 at 1:09 AM Kamal Chandraprakash <
> > >>> kamal.chandraprak...@gmail.com> wrote:
> > >>>
> >  Hello dev,
> > 
> >    I hope to initiate the discussion for KIP-399: Extend
> >  ProductionExceptionHandler to cover serialization exceptions.
> > 
> >  KIP: <
> > 
> > 
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> > >
> >  JIRA: https://issues.apache.org/jira/browse/KAFKA-7499
> > 
> >  All feedbacks will be highly appreciated.
> > 
> >  Thanks,
> >  Kamal Chandraprakash
> > 
> > >>>
> > >>
> > >
> >
> >
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-11 Thread Kamal Chandraprakash
Matthias,

For changelog topics, I think it does not make sense to allow skipping
records if serialization fails? For internal repartitions topics, I am
not sure if we should allow it or not. Would you agree with this? We
should discuss the implication to derive a sound design.

Can you explain the issue that happens when records are skipped to
changelog / internal-repartition topics ? So, that I can look into it.

On Fri, Dec 7, 2018 at 12:07 AM Matthias J. Sax 
wrote:

> >> To accept different types of records from multiple topologies, I have to
> >> define the ProducerRecord without generics.
>
> Yes. It does make sense. My point was, that the KIP should
> mention/explain this explicitly to allow other not familiar with the
> code base to understand it more easily :)
>
>
>
> About `ClassCastException`: seems to be an implementation detail. No
> need to make it part of the KIP discussion.
>
>
>
> One more thing that came to my mind. We use the `RecordCollector` to
> write into all topics, ie, user output topics and internal repartition
> and changelog topics.
>
> For changelog topics, I think it does not make sense to allow skipping
> records if serialization fails? For internal repartitions topics, I am
> not sure if we should allow it or not. Would you agree with this? We
> should discuss the implication to derive a sound design.
>
> I was also just double checking the code, and it seems that the current
> `ProductionExceptionHandler` is applied for all topics. This seems to be
> incorrect to me. Seems we missed this case when doing KIP-210? (Or did
> we discuss this and I cannot remember? Might be worth to double check.)
>
> Last thought: of course, the handler will know which topic is affected
> and can provide a corresponding implementation. Was just wondering if we
> should be more strict?
>
>
> -Matthias
>
> On 12/6/18 10:01 AM, Kamal Chandraprakash wrote:
> > Matt,
> > I agree with Matthias on not to altering the serializer as it's used
> by
> > multiple components.
> >
> > Matthias,
> >
> >  - the proposed method accepts a `ProducerRecord` -- it might be good to
> > explain why this cannot be done in a type safe way (ie, missing generics)
> >
> > To accept different types of records from multiple topologies, I have to
> > define the ProducerRecord without generics.
> >
> > - `AlwaysProductionExceptionHandler` ->
> > `AlwaysContinueProductionExceptionHandler`
> >
> > Updated the typo error in KIP.
> >
> >  - `DefaultProductionExceptionHandler` is not mentioned
> >
> > The `handleSerializationException` method in the
> > `ProductionExceptionHandler` interface will have default implementation
> > that is set to FAIL by default.
> > This is done to avoid any changes in the user implementation. So, I
> didn't
> > mentioned the `DefaultProductionExceptionHandler` class. Updated the KIP.
> >
> > - Why do you distinguish between `ClassCastException` and "any other
> > unchecked exception? Both second case seems to include the first one?
> >
> > In SinkNode.java#93
> > <
> https://github.com/apache/kafka/blob/87cc31c4e7ea36e7e832a1d02d71480a91a75293/streams/src/main/java/org/apache/kafka/streams/processor/internals/SinkNode.java#L93
> >
> > on
> > hitting `ClassCastException`, we are halting the streams as it's a fatal
> > error.
> > To keep the original behavior, I've to distinguish the exceptions.
> >
> >
> > On Thu, Dec 6, 2018 at 10:44 PM Matthias J. Sax 
> > wrote:
> >
> >> Well, that's exactly the point. The serializer should not be altered
> >> IMHO because this would have impact on other components. Also, for
> >> applications that use KafkaProducer directly, they can catch any
> >> serialization exception and react to it. Hence, I don't don't see a
> >> reason to change the serializer interface.
> >>
> >> Instead, it seems better to solve this issue in Streams by allowing to
> >> skip over a record for this case.
> >>
> >> Some more comments on the KIP:
> >>
> >>  - the proposed method accepts a `ProducerRecord` -- it might be good to
> >> explain why this cannot be done in a type safe way (ie, missing
> generics)
> >>
> >>  - `AlwaysProductionExceptionHandler` ->
> >> `AlwaysContinueProductionExceptionHandler`
> >>
> >>  - `DefaultProductionExceptionHandler` is not mentioned
> >>
> >>  - Why do you distinguish between `ClassCastException` and "any other
> >> unchecked exception? Both second case seems to include the first one?
> >>
> >>
> >>
> >> -Matthias
> >>
> >> On 12/6/18 8:35 AM, Matt Farmer wrote:
> >>> Ah, good point.
> >>>
> >>> Should we consider altering the serializer interface to permit not
> >> sending
> >>> the record?
> >>>
> >>> On Wed, Dec 5, 2018 at 9:23 PM Kamal Chandraprakash <
> >>> kamal.chandraprak...@gmail.com> wrote:
> >>>
>  Matt,
> 
>  That's a good point. If these cases are handled in the serializer,
> >> then
>  one cannot continue the stream processing by skipping the record.
>  To continue, you may have to send a empty record 

Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-06 Thread Matthias J. Sax
>> To accept different types of records from multiple topologies, I have to
>> define the ProducerRecord without generics.

Yes. It does make sense. My point was, that the KIP should
mention/explain this explicitly to allow other not familiar with the
code base to understand it more easily :)



About `ClassCastException`: seems to be an implementation detail. No
need to make it part of the KIP discussion.



One more thing that came to my mind. We use the `RecordCollector` to
write into all topics, ie, user output topics and internal repartition
and changelog topics.

For changelog topics, I think it does not make sense to allow skipping
records if serialization fails? For internal repartitions topics, I am
not sure if we should allow it or not. Would you agree with this? We
should discuss the implication to derive a sound design.

I was also just double checking the code, and it seems that the current
`ProductionExceptionHandler` is applied for all topics. This seems to be
incorrect to me. Seems we missed this case when doing KIP-210? (Or did
we discuss this and I cannot remember? Might be worth to double check.)

Last thought: of course, the handler will know which topic is affected
and can provide a corresponding implementation. Was just wondering if we
should be more strict?


-Matthias

On 12/6/18 10:01 AM, Kamal Chandraprakash wrote:
> Matt,
> I agree with Matthias on not to altering the serializer as it's used by
> multiple components.
> 
> Matthias,
> 
>  - the proposed method accepts a `ProducerRecord` -- it might be good to
> explain why this cannot be done in a type safe way (ie, missing generics)
> 
> To accept different types of records from multiple topologies, I have to
> define the ProducerRecord without generics.
> 
> - `AlwaysProductionExceptionHandler` ->
> `AlwaysContinueProductionExceptionHandler`
> 
> Updated the typo error in KIP.
> 
>  - `DefaultProductionExceptionHandler` is not mentioned
> 
> The `handleSerializationException` method in the
> `ProductionExceptionHandler` interface will have default implementation
> that is set to FAIL by default.
> This is done to avoid any changes in the user implementation. So, I didn't
> mentioned the `DefaultProductionExceptionHandler` class. Updated the KIP.
> 
> - Why do you distinguish between `ClassCastException` and "any other
> unchecked exception? Both second case seems to include the first one?
> 
> In SinkNode.java#93
> 
> on
> hitting `ClassCastException`, we are halting the streams as it's a fatal
> error.
> To keep the original behavior, I've to distinguish the exceptions.
> 
> 
> On Thu, Dec 6, 2018 at 10:44 PM Matthias J. Sax 
> wrote:
> 
>> Well, that's exactly the point. The serializer should not be altered
>> IMHO because this would have impact on other components. Also, for
>> applications that use KafkaProducer directly, they can catch any
>> serialization exception and react to it. Hence, I don't don't see a
>> reason to change the serializer interface.
>>
>> Instead, it seems better to solve this issue in Streams by allowing to
>> skip over a record for this case.
>>
>> Some more comments on the KIP:
>>
>>  - the proposed method accepts a `ProducerRecord` -- it might be good to
>> explain why this cannot be done in a type safe way (ie, missing generics)
>>
>>  - `AlwaysProductionExceptionHandler` ->
>> `AlwaysContinueProductionExceptionHandler`
>>
>>  - `DefaultProductionExceptionHandler` is not mentioned
>>
>>  - Why do you distinguish between `ClassCastException` and "any other
>> unchecked exception? Both second case seems to include the first one?
>>
>>
>>
>> -Matthias
>>
>> On 12/6/18 8:35 AM, Matt Farmer wrote:
>>> Ah, good point.
>>>
>>> Should we consider altering the serializer interface to permit not
>> sending
>>> the record?
>>>
>>> On Wed, Dec 5, 2018 at 9:23 PM Kamal Chandraprakash <
>>> kamal.chandraprak...@gmail.com> wrote:
>>>
 Matt,

 That's a good point. If these cases are handled in the serializer,
>> then
 one cannot continue the stream processing by skipping the record.
 To continue, you may have to send a empty record serialized key/value
>> (new
 byte[0]) to the downstream on hitting the error which may cause
>> un-intended
 results.





 On Wed, Dec 5, 2018 at 8:41 PM Matt Farmer  wrote:

> Hi there,
>
> Thanks for this KIP.
>
> What’s the thinking behind doing this in ProductionExceptionHandler
 versus
> handling these cases in your serializer implementation?
>
> On Mon, Dec 3, 2018 at 1:09 AM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
>> Hello dev,
>>
>>   I hope to initiate the discussion for KIP-399: Extend
>> ProductionExceptionHandler to cover serialization exceptions.
>>
>> 

Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-06 Thread Kamal Chandraprakash
Matt,
I agree with Matthias on not to altering the serializer as it's used by
multiple components.

Matthias,

 - the proposed method accepts a `ProducerRecord` -- it might be good to
explain why this cannot be done in a type safe way (ie, missing generics)

To accept different types of records from multiple topologies, I have to
define the ProducerRecord without generics.

- `AlwaysProductionExceptionHandler` ->
`AlwaysContinueProductionExceptionHandler`

Updated the typo error in KIP.

 - `DefaultProductionExceptionHandler` is not mentioned

The `handleSerializationException` method in the
`ProductionExceptionHandler` interface will have default implementation
that is set to FAIL by default.
This is done to avoid any changes in the user implementation. So, I didn't
mentioned the `DefaultProductionExceptionHandler` class. Updated the KIP.

- Why do you distinguish between `ClassCastException` and "any other
unchecked exception? Both second case seems to include the first one?

In SinkNode.java#93

on
hitting `ClassCastException`, we are halting the streams as it's a fatal
error.
To keep the original behavior, I've to distinguish the exceptions.


On Thu, Dec 6, 2018 at 10:44 PM Matthias J. Sax 
wrote:

> Well, that's exactly the point. The serializer should not be altered
> IMHO because this would have impact on other components. Also, for
> applications that use KafkaProducer directly, they can catch any
> serialization exception and react to it. Hence, I don't don't see a
> reason to change the serializer interface.
>
> Instead, it seems better to solve this issue in Streams by allowing to
> skip over a record for this case.
>
> Some more comments on the KIP:
>
>  - the proposed method accepts a `ProducerRecord` -- it might be good to
> explain why this cannot be done in a type safe way (ie, missing generics)
>
>  - `AlwaysProductionExceptionHandler` ->
> `AlwaysContinueProductionExceptionHandler`
>
>  - `DefaultProductionExceptionHandler` is not mentioned
>
>  - Why do you distinguish between `ClassCastException` and "any other
> unchecked exception? Both second case seems to include the first one?
>
>
>
> -Matthias
>
> On 12/6/18 8:35 AM, Matt Farmer wrote:
> > Ah, good point.
> >
> > Should we consider altering the serializer interface to permit not
> sending
> > the record?
> >
> > On Wed, Dec 5, 2018 at 9:23 PM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> >> Matt,
> >>
> >> That's a good point. If these cases are handled in the serializer,
> then
> >> one cannot continue the stream processing by skipping the record.
> >> To continue, you may have to send a empty record serialized key/value
> (new
> >> byte[0]) to the downstream on hitting the error which may cause
> un-intended
> >> results.
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Dec 5, 2018 at 8:41 PM Matt Farmer  wrote:
> >>
> >>> Hi there,
> >>>
> >>> Thanks for this KIP.
> >>>
> >>> What’s the thinking behind doing this in ProductionExceptionHandler
> >> versus
> >>> handling these cases in your serializer implementation?
> >>>
> >>> On Mon, Dec 3, 2018 at 1:09 AM Kamal Chandraprakash <
> >>> kamal.chandraprak...@gmail.com> wrote:
> >>>
>  Hello dev,
> 
>    I hope to initiate the discussion for KIP-399: Extend
>  ProductionExceptionHandler to cover serialization exceptions.
> 
>  KIP: <
> 
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> >
>  JIRA: https://issues.apache.org/jira/browse/KAFKA-7499
> 
>  All feedbacks will be highly appreciated.
> 
>  Thanks,
>  Kamal Chandraprakash
> 
> >>>
> >>
> >
>
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-06 Thread Matthias J. Sax
Well, that's exactly the point. The serializer should not be altered
IMHO because this would have impact on other components. Also, for
applications that use KafkaProducer directly, they can catch any
serialization exception and react to it. Hence, I don't don't see a
reason to change the serializer interface.

Instead, it seems better to solve this issue in Streams by allowing to
skip over a record for this case.

Some more comments on the KIP:

 - the proposed method accepts a `ProducerRecord` -- it might be good to
explain why this cannot be done in a type safe way (ie, missing generics)

 - `AlwaysProductionExceptionHandler` ->
`AlwaysContinueProductionExceptionHandler`

 - `DefaultProductionExceptionHandler` is not mentioned

 - Why do you distinguish between `ClassCastException` and "any other
unchecked exception? Both second case seems to include the first one?



-Matthias

On 12/6/18 8:35 AM, Matt Farmer wrote:
> Ah, good point.
> 
> Should we consider altering the serializer interface to permit not sending
> the record?
> 
> On Wed, Dec 5, 2018 at 9:23 PM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
> 
>> Matt,
>>
>> That's a good point. If these cases are handled in the serializer, then
>> one cannot continue the stream processing by skipping the record.
>> To continue, you may have to send a empty record serialized key/value (new
>> byte[0]) to the downstream on hitting the error which may cause un-intended
>> results.
>>
>>
>>
>>
>>
>> On Wed, Dec 5, 2018 at 8:41 PM Matt Farmer  wrote:
>>
>>> Hi there,
>>>
>>> Thanks for this KIP.
>>>
>>> What’s the thinking behind doing this in ProductionExceptionHandler
>> versus
>>> handling these cases in your serializer implementation?
>>>
>>> On Mon, Dec 3, 2018 at 1:09 AM Kamal Chandraprakash <
>>> kamal.chandraprak...@gmail.com> wrote:
>>>
 Hello dev,

   I hope to initiate the discussion for KIP-399: Extend
 ProductionExceptionHandler to cover serialization exceptions.

 KIP: <


>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
>
 JIRA: https://issues.apache.org/jira/browse/KAFKA-7499

 All feedbacks will be highly appreciated.

 Thanks,
 Kamal Chandraprakash

>>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-06 Thread Matt Farmer
Ah, good point.

Should we consider altering the serializer interface to permit not sending
the record?

On Wed, Dec 5, 2018 at 9:23 PM Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Matt,
>
> That's a good point. If these cases are handled in the serializer, then
> one cannot continue the stream processing by skipping the record.
> To continue, you may have to send a empty record serialized key/value (new
> byte[0]) to the downstream on hitting the error which may cause un-intended
> results.
>
>
>
>
>
> On Wed, Dec 5, 2018 at 8:41 PM Matt Farmer  wrote:
>
> > Hi there,
> >
> > Thanks for this KIP.
> >
> > What’s the thinking behind doing this in ProductionExceptionHandler
> versus
> > handling these cases in your serializer implementation?
> >
> > On Mon, Dec 3, 2018 at 1:09 AM Kamal Chandraprakash <
> > kamal.chandraprak...@gmail.com> wrote:
> >
> > > Hello dev,
> > >
> > >   I hope to initiate the discussion for KIP-399: Extend
> > > ProductionExceptionHandler to cover serialization exceptions.
> > >
> > > KIP: <
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> > > >
> > > JIRA: https://issues.apache.org/jira/browse/KAFKA-7499
> > >
> > > All feedbacks will be highly appreciated.
> > >
> > > Thanks,
> > > Kamal Chandraprakash
> > >
> >
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-05 Thread Kamal Chandraprakash
Matt,

That's a good point. If these cases are handled in the serializer, then
one cannot continue the stream processing by skipping the record.
To continue, you may have to send a empty record serialized key/value (new
byte[0]) to the downstream on hitting the error which may cause un-intended
results.





On Wed, Dec 5, 2018 at 8:41 PM Matt Farmer  wrote:

> Hi there,
>
> Thanks for this KIP.
>
> What’s the thinking behind doing this in ProductionExceptionHandler versus
> handling these cases in your serializer implementation?
>
> On Mon, Dec 3, 2018 at 1:09 AM Kamal Chandraprakash <
> kamal.chandraprak...@gmail.com> wrote:
>
> > Hello dev,
> >
> >   I hope to initiate the discussion for KIP-399: Extend
> > ProductionExceptionHandler to cover serialization exceptions.
> >
> > KIP: <
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> > >
> > JIRA: https://issues.apache.org/jira/browse/KAFKA-7499
> >
> > All feedbacks will be highly appreciated.
> >
> > Thanks,
> > Kamal Chandraprakash
> >
>


Re: [DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-05 Thread Matt Farmer
Hi there,

Thanks for this KIP.

What’s the thinking behind doing this in ProductionExceptionHandler versus
handling these cases in your serializer implementation?

On Mon, Dec 3, 2018 at 1:09 AM Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Hello dev,
>
>   I hope to initiate the discussion for KIP-399: Extend
> ProductionExceptionHandler to cover serialization exceptions.
>
> KIP: <
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
> >
> JIRA: https://issues.apache.org/jira/browse/KAFKA-7499
>
> All feedbacks will be highly appreciated.
>
> Thanks,
> Kamal Chandraprakash
>


[DISCUSS] KIP-399: Extend ProductionExceptionHandler to cover serialization exceptions

2018-12-02 Thread Kamal Chandraprakash
Hello dev,

  I hope to initiate the discussion for KIP-399: Extend
ProductionExceptionHandler to cover serialization exceptions.

KIP: <
https://cwiki.apache.org/confluence/display/KAFKA/KIP-399%3A+Extend+ProductionExceptionHandler+to+cover+serialization+exceptions
>
JIRA: https://issues.apache.org/jira/browse/KAFKA-7499

All feedbacks will be highly appreciated.

Thanks,
Kamal Chandraprakash