[ 
https://issues.apache.org/jira/browse/KAFKA-12739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

A. Sophie Blee-Goldman updated KAFKA-12739:
-------------------------------------------
    Description: 
Within a task, there will typically be a number of records that have been 
successfully processed through the subtopology but not yet committed. If the 
next record to be picked up hits an unexpected exception, we’ll dirty close the 
entire task and essentially throw away all the work we did on those previous 
records. We should be able to drop only the corrupted record and just commit 
the offsets up to that point.
Again, for some exceptions such as de/serialization or user code errors, this 
can be straightforward as the thread/task is otherwise in a healthy state. 
Other cases such as an error in the Producer will need to be tackled 
separately, since a Producer error cannot be isolated to a single task.

The challenge here will be in handling records sent to the changelog while 
processing the record that hits an error – we may need to buffer those records 
so they aren’t sent to the RecordCollector until a record has been fully 
processed, otherwise they will be committed and break EOS semantics (unless we 
can immediately implement 
[KAFKA-12740|https://issues.apache.org/jira/browse/KAFKA-12740])

  was:Within a task, there will typically be a number of records that have been 
successfully processed through the subtopology but not yet committed. If the 
next record to be picked up hits an unexpected exception, we’ll dirty close the 
entire task and essentially throw away all the work we did on those previous 
records. We should be able to drop only the corrupted record and just commit 
the offsets up to that point. Again, for some exceptions such as 
de/serialization or user code errors, this can be straightforward as the 
thread/task is otherwise in a healthy state. Other cases such as an error in 
the Producer will need to be tackled separately, since a Producer error cannot 
be isolated to a single task.


> 2. Commit any cleanly-processed records within a corrupted task
> ---------------------------------------------------------------
>
>                 Key: KAFKA-12739
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12739
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Priority: Major
>
> Within a task, there will typically be a number of records that have been 
> successfully processed through the subtopology but not yet committed. If the 
> next record to be picked up hits an unexpected exception, we’ll dirty close 
> the entire task and essentially throw away all the work we did on those 
> previous records. We should be able to drop only the corrupted record and 
> just commit the offsets up to that point.
> Again, for some exceptions such as de/serialization or user code errors, this 
> can be straightforward as the thread/task is otherwise in a healthy state. 
> Other cases such as an error in the Producer will need to be tackled 
> separately, since a Producer error cannot be isolated to a single task.
> The challenge here will be in handling records sent to the changelog while 
> processing the record that hits an error – we may need to buffer those 
> records so they aren’t sent to the RecordCollector until a record has been 
> fully processed, otherwise they will be committed and break EOS semantics 
> (unless we can immediately implement 
> [KAFKA-12740|https://issues.apache.org/jira/browse/KAFKA-12740])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to