[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355012#comment-16355012 ] Prasanna Subburaj commented on KAFKA-6490: -- [~ewencp]: Thanks for giving me permissions. I am interested in working on this improvement and yes we need to discuss more on the dead letter queue. After creating the page I will start the discussion thread in the mailing list. > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Assignee: Prasanna Subburaj >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354959#comment-16354959 ] Ewen Cheslack-Postava commented on KAFKA-6490: -- [~prasanna1433] I've given you wiki permissions, you should be able to create a page now. The dead letter queue is something I've specifically heard from a number of users, so it's definitely in demand. The list I gave is based on a ton of real user feedback, so I feel pretty confident that it is both a) covering important use cases and b) comprehensive enough to address the vast majority of use cases. But I'm of course open to discussion of the options. I suspect *more* options would be the result rather than removing some. If you want to take on this improvement, we can of course discuss further in the KIP thread :) With respect to the version, new features should almost universally be worked on in trunk – older release branches are reserved for bug fixes. In this case, since we just cut 1.1 branches, this would be a candidate for 1.2 / 2.0 and can simply be developed against trunk. > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354957#comment-16354957 ] Prasanna Subburaj commented on KAFKA-6490: -- [~ewencp]: Thanks for feedback. What you are mentioning makes sense we should users option to chose from because each use case is different. I feel that Discard and log option can be provided to the user and skeptical about the dead letter queue. Also which version should this bug be worked upon ? Can I get please get access to the confluent ([https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals)] as well ? > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354945#comment-16354945 ] Ewen Cheslack-Postava commented on KAFKA-6490: -- A change in behavior like that would definitely require a KIP – existing users would not expect this at all. Connect started with the current behavior because for many users losing data is worse than suffering some downtime. However, it's clear some alternatives are warranted; this question comes up from time to time on mailing lists. Generally there are only a few options that seem to make sense: * Stop processing (current behavior) and log * Log and retry (really only makes sense for unusual edge cases where data got corrupted in flight between Kafka and Connect) * Discard and log (I care about uptime more than a bit of lost data) * Dead letter queue (or some other fallback handler) The retry case is probably the least important here as it will rarely make a difference, so the other 3 are the ones I think we'd want to implement. A KIP for this should be straightforward, though the implementation will require care to make sure we handle all places errors can occur (in the producer/consumer, during deserialization, during transformations, etc). > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354908#comment-16354908 ] Prasanna Subburaj commented on KAFKA-6490: -- Thanks for you response [~mjsax]. I think we need a similar feature in connector because now if we get a malformed JSON message the connector will fail and will not process any additional message that are coming after this one. I can work on KIP for solving this issue if the forks you tagged in agree with what I am saying. Can you also please add me to the contributors list ? > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354852#comment-16354852 ] Matthias J. Sax commented on KAFKA-6490: I am not too familiar with the details of Connect. However, it sound like this change might require a KIP. There was a similar issue for Streams and we added a config so people can choose to resume of fail for this case: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+deserialization+exception+handlers] \cc [~wicknicks] [~rhauch] [~kkonstantine] > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354805#comment-16354805 ] Prasanna Subburaj commented on KAFKA-6490: -- [~mjsax] Can you please help [~wspeirs] with this ticket ? > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345031#comment-16345031 ] William R. Speirs commented on KAFKA-6490: -- Seems like a simple fix. I wrapped the inside of the {{for}} loop with a {{try/catch}} for a {{DataException}} and reported the exception as a warning. This will allow other messages to be processed. Only hitch is that if you have an entire topic full of non-JSON messages, then you'll fill up your logs with a bunch of these messages. One way to remedy this would be to put a counter such that after say 100 of these warnings, it stops reporting... however, that might be too clever. Thoughts? > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > Attachments: KAFKA-6490_v1.patch > > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect
[ https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341854#comment-16341854 ] Prasanna Subburaj commented on KAFKA-6490: -- Yes I also faces this issue. I feel that the connector should throw a warning and should move on to the next message. But still I would like other contributors to comment on this. > JSON SerializationException Stops Connect > - > > Key: KAFKA-6490 > URL: https://issues.apache.org/jira/browse/KAFKA-6490 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: William R. Speirs >Priority: Major > > If you configure KafkaConnect to parse JSON messages, and you send it a > non-JSON message, the SerializationException message will bubble up to the > top, and stop KafkaConnect. While I understand sending non-JSON to a JSON > serializer is a bad idea, I think that a single malformed message stopping > all of KafkaConnect is even worse. > The data exception is thrown here: > [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305] > > From the call here: > [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476] > This bubbles all the way up to the top, and KafkaConnect simply stops with > the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an > uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:172)}} > Thoughts on adding a {{try/catch}} around the {{for}} loop in > WorkerSinkTask's {{convertMessages}} so messages that don't properly parse > are logged, but simply ignored? This way KafkaConnect can keep working even > when it encounters a message it cannot decode? -- This message was sent by Atlassian JIRA (v7.6.3#76005)