[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-02-06 Thread Prasanna Subburaj (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355012#comment-16355012
 ] 

Prasanna Subburaj commented on KAFKA-6490:
--

[~ewencp]: Thanks for giving me permissions. 

I am interested in working on this improvement and yes we need to discuss more 
on the dead letter queue.

After creating the page I will start the discussion thread in the mailing list.

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Assignee: Prasanna Subburaj
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-02-06 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354959#comment-16354959
 ] 

Ewen Cheslack-Postava commented on KAFKA-6490:
--

[~prasanna1433] I've given you wiki permissions, you should be able to create a 
page now.

The dead letter queue is something I've specifically heard from a number of 
users, so it's definitely in demand. The list I gave is based on a ton of real 
user feedback, so I feel pretty confident that it is both a) covering important 
use cases and b) comprehensive enough to address the vast majority of use 
cases. But I'm of course open to discussion of the options. I suspect *more* 
options would be the result rather than removing some. If you want to take on 
this improvement, we can of course discuss further in the KIP thread :)

 

With respect to the version, new features should almost universally be worked 
on in trunk – older release branches are reserved for bug fixes. In this case, 
since we just cut 1.1 branches, this would be a candidate for 1.2 / 2.0 and can 
simply be developed against trunk.

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-02-06 Thread Prasanna Subburaj (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354957#comment-16354957
 ] 

Prasanna Subburaj commented on KAFKA-6490:
--

[~ewencp]: Thanks for feedback. What you are mentioning makes sense we should 
users option to chose from because each use case is different. I feel that 
Discard and log option can be provided to the user and skeptical about the dead 
letter queue.  

Also which version should this bug be worked upon ? 

 

Can I get please get access to the confluent 
([https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals)]
 as well ? 

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-02-06 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354945#comment-16354945
 ] 

Ewen Cheslack-Postava commented on KAFKA-6490:
--

A change in behavior like that would definitely require a KIP – existing users 
would not expect this at all.

Connect started with the current behavior because for many users losing data is 
worse than suffering some downtime. However, it's clear some alternatives are 
warranted; this question comes up from time to time on mailing lists. Generally 
there are only a few options that seem to make sense:
 * Stop processing (current behavior) and log
 * Log and retry (really only makes sense for unusual edge cases where data got 
corrupted in flight between Kafka and Connect)
 * Discard and log (I care about uptime more than a bit of lost data)
 * Dead letter queue (or some other fallback handler)

The retry case is probably the least important here as it will rarely make a 
difference, so the other 3 are the ones I think we'd want to implement. A KIP 
for this should be straightforward, though the implementation will require care 
to make sure we handle all places errors can occur (in the producer/consumer, 
during deserialization, during transformations, etc).

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-02-06 Thread Prasanna Subburaj (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354908#comment-16354908
 ] 

Prasanna Subburaj commented on KAFKA-6490:
--

Thanks for you response [~mjsax]. I think we need a similar feature in 
connector because now if we get a malformed JSON message the connector will 
fail and will not process any additional message that are coming after this 
one. I can work on KIP for solving this issue if the forks you tagged in agree 
with what I am saying. Can you also please add me to the contributors list ?

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-02-06 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354852#comment-16354852
 ] 

Matthias J. Sax commented on KAFKA-6490:


I am not too familiar with the details of Connect. However, it sound like this 
change might require a KIP. There was a similar issue for Streams and we added 
a config so people can choose to resume of fail for this case: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-161%3A+streams+deserialization+exception+handlers]

\cc  [~wicknicks] [~rhauch] [~kkonstantine]

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-02-06 Thread Prasanna Subburaj (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354805#comment-16354805
 ] 

Prasanna Subburaj commented on KAFKA-6490:
--

[~mjsax] Can you please help [~wspeirs] with this ticket ?

 
 

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-01-30 Thread William R. Speirs (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345031#comment-16345031
 ] 

William R. Speirs commented on KAFKA-6490:
--

Seems like a simple fix. I wrapped the inside of the {{for}} loop with a 
{{try/catch}} for a {{DataException}} and reported the exception as a warning. 
This will allow other messages to be processed. Only hitch is that if you have 
an entire topic full of non-JSON messages, then you'll fill up your logs with a 
bunch of these messages. One way to remedy this would be to put a counter such 
that after say 100 of these warnings, it stops reporting... however, that might 
be too clever. Thoughts?

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
> Attachments: KAFKA-6490_v1.patch
>
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6490) JSON SerializationException Stops Connect

2018-01-26 Thread Prasanna Subburaj (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-6490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341854#comment-16341854
 ] 

Prasanna Subburaj commented on KAFKA-6490:
--

Yes I also faces this issue. I feel that the connector should throw a warning 
and should move on to the next message. But still I would like other 
contributors to comment on this.

> JSON SerializationException Stops Connect
> -
>
> Key: KAFKA-6490
> URL: https://issues.apache.org/jira/browse/KAFKA-6490
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 1.0.0
>Reporter: William R. Speirs
>Priority: Major
>
> If you configure KafkaConnect to parse JSON messages, and you send it a 
> non-JSON message, the SerializationException message will bubble up to the 
> top, and stop KafkaConnect. While I understand sending non-JSON to a JSON 
> serializer is a bad idea, I think that a single malformed message stopping 
> all of KafkaConnect is even worse.
> The data exception is thrown here: 
> [https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L305]
>  
> From the call here: 
> [https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSinkTask.java#L476]
> This bubbles all the way up to the top, and KafkaConnect simply stops with 
> the message: {{ERROR WorkerSinkTask\{id=elasticsearch-sink-0} Task threw an 
> uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask:172)}}
> Thoughts on adding a {{try/catch}} around the {{for}} loop in 
> WorkerSinkTask's {{convertMessages}} so messages that don't properly parse 
> are logged, but simply ignored? This way KafkaConnect can keep working even 
> when it encounters a message it cannot decode?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)