[jira] [Assigned] (KAFKA-4388) Connect key and value converters are listed without default values
[ https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgeny Veretennikov reassigned KAFKA-4388: -- Assignee: Evgeny Veretennikov > Connect key and value converters are listed without default values > -- > > Key: KAFKA-4388 > URL: https://issues.apache.org/jira/browse/KAFKA-4388 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.1.0 >Reporter: Ewen Cheslack-Postava >Assignee: Evgeny Veretennikov >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > KIP-75 added per connector converters. This exposes the settings on a > per-connector basis via the validation API. However, the way this is > specified for each connector is via a config value with no default value. > This means the validation API implies there is no setting unless you provide > one. > It would be much better to include the default value extracted from the > WorkerConfig instead so it's clear you shouldn't need to override the default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5511) ConfigDef.define() overloads take too many parameters
Evgeny Veretennikov created KAFKA-5511: -- Summary: ConfigDef.define() overloads take too many parameters Key: KAFKA-5511 URL: https://issues.apache.org/jira/browse/KAFKA-5511 Project: Kafka Issue Type: Improvement Components: clients Reporter: Evgeny Veretennikov Priority: Minor Builder pattern can be helpful to get rid of all these {{define()}} overloads. I think, it's better to create some {{ConfigKeyBuilder}} class to construct keys. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770 ] Michal Borowiecki commented on KAFKA-4750: -- Deserialiser.deserialize javadoc says: * @param data serialized bytes; may be null; implementations are recommended to handle null by +returning a value+ or null rather than throwing an exception. If it's fair for deserializer to return a value when a null byte array is provided, by symmetry, the serializer should be able to return null when such an object is provided. > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770 ] Michal Borowiecki edited comment on KAFKA-4750 at 6/26/17 8:33 AM: --- Deserialiser.deserialize javadoc says: * @param data serialized bytes; may be null; implementations are recommended to handle null by +returning a value+ or null rather than throwing an exception. If it's fair for deserializer to return a value when a null byte array is provided, by symmetry, the serializer should be able to return null when such an object is provided and achieve correct round-tripping. was (Author: mihbor): Deserialiser.deserialize javadoc says: * @param data serialized bytes; may be null; implementations are recommended to handle null by +returning a value+ or null rather than throwing an exception. If it's fair for deserializer to return a value when a null byte array is provided, by symmetry, the serializer should be able to return null when such an object is provided. > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770 ] Michal Borowiecki edited comment on KAFKA-4750 at 6/26/17 8:42 AM: --- Deserialiser.deserialize javadoc says: * @param data serialized bytes; may be null; implementations are recommended to handle null by +returning a value+ or null rather than throwing an exception. If it's fair for deserializer to return a value when a null byte array is provided, by symmetry, the serializer should be able to return null when such an object is provided and achieve correct round-tripping. [~evis] In your example, whether its intuitive or not I guess it depends on peoples' intuition. To me it is intuitive, since null on a changelog topic is a tombstone marker. If I serialize a value to null then to me it's intuitive that it's equivalent to deletion, so should not be found in the following get. An example use case would be if the value types are collections. In order to avoid dealing with nulls in the application layer, one might want to deserialize null into an empty collection and vice-versa. Would that be a valid use-case? I can't see why not, but I may be wrong. was (Author: mihbor): Deserialiser.deserialize javadoc says: * @param data serialized bytes; may be null; implementations are recommended to handle null by +returning a value+ or null rather than throwing an exception. If it's fair for deserializer to return a value when a null byte array is provided, by symmetry, the serializer should be able to return null when such an object is provided and achieve correct round-tripping. [~evis] In your example, whether its intuitive or not I guess it depends on peoples' intuition. To me it is intuitive, since null on a changelog topic is a tombstone marker. If I serialize a value to null then it's intuitive that it's equivalent to deletion. An example use case would be if the value types are collections. In order to avoid dealing with nulls in the application layer, one might want to deserialize null into an empty collection and vice-versa. Would that be a valid use-case? I can't see why not, but I may be wrong. > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770 ] Michal Borowiecki edited comment on KAFKA-4750 at 6/26/17 8:42 AM: --- Deserialiser.deserialize javadoc says: * @param data serialized bytes; may be null; implementations are recommended to handle null by +returning a value+ or null rather than throwing an exception. If it's fair for deserializer to return a value when a null byte array is provided, by symmetry, the serializer should be able to return null when such an object is provided and achieve correct round-tripping. [~evis] In your example, whether its intuitive or not I guess it depends on peoples' intuition. To me it is intuitive, since null on a changelog topic is a tombstone marker. If I serialize a value to null then it's intuitive that it's equivalent to deletion. An example use case would be if the value types are collections. In order to avoid dealing with nulls in the application layer, one might want to deserialize null into an empty collection and vice-versa. Would that be a valid use-case? I can't see why not, but I may be wrong. was (Author: mihbor): Deserialiser.deserialize javadoc says: * @param data serialized bytes; may be null; implementations are recommended to handle null by +returning a value+ or null rather than throwing an exception. If it's fair for deserializer to return a value when a null byte array is provided, by symmetry, the serializer should be able to return null when such an object is provided and achieve correct round-tripping. > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062837#comment-16062837 ] ASF GitHub Bot commented on KAFKA-4750: --- GitHub user evis opened a pull request: https://github.com/apache/kafka/pull/3430 KAFKA-4750: RocksDBStore always deletes null values @guozhangwang @mjsax You can merge this pull request into a Git repository by running: $ git pull https://github.com/evis/kafka rocksdbstore_null_values Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3430.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3430 commit 47b4b2b2a5fe8af4ca4a60c2cae97dcab4e7eec3 Author: Evgeny Veretennikov Date: 2017-06-26T09:21:35Z KAFKA-4750: RocksDBStore always deletes null values Deletion occurs even if serde serializes null value to non-null byte array. > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-5372) Unexpected state transition Dead to PendingShutdown
[ https://issues.apache.org/jira/browse/KAFKA-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eno Thereska reassigned KAFKA-5372: --- Assignee: Eno Thereska > Unexpected state transition Dead to PendingShutdown > --- > > Key: KAFKA-5372 > URL: https://issues.apache.org/jira/browse/KAFKA-5372 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jason Gustafson >Assignee: Eno Thereska > Fix For: 0.11.1.0 > > > I often see this running integration tests: > {code} > [2017-06-02 15:36:03,411] WARN stream-thread > [appId-1-c382ef0a-adbd-422b-9717-9b2bc52b55eb-StreamThread-13] Unexpected > state transition from DEAD to PENDING_SHUTDOWN. > (org.apache.kafka.streams.processor.internals.StreamThread:976) > {code} > Maybe a race condition on shutdown or something? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
Stephane Roset created KAFKA-5512: - Summary: KafkaConsumer: High memory allocation rate when idle Key: KAFKA-5512 URL: https://issues.apache.org/jira/browse/KAFKA-5512 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 0.10.1.1 Reporter: Stephane Roset Hi, We noticed in our application that the memory allocation rate increased significantly when we have no Kafka messages to consume. We isolated the issue by using a JVM that simply runs 128 Kafka consumers. These consumers consume 128 partitions (so each consumer consumes one partition). The partitions are empty and no message has been sent during the test. The consumers were configured with default values (session.timeout.ms=3, fetch.max.wait.ms=500, receive.buffer.bytes=65536, heartbeat.interval.ms=3000, max.poll.interval.ms=30, max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this context, the allocation rate was about 55 MiB/s. This high allocation rate generates a lot of GC activity (to garbage the young heap) and was an issue for our project. We profiled the JVM with JProfiler. We noticed that there were a huge quantity of ArrayList$Itr in memory. These collections were mainly instantiated by the methods handleCompletedReceives, handleCompletedSends, handleConnecions and handleDisconnections of the class NetWorkClient. We also noticed that we had a lot of calls to the method pollOnce of the class KafkaConsumer. So we decided to run only one consumer and to profile the calls to the method pollOnce. We noticed that regularly a huge number of calls is made to this method, up to 268000 calls within 100ms. The pollOnce method calls the NetworkClient.handle* methods. These methods iterate on collections (even if they are empty), so that explains the huge number of iterators in memory. The large number of calls is related to the heartbeat mechanism. The pollOnce method calculates the poll timeout; if a heartbeat needs to be done, the timeout will be set to 0. The problem is that the heartbeat thread checks every 100 ms (default value of retry.backoff.ms) if a heartbeat should be sent, so the KafkaConsumer will call the poll method in a loop without timeout until the heartbeat thread awakes. For example: the heartbeat thread just started to wait and will awake in 99ms. So during 99ms, the KafkaConsumer will call in a loop the pollOnce method and will use a timeout of 0. That explains how we can have 268000 calls within 100ms. The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so I think the Kafka consumer should awake the heartbeat thread with a notify when needed. We made two quick fixes to solve this issue: - In NetworkClient.handle*(), we don't iterate on collections if they are empty (to avoid unnecessary iterators instantiations). - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify the heartbeat thread to awake it (dirty fix because we don't handle the autocommit case). With these 2 quick fixes and 128 consumers, the allocation rate drops down from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
[ https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-5512: --- Labels: performance (was: ) > KafkaConsumer: High memory allocation rate when idle > > > Key: KAFKA-5512 > URL: https://issues.apache.org/jira/browse/KAFKA-5512 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.10.1.1 >Reporter: Stephane Roset > Labels: performance > Fix For: 0.11.0.1 > > > Hi, > We noticed in our application that the memory allocation rate increased > significantly when we have no Kafka messages to consume. We isolated the > issue by using a JVM that simply runs 128 Kafka consumers. These consumers > consume 128 partitions (so each consumer consumes one partition). The > partitions are empty and no message has been sent during the test. The > consumers were configured with default values (session.timeout.ms=3, > fetch.max.wait.ms=500, receive.buffer.bytes=65536, > heartbeat.interval.ms=3000, max.poll.interval.ms=30, > max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this > context, the allocation rate was about 55 MiB/s. This high allocation rate > generates a lot of GC activity (to garbage the young heap) and was an issue > for our project. > We profiled the JVM with JProfiler. We noticed that there were a huge > quantity of ArrayList$Itr in memory. These collections were mainly > instantiated by the methods handleCompletedReceives, handleCompletedSends, > handleConnecions and handleDisconnections of the class NetWorkClient. We also > noticed that we had a lot of calls to the method pollOnce of the class > KafkaConsumer. > So we decided to run only one consumer and to profile the calls to the method > pollOnce. We noticed that regularly a huge number of calls is made to this > method, up to 268000 calls within 100ms. The pollOnce method calls the > NetworkClient.handle* methods. These methods iterate on collections (even if > they are empty), so that explains the huge number of iterators in memory. > The large number of calls is related to the heartbeat mechanism. The pollOnce > method calculates the poll timeout; if a heartbeat needs to be done, the > timeout will be set to 0. The problem is that the heartbeat thread checks > every 100 ms (default value of retry.backoff.ms) if a heartbeat should be > sent, so the KafkaConsumer will call the poll method in a loop without > timeout until the heartbeat thread awakes. For example: the heartbeat thread > just started to wait and will awake in 99ms. So during 99ms, the > KafkaConsumer will call in a loop the pollOnce method and will use a timeout > of 0. That explains how we can have 268000 calls within 100ms. > The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so > I think the Kafka consumer should awake the heartbeat thread with a notify > when needed. > We made two quick fixes to solve this issue: > - In NetworkClient.handle*(), we don't iterate on collections if they are > empty (to avoid unnecessary iterators instantiations). > - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify > the heartbeat thread to awake it (dirty fix because we don't handle the > autocommit case). > With these 2 quick fixes and 128 consumers, the allocation rate drops down > from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
[ https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-5512: --- Fix Version/s: 0.11.0.1 > KafkaConsumer: High memory allocation rate when idle > > > Key: KAFKA-5512 > URL: https://issues.apache.org/jira/browse/KAFKA-5512 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.10.1.1 >Reporter: Stephane Roset > Labels: performance > Fix For: 0.11.0.1 > > > Hi, > We noticed in our application that the memory allocation rate increased > significantly when we have no Kafka messages to consume. We isolated the > issue by using a JVM that simply runs 128 Kafka consumers. These consumers > consume 128 partitions (so each consumer consumes one partition). The > partitions are empty and no message has been sent during the test. The > consumers were configured with default values (session.timeout.ms=3, > fetch.max.wait.ms=500, receive.buffer.bytes=65536, > heartbeat.interval.ms=3000, max.poll.interval.ms=30, > max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this > context, the allocation rate was about 55 MiB/s. This high allocation rate > generates a lot of GC activity (to garbage the young heap) and was an issue > for our project. > We profiled the JVM with JProfiler. We noticed that there were a huge > quantity of ArrayList$Itr in memory. These collections were mainly > instantiated by the methods handleCompletedReceives, handleCompletedSends, > handleConnecions and handleDisconnections of the class NetWorkClient. We also > noticed that we had a lot of calls to the method pollOnce of the class > KafkaConsumer. > So we decided to run only one consumer and to profile the calls to the method > pollOnce. We noticed that regularly a huge number of calls is made to this > method, up to 268000 calls within 100ms. The pollOnce method calls the > NetworkClient.handle* methods. These methods iterate on collections (even if > they are empty), so that explains the huge number of iterators in memory. > The large number of calls is related to the heartbeat mechanism. The pollOnce > method calculates the poll timeout; if a heartbeat needs to be done, the > timeout will be set to 0. The problem is that the heartbeat thread checks > every 100 ms (default value of retry.backoff.ms) if a heartbeat should be > sent, so the KafkaConsumer will call the poll method in a loop without > timeout until the heartbeat thread awakes. For example: the heartbeat thread > just started to wait and will awake in 99ms. So during 99ms, the > KafkaConsumer will call in a loop the pollOnce method and will use a timeout > of 0. That explains how we can have 268000 calls within 100ms. > The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so > I think the Kafka consumer should awake the heartbeat thread with a notify > when needed. > We made two quick fixes to solve this issue: > - In NetworkClient.handle*(), we don't iterate on collections if they are > empty (to avoid unnecessary iterators instantiations). > - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify > the heartbeat thread to awake it (dirty fix because we don't handle the > autocommit case). > With these 2 quick fixes and 128 consumers, the allocation rate drops down > from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle
[ https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062859#comment-16062859 ] Ismael Juma commented on KAFKA-5512: Nice catch. Are you intending to submit a PR with your fixes? > KafkaConsumer: High memory allocation rate when idle > > > Key: KAFKA-5512 > URL: https://issues.apache.org/jira/browse/KAFKA-5512 > Project: Kafka > Issue Type: Bug > Components: consumer >Affects Versions: 0.10.1.1 >Reporter: Stephane Roset > Labels: performance > Fix For: 0.11.0.1 > > > Hi, > We noticed in our application that the memory allocation rate increased > significantly when we have no Kafka messages to consume. We isolated the > issue by using a JVM that simply runs 128 Kafka consumers. These consumers > consume 128 partitions (so each consumer consumes one partition). The > partitions are empty and no message has been sent during the test. The > consumers were configured with default values (session.timeout.ms=3, > fetch.max.wait.ms=500, receive.buffer.bytes=65536, > heartbeat.interval.ms=3000, max.poll.interval.ms=30, > max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this > context, the allocation rate was about 55 MiB/s. This high allocation rate > generates a lot of GC activity (to garbage the young heap) and was an issue > for our project. > We profiled the JVM with JProfiler. We noticed that there were a huge > quantity of ArrayList$Itr in memory. These collections were mainly > instantiated by the methods handleCompletedReceives, handleCompletedSends, > handleConnecions and handleDisconnections of the class NetWorkClient. We also > noticed that we had a lot of calls to the method pollOnce of the class > KafkaConsumer. > So we decided to run only one consumer and to profile the calls to the method > pollOnce. We noticed that regularly a huge number of calls is made to this > method, up to 268000 calls within 100ms. The pollOnce method calls the > NetworkClient.handle* methods. These methods iterate on collections (even if > they are empty), so that explains the huge number of iterators in memory. > The large number of calls is related to the heartbeat mechanism. The pollOnce > method calculates the poll timeout; if a heartbeat needs to be done, the > timeout will be set to 0. The problem is that the heartbeat thread checks > every 100 ms (default value of retry.backoff.ms) if a heartbeat should be > sent, so the KafkaConsumer will call the poll method in a loop without > timeout until the heartbeat thread awakes. For example: the heartbeat thread > just started to wait and will awake in 99ms. So during 99ms, the > KafkaConsumer will call in a loop the pollOnce method and will use a timeout > of 0. That explains how we can have 268000 calls within 100ms. > The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so > I think the Kafka consumer should awake the heartbeat thread with a notify > when needed. > We made two quick fixes to solve this issue: > - In NetworkClient.handle*(), we don't iterate on collections if they are > empty (to avoid unnecessary iterators instantiations). > - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify > the heartbeat thread to awake it (dirty fix because we don't handle the > autocommit case). > With these 2 quick fixes and 128 consumers, the allocation rate drops down > from 55 MiB/s to 4 MiB/s. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5513) Contradicting scalaDoc for AdminUtils.assignReplicasToBrokers
Charly Molter created KAFKA-5513: Summary: Contradicting scalaDoc for AdminUtils.assignReplicasToBrokers Key: KAFKA-5513 URL: https://issues.apache.org/jira/browse/KAFKA-5513 Project: Kafka Issue Type: Improvement Components: core Reporter: Charly Molter Priority: Trivial The documentation for AdminUtils.assignReplicasToBrokers seems to contradict itself. I says in the description: "As the result, if the number of replicas is equal to or greater than the number of racks, it will ensure that each rack will get at least one replica." Which means that it is possible to get an assignment where there's multiple replicas in a rack (if there's less racks than the replication factor). However, the throws clauses says: " @throws AdminOperationException If rack information is supplied but it is incomplete, or if it is not possible to assign each replica to a unique rack." Which seems to be contradicting the first claim. In practice it doesn't throw when RF < #racks so the point in the @throws clause should probably be removed. https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala#L121-L130 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4563) State transitions error PARTITIONS_REVOKED to NOT_RUNNING
[ https://issues.apache.org/jira/browse/KAFKA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062911#comment-16062911 ] Eno Thereska commented on KAFKA-4563: - This will be fixed together as part of KAFKA-5372 > State transitions error PARTITIONS_REVOKED to NOT_RUNNING > - > > Key: KAFKA-4563 > URL: https://issues.apache.org/jira/browse/KAFKA-4563 > Project: Kafka > Issue Type: Sub-task > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Eno Thereska >Assignee: Eno Thereska > Fix For: 0.11.1.0 > > > When starting and stopping streams quickly, the following exception is thrown: > java.lang.IllegalStateException: Incorrect state transition from > PARTITIONS_REVOKED to NOT_RUNNING > at > org.apache.kafka.streams.processor.internals.StreamThread.setState(StreamThread.java:164) > at > org.apache.kafka.streams.processor.internals.StreamThread.shutdown(StreamThread.java:414) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:366) > A temporary fix is to convert the exception into a warning, since clearly not > all state transitions are thought through yet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-4563) State transitions error PARTITIONS_REVOKED to NOT_RUNNING
[ https://issues.apache.org/jira/browse/KAFKA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eno Thereska resolved KAFKA-4563. - Resolution: Duplicate This will be fixed as part of KAFKA-5372 > State transitions error PARTITIONS_REVOKED to NOT_RUNNING > - > > Key: KAFKA-4563 > URL: https://issues.apache.org/jira/browse/KAFKA-4563 > Project: Kafka > Issue Type: Sub-task > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Eno Thereska >Assignee: Eno Thereska > Fix For: 0.11.1.0 > > > When starting and stopping streams quickly, the following exception is thrown: > java.lang.IllegalStateException: Incorrect state transition from > PARTITIONS_REVOKED to NOT_RUNNING > at > org.apache.kafka.streams.processor.internals.StreamThread.setState(StreamThread.java:164) > at > org.apache.kafka.streams.processor.internals.StreamThread.shutdown(StreamThread.java:414) > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:366) > A temporary fix is to convert the exception into a warning, since clearly not > all state transitions are thought through yet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5372) Unexpected state transition Dead to PendingShutdown
[ https://issues.apache.org/jira/browse/KAFKA-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062996#comment-16062996 ] ASF GitHub Bot commented on KAFKA-5372: --- GitHub user enothereska opened a pull request: https://github.com/apache/kafka/pull/3432 KAFKA-5372: fixes to state transitions You can merge this pull request into a Git repository by running: $ git pull https://github.com/enothereska/kafka KAFKA-5372-state-transitions Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3432.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3432 commit 5f57558ef293351d9f5db11edb62089367e39b76 Author: Eno Thereska Date: 2017-06-26T11:04:20Z Checkpoint commit ac372b196998052a024aac47af64dbd803a65733 Author: Eno Thereska Date: 2017-06-26T12:11:26Z Some fixes > Unexpected state transition Dead to PendingShutdown > --- > > Key: KAFKA-5372 > URL: https://issues.apache.org/jira/browse/KAFKA-5372 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Jason Gustafson >Assignee: Eno Thereska > Fix For: 0.11.1.0 > > > I often see this running integration tests: > {code} > [2017-06-02 15:36:03,411] WARN stream-thread > [appId-1-c382ef0a-adbd-422b-9717-9b2bc52b55eb-StreamThread-13] Unexpected > state transition from DEAD to PENDING_SHUTDOWN. > (org.apache.kafka.streams.processor.internals.StreamThread:976) > {code} > Maybe a race condition on shutdown or something? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5514) KafkaConsumer ignores default values in Properties object because of incorrect use of Properties object.
Geert Schuring created KAFKA-5514: - Summary: KafkaConsumer ignores default values in Properties object because of incorrect use of Properties object. Key: KAFKA-5514 URL: https://issues.apache.org/jira/browse/KAFKA-5514 Project: Kafka Issue Type: Bug Components: clients Affects Versions: 0.10.2.1 Reporter: Geert Schuring When setting default values in a Properties object the KafkaConsumer ignores these values because the Properties object is being treated as a Map. The ConsumerConfig object uses the putAll method to copy properties from the incoming object to its local copy. (See https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java#L471) This is incorrect because it only copies the explicit properties and ignores the default values also present in the properties object. (Also see: https://stackoverflow.com/questions/2004833/how-to-merge-two-java-util-properties-objects) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5515) Consider removing date formatting from Segments class
Bill Bejeck created KAFKA-5515: -- Summary: Consider removing date formatting from Segments class Key: KAFKA-5515 URL: https://issues.apache.org/jira/browse/KAFKA-5515 Project: Kafka Issue Type: Improvement Components: streams Reporter: Bill Bejeck Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} for each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class
[ https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck updated KAFKA-5515: --- Description: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} for each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. We should also consider keeping a lookup of existing segments to avoid as many string operations as possible. (was: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} for each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}} ) > Consider removing date formatting from Segments class > - > > Key: KAFKA-5515 > URL: https://issues.apache.org/jira/browse/KAFKA-5515 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bill Bejeck > Labels: performance > > Currently the {{Segments}} class uses a date when calculating the segment id > and uses {{SimpleDateFormat}} for formatting the segment id. However this is > a high volume code path and creating a new {{SimpleDateFormat}} for each > segment id is expensive. We should look into removing the date from the > segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. > We should also consider keeping a lookup of existing segments to avoid as > many string operations as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class
[ https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck updated KAFKA-5515: --- Description: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} and formatting each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. We should also consider keeping a lookup of existing segments to avoid as many string operations as possible. (was: Currently the {{Segments}} class uses a date when calculating the segment id and uses {{SimpleDateFormat}} for formatting the segment id. However this is a high volume code path and creating a new {{SimpleDateFormat}} for each segment id is expensive. We should look into removing the date from the segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}. We should also consider keeping a lookup of existing segments to avoid as many string operations as possible.) > Consider removing date formatting from Segments class > - > > Key: KAFKA-5515 > URL: https://issues.apache.org/jira/browse/KAFKA-5515 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bill Bejeck > Labels: performance > > Currently the {{Segments}} class uses a date when calculating the segment id > and uses {{SimpleDateFormat}} for formatting the segment id. However this is > a high volume code path and creating a new {{SimpleDateFormat}} and > formatting each segment id is expensive. We should look into removing the > date from the segment id or at a minimum use a faster alternative to > {{SimpleDateFormat}}. We should also consider keeping a lookup of existing > segments to avoid as many string operations as possible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion
Paolo Patierno created KAFKA-5516: - Summary: Formatting verifiable producer/consumer output in a similar fashion Key: KAFKA-5516 URL: https://issues.apache.org/jira/browse/KAFKA-5516 Project: Kafka Issue Type: Improvement Components: tools Reporter: Paolo Patierno Assignee: Paolo Patierno Priority: Trivial Hi, following the proposal to have verifiable producer/consumer providing a very similar output where the "timestamp" is always the first column followed by "name" event and then all the specific data for such event. It includes a verifiable producer refactoring for having that in the same way as verifiable consumer. Thanks, Paolo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion
[ https://issues.apache.org/jira/browse/KAFKA-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063049#comment-16063049 ] ASF GitHub Bot commented on KAFKA-5516: --- GitHub user ppatierno opened a pull request: https://github.com/apache/kafka/pull/3434 KAFKA-5516: Formatting verifiable producer/consumer output in a similar fashion You can merge this pull request into a Git repository by running: $ git pull https://github.com/ppatierno/kafka verifiable-consumer-producer Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3434.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3434 commit 6e6d728dce83ea689061a196e1b6811b447d4db7 Author: ppatierno Date: 2017-06-26T12:07:08Z Modified JSON order attributes in a more readable fashion commit 9235aadfccf446415ffbfd5d90c8d4faeddecc08 Author: ppatierno Date: 2017-06-26T12:56:07Z Fixed documentation about old request.required.acks producer parameter Modified JSON order attributes in a more readable fashion Refactoring on verifiable producer to be like the verifiable consumer > Formatting verifiable producer/consumer output in a similar fashion > --- > > Key: KAFKA-5516 > URL: https://issues.apache.org/jira/browse/KAFKA-5516 > Project: Kafka > Issue Type: Improvement > Components: tools >Reporter: Paolo Patierno >Assignee: Paolo Patierno >Priority: Trivial > > Hi, > following the proposal to have verifiable producer/consumer providing a very > similar output where the "timestamp" is always the first column followed by > "name" event and then all the specific data for such event. > It includes a verifiable producer refactoring for having that in the same way > as verifiable consumer. > Thanks, > Paolo -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4388) Connect key and value converters are listed without default values
[ https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063107#comment-16063107 ] ASF GitHub Bot commented on KAFKA-4388: --- GitHub user evis opened a pull request: https://github.com/apache/kafka/pull/3435 KAFKA-4388 Recommended values for converters from plugins Questions to reviewers: 1. Should we cache `converterRecommenders.validValues()`, `SinkConnectorConfig.configDef()` and `SourceConnectorConfig.configDef()` results? 2. What is appropriate place for testing new `ConnectorConfig.configDef(plugins)` functionality? cc @ewencp You can merge this pull request into a Git repository by running: $ git pull https://github.com/evis/kafka converters_values Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3435.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3435 commit 069a1ba832f844c20224598c622fa19576b0ba61 Author: Evgeny Veretennikov Date: 2017-06-26T13:44:40Z KAFKA-4388 Recommended values for converters from plugins ConnectorConfig.configDef() takes Plugins parameter now. List of recommended values for converters is taken from plugins.converters() > Connect key and value converters are listed without default values > -- > > Key: KAFKA-4388 > URL: https://issues.apache.org/jira/browse/KAFKA-4388 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 0.10.1.0 >Reporter: Ewen Cheslack-Postava >Assignee: Evgeny Veretennikov >Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > KIP-75 added per connector converters. This exposes the settings on a > per-connector basis via the validation API. However, the way this is > specified for each connector is via a config value with no default value. > This means the validation API implies there is no setting unless you provide > one. > It would be much better to include the default value extracted from the > WorkerConfig instead so it's clear you shouldn't need to override the default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file
[ https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063228#comment-16063228 ] Jun Rao commented on KAFKA-5413: Merged https://github.com/apache/kafka/pull/3397 to 0.10.2. > Log cleaner fails due to large offset in segment file > - > > Key: KAFKA-5413 > URL: https://issues.apache.org/jira/browse/KAFKA-5413 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.0, 0.10.2.1 > Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0 >Reporter: Nicholas Ngorok >Assignee: Kelvin Rutt >Priority: Critical > Labels: reliability > Fix For: 0.11.0.0, 0.10.2.2 > > Attachments: .index.cleaned, > .log, .log.cleaned, > .timeindex.cleaned, 002147422683.log, > kafka-5413.patch > > > The log cleaner thread in our brokers is failing with the trace below > {noformat} > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 > 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner) > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp > Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. > (kafka.log.LogCleaner) > [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) > java.lang.IllegalArgumentException: requirement failed: largest offset in > message set can not be safely converted to relative offset. > at scala.Predef$.require(Predef.scala:224) > at kafka.log.LogSegment.append(LogSegment.scala:109) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.clean(LogCleaner.scala:362) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Stopped (kafka.log.LogCleaner) > {noformat} > This seems to point at the specific line [here| > https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92] > in the kafka src where the difference is actually larger than MAXINT as both > baseOffset and offset are of type long. It was introduced in this [pr| > https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631] > These were the outputs of dumping the first two log segments > {noformat} > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0.log > Dumping /kafka-logs/__consumer_offsets-12/.log > Starting offset: 0 > offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: > -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34 > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0002147343575.log > Dumping /kafka-logs/__consumer_offsets-12/002147343575.log > Starting offset: 2147343575 > offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo > adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34 > {noformat} > My guess is that since 2147539884 is larger than MAXINT, we are hitting this > exception. Was there a specific reason, this check was added in 0.10.2? > E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of > "key 1" following, wouldn't we run into this situation whenever the log > cleaner runs? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5517) Support linking to particular configuration parameters
Tom Bentley created KAFKA-5517: -- Summary: Support linking to particular configuration parameters Key: KAFKA-5517 URL: https://issues.apache.org/jira/browse/KAFKA-5517 Project: Kafka Issue Type: Improvement Components: documentation Reporter: Tom Bentley Assignee: Tom Bentley Priority: Minor Currently the configuration parameters are documented long tables, and it's only possible to link to the heading before a particular table. When discussing configuration parameters on forums it would be helpful to be able to link to the particular parameter under discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5517) Support linking to particular configuration parameters
[ https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063294#comment-16063294 ] ASF GitHub Bot commented on KAFKA-5517: --- GitHub user tombentley opened a pull request: https://github.com/apache/kafka/pull/3436 KAFKA-5517: Add id to config HTML tables to allow linking You can merge this pull request into a Git repository by running: $ git pull https://github.com/tombentley/kafka KAFKA-5517 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3436.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3436 > Support linking to particular configuration parameters > -- > > Key: KAFKA-5517 > URL: https://issues.apache.org/jira/browse/KAFKA-5517 > Project: Kafka > Issue Type: Improvement > Components: documentation >Reporter: Tom Bentley >Assignee: Tom Bentley >Priority: Minor > > Currently the configuration parameters are documented long tables, and it's > only possible to link to the heading before a particular table. When > discussing configuration parameters on forums it would be helpful to be able > to link to the particular parameter under discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5517) Support linking to particular configuration parameters
[ https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Bentley updated KAFKA-5517: --- Labels: patch-available (was: ) > Support linking to particular configuration parameters > -- > > Key: KAFKA-5517 > URL: https://issues.apache.org/jira/browse/KAFKA-5517 > Project: Kafka > Issue Type: Improvement > Components: documentation >Reporter: Tom Bentley >Assignee: Tom Bentley >Priority: Minor > Labels: patch-available > > Currently the configuration parameters are documented long tables, and it's > only possible to link to the heading before a particular table. When > discussing configuration parameters on forums it would be helpful to be able > to link to the particular parameter under discussion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5518) General Kafka connector performanc workload
Chen He created KAFKA-5518: -- Summary: General Kafka connector performanc workload Key: KAFKA-5518 URL: https://issues.apache.org/jira/browse/KAFKA-5518 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 0.10.2.1 Reporter: Chen He Sorry, first time to create Kafka JIRA. Just curious whether there is a general purpose performance workload for Kafka connector (hdfs, s3, etc). Then, we can setup an standard and evaluate the performance for further connectors such as swift, etc. Please feel free to comment or mark as dup if there already is one jira tracking this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5473) handle ZK session expiration properly when a new session can't be established
[ https://issues.apache.org/jira/browse/KAFKA-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063437#comment-16063437 ] Jun Rao commented on KAFKA-5473: [~prasincs], there are a couple cases to consider. (1) This is the case that the broker knows for sure that it's not registered with ZK. In this case, it seems failing the broker is better since from the ZK server's perspective, the broker is down and failing the broker will make the behavior of the broker consistent with what's in ZK server. This is the issue that this particular jira is trying to solve. I think we can just wait up to zk.connection.time.ms and do a clean shutdown. (2) There is another case that the broker is partitioned off from ZK server. In this case, ZK server may have expired the session of the broker. However, until the network connection is back, the broker doesn't know that its session has expired. In that mode, currently the broker doesn't shut down and just wait until the network connection is back. > handle ZK session expiration properly when a new session can't be established > - > > Key: KAFKA-5473 > URL: https://issues.apache.org/jira/browse/KAFKA-5473 > Project: Kafka > Issue Type: Sub-task >Affects Versions: 0.9.0.0 >Reporter: Jun Rao >Assignee: Prasanna Gautam > > In https://issues.apache.org/jira/browse/KAFKA-2405, we change the logic in > handling ZK session expiration a bit. If a new ZK session can't be > established after session expiration, we just log an error and continue. > However, this can leave the broker in a bad state since it's up, but not > registered from the controller's perspective. Replicas on this broker may > never to be in sync. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.
[ https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063452#comment-16063452 ] Chen He commented on KAFKA-3554: Thank you for the quick reply [~becket_qin]. This work is really valuable. It provides us a tool that can exploit kafka system's capacity. For example, we can get lowest latency by only use 1 thread, at the same time, by increasing thread, we can find what is the maximum throughput for a kafka cluster. Only one question, I did applied this patch to latest kafka and comparing results with old ProducerPerformance.java file. I found out, if we set ack=all with snappy compression, with 100M record(100B each), it does not work as well as old PproducerPerformance.java file. > Generate actual data with specific compression ratio and add multi-thread > support in the ProducerPerformance tool. > -- > > Key: KAFKA-3554 > URL: https://issues.apache.org/jira/browse/KAFKA-3554 > Project: Kafka > Issue Type: Improvement >Affects Versions: 0.9.0.1 >Reporter: Jiangjie Qin >Assignee: Jiangjie Qin > Fix For: 0.11.1.0 > > > Currently the ProducerPerformance always generate the payload with same > bytes. This does not quite well to test the compressed data because the > payload is extremely compressible no matter how big the payload is. > We can make some changes to make it more useful for compressed messages. > Currently I am generating the payload containing integer from a given range. > By adjusting the range of the integers, we can get different compression > ratios. > API wise, we can either let user to specify the integer range or the expected > compression ratio (we will do some probing to get the corresponding range for > the users) > Besides that, in many cases, it is useful to have multiple producer threads > when the producer threads themselves are bottleneck. Admittedly people can > run multiple ProducerPerformance to achieve similar result, but it is still > different from the real case when people actually use the producer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak
[ https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063456#comment-16063456 ] Joseph Aliase commented on KAFKA-5007: -- I believe that's the error we are seeing in the log. Let me reproduce the issue today. Will confirm thanks [~huxi_2b] > Kafka Replica Fetcher Thread- Resource Leak > --- > > Key: KAFKA-5007 > URL: https://issues.apache.org/jira/browse/KAFKA-5007 > Project: Kafka > Issue Type: Bug > Components: core, network >Affects Versions: 0.10.0.0, 0.10.1.1, 0.10.2.0 > Environment: Centos 7 > Jave 8 >Reporter: Joseph Aliase >Priority: Critical > Labels: reliability > Attachments: jstack-kafka.out, jstack-zoo.out, lsofkafka.txt, > lsofzookeeper.txt > > > Kafka is running out of open file descriptor when system network interface is > done. > Issue description: > We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file > descriptor for the account running Kafka is set to 10. > During an upgrade, network interface went down. Outage continued for 12 hours > eventually all the broker crashed with java.io.IOException: Too many open > files error. > We repeated the test in a lower environment and observed that Open Socket > count keeps on increasing while the NIC is down. > We have around 13 topics with max partition size of 120 and number of replica > fetcher thread is set to 8. > Using an internal monitoring tool we observed that Open Socket descriptor > for the broker pid continued to increase although NIC was down leading to > Open File descriptor error. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file
[ https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063514#comment-16063514 ] Nicholas Ngorok commented on KAFKA-5413: Thanks [~Kelvinrutt] for getting this solved very quickly! [~junrao] is there a timeline/plans for the 0.10.2.2 release? > Log cleaner fails due to large offset in segment file > - > > Key: KAFKA-5413 > URL: https://issues.apache.org/jira/browse/KAFKA-5413 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 0.10.2.0, 0.10.2.1 > Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0 >Reporter: Nicholas Ngorok >Assignee: Kelvin Rutt >Priority: Critical > Labels: reliability > Fix For: 0.11.0.0, 0.10.2.2 > > Attachments: .index.cleaned, > .log, .log.cleaned, > .timeindex.cleaned, 002147422683.log, > kafka-5413.patch > > > The log cleaner thread in our brokers is failing with the trace below > {noformat} > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 > 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner) > [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: > Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp > Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. > (kafka.log.LogCleaner) > [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Error due to (kafka.log.LogCleaner) > java.lang.IllegalArgumentException: requirement failed: largest offset in > message set can not be safely converted to relative offset. > at scala.Predef$.require(Predef.scala:224) > at kafka.log.LogSegment.append(LogSegment.scala:109) > at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405) > at > kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363) > at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362) > at scala.collection.immutable.List.foreach(List.scala:381) > at kafka.log.Cleaner.clean(LogCleaner.scala:362) > at > kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241) > at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) > [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} > [kafka-log-cleaner-thread-0], Stopped (kafka.log.LogCleaner) > {noformat} > This seems to point at the specific line [here| > https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92] > in the kafka src where the difference is actually larger than MAXINT as both > baseOffset and offset are of type long. It was introduced in this [pr| > https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631] > These were the outputs of dumping the first two log segments > {noformat} > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0.log > Dumping /kafka-logs/__consumer_offsets-12/.log > Starting offset: 0 > offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: > -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34 > :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration > --files /kafka-logs/__consumer_offsets-12/000 > 0002147343575.log > Dumping /kafka-logs/__consumer_offsets-12/002147343575.log > Starting offset: 2147343575 > offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo > adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34 > {noformat} > My guess is that since 2147539884 is larger than MAXINT, we are hitting this > exception. Was there a specific reason, this check was added in 0.10.2? > E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of > "key 1" following, wouldn't we run into this situation whenever the log > cleaner runs? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063569#comment-16063569 ] Colin P. McCabe commented on KAFKA-3465: Should we rename the tool or add documentation to make it clear when people find this class, that it is only for the old consumer? > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3806) Adjust default values of log.retention.hours and offsets.retention.minutes
[ https://issues.apache.org/jira/browse/KAFKA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063579#comment-16063579 ] Jun Rao commented on KAFKA-3806: [~shangdi], good point on the issue with mirrormaker. I have no objection to increasing the default offsets.retention. However, as long as offsets.retention is not infinite, it seems that the issue with mirrormaker can still happen, i.e., the consumer in MM is still running, but no new offsets are updated since no new data is coming. For that case, it seems that we need a different solution (e.g. tracking when an offset is obsolete based on the activity of the group). > Adjust default values of log.retention.hours and offsets.retention.minutes > -- > > Key: KAFKA-3806 > URL: https://issues.apache.org/jira/browse/KAFKA-3806 > Project: Kafka > Issue Type: Improvement > Components: config >Affects Versions: 0.9.0.1, 0.10.0.0 >Reporter: Michal Turek >Priority: Minor > > Combination of default values of log.retention.hours (168 hours = 7 days) and > offsets.retention.minutes (1440 minutes = 1 day) may be dangerous in special > cases. Offset retention should be always greater than log retention. > We have observed the following scenario and issue: > - Producing of data to a topic was disabled two days ago by producer update, > topic wasn't deleted. > - Consumer consumed all data and properly committed offsets to Kafka. > - Consumer made no more offset commits for that topic because there was no > more incoming data and there was nothing to confirm. (We have auto-commit > disabled, I'm not sure how behaves enabled auto-commit.) > - After one day: Kafka cleared too old offsets according to > offsets.retention.minutes. > - After two days: Long-term running consumer was restarted after update, it > didn't find any committed offsets for that topic since they were deleted by > offsets.retention.minutes so it started consuming from the beginning. > - The messages were still in Kafka due to larger log.retention.hours, about 5 > days of messages were read again. > Known workaround to solve this issue: > - Explicitly configure log.retention.hours and offsets.retention.minutes, > don't use defaults. > Proposals: > - Prolong default value of offsets.retention.minutes to be at least twice > larger than log.retention.hours. > - Check these values during Kafka startup and log a warning if > offsets.retention.minutes is smaller than log.retention.hours. > - Add a note to migration guide about differences between storing of offsets > in ZooKeeper and Kafka (http://kafka.apache.org/documentation.html#upgrade). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5519) Support for multiple certificates in a single keystore
Alla Tumarkin created KAFKA-5519: Summary: Support for multiple certificates in a single keystore Key: KAFKA-5519 URL: https://issues.apache.org/jira/browse/KAFKA-5519 Project: Kafka Issue Type: New Feature Components: security Affects Versions: 0.10.2.1 Reporter: Alla Tumarkin Background Currently, we need to have a keystore exclusive to the component with exactly one key in it. Looking at the JSSE Reference guide, it seems like we would need to introduce our own KeyManager into the SSLContext which selects a configurable key alias name. https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/X509KeyManager.html has methods for dealing with aliases. The goal here to use a specific certificate (with proper ACLs set for this client), and not just the first one that matches. Looks like it requires a code change to the SSLChannelBuilder -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063618#comment-16063618 ] Vahid Hashemian commented on KAFKA-3465: [~cmccabe] I have opened a PR to update the relevant documentation [here|https://github.com/apache/kafka/pull/3405]. Would that address your concern? Thanks. > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063629#comment-16063629 ] Colin P. McCabe commented on KAFKA-3465: What about adding a comment in {{ConsumerOffsetChecker.scala}}? > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063675#comment-16063675 ] Vahid Hashemian commented on KAFKA-3465: Yeah, we could improve the warning message of the tool. I can submit a PR for it. > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5520) Extend Consumer Group Reset Offset tool for Stream Applications
Jorge Quilcate created KAFKA-5520: - Summary: Extend Consumer Group Reset Offset tool for Stream Applications Key: KAFKA-5520 URL: https://issues.apache.org/jira/browse/KAFKA-5520 Project: Kafka Issue Type: Improvement Components: core, tools Reporter: Jorge Quilcate Fix For: 0.11.1.0 KIP documentation: TODO -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5520) Extend Consumer Group Reset Offset tool for Stream Applications
[ https://issues.apache.org/jira/browse/KAFKA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Quilcate updated KAFKA-5520: -- Description: KIP documentation: https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application (was: KIP documentation: TODO) > Extend Consumer Group Reset Offset tool for Stream Applications > --- > > Key: KAFKA-5520 > URL: https://issues.apache.org/jira/browse/KAFKA-5520 > Project: Kafka > Issue Type: Improvement > Components: core, tools >Reporter: Jorge Quilcate > Labels: kip > Fix For: 0.11.1.0 > > > KIP documentation: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
[ https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063725#comment-16063725 ] ASF GitHub Bot commented on KAFKA-3465: --- GitHub user vahidhashemian opened a pull request: https://github.com/apache/kafka/pull/3438 KAFKA-3465: Clarify warning message of ConsumerOffsetChecker Add that the tool works with the old consumer only. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vahidhashemian/kafka KAFKA-3465 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3438.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3438 commit fa68749267b40d156b3811368e2f49c5c8813a43 Author: Vahid Hashemian Date: 2017-06-26T20:24:46Z KAFKA-3465: Clarify warning message of ConsumerOffsetChecker Add that the tool works with the old consumer only. > kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode > -- > > Key: KAFKA-3465 > URL: https://issues.apache.org/jira/browse/KAFKA-3465 > Project: Kafka > Issue Type: Improvement > Components: core >Affects Versions: 0.9.0.0 >Reporter: BrianLing > > 1. When we enable mirrorMake to migrate Kafka event from one to other with > "new.consumer" mode: > java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 > -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC > -Djava.awt.headless=true -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dkafka.logs.dir=/kafka/kafka-app-logs > -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties > -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* > -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log > --consumer.config ../config/consumer.properties --new.consumer --num.streams > 4 --producer.config ../config/producer-slca.properties --whitelist risk.* > 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and > the owner is none. > bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --broker-info > --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 > --topic > Group Topic Pid Offset logSize > Lag Owner > lvs.slca.mirrormaker 0 418578332 418678347 100015 > none > lvs.slca.mirrormaker 1 418598026 418698338 100312 > none > [Root Cause] > I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka > internal to store offset & consumer owner information. > Does it mean we can not use the below command to check new consumer’s > lag since current lag formula: lag= logSize – offset > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80 > > https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182 > => offSet Fetch from zookeeper instead of from Kafka -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4849) Bug in KafkaStreams documentation
[ https://issues.apache.org/jira/browse/KAFKA-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063739#comment-16063739 ] Guozhang Wang commented on KAFKA-4849: -- We have resolved all the reported issues in this JIRA except the one in web docs, which is covered in KAFKA-4705 already. Resolving this ticket for now. > Bug in KafkaStreams documentation > - > > Key: KAFKA-4849 > URL: https://issues.apache.org/jira/browse/KAFKA-4849 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Matthias J. Sax >Priority: Minor > > At the page: https://kafka.apache.org/documentation/streams > > In the chapter titled Application Configuration and Execution, in the example > there is a line: > > settings.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, "zookeeper1:2181"); > > but ZOOKEEPER_CONNECT_CONFIG is deprecated in the Kafka version 0.10.2.0. > > Also the table on the page: > https://kafka.apache.org/0102/documentation/#streamsconfigs is a bit > misleading. > 1. Again zookeeper.connect is deprecated. > 2. The client.id and zookeeper.connect are marked by high importance, > but according to http://docs.confluent.io/3.2.0/streams/developer-guide.html > none of them are important to initialize the stream. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-3806) Adjust default values of log.retention.hours and offsets.retention.minutes
[ https://issues.apache.org/jira/browse/KAFKA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063798#comment-16063798 ] James Cheng commented on KAFKA-3806: [~junrao], your example of "track when an offset is obsolete based on the activity of the group" is being tracked in https://issues.apache.org/jira/browse/KAFKA-4682 > Adjust default values of log.retention.hours and offsets.retention.minutes > -- > > Key: KAFKA-3806 > URL: https://issues.apache.org/jira/browse/KAFKA-3806 > Project: Kafka > Issue Type: Improvement > Components: config >Affects Versions: 0.9.0.1, 0.10.0.0 >Reporter: Michal Turek >Priority: Minor > > Combination of default values of log.retention.hours (168 hours = 7 days) and > offsets.retention.minutes (1440 minutes = 1 day) may be dangerous in special > cases. Offset retention should be always greater than log retention. > We have observed the following scenario and issue: > - Producing of data to a topic was disabled two days ago by producer update, > topic wasn't deleted. > - Consumer consumed all data and properly committed offsets to Kafka. > - Consumer made no more offset commits for that topic because there was no > more incoming data and there was nothing to confirm. (We have auto-commit > disabled, I'm not sure how behaves enabled auto-commit.) > - After one day: Kafka cleared too old offsets according to > offsets.retention.minutes. > - After two days: Long-term running consumer was restarted after update, it > didn't find any committed offsets for that topic since they were deleted by > offsets.retention.minutes so it started consuming from the beginning. > - The messages were still in Kafka due to larger log.retention.hours, about 5 > days of messages were read again. > Known workaround to solve this issue: > - Explicitly configure log.retention.hours and offsets.retention.minutes, > don't use defaults. > Proposals: > - Prolong default value of offsets.retention.minutes to be at least twice > larger than log.retention.hours. > - Check these values during Kafka startup and log a warning if > offsets.retention.minutes is smaller than log.retention.hours. > - Add a note to migration guide about differences between storing of offsets > in ZooKeeper and Kafka (http://kafka.apache.org/documentation.html#upgrade). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
[ https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063937#comment-16063937 ] ASF GitHub Bot commented on KAFKA-5464: --- GitHub user mjsax opened a pull request: https://github.com/apache/kafka/pull/3439 KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG You can merge this pull request into a Git repository by running: $ git pull https://github.com/mjsax/kafka kafka-5464-streamskafkaclient-poll Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3439.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3439 commit 609ceeb0d546f02b9da2638545784f050dc9558a Author: Matthias J. Sax Date: 2017-06-26T22:52:56Z KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > -- > > Key: KAFKA-5464 > URL: https://issues.apache.org/jira/browse/KAFKA-5464 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.10.2.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax > > In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call > {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout. > However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to > {{KafkaConsumer.poll()}} and it's incorrect to use it for the > {{NetworkClient}}. If the config is increased, this can lead to a infinite > rebalance and rebalance on the client side is increased and thus, the client > is not able to meet broker enforced timeouts anymore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
[ https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-5464: --- Fix Version/s: 0.11.0.1 0.10.2.2 > StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > -- > > Key: KAFKA-5464 > URL: https://issues.apache.org/jira/browse/KAFKA-5464 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.10.2.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax > Fix For: 0.10.2.2, 0.11.0.1 > > > In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call > {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout. > However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to > {{KafkaConsumer.poll()}} and it's incorrect to use it for the > {{NetworkClient}}. If the config is increased, this can lead to a infinite > rebalance and rebalance on the client side is increased and thus, the client > is not able to meet broker enforced timeouts anymore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
[ https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias J. Sax updated KAFKA-5464: --- Fix Version/s: 0.11.1.0 > StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG > -- > > Key: KAFKA-5464 > URL: https://issues.apache.org/jira/browse/KAFKA-5464 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.10.2.1 >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax > Fix For: 0.11.1.0, 0.10.2.2, 0.11.0.1 > > > In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call > {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout. > However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to > {{KafkaConsumer.poll()}} and it's incorrect to use it for the > {{NetworkClient}}. If the config is increased, this can lead to a infinite > rebalance and rebalance on the client side is increased and thus, the client > is not able to meet broker enforced timeouts anymore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063945#comment-16063945 ] Guozhang Wang commented on KAFKA-4750: -- [~mjsax][~evis] [~mihbor] Thanks for your comments. I would like to think a bit more on the general resolution for this case though before reviewing [~evis]'s patch: 1. In Kafka messages, "null" byte arrays indicate tombstones, note that this means that if user's serde decide to serialize any objects into null for a log compacted topic (e.g. a changelog topic of a state store), it meant to delete the record from the store. 2. In Kafka Streams state stores, we did NOT enforcing if "null" indicates deletion from the javadoc: {code} /** * Update the value associated with this key * * @param key The key to associate the value to * @param value The value, it can be null. * @throws NullPointerException If null is used for key. */ void put(K key, V value); {code} However our implementation did treat value-typed "null" (note it is not "null" byte arrays as in serialized messages) as deletions, since we implement {{delete(key)}} as {{put(key, null)}}. As Evgeny / Michal mentioned, it is intuitive if our {{put}} semantics aligned with Java's map operations: {code} ... // store initialized as empty store.get(key); // returns null store.put(key, value); store.delete(key); store.get(key); // returns null store.put(key, value); store.put(key, null); // we can interpret it as "associate the key with null" or simply delete this key store.get(key); // returns null, though generally speaking it could indicate either the key is associated with value or the key does not exist {code} Now assuming you have a customized serde that maps "null" object to "not-null" byte arrays, in this case the above would still hold: {code} store.put(key, value); store.put(key, null); // now "null" object is just a special value that do not indicate deletion store.get(key); // returns null, but this should be interpreted as "the key is associated with null" {code} Now assuming you have a customized serde that maps "not null" object to "null" byte arrays, in this case the "not-null" object is really interpreted as a dummy value that the above still holds {code} store.put(key, value); store.put(key, MY_DUMMY); // serialized into "null" byte arrays store.get(key); // returns MY_DUMMY as "null" byte arrays is deserialized symmetrically {code} So I think if we want to allow the above customized interpretation then we should not implement {{delete()}} as {{put(key, null)}} since "null" objects may not indicate deletions; if we want to be more restrict then we should emphasize that in the javadoc above that "@param value The value, it can be null which indicates deletion of the key". WDYT? > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values
[ https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063956#comment-16063956 ] Matthias J. Sax commented on KAFKA-4750: I think [~damianguy] or [~enothereska] can comment best in this. AFAIK, we use {{put(key,null)}} with delete-semantics all over the place. Also for {{KTable}} caches. As it align with changelog delete semantics I also think it does make sense to keep it this way. I would rather educate user that plug in Serde to not return {{null}} if input is not {{null}}. We can also add checks to all {{Serde}} calls: (1) never call Serde for {{null}} as we know it must be {{null}} anyway (2) if we call Serde with not-null, make sure it does not return {{null}} -- otherwise throw exception. > KeyValueIterator returns null values > > > Key: KAFKA-4750 > URL: https://issues.apache.org/jira/browse/KAFKA-4750 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1 >Reporter: Michal Borowiecki >Assignee: Evgeny Veretennikov > Labels: newbie > Attachments: DeleteTest.java > > > The API for ReadOnlyKeyValueStore.range method promises the returned iterator > will not return null values. However, after upgrading from 0.10.0.0 to > 0.10.1.1 we found null values are returned causing NPEs on our side. > I found this happens after removing entries from the store and I found > resemblance to SAMZA-94 defect. The problem seems to be as it was there, when > deleting entries and having a serializer that does not return null when null > is passed in, the state store doesn't actually delete that key/value pair but > the iterator will return null value for that key. > When I modified our serilizer to return null when null is passed in, the > problem went away. However, I believe this should be fixed in kafka streams, > perhaps with a similar approach as SAMZA-94. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-5367) Producer should not expiry topic from metadata cache if accumulator still has data for this topic
[ https://issues.apache.org/jira/browse/KAFKA-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin resolved KAFKA-5367. - Resolution: Invalid > Producer should not expiry topic from metadata cache if accumulator still has > data for this topic > - > > Key: KAFKA-5367 > URL: https://issues.apache.org/jira/browse/KAFKA-5367 > Project: Kafka > Issue Type: Bug >Reporter: Dong Lin >Assignee: Dong Lin > > To be added. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5521) Support replicas movement between log directories (KIP-113)
Dong Lin created KAFKA-5521: --- Summary: Support replicas movement between log directories (KIP-113) Key: KAFKA-5521 URL: https://issues.apache.org/jira/browse/KAFKA-5521 Project: Kafka Issue Type: Bug Reporter: Dong Lin Assignee: Dong Lin -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5522) ListOffset should take LSO into account when searching by timestamp
Jason Gustafson created KAFKA-5522: -- Summary: ListOffset should take LSO into account when searching by timestamp Key: KAFKA-5522 URL: https://issues.apache.org/jira/browse/KAFKA-5522 Project: Kafka Issue Type: Sub-task Affects Versions: 0.11.0.0 Reporter: Jason Gustafson Assignee: Jason Gustafson Fix For: 0.11.0.1 For a normal read_uncommitted consumer, we bound the offset returned from ListOffsets by the high watermark. For read_committed consumers, we should similarly bound offsets by the LSO. Currently we only handle the case of fetching the end offset. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
[ https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064176#comment-16064176 ] Mahesh Veerabathiran commented on KAFKA-5153: - Having the same issue. do anyone has a resolution? > KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting > --- > > Key: KAFKA-5153 > URL: https://issues.apache.org/jira/browse/KAFKA-5153 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.10.2.0 > Environment: RHEL 6 > Java Version 1.8.0_91-b14 >Reporter: Arpan >Priority: Critical > Attachments: server_1_72server.log, server_2_73_server.log, > server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, > ThreadDump_1493564177.dump, ThreadDump_1493564249.dump > > > Hi Team, > I was earlier referring to issue KAFKA-4477 because the problem i am facing > is similar. I tried to search the same reference in release docs as well but > did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using > 2.11_0.10.2.0. > I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set > of servers in cluster mode. We are having around 240GB of data getting > transferred through KAFKA everyday. What we are observing is disconnect of > the server from cluster and ISR getting reduced and it starts impacting > service. > I have also observed file descriptor count getting increased a bit, in normal > circumstances we have not observed FD count more than 500 but when issue > started we were observing it in the range of 650-700 on all 3 servers. > Attaching thread dumps of all 3 servers when we started facing the issue > recently. > The issue get vanished once you bounce the nodes and the set up is not > working more than 5 days without this issue. Attaching server logs as well. > Kindly let me know if you need any additional information. Attaching > server.properties as well for one of the server (It's similar on all 3 > serversP) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4344) Exception when accessing partition, offset and timestamp in processor class
[ https://issues.apache.org/jira/browse/KAFKA-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064379#comment-16064379 ] Nishkam Ravi commented on KAFKA-4344: - [~guozhang] We are encountering the same error (for topic()). The code is written in Scala and is being launched using sbt (spring isn't involved). Here's the code sketch: class GenericProcessor[T <: ThriftStruct](serDe: ThriftStructSerDe[T], decrypt: Boolean, config: Config) extends AbstractProcessor[Array[Byte], Array[Byte]] with LazyLogging { private var hsmClient: HSMClient = _ override def init(processorContext: ProcessorContext): Unit = { super.init(processorContext) hsmClient = HSMClient(config).getOrElse(null) } override def process(key: Array[Byte], value: Array[Byte]): Unit = { val topic: String = this.context.topic() // exception thrown here val partition: Int = this.context.partition() val offset: Long = this.context.offset() val timestamp: Long = this.context.timestamp() // business logic } } The exception is thrown only for the multi-consumer case (when number of partitions for a topic > 1 and parallelism > 1). This should be easy to reproduce. > Exception when accessing partition, offset and timestamp in processor class > --- > > Key: KAFKA-4344 > URL: https://issues.apache.org/jira/browse/KAFKA-4344 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.1.0 >Reporter: saiprasad mishra >Assignee: Guozhang Wang >Priority: Minor > > I have a kafka stream pipeline like below > source topic stream -> filter for null value ->map to make it keyed by id > ->custom processor to mystore ->to another topic -> ktable > I am hitting the below type of exception in a custom processor class if I try > to access offset() or partition() or timestamp() from the ProcessorContext in > the process() method. I was hoping it would return the partition and offset > for the enclosing topic(in this case source topic) where its consuming from > or -1 based on the api docs. > java.lang.IllegalStateException: This should not happen as offset() should > only be called while a record is processed > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.offset(ProcessorContextImpl.java:181) > ~[kafka-streams-0.10.1.0.jar!/:?] > at com.sai.repo.MyStore.process(MyStore.java:72) ~[classes!/:?] > at com.sai.repo.MyStore.process(MyStore.java:39) ~[classes!/:?] > at > org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:204) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:43) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:204) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.kstream.internals.KStreamFilter$KStreamFilterProcessor.process(KStreamFilter.java:44) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:204) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:66) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:181) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:436) > ~[kafka-streams-0.10.1.0.jar!/:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242) > [kafka-streams-0.10.1.0.jar!/:?] -- This message was sent by Atlassian JIRA (v6.4.14#64029)