[jira] [Assigned] (KAFKA-4388) Connect key and value converters are listed without default values

2017-06-26 Thread Evgeny Veretennikov (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Veretennikov reassigned KAFKA-4388:
--

Assignee: Evgeny Veretennikov

> Connect key and value converters are listed without default values
> --
>
> Key: KAFKA-4388
> URL: https://issues.apache.org/jira/browse/KAFKA-4388
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Evgeny Veretennikov
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> KIP-75 added per connector converters. This exposes the settings on a 
> per-connector basis via the validation API. However, the way this is 
> specified for each connector is via a config value with no default value. 
> This means the validation API implies there is no setting unless you provide 
> one.
> It would be much better to include the default value extracted from the 
> WorkerConfig instead so it's clear you shouldn't need to override the default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5511) ConfigDef.define() overloads take too many parameters

2017-06-26 Thread Evgeny Veretennikov (JIRA)
Evgeny Veretennikov created KAFKA-5511:
--

 Summary: ConfigDef.define() overloads take too many parameters
 Key: KAFKA-5511
 URL: https://issues.apache.org/jira/browse/KAFKA-5511
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Evgeny Veretennikov
Priority: Minor


Builder pattern can be helpful to get rid of all these {{define()}} overloads. 
I think, it's better to create some {{ConfigKeyBuilder}} class to construct 
keys.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770
 ] 

Michal Borowiecki commented on KAFKA-4750:
--

Deserialiser.deserialize javadoc says:
 * @param data serialized bytes; may be null; implementations are 
recommended to handle null by +returning a value+ or null rather than throwing 
an exception.

If it's fair for deserializer to return a value when a null byte array is 
provided, by symmetry, the serializer should be able to return null when such 
an object is provided.

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770
 ] 

Michal Borowiecki edited comment on KAFKA-4750 at 6/26/17 8:33 AM:
---

Deserialiser.deserialize javadoc says:
 * @param data serialized bytes; may be null; implementations are 
recommended to handle null by +returning a value+ or null rather than throwing 
an exception.

If it's fair for deserializer to return a value when a null byte array is 
provided, by symmetry, the serializer should be able to return null when such 
an object is provided and achieve correct round-tripping.


was (Author: mihbor):
Deserialiser.deserialize javadoc says:
 * @param data serialized bytes; may be null; implementations are 
recommended to handle null by +returning a value+ or null rather than throwing 
an exception.

If it's fair for deserializer to return a value when a null byte array is 
provided, by symmetry, the serializer should be able to return null when such 
an object is provided.

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770
 ] 

Michal Borowiecki edited comment on KAFKA-4750 at 6/26/17 8:42 AM:
---

Deserialiser.deserialize javadoc says:
 * @param data serialized bytes; may be null; implementations are 
recommended to handle null by +returning a value+ or null rather than throwing 
an exception.

If it's fair for deserializer to return a value when a null byte array is 
provided, by symmetry, the serializer should be able to return null when such 
an object is provided and achieve correct round-tripping.

[~evis] In your example, whether its intuitive or not I guess it depends on 
peoples' intuition. To me it is intuitive, since null on a changelog topic is a 
tombstone marker. If I serialize a value to null then to me it's intuitive that 
it's equivalent to deletion, so should not be found in the following get.

An example use case would be if the value types are collections. In order to 
avoid dealing with nulls in the application layer, one might want to 
deserialize null into an empty collection and vice-versa. Would that be a valid 
use-case? I can't see why not, but I may be wrong.


was (Author: mihbor):
Deserialiser.deserialize javadoc says:
 * @param data serialized bytes; may be null; implementations are 
recommended to handle null by +returning a value+ or null rather than throwing 
an exception.

If it's fair for deserializer to return a value when a null byte array is 
provided, by symmetry, the serializer should be able to return null when such 
an object is provided and achieve correct round-tripping.

[~evis] In your example, whether its intuitive or not I guess it depends on 
peoples' intuition. To me it is intuitive, since null on a changelog topic is a 
tombstone marker. If I serialize a value to null then it's intuitive that it's 
equivalent to deletion.

An example use case would be if the value types are collections. In order to 
avoid dealing with nulls in the application layer, one might want to 
deserialize null into an empty collection and vice-versa. Would that be a valid 
use-case? I can't see why not, but I may be wrong.

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062770#comment-16062770
 ] 

Michal Borowiecki edited comment on KAFKA-4750 at 6/26/17 8:42 AM:
---

Deserialiser.deserialize javadoc says:
 * @param data serialized bytes; may be null; implementations are 
recommended to handle null by +returning a value+ or null rather than throwing 
an exception.

If it's fair for deserializer to return a value when a null byte array is 
provided, by symmetry, the serializer should be able to return null when such 
an object is provided and achieve correct round-tripping.

[~evis] In your example, whether its intuitive or not I guess it depends on 
peoples' intuition. To me it is intuitive, since null on a changelog topic is a 
tombstone marker. If I serialize a value to null then it's intuitive that it's 
equivalent to deletion.

An example use case would be if the value types are collections. In order to 
avoid dealing with nulls in the application layer, one might want to 
deserialize null into an empty collection and vice-versa. Would that be a valid 
use-case? I can't see why not, but I may be wrong.


was (Author: mihbor):
Deserialiser.deserialize javadoc says:
 * @param data serialized bytes; may be null; implementations are 
recommended to handle null by +returning a value+ or null rather than throwing 
an exception.

If it's fair for deserializer to return a value when a null byte array is 
provided, by symmetry, the serializer should be able to return null when such 
an object is provided and achieve correct round-tripping.

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062837#comment-16062837
 ] 

ASF GitHub Bot commented on KAFKA-4750:
---

GitHub user evis opened a pull request:

https://github.com/apache/kafka/pull/3430

KAFKA-4750: RocksDBStore always deletes null values

@guozhangwang @mjsax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/evis/kafka rocksdbstore_null_values

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3430


commit 47b4b2b2a5fe8af4ca4a60c2cae97dcab4e7eec3
Author: Evgeny Veretennikov 
Date:   2017-06-26T09:21:35Z

KAFKA-4750: RocksDBStore always deletes null values

Deletion occurs even if serde serializes null value
to non-null byte array.




> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (KAFKA-5372) Unexpected state transition Dead to PendingShutdown

2017-06-26 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska reassigned KAFKA-5372:
---

Assignee: Eno Thereska

> Unexpected state transition Dead to PendingShutdown
> ---
>
> Key: KAFKA-5372
> URL: https://issues.apache.org/jira/browse/KAFKA-5372
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Jason Gustafson
>Assignee: Eno Thereska
> Fix For: 0.11.1.0
>
>
> I often see this running integration tests:
> {code}
> [2017-06-02 15:36:03,411] WARN stream-thread 
> [appId-1-c382ef0a-adbd-422b-9717-9b2bc52b55eb-StreamThread-13] Unexpected 
> state transition from DEAD to PENDING_SHUTDOWN. 
> (org.apache.kafka.streams.processor.internals.StreamThread:976)
> {code}
> Maybe a race condition on shutdown or something?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Stephane Roset (JIRA)
Stephane Roset created KAFKA-5512:
-

 Summary: KafkaConsumer: High memory allocation rate when idle
 Key: KAFKA-5512
 URL: https://issues.apache.org/jira/browse/KAFKA-5512
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 0.10.1.1
Reporter: Stephane Roset


Hi,

We noticed in our application that the memory allocation rate increased 
significantly when we have no Kafka messages to consume. We isolated the issue 
by using a JVM that simply runs 128 Kafka consumers. These consumers consume 
128 partitions (so each consumer consumes one partition). The partitions are 
empty and no message has been sent during the test. The consumers were 
configured with default values (session.timeout.ms=3, 
fetch.max.wait.ms=500, receive.buffer.bytes=65536, heartbeat.interval.ms=3000, 
max.poll.interval.ms=30, max.poll.records=500). The Kafka cluster was made 
of 3 brokers. Within this context, the allocation rate was about 55 MiB/s. This 
high allocation rate generates a lot of GC activity (to garbage the young heap) 
and was an issue for our project.

We profiled the JVM with JProfiler. We noticed that there were a huge quantity 
of ArrayList$Itr in memory. These collections were mainly instantiated by the 
methods handleCompletedReceives, handleCompletedSends, handleConnecions and 
handleDisconnections of the class NetWorkClient. We also noticed that we had a 
lot of calls to the method pollOnce of the class KafkaConsumer. 

So we decided to run only one consumer and to profile the calls to the method 
pollOnce. We noticed that regularly a huge number of calls is made to this 
method, up to 268000 calls within 100ms. The pollOnce method calls the 
NetworkClient.handle* methods. These methods iterate on collections (even if 
they are empty), so that explains the huge number of iterators in memory.

The large number of calls is related to the heartbeat mechanism. The pollOnce 
method calculates the poll timeout; if a heartbeat needs to be done, the 
timeout will be set to 0. The problem is that the heartbeat thread checks every 
100 ms (default value of retry.backoff.ms) if a heartbeat should be sent, so 
the KafkaConsumer will call the poll method in a loop without timeout until the 
heartbeat thread awakes. For example: the heartbeat thread just started to wait 
and will awake in 99ms. So during 99ms, the KafkaConsumer will call in a loop 
the pollOnce method and will use a timeout of 0. That explains how we can have 
268000 calls within 100ms. 

The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so I 
think the Kafka consumer should awake the heartbeat thread with a notify when 
needed.

We made two quick fixes to solve this issue:
  - In NetworkClient.handle*(), we don't iterate on collections if they are 
empty (to avoid unnecessary iterators instantiations).
  - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
the heartbeat thread to awake it (dirty fix because we don't handle the 
autocommit case).

With these 2 quick fixes and 128 consumers, the allocation rate drops down from 
55 MiB/s to 4 MiB/s.








--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5512:
---
Labels: performance  (was: )

> KafkaConsumer: High memory allocation rate when idle
> 
>
> Key: KAFKA-5512
> URL: https://issues.apache.org/jira/browse/KAFKA-5512
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Stephane Roset
>  Labels: performance
> Fix For: 0.11.0.1
>
>
> Hi,
> We noticed in our application that the memory allocation rate increased 
> significantly when we have no Kafka messages to consume. We isolated the 
> issue by using a JVM that simply runs 128 Kafka consumers. These consumers 
> consume 128 partitions (so each consumer consumes one partition). The 
> partitions are empty and no message has been sent during the test. The 
> consumers were configured with default values (session.timeout.ms=3, 
> fetch.max.wait.ms=500, receive.buffer.bytes=65536, 
> heartbeat.interval.ms=3000, max.poll.interval.ms=30, 
> max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this 
> context, the allocation rate was about 55 MiB/s. This high allocation rate 
> generates a lot of GC activity (to garbage the young heap) and was an issue 
> for our project.
> We profiled the JVM with JProfiler. We noticed that there were a huge 
> quantity of ArrayList$Itr in memory. These collections were mainly 
> instantiated by the methods handleCompletedReceives, handleCompletedSends, 
> handleConnecions and handleDisconnections of the class NetWorkClient. We also 
> noticed that we had a lot of calls to the method pollOnce of the class 
> KafkaConsumer. 
> So we decided to run only one consumer and to profile the calls to the method 
> pollOnce. We noticed that regularly a huge number of calls is made to this 
> method, up to 268000 calls within 100ms. The pollOnce method calls the 
> NetworkClient.handle* methods. These methods iterate on collections (even if 
> they are empty), so that explains the huge number of iterators in memory.
> The large number of calls is related to the heartbeat mechanism. The pollOnce 
> method calculates the poll timeout; if a heartbeat needs to be done, the 
> timeout will be set to 0. The problem is that the heartbeat thread checks 
> every 100 ms (default value of retry.backoff.ms) if a heartbeat should be 
> sent, so the KafkaConsumer will call the poll method in a loop without 
> timeout until the heartbeat thread awakes. For example: the heartbeat thread 
> just started to wait and will awake in 99ms. So during 99ms, the 
> KafkaConsumer will call in a loop the pollOnce method and will use a timeout 
> of 0. That explains how we can have 268000 calls within 100ms. 
> The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so 
> I think the Kafka consumer should awake the heartbeat thread with a notify 
> when needed.
> We made two quick fixes to solve this issue:
>   - In NetworkClient.handle*(), we don't iterate on collections if they are 
> empty (to avoid unnecessary iterators instantiations).
>   - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
> the heartbeat thread to awake it (dirty fix because we don't handle the 
> autocommit case).
> With these 2 quick fixes and 128 consumers, the allocation rate drops down 
> from 55 MiB/s to 4 MiB/s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5512:
---
Fix Version/s: 0.11.0.1

> KafkaConsumer: High memory allocation rate when idle
> 
>
> Key: KAFKA-5512
> URL: https://issues.apache.org/jira/browse/KAFKA-5512
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Stephane Roset
>  Labels: performance
> Fix For: 0.11.0.1
>
>
> Hi,
> We noticed in our application that the memory allocation rate increased 
> significantly when we have no Kafka messages to consume. We isolated the 
> issue by using a JVM that simply runs 128 Kafka consumers. These consumers 
> consume 128 partitions (so each consumer consumes one partition). The 
> partitions are empty and no message has been sent during the test. The 
> consumers were configured with default values (session.timeout.ms=3, 
> fetch.max.wait.ms=500, receive.buffer.bytes=65536, 
> heartbeat.interval.ms=3000, max.poll.interval.ms=30, 
> max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this 
> context, the allocation rate was about 55 MiB/s. This high allocation rate 
> generates a lot of GC activity (to garbage the young heap) and was an issue 
> for our project.
> We profiled the JVM with JProfiler. We noticed that there were a huge 
> quantity of ArrayList$Itr in memory. These collections were mainly 
> instantiated by the methods handleCompletedReceives, handleCompletedSends, 
> handleConnecions and handleDisconnections of the class NetWorkClient. We also 
> noticed that we had a lot of calls to the method pollOnce of the class 
> KafkaConsumer. 
> So we decided to run only one consumer and to profile the calls to the method 
> pollOnce. We noticed that regularly a huge number of calls is made to this 
> method, up to 268000 calls within 100ms. The pollOnce method calls the 
> NetworkClient.handle* methods. These methods iterate on collections (even if 
> they are empty), so that explains the huge number of iterators in memory.
> The large number of calls is related to the heartbeat mechanism. The pollOnce 
> method calculates the poll timeout; if a heartbeat needs to be done, the 
> timeout will be set to 0. The problem is that the heartbeat thread checks 
> every 100 ms (default value of retry.backoff.ms) if a heartbeat should be 
> sent, so the KafkaConsumer will call the poll method in a loop without 
> timeout until the heartbeat thread awakes. For example: the heartbeat thread 
> just started to wait and will awake in 99ms. So during 99ms, the 
> KafkaConsumer will call in a loop the pollOnce method and will use a timeout 
> of 0. That explains how we can have 268000 calls within 100ms. 
> The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so 
> I think the Kafka consumer should awake the heartbeat thread with a notify 
> when needed.
> We made two quick fixes to solve this issue:
>   - In NetworkClient.handle*(), we don't iterate on collections if they are 
> empty (to avoid unnecessary iterators instantiations).
>   - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
> the heartbeat thread to awake it (dirty fix because we don't handle the 
> autocommit case).
> With these 2 quick fixes and 128 consumers, the allocation rate drops down 
> from 55 MiB/s to 4 MiB/s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5512) KafkaConsumer: High memory allocation rate when idle

2017-06-26 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062859#comment-16062859
 ] 

Ismael Juma commented on KAFKA-5512:


Nice catch. Are you intending to submit a PR with your fixes?

> KafkaConsumer: High memory allocation rate when idle
> 
>
> Key: KAFKA-5512
> URL: https://issues.apache.org/jira/browse/KAFKA-5512
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.10.1.1
>Reporter: Stephane Roset
>  Labels: performance
> Fix For: 0.11.0.1
>
>
> Hi,
> We noticed in our application that the memory allocation rate increased 
> significantly when we have no Kafka messages to consume. We isolated the 
> issue by using a JVM that simply runs 128 Kafka consumers. These consumers 
> consume 128 partitions (so each consumer consumes one partition). The 
> partitions are empty and no message has been sent during the test. The 
> consumers were configured with default values (session.timeout.ms=3, 
> fetch.max.wait.ms=500, receive.buffer.bytes=65536, 
> heartbeat.interval.ms=3000, max.poll.interval.ms=30, 
> max.poll.records=500). The Kafka cluster was made of 3 brokers. Within this 
> context, the allocation rate was about 55 MiB/s. This high allocation rate 
> generates a lot of GC activity (to garbage the young heap) and was an issue 
> for our project.
> We profiled the JVM with JProfiler. We noticed that there were a huge 
> quantity of ArrayList$Itr in memory. These collections were mainly 
> instantiated by the methods handleCompletedReceives, handleCompletedSends, 
> handleConnecions and handleDisconnections of the class NetWorkClient. We also 
> noticed that we had a lot of calls to the method pollOnce of the class 
> KafkaConsumer. 
> So we decided to run only one consumer and to profile the calls to the method 
> pollOnce. We noticed that regularly a huge number of calls is made to this 
> method, up to 268000 calls within 100ms. The pollOnce method calls the 
> NetworkClient.handle* methods. These methods iterate on collections (even if 
> they are empty), so that explains the huge number of iterators in memory.
> The large number of calls is related to the heartbeat mechanism. The pollOnce 
> method calculates the poll timeout; if a heartbeat needs to be done, the 
> timeout will be set to 0. The problem is that the heartbeat thread checks 
> every 100 ms (default value of retry.backoff.ms) if a heartbeat should be 
> sent, so the KafkaConsumer will call the poll method in a loop without 
> timeout until the heartbeat thread awakes. For example: the heartbeat thread 
> just started to wait and will awake in 99ms. So during 99ms, the 
> KafkaConsumer will call in a loop the pollOnce method and will use a timeout 
> of 0. That explains how we can have 268000 calls within 100ms. 
> The heartbeat thread calls the method AbstractCoordinator.wait() to sleep, so 
> I think the Kafka consumer should awake the heartbeat thread with a notify 
> when needed.
> We made two quick fixes to solve this issue:
>   - In NetworkClient.handle*(), we don't iterate on collections if they are 
> empty (to avoid unnecessary iterators instantiations).
>   - In KafkaConsumer.pollOnce(), if the poll timeout is equal to 0 we notify 
> the heartbeat thread to awake it (dirty fix because we don't handle the 
> autocommit case).
> With these 2 quick fixes and 128 consumers, the allocation rate drops down 
> from 55 MiB/s to 4 MiB/s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5513) Contradicting scalaDoc for AdminUtils.assignReplicasToBrokers

2017-06-26 Thread Charly Molter (JIRA)
Charly Molter created KAFKA-5513:


 Summary: Contradicting scalaDoc for 
AdminUtils.assignReplicasToBrokers
 Key: KAFKA-5513
 URL: https://issues.apache.org/jira/browse/KAFKA-5513
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Charly Molter
Priority: Trivial


The documentation for AdminUtils.assignReplicasToBrokers seems to contradict 
itself.

I says in the description: "As the result, if the number of replicas is equal 
to or greater than the number of racks, it will ensure that each rack will get 
at least one replica."

Which means that it is possible to get an assignment where there's multiple 
replicas in a rack (if there's less racks than the replication factor).

However, the throws clauses says: " @throws AdminOperationException If rack 
information is supplied but it is incomplete, or if it is not possible to 
assign each replica to a unique rack."

Which seems to be contradicting the first claim.

In practice it doesn't throw when RF < #racks so the point in the @throws 
clause should probably be removed.

https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/AdminUtils.scala#L121-L130



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4563) State transitions error PARTITIONS_REVOKED to NOT_RUNNING

2017-06-26 Thread Eno Thereska (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062911#comment-16062911
 ] 

Eno Thereska commented on KAFKA-4563:
-

This will be fixed together as part of KAFKA-5372

> State transitions error PARTITIONS_REVOKED to NOT_RUNNING
> -
>
> Key: KAFKA-4563
> URL: https://issues.apache.org/jira/browse/KAFKA-4563
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
> Fix For: 0.11.1.0
>
>
> When starting and stopping streams quickly, the following exception is thrown:
> java.lang.IllegalStateException: Incorrect state transition from 
> PARTITIONS_REVOKED to NOT_RUNNING
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.setState(StreamThread.java:164)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.shutdown(StreamThread.java:414)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:366)
> A temporary fix is to convert the exception into a warning, since clearly not 
> all state transitions are thought through yet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-4563) State transitions error PARTITIONS_REVOKED to NOT_RUNNING

2017-06-26 Thread Eno Thereska (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eno Thereska resolved KAFKA-4563.
-
Resolution: Duplicate

This will be fixed as part of KAFKA-5372

> State transitions error PARTITIONS_REVOKED to NOT_RUNNING
> -
>
> Key: KAFKA-4563
> URL: https://issues.apache.org/jira/browse/KAFKA-4563
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Eno Thereska
> Fix For: 0.11.1.0
>
>
> When starting and stopping streams quickly, the following exception is thrown:
> java.lang.IllegalStateException: Incorrect state transition from 
> PARTITIONS_REVOKED to NOT_RUNNING
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.setState(StreamThread.java:164)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.shutdown(StreamThread.java:414)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:366)
> A temporary fix is to convert the exception into a warning, since clearly not 
> all state transitions are thought through yet.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5372) Unexpected state transition Dead to PendingShutdown

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062996#comment-16062996
 ] 

ASF GitHub Bot commented on KAFKA-5372:
---

GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/3432

KAFKA-5372: fixes to state transitions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-5372-state-transitions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3432.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3432


commit 5f57558ef293351d9f5db11edb62089367e39b76
Author: Eno Thereska 
Date:   2017-06-26T11:04:20Z

Checkpoint

commit ac372b196998052a024aac47af64dbd803a65733
Author: Eno Thereska 
Date:   2017-06-26T12:11:26Z

Some fixes




> Unexpected state transition Dead to PendingShutdown
> ---
>
> Key: KAFKA-5372
> URL: https://issues.apache.org/jira/browse/KAFKA-5372
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Jason Gustafson
>Assignee: Eno Thereska
> Fix For: 0.11.1.0
>
>
> I often see this running integration tests:
> {code}
> [2017-06-02 15:36:03,411] WARN stream-thread 
> [appId-1-c382ef0a-adbd-422b-9717-9b2bc52b55eb-StreamThread-13] Unexpected 
> state transition from DEAD to PENDING_SHUTDOWN. 
> (org.apache.kafka.streams.processor.internals.StreamThread:976)
> {code}
> Maybe a race condition on shutdown or something?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5514) KafkaConsumer ignores default values in Properties object because of incorrect use of Properties object.

2017-06-26 Thread Geert Schuring (JIRA)
Geert Schuring created KAFKA-5514:
-

 Summary: KafkaConsumer ignores default values in Properties object 
because of incorrect use of Properties object.
 Key: KAFKA-5514
 URL: https://issues.apache.org/jira/browse/KAFKA-5514
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.2.1
Reporter: Geert Schuring


When setting default values in a Properties object the KafkaConsumer ignores 
these values because the Properties object is being treated as a Map. The 
ConsumerConfig object uses the putAll method to copy properties from the 
incoming object to its local copy. (See 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/ConsumerConfig.java#L471)

This is incorrect because it only copies the explicit properties and ignores 
the default values also present in the properties object. (Also see: 
https://stackoverflow.com/questions/2004833/how-to-merge-two-java-util-properties-objects)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5515) Consider removing date formatting from Segments class

2017-06-26 Thread Bill Bejeck (JIRA)
Bill Bejeck created KAFKA-5515:
--

 Summary: Consider removing date formatting from Segments class
 Key: KAFKA-5515
 URL: https://issues.apache.org/jira/browse/KAFKA-5515
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Bill Bejeck


Currently the {{Segments}} class uses a date when calculating the segment id 
and uses {{SimpleDateFormat}} for formatting the segment id.  However this is a 
high volume code path and creating a new {{SimpleDateFormat}} for each segment 
id is expensive.  We should look into removing the date from the segment id or 
at a minimum use a faster alternative to {{SimpleDateFormat}} 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class

2017-06-26 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-5515:
---
Description: Currently the {{Segments}} class uses a date when calculating 
the segment id and uses {{SimpleDateFormat}} for formatting the segment id.  
However this is a high volume code path and creating a new {{SimpleDateFormat}} 
for each segment id is expensive.  We should look into removing the date from 
the segment id or at a minimum use a faster alternative to 
{{SimpleDateFormat}}.  We should also consider keeping a lookup of existing 
segments to avoid as many string operations as possible.  (was: Currently the 
{{Segments}} class uses a date when calculating the segment id and uses 
{{SimpleDateFormat}} for formatting the segment id.  However this is a high 
volume code path and creating a new {{SimpleDateFormat}} for each segment id is 
expensive.  We should look into removing the date from the segment id or at a 
minimum use a faster alternative to {{SimpleDateFormat}} )

> Consider removing date formatting from Segments class
> -
>
> Key: KAFKA-5515
> URL: https://issues.apache.org/jira/browse/KAFKA-5515
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>  Labels: performance
>
> Currently the {{Segments}} class uses a date when calculating the segment id 
> and uses {{SimpleDateFormat}} for formatting the segment id.  However this is 
> a high volume code path and creating a new {{SimpleDateFormat}} for each 
> segment id is expensive.  We should look into removing the date from the 
> segment id or at a minimum use a faster alternative to {{SimpleDateFormat}}.  
> We should also consider keeping a lookup of existing segments to avoid as 
> many string operations as possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5515) Consider removing date formatting from Segments class

2017-06-26 Thread Bill Bejeck (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck updated KAFKA-5515:
---
Description: Currently the {{Segments}} class uses a date when calculating 
the segment id and uses {{SimpleDateFormat}} for formatting the segment id.  
However this is a high volume code path and creating a new {{SimpleDateFormat}} 
and formatting each segment id is expensive.  We should look into removing the 
date from the segment id or at a minimum use a faster alternative to 
{{SimpleDateFormat}}.  We should also consider keeping a lookup of existing 
segments to avoid as many string operations as possible.  (was: Currently the 
{{Segments}} class uses a date when calculating the segment id and uses 
{{SimpleDateFormat}} for formatting the segment id.  However this is a high 
volume code path and creating a new {{SimpleDateFormat}} for each segment id is 
expensive.  We should look into removing the date from the segment id or at a 
minimum use a faster alternative to {{SimpleDateFormat}}.  We should also 
consider keeping a lookup of existing segments to avoid as many string 
operations as possible.)

> Consider removing date formatting from Segments class
> -
>
> Key: KAFKA-5515
> URL: https://issues.apache.org/jira/browse/KAFKA-5515
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Bill Bejeck
>  Labels: performance
>
> Currently the {{Segments}} class uses a date when calculating the segment id 
> and uses {{SimpleDateFormat}} for formatting the segment id.  However this is 
> a high volume code path and creating a new {{SimpleDateFormat}} and 
> formatting each segment id is expensive.  We should look into removing the 
> date from the segment id or at a minimum use a faster alternative to 
> {{SimpleDateFormat}}.  We should also consider keeping a lookup of existing 
> segments to avoid as many string operations as possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion

2017-06-26 Thread Paolo Patierno (JIRA)
Paolo Patierno created KAFKA-5516:
-

 Summary: Formatting verifiable producer/consumer output in a 
similar fashion
 Key: KAFKA-5516
 URL: https://issues.apache.org/jira/browse/KAFKA-5516
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: Paolo Patierno
Assignee: Paolo Patierno
Priority: Trivial


Hi,
following the proposal to have verifiable producer/consumer providing a very 
similar output where the "timestamp" is always the first column followed by 
"name" event and then all the specific data for such event.
It includes a verifiable producer refactoring for having that in the same way 
as verifiable consumer.

Thanks,
Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5516) Formatting verifiable producer/consumer output in a similar fashion

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063049#comment-16063049
 ] 

ASF GitHub Bot commented on KAFKA-5516:
---

GitHub user ppatierno opened a pull request:

https://github.com/apache/kafka/pull/3434

KAFKA-5516: Formatting verifiable producer/consumer output in a similar 
fashion



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppatierno/kafka verifiable-consumer-producer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3434.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3434


commit 6e6d728dce83ea689061a196e1b6811b447d4db7
Author: ppatierno 
Date:   2017-06-26T12:07:08Z

Modified JSON order attributes in a more readable fashion

commit 9235aadfccf446415ffbfd5d90c8d4faeddecc08
Author: ppatierno 
Date:   2017-06-26T12:56:07Z

Fixed documentation about old request.required.acks producer parameter
Modified JSON order attributes in a more readable fashion
Refactoring on verifiable producer to be like the verifiable consumer




> Formatting verifiable producer/consumer output in a similar fashion
> ---
>
> Key: KAFKA-5516
> URL: https://issues.apache.org/jira/browse/KAFKA-5516
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Paolo Patierno
>Assignee: Paolo Patierno
>Priority: Trivial
>
> Hi,
> following the proposal to have verifiable producer/consumer providing a very 
> similar output where the "timestamp" is always the first column followed by 
> "name" event and then all the specific data for such event.
> It includes a verifiable producer refactoring for having that in the same way 
> as verifiable consumer.
> Thanks,
> Paolo



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4388) Connect key and value converters are listed without default values

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063107#comment-16063107
 ] 

ASF GitHub Bot commented on KAFKA-4388:
---

GitHub user evis opened a pull request:

https://github.com/apache/kafka/pull/3435

KAFKA-4388 Recommended values for converters from plugins

Questions to reviewers:
1. Should we cache `converterRecommenders.validValues()`, 
`SinkConnectorConfig.configDef()` and `SourceConnectorConfig.configDef()` 
results?
2. What is appropriate place for testing new 
`ConnectorConfig.configDef(plugins)` functionality?

cc @ewencp 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/evis/kafka converters_values

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3435.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3435


commit 069a1ba832f844c20224598c622fa19576b0ba61
Author: Evgeny Veretennikov 
Date:   2017-06-26T13:44:40Z

KAFKA-4388 Recommended values for converters from plugins

ConnectorConfig.configDef() takes Plugins parameter now. List of
recommended values for converters is taken from plugins.converters()




> Connect key and value converters are listed without default values
> --
>
> Key: KAFKA-4388
> URL: https://issues.apache.org/jira/browse/KAFKA-4388
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Evgeny Veretennikov
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> KIP-75 added per connector converters. This exposes the settings on a 
> per-connector basis via the validation API. However, the way this is 
> specified for each connector is via a config value with no default value. 
> This means the validation API implies there is no setting unless you provide 
> one.
> It would be much better to include the default value extracted from the 
> WorkerConfig instead so it's clear you shouldn't need to override the default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file

2017-06-26 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063228#comment-16063228
 ] 

Jun Rao commented on KAFKA-5413:


Merged https://github.com/apache/kafka/pull/3397 to 0.10.2.

> Log cleaner fails due to large offset in segment file
> -
>
> Key: KAFKA-5413
> URL: https://issues.apache.org/jira/browse/KAFKA-5413
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0, 0.10.2.1
> Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0
>Reporter: Nicholas Ngorok
>Assignee: Kelvin Rutt
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0, 0.10.2.2
>
> Attachments: .index.cleaned, 
> .log, .log.cleaned, 
> .timeindex.cleaned, 002147422683.log, 
> kafka-5413.patch
>
>
> The log cleaner thread in our brokers is failing with the trace below
> {noformat}
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 
> 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp 
> Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. 
> (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Error due to  (kafka.log.LogCleaner)
> java.lang.IllegalArgumentException: requirement failed: largest offset in 
> message set can not be safely converted to relative offset.
> at scala.Predef$.require(Predef.scala:224)
> at kafka.log.LogSegment.append(LogSegment.scala:109)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.clean(LogCleaner.scala:362)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Stopped  (kafka.log.LogCleaner)
> {noformat}
> This seems to point at the specific line [here| 
> https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92]
>  in the kafka src where the difference is actually larger than MAXINT as both 
> baseOffset and offset are of type long. It was introduced in this [pr| 
> https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631]
> These were the outputs of dumping the first two log segments
> {noformat}
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0.log
> Dumping /kafka-logs/__consumer_offsets-12/.log
> Starting offset: 0
> offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: 
> -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0002147343575.log
> Dumping /kafka-logs/__consumer_offsets-12/002147343575.log
> Starting offset: 2147343575
> offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo
> adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34
> {noformat}
> My guess is that since 2147539884 is larger than MAXINT, we are hitting this 
> exception. Was there a specific reason, this check was added in 0.10.2?
> E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of 
> "key 1" following, wouldn't we run into this situation whenever the log 
> cleaner runs?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5517) Support linking to particular configuration parameters

2017-06-26 Thread Tom Bentley (JIRA)
Tom Bentley created KAFKA-5517:
--

 Summary: Support linking to particular configuration parameters
 Key: KAFKA-5517
 URL: https://issues.apache.org/jira/browse/KAFKA-5517
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Reporter: Tom Bentley
Assignee: Tom Bentley
Priority: Minor


Currently the configuration parameters are documented long tables, and it's 
only possible to link to the heading before a particular table. When discussing 
configuration parameters on forums it would be helpful to be able to link to 
the particular parameter under discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5517) Support linking to particular configuration parameters

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063294#comment-16063294
 ] 

ASF GitHub Bot commented on KAFKA-5517:
---

GitHub user tombentley opened a pull request:

https://github.com/apache/kafka/pull/3436

KAFKA-5517: Add id to config HTML tables to allow linking



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tombentley/kafka KAFKA-5517

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3436.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3436






> Support linking to particular configuration parameters
> --
>
> Key: KAFKA-5517
> URL: https://issues.apache.org/jira/browse/KAFKA-5517
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Tom Bentley
>Assignee: Tom Bentley
>Priority: Minor
>
> Currently the configuration parameters are documented long tables, and it's 
> only possible to link to the heading before a particular table. When 
> discussing configuration parameters on forums it would be helpful to be able 
> to link to the particular parameter under discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5517) Support linking to particular configuration parameters

2017-06-26 Thread Tom Bentley (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom Bentley updated KAFKA-5517:
---
Labels: patch-available  (was: )

> Support linking to particular configuration parameters
> --
>
> Key: KAFKA-5517
> URL: https://issues.apache.org/jira/browse/KAFKA-5517
> Project: Kafka
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Tom Bentley
>Assignee: Tom Bentley
>Priority: Minor
>  Labels: patch-available
>
> Currently the configuration parameters are documented long tables, and it's 
> only possible to link to the heading before a particular table. When 
> discussing configuration parameters on forums it would be helpful to be able 
> to link to the particular parameter under discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5518) General Kafka connector performanc workload

2017-06-26 Thread Chen He (JIRA)
Chen He created KAFKA-5518:
--

 Summary: General Kafka connector performanc workload
 Key: KAFKA-5518
 URL: https://issues.apache.org/jira/browse/KAFKA-5518
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.2.1
Reporter: Chen He


Sorry, first time to create Kafka JIRA. Just curious whether there is a general 
purpose performance workload for Kafka connector (hdfs, s3, etc). Then, we can 
setup an standard and evaluate the performance for further connectors such as 
swift, etc.

Please feel free to comment or mark as dup if there already is one jira 
tracking this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5473) handle ZK session expiration properly when a new session can't be established

2017-06-26 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063437#comment-16063437
 ] 

Jun Rao commented on KAFKA-5473:


[~prasincs], there are a couple cases to consider. (1) This is the case that 
the broker knows for sure that it's not registered with ZK. In this case, it 
seems failing the broker is better since from the ZK server's perspective, the 
broker is down and failing the broker will make the behavior of the broker 
consistent with what's in ZK server. This is the issue that this particular 
jira is trying to solve. I think we can just wait up to zk.connection.time.ms 
and do a clean shutdown. (2) There is another case that the broker is 
partitioned off from ZK server. In this case, ZK server may have expired the 
session of the broker. However, until the network connection is back, the 
broker doesn't know that its session has expired. In that mode, currently the 
broker doesn't shut down and just wait until the network connection is back.

> handle ZK session expiration properly when a new session can't be established
> -
>
> Key: KAFKA-5473
> URL: https://issues.apache.org/jira/browse/KAFKA-5473
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Prasanna Gautam
>
> In https://issues.apache.org/jira/browse/KAFKA-2405, we change the logic in 
> handling ZK session expiration a bit. If a new ZK session can't be 
> established after session expiration, we just log an error and continue. 
> However, this can leave the broker in a bad state since it's up, but not 
> registered from the controller's perspective. Replicas on this broker may 
> never to be in sync.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3554) Generate actual data with specific compression ratio and add multi-thread support in the ProducerPerformance tool.

2017-06-26 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063452#comment-16063452
 ] 

Chen He commented on KAFKA-3554:


Thank you for the quick reply [~becket_qin]. This work is really valuable. It 
provides us a tool that can exploit kafka system's capacity. For example, we 
can get lowest latency by only use 1 thread, at the same time, by increasing 
thread, we can find what is the maximum throughput for a kafka cluster. 

Only one question, I did applied this patch to latest kafka and comparing 
results with old ProducerPerformance.java file. I found out, if we set ack=all 
with snappy compression, with 100M record(100B each), it does not work as well 
as old PproducerPerformance.java file. 

> Generate actual data with specific compression ratio and add multi-thread 
> support in the ProducerPerformance tool.
> --
>
> Key: KAFKA-3554
> URL: https://issues.apache.org/jira/browse/KAFKA-3554
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.1
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.11.1.0
>
>
> Currently the ProducerPerformance always generate the payload with same 
> bytes. This does not quite well to test the compressed data because the 
> payload is extremely compressible no matter how big the payload is.
> We can make some changes to make it more useful for compressed messages. 
> Currently I am generating the payload containing integer from a given range. 
> By adjusting the range of the integers, we can get different compression 
> ratios. 
> API wise, we can either let user to specify the integer range or the expected 
> compression ratio (we will do some probing to get the corresponding range for 
> the users)
> Besides that, in many cases, it is useful to have multiple producer threads 
> when the producer threads themselves are bottleneck. Admittedly people can 
> run multiple ProducerPerformance to achieve similar result, but it is still 
> different from the real case when people actually use the producer.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak

2017-06-26 Thread Joseph Aliase (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063456#comment-16063456
 ] 

Joseph Aliase commented on KAFKA-5007:
--

I believe that's the error we are seeing in the log. Let me reproduce the issue 
today. Will confirm

thanks [~huxi_2b]

> Kafka Replica Fetcher Thread- Resource Leak
> ---
>
> Key: KAFKA-5007
> URL: https://issues.apache.org/jira/browse/KAFKA-5007
> Project: Kafka
>  Issue Type: Bug
>  Components: core, network
>Affects Versions: 0.10.0.0, 0.10.1.1, 0.10.2.0
> Environment: Centos 7
> Jave 8
>Reporter: Joseph Aliase
>Priority: Critical
>  Labels: reliability
> Attachments: jstack-kafka.out, jstack-zoo.out, lsofkafka.txt, 
> lsofzookeeper.txt
>
>
> Kafka is running out of open file descriptor when system network interface is 
> done.
> Issue description:
> We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file 
> descriptor for the account running Kafka is set to 10.
> During an upgrade, network interface went down. Outage continued for 12 hours 
> eventually all the broker crashed with java.io.IOException: Too many open 
> files error.
> We repeated the test in a lower environment and observed that Open Socket 
> count keeps on increasing while the NIC is down.
> We have around 13 topics with max partition size of 120 and number of replica 
> fetcher thread is set to 8.
> Using an internal monitoring tool we observed that Open Socket descriptor   
> for the broker pid continued to increase although NIC was down leading to  
> Open File descriptor error. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5413) Log cleaner fails due to large offset in segment file

2017-06-26 Thread Nicholas Ngorok (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063514#comment-16063514
 ] 

Nicholas Ngorok commented on KAFKA-5413:


Thanks [~Kelvinrutt] for getting this solved very quickly! [~junrao] is there a 
timeline/plans for the 0.10.2.2 release?

> Log cleaner fails due to large offset in segment file
> -
>
> Key: KAFKA-5413
> URL: https://issues.apache.org/jira/browse/KAFKA-5413
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0, 0.10.2.1
> Environment: Ubuntu 14.04 LTS, Oracle Java 8u92, kafka_2.11-0.10.2.0
>Reporter: Nicholas Ngorok
>Assignee: Kelvin Rutt
>Priority: Critical
>  Labels: reliability
> Fix For: 0.11.0.0, 0.10.2.2
>
> Attachments: .index.cleaned, 
> .log, .log.cleaned, 
> .timeindex.cleaned, 002147422683.log, 
> kafka-5413.patch
>
>
> The log cleaner thread in our brokers is failing with the trace below
> {noformat}
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 0 in log __consumer_offsets-12 (largest timestamp Thu Jun 08 
> 15:48:59 PDT 2017) into 0, retaining deletes. (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,822] INFO {kafka-log-cleaner-thread-0} Cleaner 0: 
> Cleaning segment 2147343575 in log __consumer_offsets-12 (largest timestamp 
> Thu Jun 08 15:49:06 PDT 2017) into 0, retaining deletes. 
> (kafka.log.LogCleaner)
> [2017-06-08 15:49:54,834] ERROR {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Error due to  (kafka.log.LogCleaner)
> java.lang.IllegalArgumentException: requirement failed: largest offset in 
> message set can not be safely converted to relative offset.
> at scala.Predef$.require(Predef.scala:224)
> at kafka.log.LogSegment.append(LogSegment.scala:109)
> at kafka.log.Cleaner.cleanInto(LogCleaner.scala:478)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:405)
> at 
> kafka.log.Cleaner$$anonfun$cleanSegments$1.apply(LogCleaner.scala:401)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.cleanSegments(LogCleaner.scala:401)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:363)
> at kafka.log.Cleaner$$anonfun$clean$4.apply(LogCleaner.scala:362)
> at scala.collection.immutable.List.foreach(List.scala:381)
> at kafka.log.Cleaner.clean(LogCleaner.scala:362)
> at 
> kafka.log.LogCleaner$CleanerThread.cleanOrSleep(LogCleaner.scala:241)
> at kafka.log.LogCleaner$CleanerThread.doWork(LogCleaner.scala:220)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> [2017-06-08 15:49:54,835] INFO {kafka-log-cleaner-thread-0} 
> [kafka-log-cleaner-thread-0], Stopped  (kafka.log.LogCleaner)
> {noformat}
> This seems to point at the specific line [here| 
> https://github.com/apache/kafka/blob/0.11.0/core/src/main/scala/kafka/log/LogSegment.scala#L92]
>  in the kafka src where the difference is actually larger than MAXINT as both 
> baseOffset and offset are of type long. It was introduced in this [pr| 
> https://github.com/apache/kafka/pull/2210/files/56d1f8196b77a47b176b7bbd1e4220a3be827631]
> These were the outputs of dumping the first two log segments
> {noformat}
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0.log
> Dumping /kafka-logs/__consumer_offsets-12/.log
> Starting offset: 0
> offset: 1810054758 position: 0 NoTimestampType: -1 isvalid: true payloadsize: 
> -1 magic: 0 compresscodec: NONE crc: 3127861909 keysize: 34
> :~$ /usr/bin/kafka-run-class kafka.tools.DumpLogSegments --deep-iteration 
> --files /kafka-logs/__consumer_offsets-12/000
> 0002147343575.log
> Dumping /kafka-logs/__consumer_offsets-12/002147343575.log
> Starting offset: 2147343575
> offset: 2147539884 position: 0 NoTimestampType: -1 isvalid: true paylo
> adsize: -1 magic: 0 compresscodec: NONE crc: 2282192097 keysize: 34
> {noformat}
> My guess is that since 2147539884 is larger than MAXINT, we are hitting this 
> exception. Was there a specific reason, this check was added in 0.10.2?
> E.g. if the first offset is a key = "key 0" and then we have MAXINT + 1 of 
> "key 1" following, wouldn't we run into this situation whenever the log 
> cleaner runs?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063569#comment-16063569
 ] 

Colin P. McCabe commented on KAFKA-3465:


Should we rename the tool or add documentation to make it clear when people 
find this class, that it is only for the old consumer?

> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3806) Adjust default values of log.retention.hours and offsets.retention.minutes

2017-06-26 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063579#comment-16063579
 ] 

Jun Rao commented on KAFKA-3806:


[~shangdi], good point on the issue with mirrormaker. I have no objection to 
increasing the default offsets.retention. However, as long as offsets.retention 
is not infinite, it seems that the issue with mirrormaker can still happen, 
i.e., the consumer in MM is still running, but no new offsets are updated since 
no new data is coming. For that case, it seems that we need a different 
solution (e.g. tracking when an offset is obsolete based on the activity of the 
group). 

> Adjust default values of log.retention.hours and offsets.retention.minutes
> --
>
> Key: KAFKA-3806
> URL: https://issues.apache.org/jira/browse/KAFKA-3806
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Michal Turek
>Priority: Minor
>
> Combination of default values of log.retention.hours (168 hours = 7 days) and 
> offsets.retention.minutes (1440 minutes = 1 day) may be dangerous in special 
> cases. Offset retention should be always greater than log retention.
> We have observed the following scenario and issue:
> - Producing of data to a topic was disabled two days ago by producer update, 
> topic wasn't deleted.
> - Consumer consumed all data and properly committed offsets to Kafka.
> - Consumer made no more offset commits for that topic because there was no 
> more incoming data and there was nothing to confirm. (We have auto-commit 
> disabled, I'm not sure how behaves enabled auto-commit.)
> - After one day: Kafka cleared too old offsets according to 
> offsets.retention.minutes.
> - After two days: Long-term running consumer was restarted after update, it 
> didn't find any committed offsets for that topic since they were deleted by 
> offsets.retention.minutes so it started consuming from the beginning.
> - The messages were still in Kafka due to larger log.retention.hours, about 5 
> days of messages were read again.
> Known workaround to solve this issue:
> - Explicitly configure log.retention.hours and offsets.retention.minutes, 
> don't use defaults.
> Proposals:
> - Prolong default value of offsets.retention.minutes to be at least twice 
> larger than log.retention.hours.
> - Check these values during Kafka startup and log a warning if 
> offsets.retention.minutes is smaller than log.retention.hours.
> - Add a note to migration guide about differences between storing of offsets 
> in ZooKeeper and Kafka (http://kafka.apache.org/documentation.html#upgrade).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5519) Support for multiple certificates in a single keystore

2017-06-26 Thread Alla Tumarkin (JIRA)
Alla Tumarkin created KAFKA-5519:


 Summary: Support for multiple certificates in a single keystore
 Key: KAFKA-5519
 URL: https://issues.apache.org/jira/browse/KAFKA-5519
 Project: Kafka
  Issue Type: New Feature
  Components: security
Affects Versions: 0.10.2.1
Reporter: Alla Tumarkin


Background
Currently, we need to have a keystore exclusive to the component with exactly 
one key in it. Looking at the JSSE Reference guide, it seems like we would need 
to introduce our own KeyManager into the SSLContext which selects a 
configurable key alias name.
https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/X509KeyManager.html 
has methods for dealing with aliases.
The goal here to use a specific certificate (with proper ACLs set for this 
client), and not just the first one that matches.
Looks like it requires a code change to the SSLChannelBuilder



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063618#comment-16063618
 ] 

Vahid Hashemian commented on KAFKA-3465:


[~cmccabe] I have opened a PR to update the relevant documentation 
[here|https://github.com/apache/kafka/pull/3405]. Would that address your 
concern? Thanks. 

> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063629#comment-16063629
 ] 

Colin P. McCabe commented on KAFKA-3465:


What about adding a comment in {{ConsumerOffsetChecker.scala}}?

> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread Vahid Hashemian (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063675#comment-16063675
 ] 

Vahid Hashemian commented on KAFKA-3465:


Yeah, we could improve the warning message of the tool. I can submit a PR for 
it.

> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5520) Extend Consumer Group Reset Offset tool for Stream Applications

2017-06-26 Thread Jorge Quilcate (JIRA)
Jorge Quilcate created KAFKA-5520:
-

 Summary: Extend Consumer Group Reset Offset tool for Stream 
Applications
 Key: KAFKA-5520
 URL: https://issues.apache.org/jira/browse/KAFKA-5520
 Project: Kafka
  Issue Type: Improvement
  Components: core, tools
Reporter: Jorge Quilcate
 Fix For: 0.11.1.0


KIP documentation: TODO



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5520) Extend Consumer Group Reset Offset tool for Stream Applications

2017-06-26 Thread Jorge Quilcate (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jorge Quilcate updated KAFKA-5520:
--
Description: KIP documentation: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application
  (was: KIP documentation: TODO)

> Extend Consumer Group Reset Offset tool for Stream Applications
> ---
>
> Key: KAFKA-5520
> URL: https://issues.apache.org/jira/browse/KAFKA-5520
> Project: Kafka
>  Issue Type: Improvement
>  Components: core, tools
>Reporter: Jorge Quilcate
>  Labels: kip
> Fix For: 0.11.1.0
>
>
> KIP documentation: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-171+-+Extend+Consumer+Group+Reset+Offset+for+Stream+Application



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3465) kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063725#comment-16063725
 ] 

ASF GitHub Bot commented on KAFKA-3465:
---

GitHub user vahidhashemian opened a pull request:

https://github.com/apache/kafka/pull/3438

KAFKA-3465: Clarify warning message of ConsumerOffsetChecker

Add that the tool works with the old consumer only.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vahidhashemian/kafka KAFKA-3465

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3438


commit fa68749267b40d156b3811368e2f49c5c8813a43
Author: Vahid Hashemian 
Date:   2017-06-26T20:24:46Z

KAFKA-3465: Clarify warning message of ConsumerOffsetChecker

Add that the tool works with the old consumer only.




> kafka.tools.ConsumerOffsetChecker won't align with kafka New Consumer mode
> --
>
> Key: KAFKA-3465
> URL: https://issues.apache.org/jira/browse/KAFKA-3465
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: BrianLing
>
> 1. When we enable mirrorMake to migrate Kafka event from one to other with 
> "new.consumer" mode:
> java -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC 
> -Djava.awt.headless=true -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/kafka/kafka-app-logs 
> -Dlog4j.configuration=file:/kafka/kafka_2.10-0.9.0.0/bin/../config/tools-log4j.properties
>  -cp :/kafka/kafka_2.10-0.9.0.0/bin/../libs/* 
> -Dkafka.logs.filename=lvs-slca-mm.log kafka.tools.MirrorMaker lvs-slca-mm.log 
> --consumer.config ../config/consumer.properties --new.consumer --num.streams 
> 4 --producer.config ../config/producer-slca.properties --whitelist risk.*
> 2. When we use ConsumerOffzsetChecker tool, notice the lag won't changed and 
> the owner is none.
> bin/kafka-run-class.sh  kafka.tools.ConsumerOffsetChecker --broker-info 
> --group lvs.slca.mirrormaker --zookeeper lvsdmetlvm01.lvs.paypal.com:2181 
> --topic 
> Group   Topic  Pid Offset  logSize
>  Lag Owner
> lvs.slca.mirrormaker   0   418578332   418678347   100015 
>  none
> lvs.slca.mirrormaker  1   418598026   418698338   100312  
> none
> [Root Cause]
> I think it's due to 0.9.0 new feature to switch zookeeper dependency to kafka 
> internal to store offset & consumer owner information. 
>   Does it mean we can not use the below command to check new consumer’s 
> lag since current lag formula: lag= logSize – offset 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L80
>   
> https://github.com/apache/kafka/blob/0.9.0/core/src/main/scala/kafka/tools/ConsumerOffsetChecker.scala#L174-L182
>  => offSet Fetch from zookeeper instead of from Kafka



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4849) Bug in KafkaStreams documentation

2017-06-26 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063739#comment-16063739
 ] 

Guozhang Wang commented on KAFKA-4849:
--

We have resolved all the reported issues in this JIRA except the one in web 
docs, which is covered in KAFKA-4705 already. Resolving this ticket for now.

> Bug in KafkaStreams documentation
> -
>
> Key: KAFKA-4849
> URL: https://issues.apache.org/jira/browse/KAFKA-4849
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Seweryn Habdank-Wojewodzki
>Assignee: Matthias J. Sax
>Priority: Minor
>
> At the page: https://kafka.apache.org/documentation/streams
>  
> In the chapter titled Application Configuration and Execution, in the example 
> there is a line:
>  
> settings.put(StreamsConfig.ZOOKEEPER_CONNECT_CONFIG, "zookeeper1:2181");
>  
> but ZOOKEEPER_CONNECT_CONFIG is deprecated in the Kafka version 0.10.2.0.
>  
> Also the table on the page: 
> https://kafka.apache.org/0102/documentation/#streamsconfigs is a bit 
> misleading.
> 1. Again zookeeper.connect is deprecated.
> 2. The client.id and zookeeper.connect are marked by high importance, 
> but according to http://docs.confluent.io/3.2.0/streams/developer-guide.html 
> none of them are important to initialize the stream.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-3806) Adjust default values of log.retention.hours and offsets.retention.minutes

2017-06-26 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063798#comment-16063798
 ] 

James Cheng commented on KAFKA-3806:


[~junrao], your example of "track when an offset is obsolete based on the 
activity of the group" is being tracked in 
https://issues.apache.org/jira/browse/KAFKA-4682

> Adjust default values of log.retention.hours and offsets.retention.minutes
> --
>
> Key: KAFKA-3806
> URL: https://issues.apache.org/jira/browse/KAFKA-3806
> Project: Kafka
>  Issue Type: Improvement
>  Components: config
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Michal Turek
>Priority: Minor
>
> Combination of default values of log.retention.hours (168 hours = 7 days) and 
> offsets.retention.minutes (1440 minutes = 1 day) may be dangerous in special 
> cases. Offset retention should be always greater than log retention.
> We have observed the following scenario and issue:
> - Producing of data to a topic was disabled two days ago by producer update, 
> topic wasn't deleted.
> - Consumer consumed all data and properly committed offsets to Kafka.
> - Consumer made no more offset commits for that topic because there was no 
> more incoming data and there was nothing to confirm. (We have auto-commit 
> disabled, I'm not sure how behaves enabled auto-commit.)
> - After one day: Kafka cleared too old offsets according to 
> offsets.retention.minutes.
> - After two days: Long-term running consumer was restarted after update, it 
> didn't find any committed offsets for that topic since they were deleted by 
> offsets.retention.minutes so it started consuming from the beginning.
> - The messages were still in Kafka due to larger log.retention.hours, about 5 
> days of messages were read again.
> Known workaround to solve this issue:
> - Explicitly configure log.retention.hours and offsets.retention.minutes, 
> don't use defaults.
> Proposals:
> - Prolong default value of offsets.retention.minutes to be at least twice 
> larger than log.retention.hours.
> - Check these values during Kafka startup and log a warning if 
> offsets.retention.minutes is smaller than log.retention.hours.
> - Add a note to migration guide about differences between storing of offsets 
> in ZooKeeper and Kafka (http://kafka.apache.org/documentation.html#upgrade).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063937#comment-16063937
 ] 

ASF GitHub Bot commented on KAFKA-5464:
---

GitHub user mjsax opened a pull request:

https://github.com/apache/kafka/pull/3439

KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mjsax/kafka kafka-5464-streamskafkaclient-poll

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3439


commit 609ceeb0d546f02b9da2638545784f050dc9558a
Author: Matthias J. Sax 
Date:   2017-06-26T22:52:56Z

KAFKA-5464: StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG




> StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
> --
>
> Key: KAFKA-5464
> URL: https://issues.apache.org/jira/browse/KAFKA-5464
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>
> In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call 
> {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout.
> However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to 
> {{KafkaConsumer.poll()}} and it's incorrect to use it for the 
> {{NetworkClient}}. If the config is increased, this can lead to a infinite 
> rebalance and rebalance on the client side is increased and thus, the client 
> is not able to meet broker enforced timeouts anymore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG

2017-06-26 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5464:
---
Fix Version/s: 0.11.0.1
   0.10.2.2

> StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
> --
>
> Key: KAFKA-5464
> URL: https://issues.apache.org/jira/browse/KAFKA-5464
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.10.2.2, 0.11.0.1
>
>
> In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call 
> {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout.
> However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to 
> {{KafkaConsumer.poll()}} and it's incorrect to use it for the 
> {{NetworkClient}}. If the config is increased, this can lead to a infinite 
> rebalance and rebalance on the client side is increased and thus, the client 
> is not able to meet broker enforced timeouts anymore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (KAFKA-5464) StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG

2017-06-26 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-5464:
---
Fix Version/s: 0.11.1.0

> StreamsKafkaClient should not use StreamsConfig.POLL_MS_CONFIG
> --
>
> Key: KAFKA-5464
> URL: https://issues.apache.org/jira/browse/KAFKA-5464
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.10.2.1
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
> Fix For: 0.11.1.0, 0.10.2.2, 0.11.0.1
>
>
> In {{StreamsKafkaClient}} we use an {{NetworkClient}} internally and call 
> {{poll}} using {{StreamsConfig.POLL_MS_CONFIG}} as timeout.
> However, {{StreamsConfig.POLL_MS_CONFIG}} is solely meant to be applied to 
> {{KafkaConsumer.poll()}} and it's incorrect to use it for the 
> {{NetworkClient}}. If the config is increased, this can lead to a infinite 
> rebalance and rebalance on the client side is increased and thus, the client 
> is not able to meet broker enforced timeouts anymore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063945#comment-16063945
 ] 

Guozhang Wang commented on KAFKA-4750:
--

[~mjsax][~evis] [~mihbor] Thanks for your comments. I would like to think a bit 
more on the general resolution for this case though before reviewing [~evis]'s 
patch:

1. In Kafka messages, "null" byte arrays indicate tombstones, note that this 
means that if user's serde decide to serialize any objects into null for a log 
compacted topic (e.g. a changelog topic of a state store), it meant to delete 
the record from the store.

2. In Kafka Streams state stores, we did NOT enforcing if "null" indicates 
deletion from the javadoc:

{code}
/**
 * Update the value associated with this key
 *
 * @param key The key to associate the value to
 * @param value The value, it can be null.
 * @throws NullPointerException If null is used for key.
 */
void put(K key, V value);
{code}

However our implementation did treat value-typed "null" (note it is not "null" 
byte arrays as in serialized messages) as deletions, since we implement 
{{delete(key)}} as {{put(key, null)}}. As Evgeny / Michal mentioned, it is 
intuitive if our {{put}} semantics aligned with Java's map operations:

{code}
...  // store initialized as empty

store.get(key); // returns null

store.put(key, value);
store.delete(key);
store.get(key);  // returns null

store.put(key, value);
store.put(key, null);  // we can interpret it as "associate the key with null" 
or simply delete this key
store.get(key);  // returns null, though generally speaking it could indicate 
either the key is associated with value or the key does not exist
{code}

Now assuming you have a customized serde that maps "null" object to "not-null" 
byte arrays, in this case the above would still hold:

{code}
store.put(key, value);
store.put(key, null);  // now "null" object is just a special value that do not 
indicate deletion
store.get(key);  // returns null, but this should be interpreted as "the key is 
associated with null"
{code}

Now assuming you have a customized serde that maps "not null" object to "null" 
byte arrays, in this case the "not-null" object is really interpreted as a 
dummy value that the above still holds

{code}
store.put(key, value);
store.put(key, MY_DUMMY);  // serialized into "null" byte arrays
store.get(key);  // returns MY_DUMMY as "null" byte arrays is deserialized 
symmetrically
{code}

So I think if we want to allow the above customized interpretation then we 
should not implement {{delete()}} as {{put(key, null)}} since "null" objects 
may not indicate deletions; if we want to be more restrict then we should 
emphasize that in the javadoc above that "@param value The value, it can be 
null which indicates deletion of the key".

WDYT?

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4750) KeyValueIterator returns null values

2017-06-26 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063956#comment-16063956
 ] 

Matthias J. Sax commented on KAFKA-4750:


I think [~damianguy] or [~enothereska] can comment best in this. AFAIK, we use 
{{put(key,null)}} with delete-semantics all over the place. Also for {{KTable}} 
caches. As it align with changelog delete semantics I also think it does make 
sense to keep it this way. I would rather educate user that plug in Serde to 
not return {{null}} if input is not {{null}}. We can also add checks to all 
{{Serde}} calls: (1) never call Serde for {{null}} as we know it must be 
{{null}} anyway (2) if we call Serde with not-null, make sure it does not 
return {{null}} -- otherwise throw exception.

> KeyValueIterator returns null values
> 
>
> Key: KAFKA-4750
> URL: https://issues.apache.org/jira/browse/KAFKA-4750
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>Reporter: Michal Borowiecki
>Assignee: Evgeny Veretennikov
>  Labels: newbie
> Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (KAFKA-5367) Producer should not expiry topic from metadata cache if accumulator still has data for this topic

2017-06-26 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin resolved KAFKA-5367.
-
Resolution: Invalid

> Producer should not expiry topic from metadata cache if accumulator still has 
> data for this topic
> -
>
> Key: KAFKA-5367
> URL: https://issues.apache.org/jira/browse/KAFKA-5367
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>
> To be added.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5521) Support replicas movement between log directories (KIP-113)

2017-06-26 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-5521:
---

 Summary: Support replicas movement between log directories 
(KIP-113)
 Key: KAFKA-5521
 URL: https://issues.apache.org/jira/browse/KAFKA-5521
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (KAFKA-5522) ListOffset should take LSO into account when searching by timestamp

2017-06-26 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5522:
--

 Summary: ListOffset should take LSO into account when searching by 
timestamp
 Key: KAFKA-5522
 URL: https://issues.apache.org/jira/browse/KAFKA-5522
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 0.11.0.0
Reporter: Jason Gustafson
Assignee: Jason Gustafson
 Fix For: 0.11.0.1


For a normal read_uncommitted consumer, we bound the offset returned from 
ListOffsets by the high watermark. For read_committed consumers, we should 
similarly bound offsets by the LSO. Currently we only handle the case of 
fetching the end offset.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-06-26 Thread Mahesh Veerabathiran (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064176#comment-16064176
 ] 

Mahesh Veerabathiran commented on KAFKA-5153:
-

Having the same issue. do anyone has a resolution?

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (KAFKA-4344) Exception when accessing partition, offset and timestamp in processor class

2017-06-26 Thread Nishkam Ravi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064379#comment-16064379
 ] 

Nishkam Ravi commented on KAFKA-4344:
-

[~guozhang] We are encountering the same error (for topic()). The code is 
written in Scala and is being launched using sbt (spring isn't involved). 
Here's the code sketch:

class GenericProcessor[T <: ThriftStruct](serDe: ThriftStructSerDe[T], decrypt: 
Boolean, config: Config) extends AbstractProcessor[Array[Byte], Array[Byte]] 
with LazyLogging {

  private var hsmClient: HSMClient = _

  override def init(processorContext: ProcessorContext): Unit = {
super.init(processorContext)
hsmClient = HSMClient(config).getOrElse(null)
  }

  override def process(key: Array[Byte], value: Array[Byte]): Unit = {
val topic: String = this.context.topic() // exception thrown here
val partition: Int = this.context.partition()
val offset: Long = this.context.offset()
val timestamp: Long = this.context.timestamp()
// business logic
  }
}

The exception is thrown only for the multi-consumer case (when number of 
partitions for a topic > 1 and parallelism > 1). This should be easy to 
reproduce. 

> Exception when accessing partition, offset and timestamp in processor class
> ---
>
> Key: KAFKA-4344
> URL: https://issues.apache.org/jira/browse/KAFKA-4344
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: saiprasad mishra
>Assignee: Guozhang Wang
>Priority: Minor
>
> I have a kafka stream pipeline like below
> source topic stream -> filter for null value ->map to make it keyed by id 
> ->custom processor to mystore ->to another topic -> ktable
> I am hitting the below type of exception in a custom processor class if I try 
> to access offset() or partition() or timestamp() from the ProcessorContext in 
> the process() method. I was hoping it would return the partition and offset 
> for the enclosing topic(in this case source topic) where its consuming from 
> or -1 based on the api docs.
> java.lang.IllegalStateException: This should not happen as offset() should 
> only be called while a record is processed
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.offset(ProcessorContextImpl.java:181)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at com.sai.repo.MyStore.process(MyStore.java:72) ~[classes!/:?]
>   at com.sai.repo.MyStore.process(MyStore.java:39) ~[classes!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:204)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.kstream.internals.KStreamMap$KStreamMapProcessor.process(KStreamMap.java:43)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:204)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.kstream.internals.KStreamFilter$KStreamFilterProcessor.process(KStreamFilter.java:44)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:82)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:204)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:66)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:181)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:436)
>  ~[kafka-streams-0.10.1.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
>  [kafka-streams-0.10.1.0.jar!/:?]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)