[jira] [Commented] (KAFKA-1944) Rename LogCleaner and related classes to LogCompactor
[ https://issues.apache.org/jira/browse/KAFKA-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105959#comment-16105959 ] Jiangjie Qin commented on KAFKA-1944: - I agree with [~hachikuji] that we can deprecate {{log.cleaner.enable}}. The name "log.cleaner" appears in a handful of other configurations related to log cleaner. We will need to change those following the deprecation process as well. It would likely be a quick KIP. > Rename LogCleaner and related classes to LogCompactor > - > > Key: KAFKA-1944 > URL: https://issues.apache.org/jira/browse/KAFKA-1944 > Project: Kafka > Issue Type: Bug >Reporter: Gwen Shapira >Assignee: Aravind Selvan > Labels: newbie > > Following a mailing list discussion: > "the name LogCleaner is seriously misleading. Its more of a log compactor. > Deleting old logs happens elsewhere from what I've seen." > Note that this may require renaming related classes, objects, configs and > metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5670) Add Topology and deprecate TopologyBuilder
[ https://issues.apache.org/jira/browse/KAFKA-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105869#comment-16105869 ] ASF GitHub Bot commented on KAFKA-5670: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3590 > Add Topology and deprecate TopologyBuilder > -- > > Key: KAFKA-5670 > URL: https://issues.apache.org/jira/browse/KAFKA-5670 > Project: Kafka > Issue Type: Sub-task > Components: streams >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax > Fix For: 1.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5666) Need feedback to user if consumption fails due to offsets.topic.replication.factor=3
[ https://issues.apache.org/jira/browse/KAFKA-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gustafson updated KAFKA-5666: --- Labels: newbie usability (was: newbie) > Need feedback to user if consumption fails due to > offsets.topic.replication.factor=3 > > > Key: KAFKA-5666 > URL: https://issues.apache.org/jira/browse/KAFKA-5666 > Project: Kafka > Issue Type: Improvement > Components: clients, consumer >Affects Versions: 0.11.0.0 >Reporter: Yeva Byzek > Labels: newbie, usability > > Introduced in 0.11: The offsets.topic.replication.factor broker config is now > enforced upon auto topic creation. Internal auto topic creation will fail > with a GROUP_COORDINATOR_NOT_AVAILABLE error until the cluster size meets > this replication factor requirement. > Issue: Default is setting offsets.topic.replication.factor=3, but in > development and docker environments where there is only 1 broker, the offsets > topic will fail to be created when a consumer tries to consume and no records > will be returned. As a result, the user experience is bad. The user may > have no idea about this setting change and enforcement, and they just see > that `kafka-console-consumer` hangs with ZERO output. It is true that the > broker log file will provide a message (e.g. {{ERROR [KafkaApi-1] Number of > alive brokers '1' does not meet the required replication factor '3' for the > offsets topic (configured via 'offsets.topic.replication.factor'). This error > can be ignored if the cluster is starting up and not all brokers are up yet. > (kafka.server.KafkaApis)}}) but many users do not have access to the log > files or know how to get them. > Suggestion: give feedback to the user/app if offsets topic cannot be created. > For example, after some timeout. > Workaround: > Set offsets.topic.replication.factor=3 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-1944) Rename LogCleaner and related classes to LogCompactor
[ https://issues.apache.org/jira/browse/KAFKA-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105845#comment-16105845 ] Jason Gustafson commented on KAFKA-1944: The name LogCleaner is somewhat entrenched. If this is just a matter of renaming {{LogCleaner}} to {{LogCompactor}}, that could be done in a MINOR PR, but that probably only makes sense if we change the configuration {{log.cleaner.enable}} as well. On the other hand, since we have changed its default value to true and since two key components now depend on it (i.e. the new consumer and the transactional producer), maybe we should consider deprecating the config instead? > Rename LogCleaner and related classes to LogCompactor > - > > Key: KAFKA-1944 > URL: https://issues.apache.org/jira/browse/KAFKA-1944 > Project: Kafka > Issue Type: Bug >Reporter: Gwen Shapira >Assignee: Aravind Selvan > Labels: newbie > > Following a mailing list discussion: > "the name LogCleaner is seriously misleading. Its more of a log compactor. > Deleting old logs happens elsewhere from what I've seen." > Note that this may require renaming related classes, objects, configs and > metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5676) MockStreamsMetrics should be in o.a.k.test
Guozhang Wang created KAFKA-5676: Summary: MockStreamsMetrics should be in o.a.k.test Key: KAFKA-5676 URL: https://issues.apache.org/jira/browse/KAFKA-5676 Project: Kafka Issue Type: Bug Components: streams Reporter: Guozhang Wang {{MockStreamsMetrics}}'s package should be `o.a.k.test` not `o.a.k.streams.processor.internals`. In addition, it should not require a {{Metrics}} parameter in its constructor as it is only needed for its extended base class; the right way of mocking should be implementing {{StreamsMetrics}} with mock behavior than extended a real implementaion of {{StreamsMetricsImpl}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5233) Changes to punctuate semantics (KIP-138)
[ https://issues.apache.org/jira/browse/KAFKA-5233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105814#comment-16105814 ] Guozhang Wang commented on KAFKA-5233: -- [~mihbor] While reviewing some other PRs I realized that there are some places in unit tests that are still referring to the deprecated `punctuate` function. For example in {{KStreamTestDriver}}. Could you file a follow-up PR to clear them up as well? cc [~damianguy] [~mjsax] > Changes to punctuate semantics (KIP-138) > > > Key: KAFKA-5233 > URL: https://issues.apache.org/jira/browse/KAFKA-5233 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Michal Borowiecki >Assignee: Michal Borowiecki > Labels: kip > Fix For: 1.0.0 > > > This ticket is to track implementation of > [KIP-138: Change punctuate > semantics|https://cwiki.apache.org/confluence/display/KAFKA/KIP-138%3A+Change+punctuate+semantics] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-4763) Handle disk failure for JBOD (KIP-112)
[ https://issues.apache.org/jira/browse/KAFKA-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-4763: - Fix Version/s: 1.0.0 > Handle disk failure for JBOD (KIP-112) > -- > > Key: KAFKA-4763 > URL: https://issues.apache.org/jira/browse/KAFKA-4763 > Project: Kafka > Issue Type: Improvement >Reporter: Dong Lin >Assignee: Dong Lin > Fix For: 1.0.0 > > > See > https://cwiki.apache.org/confluence/display/KAFKA/KIP-112%3A+Handle+disk+failure+for+JBOD > for motivation and design. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-4133) Provide a configuration to control consumer max in-flight fetches
[ https://issues.apache.org/jira/browse/KAFKA-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-4133: - Fix Version/s: 1.0.0 > Provide a configuration to control consumer max in-flight fetches > - > > Key: KAFKA-4133 > URL: https://issues.apache.org/jira/browse/KAFKA-4133 > Project: Kafka > Issue Type: Improvement > Components: consumer >Reporter: Jason Gustafson >Assignee: Mickael Maison > Fix For: 1.0.0 > > > With KIP-74, we now have a good way to limit the size of fetch responses, but > it may still be difficult for users to control overall memory since the > consumer will send fetches in parallel to all the brokers which own > partitions that it is subscribed to. To give users finer control, it might > make sense to add a `max.in.flight.fetches` setting to limit the total number > of concurrent fetches at any time. This would require a KIP since it's a new > configuration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-1696) Kafka should be able to generate Hadoop delegation tokens
[ https://issues.apache.org/jira/browse/KAFKA-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang updated KAFKA-1696: - Fix Version/s: 1.0.0 > Kafka should be able to generate Hadoop delegation tokens > - > > Key: KAFKA-1696 > URL: https://issues.apache.org/jira/browse/KAFKA-1696 > Project: Kafka > Issue Type: New Feature > Components: security >Reporter: Jay Kreps >Assignee: Parth Brahmbhatt > Fix For: 1.0.0 > > > For access from MapReduce/etc jobs run on behalf of a user. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5663) LogDirFailureTest system test fails
[ https://issues.apache.org/jira/browse/KAFKA-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105750#comment-16105750 ] ASF GitHub Bot commented on KAFKA-5663: --- GitHub user lindong28 opened a pull request: https://github.com/apache/kafka/pull/3594 KAFKA-5663; Fix LogDirFailureTest system test You can merge this pull request into a Git repository by running: $ git pull https://github.com/lindong28/kafka KAFKA-5663 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3594.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3594 commit 1a2f429d4af2e0f77706cdcea1801f7ddc4195c8 Author: Dong Lin Date: 2017-07-28T21:55:44Z KAFKA-5663; Fix LogDirFailureTest system test > LogDirFailureTest system test fails > --- > > Key: KAFKA-5663 > URL: https://issues.apache.org/jira/browse/KAFKA-5663 > Project: Kafka > Issue Type: Bug >Reporter: Apurva Mehta >Assignee: Dong Lin > > The recently added JBOD system test failed last night. > {noformat} > Producer failed to produce messages for 20s. > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 123, in run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/tests/runner_client.py", > line 176, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/mark/_mark.py", > line 321, in wrapper > return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs) > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/tests/kafkatest/tests/core/log_dir_failure_test.py", > line 166, in test_replication_with_disk_failure > self.start_producer_and_consumer() > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/tests/kafkatest/tests/produce_consume_validate.py", > line 75, in start_producer_and_consumer > self.producer_start_timeout_sec) > File > "/home/jenkins/workspace/system-test-kafka-trunk/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.6.0-py2.7.egg/ducktape/utils/util.py", > line 36, in wait_until > raise TimeoutError(err_msg) > TimeoutError: Producer failed to produce messages for 20s. > {noformat} > Complete logs here: > http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2017-07-26--001.1501074756--apache--trunk--91c207c/LogDirFailureTest/test_replication_with_disk_failure/bounce_broker=False.security_protocol=PLAINTEXT.broker_type=follower/48.tgz -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5664) Disable auto offset commit in ConsoleConsumer if no group is provided
[ https://issues.apache.org/jira/browse/KAFKA-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105744#comment-16105744 ] ASF GitHub Bot commented on KAFKA-5664: --- GitHub user vahidhashemian opened a pull request: https://github.com/apache/kafka/pull/3593 KAFKA-5664: Disable auto offset commit in ConsoleConsumer if no group is provided This is to avoid polluting the Consumer Coordinator cache as the auto-generated group and its offsets are unlikely to be reused. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vahidhashemian/kafka KAFKA-5664 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3593.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3593 commit 3d708fabb7609de794fc90c86c4fb00d47118fb5 Author: Vahid Hashemian Date: 2017-07-28T21:47:45Z KAFKA-5664: Disable auto offset commit in ConsoleConsumer if no group is provided This is to avoid polluting the Consumer Coordinator cache as the group and its offsets are unlikely to be reused. > Disable auto offset commit in ConsoleConsumer if no group is provided > - > > Key: KAFKA-5664 > URL: https://issues.apache.org/jira/browse/KAFKA-5664 > Project: Kafka > Issue Type: Improvement >Reporter: Jason Gustafson >Assignee: Vahid Hashemian > > In ConsoleCosnumer, if no group is provided, we generate a random groupId: > {code} > consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, s"console-consumer-${new > Random().nextInt(10)}") > {code} > In this case, since the group is not likely to be used again, we should > disable automatic offset commits. This avoids polluting the coordinator cache > with offsets that will never be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-4868) Optimize RocksDb config for fast recovery/bulk load
[ https://issues.apache.org/jira/browse/KAFKA-4868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bejeck resolved KAFKA-4868. Resolution: Fixed resolved with merge of KIP-167 > Optimize RocksDb config for fast recovery/bulk load > --- > > Key: KAFKA-4868 > URL: https://issues.apache.org/jira/browse/KAFKA-4868 > Project: Kafka > Issue Type: Sub-task > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Eno Thereska >Assignee: Bill Bejeck > Fix For: 1.0.0 > > > RocksDb can be tuned to bulk-load data fast. Kafka Streams bulk-loads records > during recovery. It is likely we can use a different config to make recovery > faster, then revert to another config for normal operations like put/get. See > https://github.com/facebook/rocksdb/wiki/performance-benchmarks for examples. > Would be good to measure the performance gain as part of addressing this JIRA. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5675) Possible worker_id duplication in Connect
Dustin Cote created KAFKA-5675: -- Summary: Possible worker_id duplication in Connect Key: KAFKA-5675 URL: https://issues.apache.org/jira/browse/KAFKA-5675 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 0.10.2.1 Reporter: Dustin Cote Priority: Minor It's possible to set non-unique host/port combinations for workers via *rest.advertised.host.name* and *rest.advertised.host.port* (e.g. localhost:8083). While this isn't typically advisable, it can result in weird behavior for containerized deployments where localhost might end up being mapped to something that is externally facing. The worker_id today appears to be set as this host/port combination so you end up with duplicate worker_ids causing long rebalances presumably because task assignment gets confused. It would be good to either change how the worker_id is generated or find a way to not let a worker start if a worker with an identical worker_id already exists. In the short term, we should document the requirement of unique advertised host/port combinations for workers to avoid debugging a somewhat tricky scenario. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5363) Add ability to batch restore and receive restoration stats.
[ https://issues.apache.org/jira/browse/KAFKA-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105458#comment-16105458 ] ASF GitHub Bot commented on KAFKA-5363: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3325 > Add ability to batch restore and receive restoration stats. > --- > > Key: KAFKA-5363 > URL: https://issues.apache.org/jira/browse/KAFKA-5363 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bill Bejeck >Assignee: Bill Bejeck > Labels: kip > Fix For: 1.0.0 > > > Currently, when restoring a state store in a Kafka Streams application, we > put one key-value at a time into the store. > This task aims to make this recovery more efficient by creating a new > interface with "restoreAll" functionality allowing for bulk writes by the > underlying state store implementation. > The proposal will also add "beginRestore" and "endRestore" callback methods > potentially used for > Tracking when the bulk restoration process begins and ends. > Keeping track of the number of records and last offset restored. > KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+interface+for+the+state+store+restoration+process -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-5363) Add ability to batch restore and receive restoration stats.
[ https://issues.apache.org/jira/browse/KAFKA-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang resolved KAFKA-5363. -- Resolution: Fixed Issue resolved by pull request 3325 [https://github.com/apache/kafka/pull/3325] > Add ability to batch restore and receive restoration stats. > --- > > Key: KAFKA-5363 > URL: https://issues.apache.org/jira/browse/KAFKA-5363 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Bill Bejeck >Assignee: Bill Bejeck > Labels: kip > Fix For: 1.0.0 > > > Currently, when restoring a state store in a Kafka Streams application, we > put one key-value at a time into the store. > This task aims to make this recovery more efficient by creating a new > interface with "restoreAll" functionality allowing for bulk writes by the > underlying state store implementation. > The proposal will also add "beginRestore" and "endRestore" callback methods > potentially used for > Tracking when the bulk restoration process begins and ends. > Keeping track of the number of records and last offset restored. > KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-167%3A+Add+interface+for+the+state+store+restoration+process -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5341) Add UnderMinIsrPartitionCount and per-partition UnderMinIsr metrics
[ https://issues.apache.org/jira/browse/KAFKA-5341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105453#comment-16105453 ] ASF GitHub Bot commented on KAFKA-5341: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/3583 > Add UnderMinIsrPartitionCount and per-partition UnderMinIsr metrics > --- > > Key: KAFKA-5341 > URL: https://issues.apache.org/jira/browse/KAFKA-5341 > Project: Kafka > Issue Type: New Feature >Reporter: Dong Lin >Assignee: Dong Lin > Fix For: 1.0.0 > > > We currently have under replicated partitions, but we do not have a metric to > track the number of partitions whose in-sync replicas count < minIsr. > Partitions whose in-syn replicas count < minIsr will be unavailable to those > producers who uses ack = all. It is important for Kafka operators to be > notified of the existence of such partition because their existence reduces > the availability of the Kafka service. > More specifically, we can define a per-broker metric > UnderMinIsrPartitionCount as "The number of partitions that this broker leads > for which in-sync replicas count < minIsr." So if the RF was 3, and min ISR > is 2, then when there are 2 replicas in ISR this partition would be in the > under replicated partitions count. When there is 1 replica in ISR, this > partition would also be in the UnderMinIsrPartitionCount. > See > https://cwiki.apache.org/confluence/display/KAFKA/KIP-164-+Add+UnderMinIsrPartitionCount+and+per-partition+UnderMinIsr+metrics > for more detail. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5673) Refactor KeyValueStore hierarchy so that MeteredKeyValueStore is the outermost store
[ https://issues.apache.org/jira/browse/KAFKA-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105248#comment-16105248 ] ASF GitHub Bot commented on KAFKA-5673: --- GitHub user dguy opened a pull request: https://github.com/apache/kafka/pull/3592 KAFKA-5673: refactor KeyValueStore hierarchy to make MeteredKeyValueStore outermost refactor StateStoreSuppliers such that a `MeteredKeyValueStore` is the outermost store. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dguy/kafka key-value-store-refactor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3592.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3592 > Refactor KeyValueStore hierarchy so that MeteredKeyValueStore is the > outermost store > > > Key: KAFKA-5673 > URL: https://issues.apache.org/jira/browse/KAFKA-5673 > Project: Kafka > Issue Type: Sub-task >Reporter: Damian Guy >Assignee: Damian Guy > > MeteredKeyValueStore is currently not the outermost store. Further it needs > to have the inner store as {{}} to allow easy plugability of > custom storage engines. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5674) max.connections.per.ip minimum value to be zero to allow IP address blocking
Tristan Stevens created KAFKA-5674: -- Summary: max.connections.per.ip minimum value to be zero to allow IP address blocking Key: KAFKA-5674 URL: https://issues.apache.org/jira/browse/KAFKA-5674 Project: Kafka Issue Type: Improvement Affects Versions: 0.11.0.0 Reporter: Tristan Stevens Currently the max.connections.per.ip (KAFKA-1512) config has a minimum value of 1, however, as suggested in https://issues.apache.org/jira/browse/KAFKA-1512?focusedCommentId=14051914, having this with a minimum value of zero would allow IP-based filtering of inbound connections (effectively prohibit those IP addresses from connecting altogether). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak
[ https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105048#comment-16105048 ] Joseph Aliase commented on KAFKA-5007: -- [~huxi_2b][~junrao] Sorry for the delay. Im working on this today. Will update before EOD today > Kafka Replica Fetcher Thread- Resource Leak > --- > > Key: KAFKA-5007 > URL: https://issues.apache.org/jira/browse/KAFKA-5007 > Project: Kafka > Issue Type: Bug > Components: core, network >Affects Versions: 0.10.0.0, 0.10.1.1, 0.10.2.0 > Environment: Centos 7 > Jave 8 >Reporter: Joseph Aliase >Priority: Critical > Labels: reliability > Attachments: jstack-kafka.out, jstack-zoo.out, lsofkafka.txt, > lsofzookeeper.txt > > > Kafka is running out of open file descriptor when system network interface is > done. > Issue description: > We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file > descriptor for the account running Kafka is set to 10. > During an upgrade, network interface went down. Outage continued for 12 hours > eventually all the broker crashed with java.io.IOException: Too many open > files error. > We repeated the test in a lower environment and observed that Open Socket > count keeps on increasing while the NIC is down. > We have around 13 topics with max partition size of 120 and number of replica > fetcher thread is set to 8. > Using an internal monitoring tool we observed that Open Socket descriptor > for the broker pid continued to increase although NIC was down leading to > Open File descriptor error. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-5673) Refactor KeyValueStore hierarchy so that MeteredKeyValueStore is the outermost store
[ https://issues.apache.org/jira/browse/KAFKA-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damian Guy reassigned KAFKA-5673: - Assignee: Damian Guy > Refactor KeyValueStore hierarchy so that MeteredKeyValueStore is the > outermost store > > > Key: KAFKA-5673 > URL: https://issues.apache.org/jira/browse/KAFKA-5673 > Project: Kafka > Issue Type: Sub-task >Reporter: Damian Guy >Assignee: Damian Guy > > MeteredKeyValueStore is currently not the outermost store. Further it needs > to have the inner store as {{}} to allow easy plugability of > custom storage engines. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5673) Refactor KeyValueStore hierarchy so that MeteredKeyValueStore is the outermost store
Damian Guy created KAFKA-5673: - Summary: Refactor KeyValueStore hierarchy so that MeteredKeyValueStore is the outermost store Key: KAFKA-5673 URL: https://issues.apache.org/jira/browse/KAFKA-5673 Project: Kafka Issue Type: Sub-task Reporter: Damian Guy MeteredKeyValueStore is currently not the outermost store. Further it needs to have the inner store as {{}} to allow easy plugability of custom storage engines. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5649) Producer is being closed generating ssl exception
[ https://issues.apache.org/jira/browse/KAFKA-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo Panero updated KAFKA-5649: Priority: Major (was: Minor) > Producer is being closed generating ssl exception > - > > Key: KAFKA-5649 > URL: https://issues.apache.org/jira/browse/KAFKA-5649 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 0.10.2.1 > Environment: Spark 2.2.0 and kafka 0.10.2.0 >Reporter: Pablo Panero > > On a streaming job using built-in kafka source and sink (over SSL), with I am > getting the following exception: > On a streaming job using built-in kafka source and sink (over SSL), with I > am getting the following exception: > Config of the source: > {code:java} > val df = spark.readStream > .format("kafka") > .option("kafka.bootstrap.servers", config.bootstrapServers) > .option("failOnDataLoss", value = false) > .option("kafka.connections.max.idle.ms", 360) > //SSL: this only applies to communication between Spark and Kafka > brokers; you are still responsible for separately securing Spark inter-node > communication. > .option("kafka.security.protocol", "SASL_SSL") > .option("kafka.sasl.mechanism", "GSSAPI") > .option("kafka.sasl.kerberos.service.name", "kafka") > .option("kafka.ssl.truststore.location", "/etc/pki/java/cacerts") > .option("kafka.ssl.truststore.password", "changeit") > .option("subscribe", config.topicConfigList.keys.mkString(",")) > .load() > {code} > Config of the sink: > {code:java} > .writeStream > .option("checkpointLocation", > s"${config.checkpointDir}/${topicConfig._1}/") > .format("kafka") > .option("kafka.bootstrap.servers", config.bootstrapServers) > .option("kafka.connections.max.idle.ms", 360) > //SSL: this only applies to communication between Spark and Kafka > brokers; you are still responsible for separately securing Spark inter-node > communication. > .option("kafka.security.protocol", "SASL_SSL") > .option("kafka.sasl.mechanism", "GSSAPI") > .option("kafka.sasl.kerberos.service.name", "kafka") > .option("kafka.ssl.truststore.location", "/etc/pki/java/cacerts") > .option("kafka.ssl.truststore.password", "changeit") > .start() > {code} > And in some cases it throws the exception making the spark job stuck in that > step. Exception stack trace is the following: > {code:java} > 17/07/18 10:11:58 WARN SslTransportLayer: Failed to send SSL Close message > java.io.IOException: Broken pipe > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > at sun.nio.ch.IOUtil.write(IOUtil.java:65) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) > at > org.apache.kafka.common.network.SslTransportLayer.flush(SslTransportLayer.java:195) > at > org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:163) > at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:731) > at > org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:54) > at org.apache.kafka.common.network.Selector.doClose(Selector.java:540) > at org.apache.kafka.common.network.Selector.close(Selector.java:531) > at > org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:378) > at org.apache.kafka.common.network.Selector.poll(Selector.java:303) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:349) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:226) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1047) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) > at > org.apache.spark.sql.kafka010.CachedKafkaConsumer.poll(CachedKafkaConsumer.scala:298) > at > org.apache.spark.sql.kafka010.CachedKafkaConsumer.org$apache$spark$sql$kafka010$CachedKafkaConsumer$$fetchData(CachedKafkaConsumer.scala:206) > at > org.apache.spark.sql.kafka010.CachedKafkaConsumer$$anonfun$get$1.apply(CachedKafkaConsumer.scala:117) > at > org.apache.spark.sql.kafka010.CachedKafkaConsumer$$anonfun$get$1.apply(CachedKafkaConsumer.scala:106) > at > org.apache.spark.util.UninterruptibleThread.runUninterruptibly(UninterruptibleThread.scala:85) > at > org.apache.spark.sql.kafka010.CachedKafkaConsumer.runUninterruptiblyIfPossible(CachedKafkaConsumer.scala:68) > at > org.apache.spark.sql.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:106) >
[jira] [Updated] (KAFKA-5599) ConsoleConsumer : --new-consumer option as deprecated
[ https://issues.apache.org/jira/browse/KAFKA-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paolo Patierno updated KAFKA-5599: -- Fix Version/s: 1.0.0 > ConsoleConsumer : --new-consumer option as deprecated > - > > Key: KAFKA-5599 > URL: https://issues.apache.org/jira/browse/KAFKA-5599 > Project: Kafka > Issue Type: Bug > Components: tools >Reporter: Paolo Patierno >Assignee: Paolo Patierno > Fix For: 1.0.0 > > > Hi, > it seems to me that the --new-consumer option on the ConsoleConsumer is > useless. > The useOldConsumer var is related to specify --zookeeper on the command line > but then the bootstrap-server option (or the --new-consumer) can't be > used. > If you use --bootstrap-server option then the new consumer is used > automatically so no need for --new-consumer. > It turns out the using the old or new consumer is just related on using > --zookeeper or --bootstrap-server option (which can't be used together, so I > can't use new consumer connecting to zookeeper). > It's also clear when you use --zookeeper for the old consumer and the output > from help says : > "Consider using the new consumer by passing [bootstrap-server] instead of > [zookeeper]" > Before removing the --new-consumer option, this JIRA is for marking it as > deprecated in the next release (then moving for a new release on removing > such option). > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-5672) Move measureLatencyNs from StreamsMetricsImpl to StreamsMetrics
Damian Guy created KAFKA-5672: - Summary: Move measureLatencyNs from StreamsMetricsImpl to StreamsMetrics Key: KAFKA-5672 URL: https://issues.apache.org/jira/browse/KAFKA-5672 Project: Kafka Issue Type: Bug Reporter: Damian Guy StreamsMetricsImpl currently has the method {{measureLatencyNs}} but it is not on {{StreamsMetrics} - this should be moved to the interface so we can stop depending on the impl. Further, the {{Runnable}} argument passed to {{measureLatencyNs}} should be changed to some functional interface that can also return a value. As this is a public API change it will require a KIP -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-5619) Make --new-consumer option as deprecated in all tools
[ https://issues.apache.org/jira/browse/KAFKA-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paolo Patierno updated KAFKA-5619: -- Fix Version/s: 1.0.0 > Make --new-consumer option as deprecated in all tools > - > > Key: KAFKA-5619 > URL: https://issues.apache.org/jira/browse/KAFKA-5619 > Project: Kafka > Issue Type: Bug > Components: tools >Reporter: Paolo Patierno >Assignee: Paolo Patierno > Fix For: 1.0.0 > > > Hi, > as already described by the > [KAFKA-5599|https://issues.apache.org/jira/browse/KAFKA-5599], it's usefull > to mark as deprecated the new-consumer option for all the other tools which > use it (ConsumerPerformance and ConsumerGroupCommand). It will be available > for the next major release for then moving to remove the option in the > subsequent release cycle. > There is also the following > [KIP-176|https://cwiki.apache.org/confluence/display/KAFKA/KIP-176%3A+Remove+deprecated+new-consumer+option+for+tools] > proposed for removing them in a new release after the depracation. > Thanks, > Paolo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)