[jira] [Updated] (KAFKA-13517) Update Admin::describeConfigs to allow fetching specific configurations
[ https://issues.apache.org/jira/browse/KAFKA-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-13517: Description: A list of {{ConfigResource}} class is passed as argument to {{AdminClient::describeConfigs}} api to indicate configuration of the entities to fetch. The {{ConfigResource}} class is made up of two fields, name and type of entity. Kafka returns *all* configurations for the entities provided to the admin client api. This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the configuration for the entities in question. In addition to name and type of entity whose configuration to get, Kafka {{DescribeConfigsResource}} structure also lets users provide {{ConfigurationKeys}} list, which allows users to fetch only the configurations that are needed. However, this field isn't exposed in the {{ConfigResource}} class that is used by AdminClient, so users of AdminClient have no way to ask for specific configuration. The API always returns *all* configurations. Then the user of the {{AdminClient::describeConfigs}} go over the returned list and filter out the config keys that they are interested in. This results in boilerplate code for all users of {{AdminClient::describeConfigs}} api, in addition to being wasteful use of resource. It becomes painful in large cluster case where to fetch one configuration of all topics, we need to fetch all configuration of all topics, which can be huge in size. Creating this Jira to track changes proposed in KIP-823: https://confluentinc.atlassian.net/wiki/spaces/~953215099/pages/2707357918/KIP+Admin+describeConfigs+should+allow+fetching+specific+configurations was: A list of {{ConfigResource}} class is passed as argument to {{AdminClient::describeConfigs}} api to indicate configuration of the entities to fetch. The {{ConfigResource}} class is made up of two fields, name and type of entity. Kafka returns *all* configurations for the entities provided to the admin client api. This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the configuration for the entities in question. In addition to name and type of entity whose configuration to get, Kafka {{DescribeConfigsResource}} structure also lets users provide {{ConfigurationKeys}} list, which allows users to fetch only the configurations that are needed. However, this field isn't exposed in the {{ConfigResource}} class that is used by AdminClient, so users of AdminClient have no way to ask for specific configuration. The API always returns *all* configurations. Then the user of the {{AdminClient::describeConfigs}} go over the returned list and filter out the config keys that they are interested in. This results in boilerplate code for all users of {{AdminClient::describeConfigs}} api, in addition to being wasteful use of resource. It becomes painful in large cluster case where to fetch one configuration of all topics, we need to fetch all configuration of all topics, which can be huge in size. Creating this Jira to add same field (i.e. {{{}ConfigurationKeys{}}}) to the {{ConfigResource}} structure to bring it to parity to {{DescribeConfigsResource}} Kafka API structure. There should be no backward compatibility issue as the field will be optional and will behave same way if it is not specified (i.e. by passing null to backend kafka api) > Update Admin::describeConfigs to allow fetching specific configurations > --- > > Key: KAFKA-13517 > URL: https://issues.apache.org/jira/browse/KAFKA-13517 > Project: Kafka > Issue Type: Improvement > Components: clients >Affects Versions: 2.8.1, 3.0.0 >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > A list of {{ConfigResource}} class is passed as argument to > {{AdminClient::describeConfigs}} api to indicate configuration of the > entities to fetch. The {{ConfigResource}} class is made up of two fields, > name and type of entity. Kafka returns *all* configurations for the entities > provided to the admin client api. > This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the > configuration for the entities in question. In addition to name and type of > entity whose configuration to get, Kafka {{DescribeConfigsResource}} > structure also lets users provide {{ConfigurationKeys}} list, which allows > users to fetch only the configurations that are needed. > However, this field isn't exposed in the {{ConfigResource}} class that is > used by AdminClient, so users of AdminClient have no way to ask for specific > configuration. The API always returns *all* configurations. Then the user of > the {{AdminClient::describeConfigs}} go over the returned list and filter out > the config keys that they ar
[jira] [Updated] (KAFKA-13517) Update Admin::describeConfigs to allow fetching specific configurations
[ https://issues.apache.org/jira/browse/KAFKA-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-13517: Summary: Update Admin::describeConfigs to allow fetching specific configurations (was: Add ConfigurationKeys to ConfigResource class) > Update Admin::describeConfigs to allow fetching specific configurations > --- > > Key: KAFKA-13517 > URL: https://issues.apache.org/jira/browse/KAFKA-13517 > Project: Kafka > Issue Type: Improvement > Components: clients >Affects Versions: 2.8.1, 3.0.0 >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > A list of {{ConfigResource}} class is passed as argument to > {{AdminClient::describeConfigs}} api to indicate configuration of the > entities to fetch. The {{ConfigResource}} class is made up of two fields, > name and type of entity. Kafka returns *all* configurations for the entities > provided to the admin client api. > This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the > configuration for the entities in question. In addition to name and type of > entity whose configuration to get, Kafka {{DescribeConfigsResource}} > structure also lets users provide {{ConfigurationKeys}} list, which allows > users to fetch only the configurations that are needed. > However, this field isn't exposed in the {{ConfigResource}} class that is > used by AdminClient, so users of AdminClient have no way to ask for specific > configuration. The API always returns *all* configurations. Then the user of > the {{AdminClient::describeConfigs}} go over the returned list and filter out > the config keys that they are interested in. > This results in boilerplate code for all users of > {{AdminClient::describeConfigs}} api, in addition to being wasteful use of > resource. It becomes painful in large cluster case where to fetch one > configuration of all topics, we need to fetch all configuration of all > topics, which can be huge in size. > Creating this Jira to add same field (i.e. {{{}ConfigurationKeys{}}}) to the > {{ConfigResource}} structure to bring it to parity to > {{DescribeConfigsResource}} Kafka API structure. There should be no backward > compatibility issue as the field will be optional and will behave same way if > it is not specified (i.e. by passing null to backend kafka api) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (KAFKA-13517) Add ConfigurationKeys to ConfigResource class
Vikas Singh created KAFKA-13517: --- Summary: Add ConfigurationKeys to ConfigResource class Key: KAFKA-13517 URL: https://issues.apache.org/jira/browse/KAFKA-13517 Project: Kafka Issue Type: Improvement Components: clients Affects Versions: 3.0.0, 2.8.1 Reporter: Vikas Singh Assignee: Vikas Singh Fix For: 2.8.1 A list of {{ConfigResource}} class is passed as argument to {{AdminClient::describeConfigs}} api to indicate configuration of the entities to fetch. The {{ConfigResource}} class is made up of two fields, name and type of entity. Kafka returns *all* configurations for the entities provided to the admin client api. This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the configuration for the entities in question. In addition to name and type of entity whose configuration to get, Kafka {{DescribeConfigsResource}} structure also lets users provide {{ConfigurationKeys}} list, which allows users to fetch only the configurations that are needed. However, this field isn't exposed in the {{ConfigResource}} class that is used by AdminClient, so users of AdminClient have no way to ask for specific configuration. The API always returns *all* configurations. Then the user of the {{AdminClient::describeConfigs}} go over the returned list and filter out the config keys that they are interested in. This results in boilerplate code for all users of {{AdminClient::describeConfigs}} api, in addition to being wasteful use of resource. It becomes painful in large cluster case where to fetch one configuration of all topics, we need to fetch all configuration of all topics, which can be huge in size. Creating this Jira to add same field (i.e. {{{}ConfigurationKeys{}}}) to the {{ConfigResource}} structure to bring it to parity to {{DescribeConfigsResource}} Kafka API structure. There should be no backward compatibility issue as the field will be optional and will behave same way if it is not specified (i.e. by passing null to backend kafka api) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13432) ApiException should provide a way to capture stacktrace
[ https://issues.apache.org/jira/browse/KAFKA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438341#comment-17438341 ] Vikas Singh commented on KAFKA-13432: - One simple solution can be to fill in the stacktrace based on log level, so something like this: {code:java} /* avoid the expensive and useless stack trace for api exceptions */ @Override public Throwable fillInStackTrace() { if (LOG.debugEnabled()) { return super.fillInStackTrace() } else { return this; } } {code} > ApiException should provide a way to capture stacktrace > --- > > Key: KAFKA-13432 > URL: https://issues.apache.org/jira/browse/KAFKA-13432 > Project: Kafka > Issue Type: Improvement > Components: core >Reporter: Vikas Singh >Priority: Major > > ApiException doesn't fill in the stacktrace, it overrides `fillInStacktrace` > to make it a no-op, here is the code: > https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/errors/ApiException.java#L45,L49 > However, there are times when full stacktrace will be helpful in finding out > what went wrong on the client side. We should provide a way to make this > behavior configurable, so that if an error is hit multiple times, we can > switch the behavior and find out what code is causing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-13432) ApiException should provide a way to capture stacktrace
Vikas Singh created KAFKA-13432: --- Summary: ApiException should provide a way to capture stacktrace Key: KAFKA-13432 URL: https://issues.apache.org/jira/browse/KAFKA-13432 Project: Kafka Issue Type: Improvement Components: core Reporter: Vikas Singh ApiException doesn't fill in the stacktrace, it overrides `fillInStacktrace` to make it a no-op, here is the code: https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/errors/ApiException.java#L45,L49 However, there are times when full stacktrace will be helpful in finding out what went wrong on the client side. We should provide a way to make this behavior configurable, so that if an error is hit multiple times, we can switch the behavior and find out what code is causing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
[ https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-9370: --- Comment: was deleted (was: https://github.com/apache/kafka/pull/9347) > Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress > -- > > Key: KAFKA-9370 > URL: https://issues.apache.org/jira/browse/KAFKA-9370 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` > if the topic is getting deleted. Change it to return > `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, > client should see the topic as deleted. The fact that we are processing > deletion in background shouldn't have any impact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
[ https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203527#comment-17203527 ] Vikas Singh commented on KAFKA-9370: https://github.com/apache/kafka/pull/9347 > Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress > -- > > Key: KAFKA-9370 > URL: https://issues.apache.org/jira/browse/KAFKA-9370 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` > if the topic is getting deleted. Change it to return > `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, > client should see the topic as deleted. The fact that we are processing > deletion in background shouldn't have any impact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-10531) KafkaBasedLog can sleep for negative values
[ https://issues.apache.org/jira/browse/KAFKA-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-10531: --- Assignee: Vikas Singh > KafkaBasedLog can sleep for negative values > --- > > Key: KAFKA-10531 > URL: https://issues.apache.org/jira/browse/KAFKA-10531 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.6.0 >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > Fix For: 2.6.1 > > > {{time.milliseconds}} is not monotonic, so this code can throw : > {{java.lang.IllegalArgumentException: timeout value is negative}} > > {code:java} > long started = time.milliseconds(); > while (partitionInfos == null && time.milliseconds() - started < > CREATE_TOPIC_TIMEOUT_MS) { > partitionInfos = consumer.partitionsFor(topic); > Utils.sleep(Math.min(time.milliseconds() - started, 1000)); > } > {code} > We need to check for negative value before sleeping. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10531) KafkaBasedLog can sleep for negative values
Vikas Singh created KAFKA-10531: --- Summary: KafkaBasedLog can sleep for negative values Key: KAFKA-10531 URL: https://issues.apache.org/jira/browse/KAFKA-10531 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.6.0 Reporter: Vikas Singh Fix For: 2.6.1 {{time.milliseconds}} is not monotonic, so this code can throw : {{java.lang.IllegalArgumentException: timeout value is negative}} {code:java} long started = time.milliseconds(); while (partitionInfos == null && time.milliseconds() - started < CREATE_TOPIC_TIMEOUT_MS) { partitionInfos = consumer.partitionsFor(topic); Utils.sleep(Math.min(time.milliseconds() - started, 1000)); } {code} We need to check for negative value before sleeping. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10175) MetadataCache::getClusterMetadata returns null for offline replicas
[ https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-10175: Summary: MetadataCache::getClusterMetadata returns null for offline replicas (was: MetadataCache::getCluster returns null for offline replicas) > MetadataCache::getClusterMetadata returns null for offline replicas > --- > > Key: KAFKA-10175 > URL: https://issues.apache.org/jira/browse/KAFKA-10175 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Priority: Major > > This line in the code always returns null: > [MetadataCache::getClusterMetadata|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272] > The reason is that the `map(node)` part uses `aliveNodes` to create `Node` > object otherwise default to `null`. Offline replicas thus end up always as > null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas
[ https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-10175: Description: This line in the code always returns null: [MetadataCache::getClusterMetadata|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272] The reason is that the `map(node)` part uses `aliveNodes` to create `Node` object otherwise default to `null`. Offline replicas thus end up always as null. was: This line in the code always returns null: [[MetadataCache|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]::getClusterMetadata|[https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]] The reason is that the `map(node)` part uses `aliveNodes` to create `Node` object otherwise default to `null`. Offline replicas thus end up always as null. > MetadataCache::getCluster returns null for offline replicas > --- > > Key: KAFKA-10175 > URL: https://issues.apache.org/jira/browse/KAFKA-10175 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Priority: Major > > This line in the code always returns null: > [MetadataCache::getClusterMetadata|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272] > The reason is that the `map(node)` part uses `aliveNodes` to create `Node` > object otherwise default to `null`. Offline replicas thus end up always as > null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas
[ https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-10175: Description: This line in the code always returns null: [[MetadataCache|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]::getClusterMetadata|[https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]] The reason is that the `map(node)` part uses `aliveNodes` to create `Node` object otherwise default to `null`. Offline replicas thus end up always as null. was: This line in the code always returns null: > https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272 The reason is that the `map(node)` part uses `aliveNodes` to create `Node` object otherwise default to `null`. Offline replicas thus end up always as null. > MetadataCache::getCluster returns null for offline replicas > --- > > Key: KAFKA-10175 > URL: https://issues.apache.org/jira/browse/KAFKA-10175 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Priority: Major > > This line in the code always returns null: > [[MetadataCache|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]::getClusterMetadata|[https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]] > The reason is that the `map(node)` part uses `aliveNodes` to create `Node` > object otherwise default to `null`. Offline replicas thus end up always as > null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas
[ https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-10175: Description: This line in the code always returns null: > https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272 The reason is that the `map(node)` part uses `aliveNodes` to create `Node` object otherwise default to `null`. Offline replicas thus end up always as null. was: This line in the code always returns null: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/MetadataCache.scala#L272 The reason is that the `map(node)` part uses `aliveNodes` to create `Node` object otherwise default to `null`. Offline replicas thus end up always as null. > MetadataCache::getCluster returns null for offline replicas > --- > > Key: KAFKA-10175 > URL: https://issues.apache.org/jira/browse/KAFKA-10175 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Priority: Major > > This line in the code always returns null: > > https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272 > The reason is that the `map(node)` part uses `aliveNodes` to create `Node` > object otherwise default to `null`. Offline replicas thus end up always as > null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas
Vikas Singh created KAFKA-10175: --- Summary: MetadataCache::getCluster returns null for offline replicas Key: KAFKA-10175 URL: https://issues.apache.org/jira/browse/KAFKA-10175 Project: Kafka Issue Type: Bug Reporter: Vikas Singh This line in the code always returns null: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/MetadataCache.scala#L272 The reason is that the `map(node)` part uses `aliveNodes` to create `Node` object otherwise default to `null`. Offline replicas thus end up always as null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-9254) Updating Kafka Broker configuration dynamically twice reverts log configuration to default
[ https://issues.apache.org/jira/browse/KAFKA-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-9254: --- Summary: Updating Kafka Broker configuration dynamically twice reverts log configuration to default (was: Topic level configuration failed) > Updating Kafka Broker configuration dynamically twice reverts log > configuration to default > -- > > Key: KAFKA-9254 > URL: https://issues.apache.org/jira/browse/KAFKA-9254 > Project: Kafka > Issue Type: Bug > Components: config, log, replication >Affects Versions: 2.0.1 >Reporter: fenghong >Assignee: huxihx >Priority: Critical > > We are engineers at Huobi and now encounter Kafka BUG > Modifying DynamicBrokerConfig more than 2 times will invalidate the topic > level unrelated configuration > The bug reproduction method as follows: > # Set Kafka Broker config server.properties min.insync.replicas=3 > # Create topic test-1 and set topic‘s level config min.insync.replicas=2 > # Dynamically modify the configuration twice as shown below > {code:java} > bin/kafka-configs.sh --bootstrap-server xxx:9092 --entity-type brokers > --entity-default --alter --add-config log.message.timestamp.type=LogAppendTime > bin/kafka-configs.sh --bootstrap-server xxx:9092 --entity-type brokers > --entity-default --alter --add-config log.retention.ms=60480 > {code} > # stop a Kafka Server and found the Exception as shown below > org.apache.kafka.common.errors.NotEnoughReplicasException: Number of insync > replicas for partition test-1-0 is [2], below required minimum [3] > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
Vikas Singh created KAFKA-9370: -- Summary: Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress Key: KAFKA-9370 URL: https://issues.apache.org/jira/browse/KAFKA-9370 Project: Kafka Issue Type: Bug Reporter: Vikas Singh `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` if the topic is getting deleted. Change it to return `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, client should see the topic as deleted. The fact that we are processing deletion in background shouldn't have any impact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
[ https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-9370: -- Assignee: Vikas Singh > Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress > -- > > Key: KAFKA-9370 > URL: https://issues.apache.org/jira/browse/KAFKA-9370 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` > if the topic is getting deleted. Change it to return > `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, > client should see the topic as deleted. The fact that we are processing > deletion in background shouldn't have any impact. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9330) Calling AdminClient.close in the AdminClient completion callback causes deadlock
[ https://issues.apache.org/jira/browse/KAFKA-9330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-9330: -- Assignee: Vikas Singh > Calling AdminClient.close in the AdminClient completion callback causes > deadlock > > > Key: KAFKA-9330 > URL: https://issues.apache.org/jira/browse/KAFKA-9330 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > The close method calls `Thread.join` to wait for AdminClient thread to die, > but that doesn't happen as the thread calling join is the AdminClient thread. > This causes the thread to block forever, causing a deadlock where it forever > waits for itself to die. > `AdminClient.close` should check if the thread calling close is same as > current thread, then skip the join. The thread will check for close condition > in the main loop and exit. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9330) Calling AdminClient.close in the AdminClient completion callback causes deadlock
Vikas Singh created KAFKA-9330: -- Summary: Calling AdminClient.close in the AdminClient completion callback causes deadlock Key: KAFKA-9330 URL: https://issues.apache.org/jira/browse/KAFKA-9330 Project: Kafka Issue Type: Bug Reporter: Vikas Singh The close method calls `Thread.join` to wait for AdminClient thread to die, but that doesn't happen as the thread calling join is the AdminClient thread. This causes the thread to block forever, causing a deadlock where it forever waits for itself to die. `AdminClient.close` should check if the thread calling close is same as current thread, then skip the join. The thread will check for close condition in the main loop and exit. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9329) KafkaController::replicasAreValid should return error
[ https://issues.apache.org/jira/browse/KAFKA-9329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-9329: -- Assignee: Vikas Singh > KafkaController::replicasAreValid should return error > - > > Key: KAFKA-9329 > URL: https://issues.apache.org/jira/browse/KAFKA-9329 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > The method currently returns a boolean indicating if replicas are valid or > not. But the failure condition loses any context on why replicas are not > valid. We should return the error condition along with success/failure. > Maybe change method name to something like `validateReplicas` too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9329) KafkaController::replicasAreValid should return error
Vikas Singh created KAFKA-9329: -- Summary: KafkaController::replicasAreValid should return error Key: KAFKA-9329 URL: https://issues.apache.org/jira/browse/KAFKA-9329 Project: Kafka Issue Type: Bug Reporter: Vikas Singh The method currently returns a boolean indicating if replicas are valid or not. But the failure condition loses any context on why replicas are not valid. We should return the error condition along with success/failure. Maybe change method name to something like `validateReplicas` too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-9265) kafka.log.Log instances are leaking on log delete
Vikas Singh created KAFKA-9265: -- Summary: kafka.log.Log instances are leaking on log delete Key: KAFKA-9265 URL: https://issues.apache.org/jira/browse/KAFKA-9265 Project: Kafka Issue Type: Bug Reporter: Vikas Singh KAFKA-8448 fixes problem with similar leak. The {{Log}} objects are being held in {{ScheduledExecutor}} {{PeriodicProducerExpirationCheck}} callback. The fix in KAFKA-8448 was to change the policy of {{ScheduledExecutor}} to remove the scheduled task when it gets canceled (by calling {{setRemoveOnCancelPolicy(true)}}). This works when a log is closed using {{close()}} method. But when a log is deleted either when the topic gets deleted or when the rebalancing operation moves the replica away from broker, the {{delete()}} operation is invoked. {{Log.delete()}} doesn't close the pending scheduled task and that leaks Log instance. Fix is to close the scheduled task in the {{Log.delete()}} method too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (KAFKA-9265) kafka.log.Log instances are leaking on log delete
[ https://issues.apache.org/jira/browse/KAFKA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-9265: -- Assignee: Vikas Singh > kafka.log.Log instances are leaking on log delete > - > > Key: KAFKA-9265 > URL: https://issues.apache.org/jira/browse/KAFKA-9265 > Project: Kafka > Issue Type: Bug >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > KAFKA-8448 fixes problem with similar leak. The {{Log}} objects are being > held in {{ScheduledExecutor}} {{PeriodicProducerExpirationCheck}} callback. > The fix in KAFKA-8448 was to change the policy of {{ScheduledExecutor}} to > remove the scheduled task when it gets canceled (by calling > {{setRemoveOnCancelPolicy(true)}}). > This works when a log is closed using {{close()}} method. But when a log is > deleted either when the topic gets deleted or when the rebalancing operation > moves the replica away from broker, the {{delete()}} operation is invoked. > {{Log.delete()}} doesn't close the pending scheduled task and that leaks Log > instance. > Fix is to close the scheduled task in the {{Log.delete()}} method too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (KAFKA-8988) Replace CreatePartitions request/response with automated protocol
Vikas Singh created KAFKA-8988: -- Summary: Replace CreatePartitions request/response with automated protocol Key: KAFKA-8988 URL: https://issues.apache.org/jira/browse/KAFKA-8988 Project: Kafka Issue Type: Sub-task Reporter: Vikas Singh Assignee: Vikas Singh -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (KAFKA-8813) Race condition when creating topics and changing their configuration
[ https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-8813: --- Description: In Partition.createLog we do: {code:java} val props = stateStore.fetchTopicConfig() val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, props) val log = logManager.getOrCreateLog(topicPartition, config, isNew, isFutureReplica) {code} [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316] Config changes that arrive after configs are loaded from ZK, but before LogManager added the partition to `futureLogs` or `currentLogs` where the dynamic config handlers picks up topics to update their configs, will be lost. was: In Partition.createLog we do: {code:java} val props = stateStore.fetchTopicConfig() val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, props) val log = logManager.getOrCreateLog(topicPartition, config, isNew, isFutureReplica) {code} [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316] Config changes that arrive after configs are loaded from ZK, but before LogManager added the partition to `futureLogs` or `currentLogs` where the dynamic config handlers picks up topics to update their configs, will be lost. > Race condition when creating topics and changing their configuration > > > Key: KAFKA-8813 > URL: https://issues.apache.org/jira/browse/KAFKA-8813 > Project: Kafka > Issue Type: Bug >Reporter: Gwen Shapira >Assignee: Vikas Singh >Priority: Major > > In Partition.createLog we do: > {code:java} > val props = stateStore.fetchTopicConfig() > val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, > props) > val log = logManager.getOrCreateLog(topicPartition, config, isNew, > isFutureReplica) > {code} > [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316] > Config changes that arrive after configs are loaded from ZK, but before > LogManager added the partition to `futureLogs` or `currentLogs` where the > dynamic config handlers picks up topics to update their configs, will be lost. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (KAFKA-8813) Race condition when creating topics and changing their configuration
[ https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-8813: --- Description: In Partition.createLog we do: {code:java} val props = stateStore.fetchTopicConfig() val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, props) val log = logManager.getOrCreateLog(topicPartition, config, isNew, isFutureReplica) {code} [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316] Config changes that arrive after configs are loaded from ZK, but before LogManager added the partition to `futureLogs` or `currentLogs` where the dynamic config handlers picks up topics to update their configs, will be lost. was: In Partition.createLog we do: {code:java} val props = stateStore.fetchTopicConfig() val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, props) val log = logManager.getOrCreateLog(topicPartition, config, isNew, isFutureReplica) {code} [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316] Config changes that arrive after configs are loaded from ZK, but before LogManager added the partition to `futureLogs` or `currentLogs` where the dynamic config handlers picks up topics to update their configs, will be lost. > Race condition when creating topics and changing their configuration > > > Key: KAFKA-8813 > URL: https://issues.apache.org/jira/browse/KAFKA-8813 > Project: Kafka > Issue Type: Bug >Reporter: Gwen Shapira >Assignee: Vikas Singh >Priority: Major > > In Partition.createLog we do: > {code:java} > val props = stateStore.fetchTopicConfig() > val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, > props) > val log = logManager.getOrCreateLog(topicPartition, config, isNew, > isFutureReplica) > {code} > [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316] > Config changes that arrive after configs are loaded from ZK, but before > LogManager added the partition to `futureLogs` or `currentLogs` where the > dynamic config handlers picks up topics to update their configs, will be lost. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (KAFKA-8813) Race condition when creating topics and changing their configuration
[ https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh updated KAFKA-8813: --- Description: In Partition.createLog we do: {code:java} val props = stateStore.fetchTopicConfig() val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, props) val log = logManager.getOrCreateLog(topicPartition, config, isNew, isFutureReplica) {code} [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316] Config changes that arrive after configs are loaded from ZK, but before LogManager added the partition to `futureLogs` or `currentLogs` where the dynamic config handlers picks up topics to update their configs, will be lost. was: In Partition.createLog we do: {code:java} val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, props) val log = logManager.getOrCreateLog(topicPartition, config, isNew, isFutureReplica) {code} https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316 Config changes that arrive after configs are loaded from ZK, but before LogManager added the partition to `futureLogs` or `currentLogs` where the dynamic config handlers picks up topics to update their configs, will be lost. > Race condition when creating topics and changing their configuration > > > Key: KAFKA-8813 > URL: https://issues.apache.org/jira/browse/KAFKA-8813 > Project: Kafka > Issue Type: Bug >Reporter: Gwen Shapira >Assignee: Vikas Singh >Priority: Major > > In Partition.createLog we do: > {code:java} > val props = stateStore.fetchTopicConfig() > val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, > props) > val log = logManager.getOrCreateLog(topicPartition, config, isNew, > isFutureReplica) > {code} > [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316] > Config changes that arrive after configs are loaded from ZK, but before > LogManager added the partition to `futureLogs` or `currentLogs` where the > dynamic config handlers picks up topics to update their configs, will be lost. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Assigned] (KAFKA-8813) Race condition when creating topics and changing their configuration
[ https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-8813: -- Assignee: Vikas Singh > Race condition when creating topics and changing their configuration > > > Key: KAFKA-8813 > URL: https://issues.apache.org/jira/browse/KAFKA-8813 > Project: Kafka > Issue Type: Bug >Reporter: Gwen Shapira >Assignee: Vikas Singh >Priority: Major > > In Partition.createLog we do: > {code:java} > val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, > props) > val log = logManager.getOrCreateLog(topicPartition, config, isNew, > isFutureReplica) > {code} > https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316 > Config changes that arrive after configs are loaded from ZK, but before > LogManager added the partition to `futureLogs` or `currentLogs` where the > dynamic config handlers picks up topics to update their configs, will be lost. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (KAFKA-8813) Race condition when creating topics and changing their configuration
[ https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916232#comment-16916232 ] Vikas Singh commented on KAFKA-8813: I am working on a fix for this issue. Expect a PR sometime this week. > Race condition when creating topics and changing their configuration > > > Key: KAFKA-8813 > URL: https://issues.apache.org/jira/browse/KAFKA-8813 > Project: Kafka > Issue Type: Bug >Reporter: Gwen Shapira >Assignee: Vikas Singh >Priority: Major > > In Partition.createLog we do: > {code:java} > val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, > props) > val log = logManager.getOrCreateLog(topicPartition, config, isNew, > isFutureReplica) > {code} > https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316 > Config changes that arrive after configs are loaded from ZK, but before > LogManager added the partition to `futureLogs` or `currentLogs` where the > dynamic config handlers picks up topics to update their configs, will be lost. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (KAFKA-8001) Fetch from future replica stalls when local replica becomes a leader
[ https://issues.apache.org/jira/browse/KAFKA-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868153#comment-16868153 ] Vikas Singh commented on KAFKA-8001: Comment from Jason on PR that need to be taken care of: {noformat} I was discussing with @apovzner and we realized that there may be a few more issues here. For a normal replica, whenever we observe an epoch bump, we go through a reconciliation protocol to find a truncation point at which we know it is safe to resume fetching. Basically it works like this: Follower observes epoch bump and enters Truncating state. Follower sends OffsetsForLeaderEpoch query to leader with the latest epoch from its log. Leader looks in its local log to find the largest epoch less than or equal to the requested epoch and returns its end offset. The follower will truncate to this offset and then possibly go back to 2. if additional truncation is needed. Once truncation is complete, the follower enters the Fetching state. For a future replica, the protocol is basically the same, but rather than sending OffsetsForLeaderEpoch to the leader, we use the state from the local replica which may or may not be the leader of the bumped epoch. The basic problem we realized is that this log reconciliation is not safe while the local replica is in the Truncating state and we have nothing at the moment to guarantee that it is not in that state. Basically we have to wait until it has reached step 5 before the future replica can do its own truncation. I suspect probably what we have to do to fix this problem is move the state of the fetcher (i.e. Truncating|Fetching and the current epoch) out of AbstractFetcherThread and into something which can be accessed by alter log dir fetcher. The most obvious candidate seems like kafka.cluster.Replica. Unfortunately, this may not be a small work, but I cannot think how to make this protocol work unless we do it.{noformat} > Fetch from future replica stalls when local replica becomes a leader > > > Key: KAFKA-8001 > URL: https://issues.apache.org/jira/browse/KAFKA-8001 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.1.0, 2.1.1 >Reporter: Anna Povzner >Assignee: Vikas Singh >Priority: Critical > > With KIP-320, fetch from follower / future replica returns > FENCED_LEADER_EPOCH if current leader epoch in the request is lower than the > leader epoch known to the leader (or local replica in case of future replica > fetching). In case of future replica fetching from the local replica, if > local replica becomes the leader of the partition, the next fetch from future > replica fails with FENCED_LEADER_EPOCH and fetching from future replica is > stopped until the next leader change. > Proposed solution: on local replica leader change, future replica should > "become a follower" again, and go through the truncation phase. Or we could > optimize it, and just update partition state of the future replica to reflect > the updated current leader epoch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-8525) Make log in Partion non-optional
[ https://issues.apache.org/jira/browse/KAFKA-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865789#comment-16865789 ] Vikas Singh commented on KAFKA-8525: #2 was already done as part of KAFKA-8457. This Jira will only track the first case above in description. > Make log in Partion non-optional > > > Key: KAFKA-8525 > URL: https://issues.apache.org/jira/browse/KAFKA-8525 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.3.0 >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Minor > > While moving log out of replica to partition as part of KAFKA-8457 cleaned a > bunch of code by removing code like "if (!localReplica) throw), there are > still couple of additional cleanups that can be done: > # The log object in Partition can be made non-optional. As it doesn't make > sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: > {{I think it shouldn't be possible to have a Partition without a > corresponding Log. Once this is merged, I think we can look into whether we > can replace the optional log field in this class with a concrete instance.}} > # The LocalReplica class can be removed simplifying replica class. Here is > another comment on the PR: {{it might be possible to turn Replica into a > trait and then let Log implement it directly. Then we could get rid of > LocalReplica. That would also help us clean up RemoteReplica, since the usage > of LogOffsetMetadata only makes sense for the local replica.}} > Creating this JIRA to track these refactoring tasks for future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8457) Remove Log dependency from Replica
[ https://issues.apache.org/jira/browse/KAFKA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh resolved KAFKA-8457. Resolution: Fixed Fixed in commit 57baa4079d9fc14103411f790b9a025c9f2146a4 > Remove Log dependency from Replica > -- > > Key: KAFKA-8457 > URL: https://issues.apache.org/jira/browse/KAFKA-8457 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Major > > A partition can have one log but many replicas. Putting log in replica meant > that we have to have if-else each time we need to access log. Moving the log > out of replica and in partition will make code simpler and it will also help > in testing where mocks will get simplified. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8525) Make log in Partion non-optional
[ https://issues.apache.org/jira/browse/KAFKA-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-8525: -- Assignee: Vikas Singh > Make log in Partion non-optional > > > Key: KAFKA-8525 > URL: https://issues.apache.org/jira/browse/KAFKA-8525 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.3.0 >Reporter: Vikas Singh >Assignee: Vikas Singh >Priority: Minor > > While moving log out of replica to partition as part of KAFKA-8457 cleaned a > bunch of code by removing code like "if (!localReplica) throw), there are > still couple of additional cleanups that can be done: > # The log object in Partition can be made non-optional. As it doesn't make > sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: > {{I think it shouldn't be possible to have a Partition without a > corresponding Log. Once this is merged, I think we can look into whether we > can replace the optional log field in this class with a concrete instance.}} > # The LocalReplica class can be removed simplifying replica class. Here is > another comment on the PR: {{it might be possible to turn Replica into a > trait and then let Log implement it directly. Then we could get rid of > LocalReplica. That would also help us clean up RemoteReplica, since the usage > of LogOffsetMetadata only makes sense for the local replica.}} > Creating this JIRA to track these refactoring tasks for future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8525) Make log in Partion non-optional
Vikas Singh created KAFKA-8525: -- Summary: Make log in Partion non-optional Key: KAFKA-8525 URL: https://issues.apache.org/jira/browse/KAFKA-8525 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.3.0 Reporter: Vikas Singh While moving log out of replica to partition as part of KAFKA-8457 cleaned a bunch of code by removing code like "if (!localReplica) throw), there are still couple of additional cleanups that can be done: # The log object in Partition can be made non-optional. As it doesn't make sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: {{I think it shouldn't be possible to have a Partition without a corresponding Log. Once this is merged, I think we can look into whether we can replace the optional log field in this class with a concrete instance.}} # The LocalReplica class can be removed simplifying replica class. Here is another comment on the PR: {{it might be possible to turn Replica into a trait and then let Log implement it directly. Then we could get rid of LocalReplica. That would also help us clean up RemoteReplica, since the usage of LogOffsetMetadata only makes sense for the local replica.}} Creating this JIRA to track these refactoring tasks for future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8457) Remove Log dependency from Replica
Vikas Singh created KAFKA-8457: -- Summary: Remove Log dependency from Replica Key: KAFKA-8457 URL: https://issues.apache.org/jira/browse/KAFKA-8457 Project: Kafka Issue Type: Bug Components: core Reporter: Vikas Singh Assignee: Vikas Singh A partition can have one log but many replicas. Putting log in replica meant that we have to have if-else each time we need to access log. Moving the log out of replica and in partition will make code simpler and it will also help in testing where mocks will get simplified. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (KAFKA-8341) AdminClient should retry coordinator lookup after NOT_COORDINATOR error
[ https://issues.apache.org/jira/browse/KAFKA-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh resolved KAFKA-8341. Resolution: Fixed fixed in commit 46a02f3231cd6d340c622636159b9f59b4b3cb6e > AdminClient should retry coordinator lookup after NOT_COORDINATOR error > --- > > Key: KAFKA-8341 > URL: https://issues.apache.org/jira/browse/KAFKA-8341 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Vikas Singh >Priority: Major > > If a group operation (e.g. DescribeGroup) fails because the coordinator has > moved, the AdminClient should lookup the coordinator before retrying the > operation. Currently we will either fail or just retry anyway. This is > similar in some ways to controller rediscovery after getting NOT_CONTROLLER > errors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-8398) NPE when unmapping files after moving log directories using AlterReplicaLogDirs
Vikas Singh created KAFKA-8398: -- Summary: NPE when unmapping files after moving log directories using AlterReplicaLogDirs Key: KAFKA-8398 URL: https://issues.apache.org/jira/browse/KAFKA-8398 Project: Kafka Issue Type: Bug Components: core Affects Versions: 2.2.0 Reporter: Vikas Singh Attachments: AlterReplicaLogDirs.txt The NPE occurs after the AlterReplicaLogDirs command completes successfully and when unmapping older regions. The relevant part of log is in attached log file. Here is the stacktrace (which is repeated for both index files): {code:java} [2019-05-20 14:08:13,999] ERROR Error unmapping index /tmp/kafka-logs/test-0.567a0d8ff88b45ab95794020d0b2e66f-delete/.index (kafka.log.OffsetIndex) java.lang.NullPointerException at org.apache.kafka.common.utils.MappedByteBuffers.unmap(MappedByteBuffers.java:73) at kafka.log.AbstractIndex.forceUnmap(AbstractIndex.scala:318) at kafka.log.AbstractIndex.safeForceUnmap(AbstractIndex.scala:308) at kafka.log.AbstractIndex.$anonfun$closeHandler$1(AbstractIndex.scala:257) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251) at kafka.log.AbstractIndex.closeHandler(AbstractIndex.scala:257) at kafka.log.AbstractIndex.deleteIfExists(AbstractIndex.scala:226) at kafka.log.LogSegment.$anonfun$deleteIfExists$6(LogSegment.scala:597) at kafka.log.LogSegment.delete$1(LogSegment.scala:585) at kafka.log.LogSegment.$anonfun$deleteIfExists$5(LogSegment.scala:597) at kafka.utils.CoreUtils$.$anonfun$tryAll$1(CoreUtils.scala:115) at kafka.utils.CoreUtils$.$anonfun$tryAll$1$adapted(CoreUtils.scala:114) at scala.collection.immutable.List.foreach(List.scala:392) at kafka.utils.CoreUtils$.tryAll(CoreUtils.scala:114) at kafka.log.LogSegment.deleteIfExists(LogSegment.scala:599) at kafka.log.Log.$anonfun$delete$3(Log.scala:1762) at kafka.log.Log.$anonfun$delete$3$adapted(Log.scala:1762) at scala.collection.Iterator.foreach(Iterator.scala:941) at scala.collection.Iterator.foreach$(Iterator.scala:941) at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at kafka.log.Log.$anonfun$delete$2(Log.scala:1762) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at kafka.log.Log.maybeHandleIOException(Log.scala:2013) at kafka.log.Log.delete(Log.scala:1759) at kafka.log.LogManager.deleteLogs(LogManager.scala:761) at kafka.log.LogManager.$anonfun$deleteLogs$6(LogManager.scala:775) at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114) at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:63) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (KAFKA-8001) Fetch from future replica stalls when local replica becomes a leader
[ https://issues.apache.org/jira/browse/KAFKA-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Singh reassigned KAFKA-8001: -- Assignee: Vikas Singh (was: Colin Hicks) > Fetch from future replica stalls when local replica becomes a leader > > > Key: KAFKA-8001 > URL: https://issues.apache.org/jira/browse/KAFKA-8001 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 2.1.0, 2.1.1 >Reporter: Anna Povzner >Assignee: Vikas Singh >Priority: Critical > > With KIP-320, fetch from follower / future replica returns > FENCED_LEADER_EPOCH if current leader epoch in the request is lower than the > leader epoch known to the leader (or local replica in case of future replica > fetching). In case of future replica fetching from the local replica, if > local replica becomes the leader of the partition, the next fetch from future > replica fails with FENCED_LEADER_EPOCH and fetching from future replica is > stopped until the next leader change. > Proposed solution: on local replica leader change, future replica should > "become a follower" again, and go through the truncation phase. Or we could > optimize it, and just update partition state of the future replica to reflect > the updated current leader epoch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)