[jira] [Created] (KAFKA-13432) ApiException should provide a way to capture stacktrace

2021-11-03 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-13432:
---

 Summary: ApiException should provide a way to capture stacktrace
 Key: KAFKA-13432
 URL: https://issues.apache.org/jira/browse/KAFKA-13432
 Project: Kafka
  Issue Type: Improvement
  Components: core
Reporter: Vikas Singh


ApiException doesn't fill in the stacktrace, it overrides `fillInStacktrace` to 
make it a no-op, here is the code: 
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/errors/ApiException.java#L45,L49

However, there are times when full stacktrace will be helpful in finding out 
what went wrong on the client side. We should provide a way to make this 
behavior configurable, so that if an error is hit multiple times, we can switch 
the behavior and find out what code is causing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13432) ApiException should provide a way to capture stacktrace

2021-11-03 Thread Vikas Singh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438341#comment-17438341
 ] 

Vikas Singh commented on KAFKA-13432:
-

One simple solution can be to fill in the stacktrace based on log level, so 
something like this:
{code:java}
/* avoid the expensive and useless stack trace for api exceptions */
@Override
public Throwable fillInStackTrace() {
if (LOG.debugEnabled()) {
return super.fillInStackTrace()
} else { 
return this;
}
}
{code}

> ApiException should provide a way to capture stacktrace
> ---
>
> Key: KAFKA-13432
> URL: https://issues.apache.org/jira/browse/KAFKA-13432
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Vikas Singh
>Priority: Major
>
> ApiException doesn't fill in the stacktrace, it overrides `fillInStacktrace` 
> to make it a no-op, here is the code: 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/errors/ApiException.java#L45,L49
> However, there are times when full stacktrace will be helpful in finding out 
> what went wrong on the client side. We should provide a way to make this 
> behavior configurable, so that if an error is hit multiple times, we can 
> switch the behavior and find out what code is causing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-13517) Add ConfigurationKeys to ConfigResource class

2021-12-07 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-13517:
---

 Summary: Add ConfigurationKeys to ConfigResource class
 Key: KAFKA-13517
 URL: https://issues.apache.org/jira/browse/KAFKA-13517
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Affects Versions: 3.0.0, 2.8.1
Reporter: Vikas Singh
Assignee: Vikas Singh
 Fix For: 2.8.1


A list of {{ConfigResource}} class is passed as argument to 
{{AdminClient::describeConfigs}} api to indicate configuration of the entities 
to fetch. The {{ConfigResource}} class is made up of two fields, name and type 
of entity. Kafka returns *all* configurations for the entities provided to the 
admin client api.

This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the 
configuration for the entities in question. In addition to name and type of 
entity whose configuration to get, Kafka {{DescribeConfigsResource}} structure 
also lets users provide {{ConfigurationKeys}} list, which allows users to fetch 
only the configurations that are needed.

However, this field isn't exposed in the {{ConfigResource}} class that is used 
by AdminClient, so users of AdminClient have no way to ask for specific 
configuration. The API always returns *all* configurations. Then the user of 
the {{AdminClient::describeConfigs}} go over the returned list and filter out 
the config keys that they are interested in.

This results in boilerplate code for all users of 
{{AdminClient::describeConfigs}} api, in addition to  being wasteful use of 
resource. It becomes painful in large cluster case where to fetch one 
configuration of all topics, we need to fetch all configuration of all topics, 
which can be huge in size. 

Creating this Jira to add same field (i.e. {{{}ConfigurationKeys{}}}) to the 
{{ConfigResource}} structure to bring it to parity to 
{{DescribeConfigsResource}} Kafka API structure. There should be no backward 
compatibility issue as the field will be optional and will behave same way if 
it is not specified (i.e. by passing null to backend kafka api) 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13517) Update Admin::describeConfigs to allow fetching specific configurations

2022-02-20 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-13517:

Summary: Update Admin::describeConfigs to allow fetching specific 
configurations  (was: Add ConfigurationKeys to ConfigResource class)

> Update Admin::describeConfigs to allow fetching specific configurations
> ---
>
> Key: KAFKA-13517
> URL: https://issues.apache.org/jira/browse/KAFKA-13517
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 2.8.1, 3.0.0
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> A list of {{ConfigResource}} class is passed as argument to 
> {{AdminClient::describeConfigs}} api to indicate configuration of the 
> entities to fetch. The {{ConfigResource}} class is made up of two fields, 
> name and type of entity. Kafka returns *all* configurations for the entities 
> provided to the admin client api.
> This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the 
> configuration for the entities in question. In addition to name and type of 
> entity whose configuration to get, Kafka {{DescribeConfigsResource}} 
> structure also lets users provide {{ConfigurationKeys}} list, which allows 
> users to fetch only the configurations that are needed.
> However, this field isn't exposed in the {{ConfigResource}} class that is 
> used by AdminClient, so users of AdminClient have no way to ask for specific 
> configuration. The API always returns *all* configurations. Then the user of 
> the {{AdminClient::describeConfigs}} go over the returned list and filter out 
> the config keys that they are interested in.
> This results in boilerplate code for all users of 
> {{AdminClient::describeConfigs}} api, in addition to  being wasteful use of 
> resource. It becomes painful in large cluster case where to fetch one 
> configuration of all topics, we need to fetch all configuration of all 
> topics, which can be huge in size. 
> Creating this Jira to add same field (i.e. {{{}ConfigurationKeys{}}}) to the 
> {{ConfigResource}} structure to bring it to parity to 
> {{DescribeConfigsResource}} Kafka API structure. There should be no backward 
> compatibility issue as the field will be optional and will behave same way if 
> it is not specified (i.e. by passing null to backend kafka api) 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13517) Update Admin::describeConfigs to allow fetching specific configurations

2022-02-20 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-13517:

Description: 
A list of {{ConfigResource}} class is passed as argument to 
{{AdminClient::describeConfigs}} api to indicate configuration of the entities 
to fetch. The {{ConfigResource}} class is made up of two fields, name and type 
of entity. Kafka returns *all* configurations for the entities provided to the 
admin client api.

This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the 
configuration for the entities in question. In addition to name and type of 
entity whose configuration to get, Kafka {{DescribeConfigsResource}} structure 
also lets users provide {{ConfigurationKeys}} list, which allows users to fetch 
only the configurations that are needed.

However, this field isn't exposed in the {{ConfigResource}} class that is used 
by AdminClient, so users of AdminClient have no way to ask for specific 
configuration. The API always returns *all* configurations. Then the user of 
the {{AdminClient::describeConfigs}} go over the returned list and filter out 
the config keys that they are interested in.

This results in boilerplate code for all users of 
{{AdminClient::describeConfigs}} api, in addition to  being wasteful use of 
resource. It becomes painful in large cluster case where to fetch one 
configuration of all topics, we need to fetch all configuration of all topics, 
which can be huge in size. 

Creating this Jira to track changes proposed in KIP-823: 
https://confluentinc.atlassian.net/wiki/spaces/~953215099/pages/2707357918/KIP+Admin+describeConfigs+should+allow+fetching+specific+configurations

  was:
A list of {{ConfigResource}} class is passed as argument to 
{{AdminClient::describeConfigs}} api to indicate configuration of the entities 
to fetch. The {{ConfigResource}} class is made up of two fields, name and type 
of entity. Kafka returns *all* configurations for the entities provided to the 
admin client api.

This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the 
configuration for the entities in question. In addition to name and type of 
entity whose configuration to get, Kafka {{DescribeConfigsResource}} structure 
also lets users provide {{ConfigurationKeys}} list, which allows users to fetch 
only the configurations that are needed.

However, this field isn't exposed in the {{ConfigResource}} class that is used 
by AdminClient, so users of AdminClient have no way to ask for specific 
configuration. The API always returns *all* configurations. Then the user of 
the {{AdminClient::describeConfigs}} go over the returned list and filter out 
the config keys that they are interested in.

This results in boilerplate code for all users of 
{{AdminClient::describeConfigs}} api, in addition to  being wasteful use of 
resource. It becomes painful in large cluster case where to fetch one 
configuration of all topics, we need to fetch all configuration of all topics, 
which can be huge in size. 

Creating this Jira to add same field (i.e. {{{}ConfigurationKeys{}}}) to the 
{{ConfigResource}} structure to bring it to parity to 
{{DescribeConfigsResource}} Kafka API structure. There should be no backward 
compatibility issue as the field will be optional and will behave same way if 
it is not specified (i.e. by passing null to backend kafka api) 


> Update Admin::describeConfigs to allow fetching specific configurations
> ---
>
> Key: KAFKA-13517
> URL: https://issues.apache.org/jira/browse/KAFKA-13517
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 2.8.1, 3.0.0
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> A list of {{ConfigResource}} class is passed as argument to 
> {{AdminClient::describeConfigs}} api to indicate configuration of the 
> entities to fetch. The {{ConfigResource}} class is made up of two fields, 
> name and type of entity. Kafka returns *all* configurations for the entities 
> provided to the admin client api.
> This admin api in turn uses {{DescribeConfigsRequest}} kafka api to get the 
> configuration for the entities in question. In addition to name and type of 
> entity whose configuration to get, Kafka {{DescribeConfigsResource}} 
> structure also lets users provide {{ConfigurationKeys}} list, which allows 
> users to fetch only the configurations that are needed.
> However, this field isn't exposed in the {{ConfigResource}} class that is 
> used by AdminClient, so users of AdminClient have no way to ask for specific 
> configuration. The API always returns *all* configurations. Then the user of 
> the {{AdminClient::describeConfigs}} go over the returned list and filter out 
> the config keys that they ar

[jira] [Created] (KAFKA-9329) KafkaController::replicasAreValid should return error

2019-12-23 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-9329:
--

 Summary: KafkaController::replicasAreValid should return error
 Key: KAFKA-9329
 URL: https://issues.apache.org/jira/browse/KAFKA-9329
 Project: Kafka
  Issue Type: Bug
Reporter: Vikas Singh


The method currently returns a boolean indicating if replicas are valid or not. 
But the failure condition loses any context on why replicas are not valid. We 
should return the error condition along with success/failure.

Maybe change method name to something like `validateReplicas` too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9329) KafkaController::replicasAreValid should return error

2019-12-23 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-9329:
--

Assignee: Vikas Singh

> KafkaController::replicasAreValid should return error
> -
>
> Key: KAFKA-9329
> URL: https://issues.apache.org/jira/browse/KAFKA-9329
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> The method currently returns a boolean indicating if replicas are valid or 
> not. But the failure condition loses any context on why replicas are not 
> valid. We should return the error condition along with success/failure.
> Maybe change method name to something like `validateReplicas` too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9330) Calling AdminClient.close in the AdminClient completion callback causes deadlock

2019-12-23 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-9330:
--

 Summary: Calling AdminClient.close in the AdminClient completion 
callback causes deadlock
 Key: KAFKA-9330
 URL: https://issues.apache.org/jira/browse/KAFKA-9330
 Project: Kafka
  Issue Type: Bug
Reporter: Vikas Singh


The close method calls `Thread.join` to wait for AdminClient thread to die, but 
that doesn't happen as the thread calling join is the AdminClient thread. This 
causes the thread to block forever, causing a deadlock where it forever waits 
for itself to die. 

`AdminClient.close` should check if the thread calling close is same as current 
thread, then skip the join. The thread will check for close condition in the 
main loop and exit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9330) Calling AdminClient.close in the AdminClient completion callback causes deadlock

2019-12-23 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-9330:
--

Assignee: Vikas Singh

> Calling AdminClient.close in the AdminClient completion callback causes 
> deadlock
> 
>
> Key: KAFKA-9330
> URL: https://issues.apache.org/jira/browse/KAFKA-9330
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> The close method calls `Thread.join` to wait for AdminClient thread to die, 
> but that doesn't happen as the thread calling join is the AdminClient thread. 
> This causes the thread to block forever, causing a deadlock where it forever 
> waits for itself to die. 
> `AdminClient.close` should check if the thread calling close is same as 
> current thread, then skip the join. The thread will check for close condition 
> in the main loop and exit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress

2020-01-06 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-9370:
--

 Summary: Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in 
progress
 Key: KAFKA-9370
 URL: https://issues.apache.org/jira/browse/KAFKA-9370
 Project: Kafka
  Issue Type: Bug
Reporter: Vikas Singh


`KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` if 
the topic is getting deleted. Change it to return `UNKNOWN_TOPIC_OR_PARTITION` 
instead. After the delete topic api returns, client should see the topic as 
deleted. The fact that we are processing deletion in background shouldn't have 
any impact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress

2020-01-06 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-9370:
--

Assignee: Vikas Singh

> Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
> --
>
> Key: KAFKA-9370
> URL: https://issues.apache.org/jira/browse/KAFKA-9370
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` 
> if the topic is getting deleted. Change it to return 
> `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, 
> client should see the topic as deleted. The fact that we are processing 
> deletion in background shouldn't have any impact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9254) Updating Kafka Broker configuration dynamically twice reverts log configuration to default

2020-01-24 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-9254:
---
Summary: Updating Kafka Broker configuration dynamically twice reverts log 
configuration to default  (was: Topic level configuration failed)

> Updating Kafka Broker configuration dynamically twice reverts log 
> configuration to default
> --
>
> Key: KAFKA-9254
> URL: https://issues.apache.org/jira/browse/KAFKA-9254
> Project: Kafka
>  Issue Type: Bug
>  Components: config, log, replication
>Affects Versions: 2.0.1
>Reporter: fenghong
>Assignee: huxihx
>Priority: Critical
>
> We are engineers at Huobi and now encounter Kafka BUG 
> Modifying DynamicBrokerConfig more than 2 times will invalidate the topic 
> level unrelated configuration
> The bug reproduction method as follows:
>  # Set Kafka Broker config  server.properties min.insync.replicas=3
>  # Create topic test-1 and set topic‘s level config min.insync.replicas=2
>  # Dynamically modify the configuration twice as shown below
> {code:java}
> bin/kafka-configs.sh --bootstrap-server xxx:9092 --entity-type brokers 
> --entity-default --alter --add-config log.message.timestamp.type=LogAppendTime
> bin/kafka-configs.sh --bootstrap-server xxx:9092 --entity-type brokers 
> --entity-default --alter --add-config log.retention.ms=60480
> {code}
>  # stop a Kafka Server and found the Exception as shown below
>  org.apache.kafka.common.errors.NotEnoughReplicasException: Number of insync 
> replicas for partition test-1-0 is [2], below required minimum [3]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-8813) Race condition when creating topics and changing their configuration

2019-08-26 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-8813:
--

Assignee: Vikas Singh

> Race condition when creating topics and changing their configuration
> 
>
> Key: KAFKA-8813
> URL: https://issues.apache.org/jira/browse/KAFKA-8813
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Vikas Singh
>Priority: Major
>
> In Partition.createLog we do:
> {code:java}
> val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
> props)
> val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
> isFutureReplica)
> {code}
> https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316
> Config changes that arrive after configs are loaded from ZK, but before 
> LogManager added the partition to `futureLogs` or `currentLogs` where the 
> dynamic config handlers picks up topics to update their configs, will be lost.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (KAFKA-8813) Race condition when creating topics and changing their configuration

2019-08-26 Thread Vikas Singh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916232#comment-16916232
 ] 

Vikas Singh commented on KAFKA-8813:


I am working on a fix for this issue. Expect a PR sometime this week.

> Race condition when creating topics and changing their configuration
> 
>
> Key: KAFKA-8813
> URL: https://issues.apache.org/jira/browse/KAFKA-8813
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Vikas Singh
>Priority: Major
>
> In Partition.createLog we do:
> {code:java}
> val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
> props)
> val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
> isFutureReplica)
> {code}
> https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316
> Config changes that arrive after configs are loaded from ZK, but before 
> LogManager added the partition to `futureLogs` or `currentLogs` where the 
> dynamic config handlers picks up topics to update their configs, will be lost.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (KAFKA-8813) Race condition when creating topics and changing their configuration

2019-09-03 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-8813:
---
Description: 
In Partition.createLog we do:
{code:java}
val props = stateStore.fetchTopicConfig()
val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
props)
val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
isFutureReplica)
{code}
[https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316]

Config changes that arrive after configs are loaded from ZK, but before 
LogManager added the partition to `futureLogs` or `currentLogs` where the 
dynamic config handlers picks up topics to update their configs, will be lost.

  was:
In Partition.createLog we do:
{code:java}
val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
props)
val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
isFutureReplica)
{code}
https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316

Config changes that arrive after configs are loaded from ZK, but before 
LogManager added the partition to `futureLogs` or `currentLogs` where the 
dynamic config handlers picks up topics to update their configs, will be lost.


> Race condition when creating topics and changing their configuration
> 
>
> Key: KAFKA-8813
> URL: https://issues.apache.org/jira/browse/KAFKA-8813
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Vikas Singh
>Priority: Major
>
> In Partition.createLog we do:
> {code:java}
> val props = stateStore.fetchTopicConfig()
> val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
> props)
> val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
> isFutureReplica)
> {code}
> [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316]
> Config changes that arrive after configs are loaded from ZK, but before 
> LogManager added the partition to `futureLogs` or `currentLogs` where the 
> dynamic config handlers picks up topics to update their configs, will be lost.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (KAFKA-8813) Race condition when creating topics and changing their configuration

2019-09-03 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-8813:
---
Description: 
In Partition.createLog we do:
{code:java}
val props = stateStore.fetchTopicConfig()
val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
props)
val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
isFutureReplica)
{code}
[https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316]

Config changes that arrive after configs are loaded from ZK, but before 
LogManager added the partition to `futureLogs` or `currentLogs` where the 
dynamic config handlers picks up topics to update their configs, will be lost.

  was:
In Partition.createLog we do:
{code:java}
val props = stateStore.fetchTopicConfig()
val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
props)
val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
isFutureReplica)
{code}
[https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316]

Config changes that arrive after configs are loaded from ZK, but before 
LogManager added the partition to `futureLogs` or `currentLogs` where the 
dynamic config handlers picks up topics to update their configs, will be lost.


> Race condition when creating topics and changing their configuration
> 
>
> Key: KAFKA-8813
> URL: https://issues.apache.org/jira/browse/KAFKA-8813
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Vikas Singh
>Priority: Major
>
> In Partition.createLog we do:
> {code:java}
> val props = stateStore.fetchTopicConfig()
> val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
> props)
> val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
> isFutureReplica)
> {code}
> [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316]
> Config changes that arrive after configs are loaded from ZK, but before 
> LogManager added the partition to `futureLogs` or `currentLogs` where the 
> dynamic config handlers picks up topics to update their configs, will be lost.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (KAFKA-8813) Race condition when creating topics and changing their configuration

2019-09-03 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-8813:
---
Description: 
In Partition.createLog we do:
{code:java}
val props = stateStore.fetchTopicConfig()
val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
props)
val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
isFutureReplica)
{code}
[https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316]

Config changes that arrive after configs are loaded from ZK, but before 
LogManager added the partition to `futureLogs` or `currentLogs` where the 
dynamic config handlers picks up topics to update their configs, will be lost.

  was:
In Partition.createLog we do:
{code:java}
val props = stateStore.fetchTopicConfig()
val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
props)
val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
isFutureReplica)
{code}
[https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316]

Config changes that arrive after configs are loaded from ZK, but before 
LogManager added the partition to `futureLogs` or `currentLogs` where the 
dynamic config handlers picks up topics to update their configs, will be lost.


> Race condition when creating topics and changing their configuration
> 
>
> Key: KAFKA-8813
> URL: https://issues.apache.org/jira/browse/KAFKA-8813
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Vikas Singh
>Priority: Major
>
> In Partition.createLog we do:
> {code:java}
> val props = stateStore.fetchTopicConfig()
> val config = LogConfig.fromProps(logManager.currentDefaultConfig.originals, 
> props)
> val log = logManager.getOrCreateLog(topicPartition, config, isNew, 
> isFutureReplica)
> {code}
> [https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L314-L316|https://github.com/apache/kafka/blob/33d06082117d971cdcddd4f01392006b543f3c01/core/src/main/scala/kafka/cluster/Partition.scala#L315-L316]
> Config changes that arrive after configs are loaded from ZK, but before 
> LogManager added the partition to `futureLogs` or `currentLogs` where the 
> dynamic config handlers picks up topics to update their configs, will be lost.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (KAFKA-8988) Replace CreatePartitions request/response with automated protocol

2019-10-04 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-8988:
--

 Summary: Replace CreatePartitions request/response with automated 
protocol
 Key: KAFKA-8988
 URL: https://issues.apache.org/jira/browse/KAFKA-8988
 Project: Kafka
  Issue Type: Sub-task
Reporter: Vikas Singh
Assignee: Vikas Singh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-9265) kafka.log.Log instances are leaking on log delete

2019-12-03 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-9265:
--

 Summary: kafka.log.Log instances are leaking on log delete
 Key: KAFKA-9265
 URL: https://issues.apache.org/jira/browse/KAFKA-9265
 Project: Kafka
  Issue Type: Bug
Reporter: Vikas Singh


KAFKA-8448 fixes problem with similar leak. The {{Log}} objects are being held 
in {{ScheduledExecutor}} {{PeriodicProducerExpirationCheck}} callback. The fix 
in KAFKA-8448 was to change the policy of {{ScheduledExecutor}} to remove the 
scheduled task when it gets canceled (by calling 
{{setRemoveOnCancelPolicy(true)}}).

This works when a log is closed using {{close()}} method. But when a log is 
deleted either when the topic gets deleted or when the rebalancing operation 
moves the replica away from broker, the {{delete()}} operation is invoked. 
{{Log.delete()}} doesn't close the pending scheduled task and that leaks Log 
instance.

Fix is to close the scheduled task in the {{Log.delete()}} method too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-9265) kafka.log.Log instances are leaking on log delete

2019-12-03 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-9265:
--

Assignee: Vikas Singh

> kafka.log.Log instances are leaking on log delete
> -
>
> Key: KAFKA-9265
> URL: https://issues.apache.org/jira/browse/KAFKA-9265
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> KAFKA-8448 fixes problem with similar leak. The {{Log}} objects are being 
> held in {{ScheduledExecutor}} {{PeriodicProducerExpirationCheck}} callback. 
> The fix in KAFKA-8448 was to change the policy of {{ScheduledExecutor}} to 
> remove the scheduled task when it gets canceled (by calling 
> {{setRemoveOnCancelPolicy(true)}}).
> This works when a log is closed using {{close()}} method. But when a log is 
> deleted either when the topic gets deleted or when the rebalancing operation 
> moves the replica away from broker, the {{delete()}} operation is invoked. 
> {{Log.delete()}} doesn't close the pending scheduled task and that leaks Log 
> instance.
> Fix is to close the scheduled task in the {{Log.delete()}} method too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10531) KafkaBasedLog can sleep for negative values

2020-09-28 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-10531:
---

 Summary: KafkaBasedLog can sleep for negative values
 Key: KAFKA-10531
 URL: https://issues.apache.org/jira/browse/KAFKA-10531
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.6.0
Reporter: Vikas Singh
 Fix For: 2.6.1


{{time.milliseconds}} is not monotonic, so this code can throw :

{{java.lang.IllegalArgumentException: timeout value is negative}}

 
{code:java}
long started = time.milliseconds();
while (partitionInfos == null && time.milliseconds() - started < 
CREATE_TOPIC_TIMEOUT_MS) {
partitionInfos = consumer.partitionsFor(topic);
Utils.sleep(Math.min(time.milliseconds() - started, 1000));
}
{code}

We need to check for negative value before sleeping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-10531) KafkaBasedLog can sleep for negative values

2020-09-28 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-10531:
---

Assignee: Vikas Singh

> KafkaBasedLog can sleep for negative values
> ---
>
> Key: KAFKA-10531
> URL: https://issues.apache.org/jira/browse/KAFKA-10531
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.6.0
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
> Fix For: 2.6.1
>
>
> {{time.milliseconds}} is not monotonic, so this code can throw :
> {{java.lang.IllegalArgumentException: timeout value is negative}}
>  
> {code:java}
> long started = time.milliseconds();
> while (partitionInfos == null && time.milliseconds() - started < 
> CREATE_TOPIC_TIMEOUT_MS) {
> partitionInfos = consumer.partitionsFor(topic);
> Utils.sleep(Math.min(time.milliseconds() - started, 1000));
> }
> {code}
> We need to check for negative value before sleeping.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress

2020-09-28 Thread Vikas Singh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203527#comment-17203527
 ] 

Vikas Singh commented on KAFKA-9370:


https://github.com/apache/kafka/pull/9347

> Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
> --
>
> Key: KAFKA-9370
> URL: https://issues.apache.org/jira/browse/KAFKA-9370
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` 
> if the topic is getting deleted. Change it to return 
> `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, 
> client should see the topic as deleted. The fact that we are processing 
> deletion in background shouldn't have any impact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KAFKA-9370) Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress

2020-09-28 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-9370:
---
Comment: was deleted

(was: https://github.com/apache/kafka/pull/9347)

> Return UNKNOWN_TOPIC_OR_PARTITION if topic deletion is in progress
> --
>
> Key: KAFKA-9370
> URL: https://issues.apache.org/jira/browse/KAFKA-9370
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> `KafkaApis::handleCreatePartitionsRequest` returns `INVALID_TOPIC_EXCEPTION` 
> if the topic is getting deleted. Change it to return 
> `UNKNOWN_TOPIC_OR_PARTITION` instead. After the delete topic api returns, 
> client should see the topic as deleted. The fact that we are processing 
> deletion in background shouldn't have any impact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas

2020-06-16 Thread Vikas Singh (Jira)
Vikas Singh created KAFKA-10175:
---

 Summary: MetadataCache::getCluster returns null for offline 
replicas
 Key: KAFKA-10175
 URL: https://issues.apache.org/jira/browse/KAFKA-10175
 Project: Kafka
  Issue Type: Bug
Reporter: Vikas Singh


This line in the code always returns null:

https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/MetadataCache.scala#L272

The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
object otherwise default to `null`. Offline replicas thus end up always as 
null. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas

2020-06-16 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-10175:

Description: 
This line in the code always returns null:

> https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272

The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
object otherwise default to `null`. Offline replicas thus end up always as 
null. 

  was:
This line in the code always returns null:

https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/MetadataCache.scala#L272

The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
object otherwise default to `null`. Offline replicas thus end up always as 
null. 


> MetadataCache::getCluster returns null for offline replicas
> ---
>
> Key: KAFKA-10175
> URL: https://issues.apache.org/jira/browse/KAFKA-10175
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Priority: Major
>
> This line in the code always returns null:
> > https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272
> The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
> object otherwise default to `null`. Offline replicas thus end up always as 
> null. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas

2020-06-16 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-10175:

Description: 
This line in the code always returns null:

[[MetadataCache|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]::getClusterMetadata|[https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]]

The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
object otherwise default to `null`. Offline replicas thus end up always as null.

  was:
This line in the code always returns null:

> https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272

The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
object otherwise default to `null`. Offline replicas thus end up always as 
null. 


> MetadataCache::getCluster returns null for offline replicas
> ---
>
> Key: KAFKA-10175
> URL: https://issues.apache.org/jira/browse/KAFKA-10175
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Priority: Major
>
> This line in the code always returns null:
> [[MetadataCache|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]::getClusterMetadata|[https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]]
> The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
> object otherwise default to `null`. Offline replicas thus end up always as 
> null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10175) MetadataCache::getClusterMetadata returns null for offline replicas

2020-06-16 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-10175:

Summary: MetadataCache::getClusterMetadata returns null for offline 
replicas  (was: MetadataCache::getCluster returns null for offline replicas)

> MetadataCache::getClusterMetadata returns null for offline replicas
> ---
>
> Key: KAFKA-10175
> URL: https://issues.apache.org/jira/browse/KAFKA-10175
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Priority: Major
>
> This line in the code always returns null:
> [MetadataCache::getClusterMetadata|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]
> The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
> object otherwise default to `null`. Offline replicas thus end up always as 
> null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-10175) MetadataCache::getCluster returns null for offline replicas

2020-06-16 Thread Vikas Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh updated KAFKA-10175:

Description: 
This line in the code always returns null:

[MetadataCache::getClusterMetadata|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]

The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
object otherwise default to `null`. Offline replicas thus end up always as null.

  was:
This line in the code always returns null:

[[MetadataCache|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]::getClusterMetadata|[https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]]

The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
object otherwise default to `null`. Offline replicas thus end up always as null.


> MetadataCache::getCluster returns null for offline replicas
> ---
>
> Key: KAFKA-10175
> URL: https://issues.apache.org/jira/browse/KAFKA-10175
> Project: Kafka
>  Issue Type: Bug
>Reporter: Vikas Singh
>Priority: Major
>
> This line in the code always returns null:
> [MetadataCache::getClusterMetadata|https://github.com/apache/kafka/blob/2.6/core/src/main/scala/kafka/server/MetadataCache.scala#L272]
> The reason is that the `map(node)` part uses `aliveNodes` to create `Node` 
> object otherwise default to `null`. Offline replicas thus end up always as 
> null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KAFKA-8001) Fetch from future replica stalls when local replica becomes a leader

2019-05-09 Thread Vikas Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-8001:
--

Assignee: Vikas Singh  (was: Colin Hicks)

> Fetch from future replica stalls when local replica becomes a leader
> 
>
> Key: KAFKA-8001
> URL: https://issues.apache.org/jira/browse/KAFKA-8001
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Anna Povzner
>Assignee: Vikas Singh
>Priority: Critical
>
> With KIP-320, fetch from follower / future replica returns 
> FENCED_LEADER_EPOCH if current leader epoch in the request is lower than the 
> leader epoch known to the leader (or local replica in case of future replica 
> fetching). In case of future replica fetching from the local replica, if 
> local replica becomes the leader of the partition, the next fetch from future 
> replica fails with FENCED_LEADER_EPOCH and fetching from future replica is 
> stopped until the next leader change. 
> Proposed solution: on local replica leader change, future replica should 
> "become a follower" again, and go through the truncation phase. Or we could 
> optimize it, and just update partition state of the future replica to reflect 
> the updated current leader epoch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8398) NPE when unmapping files after moving log directories using AlterReplicaLogDirs

2019-05-20 Thread Vikas Singh (JIRA)
Vikas Singh created KAFKA-8398:
--

 Summary: NPE when unmapping files after moving log directories 
using AlterReplicaLogDirs
 Key: KAFKA-8398
 URL: https://issues.apache.org/jira/browse/KAFKA-8398
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.2.0
Reporter: Vikas Singh
 Attachments: AlterReplicaLogDirs.txt

The NPE occurs after the AlterReplicaLogDirs command completes successfully and 
when unmapping older regions. The relevant part of log is in attached log file. 
Here is the stacktrace (which is repeated for both index files):

 
{code:java}
[2019-05-20 14:08:13,999] ERROR Error unmapping index 
/tmp/kafka-logs/test-0.567a0d8ff88b45ab95794020d0b2e66f-delete/.index
 (kafka.log.OffsetIndex)
java.lang.NullPointerException
at 
org.apache.kafka.common.utils.MappedByteBuffers.unmap(MappedByteBuffers.java:73)
at kafka.log.AbstractIndex.forceUnmap(AbstractIndex.scala:318)
at kafka.log.AbstractIndex.safeForceUnmap(AbstractIndex.scala:308)
at kafka.log.AbstractIndex.$anonfun$closeHandler$1(AbstractIndex.scala:257)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
at kafka.log.AbstractIndex.closeHandler(AbstractIndex.scala:257)
at kafka.log.AbstractIndex.deleteIfExists(AbstractIndex.scala:226)
at kafka.log.LogSegment.$anonfun$deleteIfExists$6(LogSegment.scala:597)
at kafka.log.LogSegment.delete$1(LogSegment.scala:585)
at kafka.log.LogSegment.$anonfun$deleteIfExists$5(LogSegment.scala:597)
at kafka.utils.CoreUtils$.$anonfun$tryAll$1(CoreUtils.scala:115)
at kafka.utils.CoreUtils$.$anonfun$tryAll$1$adapted(CoreUtils.scala:114)
at scala.collection.immutable.List.foreach(List.scala:392)
at kafka.utils.CoreUtils$.tryAll(CoreUtils.scala:114)
at kafka.log.LogSegment.deleteIfExists(LogSegment.scala:599)
at kafka.log.Log.$anonfun$delete$3(Log.scala:1762)
at kafka.log.Log.$anonfun$delete$3$adapted(Log.scala:1762)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at kafka.log.Log.$anonfun$delete$2(Log.scala:1762)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at kafka.log.Log.maybeHandleIOException(Log.scala:2013)
at kafka.log.Log.delete(Log.scala:1759)
at kafka.log.LogManager.deleteLogs(LogManager.scala:761)
at kafka.log.LogManager.$anonfun$deleteLogs$6(LogManager.scala:775)
at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:63)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8341) AdminClient should retry coordinator lookup after NOT_COORDINATOR error

2019-05-24 Thread Vikas Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh resolved KAFKA-8341.

Resolution: Fixed

fixed in commit 46a02f3231cd6d340c622636159b9f59b4b3cb6e

> AdminClient should retry coordinator lookup after NOT_COORDINATOR error
> ---
>
> Key: KAFKA-8341
> URL: https://issues.apache.org/jira/browse/KAFKA-8341
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Vikas Singh
>Priority: Major
>
> If a group operation (e.g. DescribeGroup) fails because the coordinator has 
> moved, the AdminClient should lookup the coordinator before retrying the 
> operation. Currently we will either fail or just retry anyway. This is 
> similar in some ways to controller rediscovery after getting NOT_CONTROLLER 
> errors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8457) Remove Log dependency from Replica

2019-05-31 Thread Vikas Singh (JIRA)
Vikas Singh created KAFKA-8457:
--

 Summary: Remove Log dependency from Replica
 Key: KAFKA-8457
 URL: https://issues.apache.org/jira/browse/KAFKA-8457
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Vikas Singh
Assignee: Vikas Singh


A partition can have one log but many replicas. Putting log in replica meant 
that we have to have if-else each time we need to access log. Moving the log 
out of replica and in partition will make code simpler and it will also help in 
testing where mocks will get simplified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8525) Make log in Partion non-optional

2019-06-11 Thread Vikas Singh (JIRA)
Vikas Singh created KAFKA-8525:
--

 Summary: Make log in Partion non-optional
 Key: KAFKA-8525
 URL: https://issues.apache.org/jira/browse/KAFKA-8525
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.3.0
Reporter: Vikas Singh


While moving log out of replica to partition as part of KAFKA-8457 cleaned a 
bunch of code by removing code like "if (!localReplica) throw), there are still 
couple of additional cleanups that can be done:
 # The log object in Partition can be made non-optional. As it doesn't make 
sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: 
{{I think it shouldn't be possible to have a Partition without a corresponding 
Log. Once this is merged, I think we can look into whether we can replace the 
optional log field in this class with a concrete instance.}}
 # The LocalReplica class can be removed simplifying replica class. Here is 
another comment on the PR: {{it might be possible to turn Replica into a trait 
and then let Log implement it directly. Then we could get rid of LocalReplica. 
That would also help us clean up RemoteReplica, since the usage of 
LogOffsetMetadata only makes sense for the local replica.}}

Creating this JIRA to track these refactoring tasks for future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-8525) Make log in Partion non-optional

2019-06-11 Thread Vikas Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh reassigned KAFKA-8525:
--

Assignee: Vikas Singh

> Make log in Partion non-optional
> 
>
> Key: KAFKA-8525
> URL: https://issues.apache.org/jira/browse/KAFKA-8525
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Minor
>
> While moving log out of replica to partition as part of KAFKA-8457 cleaned a 
> bunch of code by removing code like "if (!localReplica) throw), there are 
> still couple of additional cleanups that can be done:
>  # The log object in Partition can be made non-optional. As it doesn't make 
> sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: 
> {{I think it shouldn't be possible to have a Partition without a 
> corresponding Log. Once this is merged, I think we can look into whether we 
> can replace the optional log field in this class with a concrete instance.}}
>  # The LocalReplica class can be removed simplifying replica class. Here is 
> another comment on the PR: {{it might be possible to turn Replica into a 
> trait and then let Log implement it directly. Then we could get rid of 
> LocalReplica. That would also help us clean up RemoteReplica, since the usage 
> of LogOffsetMetadata only makes sense for the local replica.}}
> Creating this JIRA to track these refactoring tasks for future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8457) Remove Log dependency from Replica

2019-06-17 Thread Vikas Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Singh resolved KAFKA-8457.

Resolution: Fixed

Fixed in commit 57baa4079d9fc14103411f790b9a025c9f2146a4

> Remove Log dependency from Replica
> --
>
> Key: KAFKA-8457
> URL: https://issues.apache.org/jira/browse/KAFKA-8457
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Major
>
> A partition can have one log but many replicas. Putting log in replica meant 
> that we have to have if-else each time we need to access log. Moving the log 
> out of replica and in partition will make code simpler and it will also help 
> in testing where mocks will get simplified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8525) Make log in Partion non-optional

2019-06-17 Thread Vikas Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865789#comment-16865789
 ] 

Vikas Singh commented on KAFKA-8525:


#2 was already done as part of KAFKA-8457. This Jira will only track the first 
case above in description.

> Make log in Partion non-optional
> 
>
> Key: KAFKA-8525
> URL: https://issues.apache.org/jira/browse/KAFKA-8525
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.3.0
>Reporter: Vikas Singh
>Assignee: Vikas Singh
>Priority: Minor
>
> While moving log out of replica to partition as part of KAFKA-8457 cleaned a 
> bunch of code by removing code like "if (!localReplica) throw), there are 
> still couple of additional cleanups that can be done:
>  # The log object in Partition can be made non-optional. As it doesn't make 
> sense to have a partition w/o log. Here is comment on PR for KAFKA-8457: 
> {{I think it shouldn't be possible to have a Partition without a 
> corresponding Log. Once this is merged, I think we can look into whether we 
> can replace the optional log field in this class with a concrete instance.}}
>  # The LocalReplica class can be removed simplifying replica class. Here is 
> another comment on the PR: {{it might be possible to turn Replica into a 
> trait and then let Log implement it directly. Then we could get rid of 
> LocalReplica. That would also help us clean up RemoteReplica, since the usage 
> of LogOffsetMetadata only makes sense for the local replica.}}
> Creating this JIRA to track these refactoring tasks for future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8001) Fetch from future replica stalls when local replica becomes a leader

2019-06-19 Thread Vikas Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868153#comment-16868153
 ] 

Vikas Singh commented on KAFKA-8001:


Comment from Jason on PR that need to be taken care of:
{noformat}
I was discussing with @apovzner and we realized that there may be a few more 
issues here. For a normal replica, whenever we observe an epoch bump, we go 
through a reconciliation protocol to find a truncation point at which we know 
it is safe to resume fetching. Basically it works like this:

Follower observes epoch bump and enters Truncating state.
Follower sends OffsetsForLeaderEpoch query to leader with the latest epoch from 
its log.
Leader looks in its local log to find the largest epoch less than or equal to 
the requested epoch and returns its end offset.
The follower will truncate to this offset and then possibly go back to 2. if 
additional truncation is needed.
Once truncation is complete, the follower enters the Fetching state.

For a future replica, the protocol is basically the same, but rather than 
sending OffsetsForLeaderEpoch to the leader, we use the state from the local 
replica which may or may not be the leader of the bumped epoch. The basic 
problem we realized is that this log reconciliation is not safe while the local 
replica is in the Truncating state and we have nothing at the moment to 
guarantee that it is not in that state. Basically we have to wait until it has 
reached step 5 before the future replica can do its own truncation.

I suspect probably what we have to do to fix this problem is move the state of 
the fetcher (i.e. Truncating|Fetching and the current epoch) out of 
AbstractFetcherThread and into something which can be accessed by alter log dir 
fetcher. The most obvious candidate seems like kafka.cluster.Replica. 
Unfortunately, this may not be a small work, but I cannot think how to make 
this protocol work unless we do it.{noformat}

> Fetch from future replica stalls when local replica becomes a leader
> 
>
> Key: KAFKA-8001
> URL: https://issues.apache.org/jira/browse/KAFKA-8001
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0, 2.1.1
>Reporter: Anna Povzner
>Assignee: Vikas Singh
>Priority: Critical
>
> With KIP-320, fetch from follower / future replica returns 
> FENCED_LEADER_EPOCH if current leader epoch in the request is lower than the 
> leader epoch known to the leader (or local replica in case of future replica 
> fetching). In case of future replica fetching from the local replica, if 
> local replica becomes the leader of the partition, the next fetch from future 
> replica fails with FENCED_LEADER_EPOCH and fetching from future replica is 
> stopped until the next leader change. 
> Proposed solution: on local replica leader change, future replica should 
> "become a follower" again, and go through the truncation phase. Or we could 
> optimize it, and just update partition state of the future replica to reflect 
> the updated current leader epoch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)