junrao commented on code in PR #19964: URL: https://github.com/apache/kafka/pull/19964#discussion_r2156322849
########## clients/src/main/java/org/apache/kafka/common/TopicIdPartition.java: ########## @@ -78,6 +80,21 @@ public TopicPartition topicPartition() { return topicPartition; } + /** + * Checking if TopicIdPartition meant to be the same reference to same this object but doesn't have all the data. Review Comment: same this object => this object ########## core/src/test/scala/integration/kafka/api/ProducerSendWhileDeletionTest.scala: ########## @@ -81,6 +86,59 @@ class ProducerSendWhileDeletionTest extends IntegrationTestHarness { assertEquals(topic, producer.send(new ProducerRecord(topic, null, "value".getBytes(StandardCharsets.UTF_8))).get.topic()) } + @Test + def testSendWhileTopicGetRecreated(): Unit = { + val numRecords = 10 + val topic = "topic" + val admin = createAdminClient() + val producer = createProducer() + + try { + val fs = CompletableFuture.runAsync(() => { + for (_ <- 1 to 20) { + recreateTopic(admin, topic) + } + }) + val producerFutures = new util.ArrayList[CompletableFuture[Void]] + for (_ <- 0 until numRecords) { + producerFutures.add(CompletableFuture.runAsync(() => { + for (i <- 0 until numRecords) { + producer.send(new ProducerRecord(topic, null, ("value" + i).getBytes(StandardCharsets.UTF_8)), new Callback() { + override def onCompletion(metadata: RecordMetadata, exception: Exception): Unit = { + assertNotNull(metadata) + assertNotEquals(metadata.offset, -1L) Review Comment: Since the topic could be deleted, it's possible for the metadata to have -1, right? Also, could we verify that every record's callback is called? ########## core/src/test/scala/integration/kafka/api/ProducerSendWhileDeletionTest.scala: ########## @@ -81,6 +86,59 @@ class ProducerSendWhileDeletionTest extends IntegrationTestHarness { assertEquals(topic, producer.send(new ProducerRecord(topic, null, "value".getBytes(StandardCharsets.UTF_8))).get.topic()) } + @Test + def testSendWhileTopicGetRecreated(): Unit = { + val numRecords = 10 + val topic = "topic" + val admin = createAdminClient() + val producer = createProducer() + + try { + val fs = CompletableFuture.runAsync(() => { + for (_ <- 1 to 20) { + recreateTopic(admin, topic) + } + }) + val producerFutures = new util.ArrayList[CompletableFuture[Void]] + for (_ <- 0 until numRecords) { + producerFutures.add(CompletableFuture.runAsync(() => { + for (i <- 0 until numRecords) { + producer.send(new ProducerRecord(topic, null, ("value" + i).getBytes(StandardCharsets.UTF_8)), new Callback() { + override def onCompletion(metadata: RecordMetadata, exception: Exception): Unit = { + assertNotNull(metadata) + assertNotEquals(metadata.offset, -1L) + assertNotEquals(metadata.timestamp, RecordBatch.NO_TIMESTAMP) + } + }).get() + } + })) + } + fs.join() + producerFutures.forEach(_.join) + } catch { + case e: TopicExistsException => Review Comment: Hmm, where would a TopicExistsException be thrown? ########## core/src/test/scala/integration/kafka/api/ProducerSendWhileDeletionTest.scala: ########## @@ -81,6 +86,59 @@ class ProducerSendWhileDeletionTest extends IntegrationTestHarness { assertEquals(topic, producer.send(new ProducerRecord(topic, null, "value".getBytes(StandardCharsets.UTF_8))).get.topic()) } + @Test + def testSendWhileTopicGetRecreated(): Unit = { + val numRecords = 10 + val topic = "topic" + val admin = createAdminClient() + val producer = createProducer() + + try { + val fs = CompletableFuture.runAsync(() => { + for (_ <- 1 to 20) { + recreateTopic(admin, topic) + } + }) + val producerFutures = new util.ArrayList[CompletableFuture[Void]] + for (_ <- 0 until numRecords) { + producerFutures.add(CompletableFuture.runAsync(() => { + for (i <- 0 until numRecords) { + producer.send(new ProducerRecord(topic, null, ("value" + i).getBytes(StandardCharsets.UTF_8)), new Callback() { + override def onCompletion(metadata: RecordMetadata, exception: Exception): Unit = { + assertNotNull(metadata) + assertNotEquals(metadata.offset, -1L) + assertNotEquals(metadata.timestamp, RecordBatch.NO_TIMESTAMP) + } + }).get() + } + })) + } + fs.join() + producerFutures.forEach(_.join) + } catch { + case e: TopicExistsException => + admin.deleteTopics(util.List.of(topic)).all().get() + } finally { + admin.close() + producer.close() + } + } + + def recreateTopic(admin: Admin, topic: String): Unit = { + waitForCondition(() => { + val topicId = if (admin.listTopics().names().get().contains(topic)) { + topicMetadata(admin, topic).topicId() + } else Uuid.ZERO_UUID + // don't wait for the physical delete + deleteTopicWithAdminRaw(admin, topic) Review Comment: Hmm, `deleteTopicWithAdminRaw()` doesn't wait for the metadata propagation to the brokers. However, the producer only sees the deleted topic after the metadata is propagated. Is this test effective? ########## clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java: ########## @@ -609,7 +607,18 @@ private void handleProduceResponse(ClientResponse response, Map<TopicPartition, .collect(Collectors.toList()), p.errorMessage(), p.currentLeader()); - ProducerBatch batch = batches.get(tp); + // Version 13 drop topic name and add support to topic id. + // We need to find batch based on topic id and partition index only as + // topic name in the response might be empty. + TopicIdPartition tpId = new TopicIdPartition(r.topicId(), p.index(), r.name()); + ProducerBatch batch = batches.entrySet().stream() Review Comment: This changes a map lookup to an iteration. Could we do some produce perf test (with multiple topic/partitions) to verify there is no performance degradation? ########## core/src/test/scala/integration/kafka/api/ProducerSendWhileDeletionTest.scala: ########## @@ -81,6 +86,59 @@ class ProducerSendWhileDeletionTest extends IntegrationTestHarness { assertEquals(topic, producer.send(new ProducerRecord(topic, null, "value".getBytes(StandardCharsets.UTF_8))).get.topic()) } + @Test + def testSendWhileTopicGetRecreated(): Unit = { + val numRecords = 10 + val topic = "topic" + val admin = createAdminClient() + val producer = createProducer() + + try { + val fs = CompletableFuture.runAsync(() => { + for (_ <- 1 to 20) { + recreateTopic(admin, topic) + } + }) + val producerFutures = new util.ArrayList[CompletableFuture[Void]] + for (_ <- 0 until numRecords) { + producerFutures.add(CompletableFuture.runAsync(() => { + for (i <- 0 until numRecords) { + producer.send(new ProducerRecord(topic, null, ("value" + i).getBytes(StandardCharsets.UTF_8)), new Callback() { + override def onCompletion(metadata: RecordMetadata, exception: Exception): Unit = { + assertNotNull(metadata) + assertNotEquals(metadata.offset, -1L) + assertNotEquals(metadata.timestamp, RecordBatch.NO_TIMESTAMP) + } + }).get() + } + })) + } + fs.join() + producerFutures.forEach(_.join) + } catch { + case e: TopicExistsException => + admin.deleteTopics(util.List.of(topic)).all().get() + } finally { + admin.close() + producer.close() + } + } + + def recreateTopic(admin: Admin, topic: String): Unit = { Review Comment: Could this be private? ########## clients/src/main/java/org/apache/kafka/common/TopicIdPartition.java: ########## @@ -78,6 +80,21 @@ public TopicPartition topicPartition() { return topicPartition; } + /** + * Checking if TopicIdPartition meant to be the same reference to same this object but doesn't have all the data. + * If topic name is empty and topic id is persisted then the method will rely on topic id only + * otherwise the method will rely on topic name. + * @return true if topic has same topicId and partition index as topic names some time might be empty. + */ + public boolean same(TopicIdPartition tpId) { + if (Utils.isBlank(tpId.topic()) && !tpId.topicId.equals(Uuid.ZERO_UUID)) { + return topicId.equals(tpId.topicId) && + topicPartition.partition() == tpId.partition(); + } else { + return topicPartition.equals(tpId.topicPartition()); Review Comment: In the rare case that `Sender::topicIdsForBatches` returns 0 topic id (e.g. topic is deleted), we will pass along topicName -> 0 to handleProduceResponse(). The response will include empty topic and 0 topic id. It's important that we find a match in this case to avoid IllegalStateException. I am thinking that we should first try to do the comparison on topic name, if it's not empty. Otherwise, just do the comparison on topic id even if it's zero. ########## clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java: ########## @@ -609,7 +607,18 @@ private void handleProduceResponse(ClientResponse response, Map<TopicPartition, .collect(Collectors.toList()), p.errorMessage(), p.currentLeader()); - ProducerBatch batch = batches.get(tp); + // Version 13 drop topic name and add support to topic id. + // We need to find batch based on topic id and partition index only as + // topic name in the response might be empty. + TopicIdPartition tpId = new TopicIdPartition(r.topicId(), p.index(), r.name()); + ProducerBatch batch = batches.entrySet().stream() + .filter(entry -> entry.getKey().same(tpId)) + .map(Map.Entry::getValue).findFirst().orElse(null); + + if (batch == null) { + throw new IllegalStateException("batch created for " + tpId + " can't be found, " + + "topic might be recreated after the batch creation."); Review Comment: Hmm, the recreation of the topic shouldn't hit this IllegalStateException, right? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org