[jira] [Updated] (KAFKA-6291) Cannot close EmbeddedZookeeper on Windows
[ https://issues.apache.org/jira/browse/KAFKA-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viliam Durina updated KAFKA-6291: - Description: We created {{EmbeddedZookeeper}} and {{ZkClient}} for various tests using this code: {code:java} EmbeddedZookeeper zkServer = new EmbeddedZookeeper(); ZkClient zkClient = new ZkClient("127.0.0.1:" + zkServer.port(), 3, 3, ZKStringSerializer$.MODULE$); zkClient.close(); zkServer.shutdown(); {code} This works fine on Linux, but on Windows, the {{zkServer.shutdown()}} call fails with this exception: {code} [Thread-1] ERROR org.apache.kafka.test.TestUtils - Error deleting C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083 java.nio.file.FileSystemException: C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083\version-2\log.1: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) at java.nio.file.Files.delete(Files.java:1126) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:630) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:619) at java.nio.file.Files.walkFileTree(Files.java:2670) at java.nio.file.Files.walkFileTree(Files.java:2742) at org.apache.kafka.common.utils.Utils.delete(Utils.java:619) at org.apache.kafka.test.TestUtils$1.run(TestUtils.java:184) java.nio.file.FileSystemException: C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083\version-2\log.1: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) at java.nio.file.Files.delete(Files.java:1126) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:630) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:619) at java.nio.file.Files.walkFileTree(Files.java:2670) at java.nio.file.Files.walkFileTree(Files.java:2742) at org.apache.kafka.common.utils.Utils.delete(Utils.java:619) at kafka.zk.EmbeddedZookeeper.shutdown(EmbeddedZookeeper.scala:53) at com.hazelcast.jet.KafkaTest.test(KafkaTest.java:32) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) {code} My workaround is to comment
[jira] [Updated] (KAFKA-6291) Cannot close EmbeddedZookeeper on Windows
[ https://issues.apache.org/jira/browse/KAFKA-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viliam Durina updated KAFKA-6291: - Description: We created {{EmbeddedZookeeper}} and {{ZkClient}} for various tests using this code: {code:java} EmbeddedZookeeper zkServer = new EmbeddedZookeeper(); ZkClient zkClient = new ZkClient("127.0.0.1" + ':' + zkServer.port(), 3, 3, ZKStringSerializer$.MODULE$); zkClient.close(); zkServer.shutdown(); {code} This works fine on Linux, but on Windows, the {{zkServer.shutdown()}} call fails with this exception: {code} [Thread-1] ERROR org.apache.kafka.test.TestUtils - Error deleting C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083 java.nio.file.FileSystemException: C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083\version-2\log.1: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) at java.nio.file.Files.delete(Files.java:1126) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:630) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:619) at java.nio.file.Files.walkFileTree(Files.java:2670) at java.nio.file.Files.walkFileTree(Files.java:2742) at org.apache.kafka.common.utils.Utils.delete(Utils.java:619) at org.apache.kafka.test.TestUtils$1.run(TestUtils.java:184) java.nio.file.FileSystemException: C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083\version-2\log.1: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) at java.nio.file.Files.delete(Files.java:1126) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:630) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:619) at java.nio.file.Files.walkFileTree(Files.java:2670) at java.nio.file.Files.walkFileTree(Files.java:2742) at org.apache.kafka.common.utils.Utils.delete(Utils.java:619) at kafka.zk.EmbeddedZookeeper.shutdown(EmbeddedZookeeper.scala:53) at com.hazelcast.jet.KafkaTest.test(KafkaTest.java:32) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) {code} My workaround is to co
[jira] [Created] (KAFKA-6291) Cannot close EmbeddedZookeeper on Windows
Viliam Durina created KAFKA-6291: Summary: Cannot close EmbeddedZookeeper on Windows Key: KAFKA-6291 URL: https://issues.apache.org/jira/browse/KAFKA-6291 Project: Kafka Issue Type: Bug Components: zkclient Affects Versions: 1.0.0, 0.11.0.0 Environment: Windows 10 (doesn't reproduce on Linux) JDK 8 Reporter: Viliam Durina We created {{EmbeddedZookeeper}} and {{ZkClient}} for various tests using this code: {{ EmbeddedZookeeper zkServer = new EmbeddedZookeeper(); ZkClient zkClient = new ZkClient("127.0.0.1" + ':' + zkServer.port(), 3, 3, ZKStringSerializer$.MODULE$); zkClient.close(); zkServer.shutdown(); }} This works fine on Linux, but on Windows, the {{zkServer.shutdown()}} call fails with this exception: {{ [Thread-1] ERROR org.apache.kafka.test.TestUtils - Error deleting C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083 java.nio.file.FileSystemException: C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083\version-2\log.1: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) at java.nio.file.Files.delete(Files.java:1126) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:630) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:619) at java.nio.file.Files.walkFileTree(Files.java:2670) at java.nio.file.Files.walkFileTree(Files.java:2742) at org.apache.kafka.common.utils.Utils.delete(Utils.java:619) at org.apache.kafka.test.TestUtils$1.run(TestUtils.java:184) java.nio.file.FileSystemException: C:\Users\vilo\AppData\Local\Temp\kafka-5521621147877426083\version-2\log.1: The process cannot access the file because it is being used by another process. at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97) at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102) at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269) at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103) at java.nio.file.Files.delete(Files.java:1126) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:630) at org.apache.kafka.common.utils.Utils$1.visitFile(Utils.java:619) at java.nio.file.Files.walkFileTree(Files.java:2670) at java.nio.file.Files.walkFileTree(Files.java:2742) at org.apache.kafka.common.utils.Utils.delete(Utils.java:619) at kafka.zk.EmbeddedZookeeper.shutdown(EmbeddedZookeeper.scala:53) at com.hazelcast.jet.KafkaTest.test(KafkaTest.java:32) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(Idea
[jira] [Updated] (KAFKA-6290) Kafka Connect fails when using cast transformation
[ https://issues.apache.org/jira/browse/KAFKA-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudhir Pradhan updated KAFKA-6290: -- Description: I am facing same issue when consuming from KAFKA to HDFS with CAST TRANSFORMS. Any pointer please. My Connector : * {code:java} { "name": "hdfs-sink-avro-cast-test-stndln", "config": { "key.converter": "io.confluent.connect.avro.AvroConverter", "key.converter.schema.registry.url": "http://localhost:8081";, "value.converter": "io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url": "http://localhost:8081";, "key.converter.schemas.enable": "true", "value.converter.schemas.enable": "true", "internal.key.converter": "org.apache.kafka.connect.json.JsonConverter", "internal.value.converter": "org.apache.kafka.connect.json.JsonConverter", "internal.key.converter.schemas.enable": "false", "internal.value.converter.schemas.enable": "false", "offset.storage.file.filename": "/tmp/connect.offsets.avroHdfsConsumer.casttest.stndln", "offset.flush.interval.ms": "500", "parse.key": "true", "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector", "hadoop.home": "/usr/lib/hadoop", "hdfs.url": "hdfs://ip-127-34-56-789.us-east-1.compute.interna:8020", "topics": "avro_raw_KFK_SRP_USER_TEST_V,avro_raw_KFK_SRP_PG_HITS_TEST_V", "tasks.max": "1", "topics.dir": "/home/hadoop/kafka/data/streams/in/raw/casttest1", "logs.dir": "/home/hadoop/kafka/wal/streams/in/raw/casttest1", "hive.integration": "true", "hive.metastore.uris": "thrift://ip-127-34-56-789.us-east-1.compute.internal:9083", "schema.compatibility": "BACKWARD", "flush.size": "1", "rotate.interval.ms": "1000", "mode": "timestamp", "transforms": "Cast", "transforms.Cast.type": "org.apache.kafka.connect.transforms.Cast$Value", "transforms.Cast.spec": "residuals:float64,comp:float64" } } {code} Exception : * {code:java} [2017-11-16 01:14:39,719] ERROR Task hdfs-sink-avro-cast-test-stndln-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:148) org.apache.kafka.connect.errors.DataException: Invalid Java object for schema type INT64: class java.util.Date for field: "null" at org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:239) at org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:209) at org.apache.kafka.connect.data.Struct.put(Struct.java:214) at org.apache.kafka.connect.transforms.Cast.applyWithSchema(Cast.java:152) at org.apache.kafka.connect.transforms.Cast.apply(Cast.java:108) at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:38) at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:414) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [2017-11-16 01:14:39,719] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:149) [2017-11-16 01:14:39,719] INFO Shutting down Hive executor service. (io.confluent.connect.hdfs.DataWriter:309) {code} was: I am facing same issue when consuming from KAFKA to HDFS with CAST TRANSFORMS. Any pointer please. My Connector : * ``` { "name": "hdfs-sink-avro-cast-test-stndln", "config": { "key.converter": "io.confluent.connect.avro.AvroConverter", "key.converter.schema.registry.url": "http://localhost:8081";, "value.converter": "io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url": "http://localhost:8081";, "key.converter.schemas.enable": "true", "value.converter.schemas.enable": "true", "internal.key.converter": "org.apache.kafka.connect.json.JsonConverter", "internal.value.converter": "org.apache.kafka.connect.json.JsonConverter", "internal.key.converter.schemas.enable": "false", "internal.value.converter.schemas.enable": "false", "offset.storage.file.filename": "/tmp/connect.offsets.avroHdfsConsumer.casttest.stndln", "offset.flush.interval.ms": "500", "parse.key":
[jira] [Created] (KAFKA-6290) Kafka Connect fails when using cast transformation
Sudhir Pradhan created KAFKA-6290: - Summary: Kafka Connect fails when using cast transformation Key: KAFKA-6290 URL: https://issues.apache.org/jira/browse/KAFKA-6290 Project: Kafka Issue Type: Task Components: KafkaConnect Affects Versions: 0.11.0.0 Reporter: Sudhir Pradhan I am facing same issue when consuming from KAFKA to HDFS with CAST TRANSFORMS. Any pointer please. My Connector : * ``` { "name": "hdfs-sink-avro-cast-test-stndln", "config": { "key.converter": "io.confluent.connect.avro.AvroConverter", "key.converter.schema.registry.url": "http://localhost:8081";, "value.converter": "io.confluent.connect.avro.AvroConverter", "value.converter.schema.registry.url": "http://localhost:8081";, "key.converter.schemas.enable": "true", "value.converter.schemas.enable": "true", "internal.key.converter": "org.apache.kafka.connect.json.JsonConverter", "internal.value.converter": "org.apache.kafka.connect.json.JsonConverter", "internal.key.converter.schemas.enable": "false", "internal.value.converter.schemas.enable": "false", "offset.storage.file.filename": "/tmp/connect.offsets.avroHdfsConsumer.casttest.stndln", "offset.flush.interval.ms": "500", "parse.key": "true", "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector", "hadoop.home": "/usr/lib/hadoop", "hdfs.url": "hdfs://ip-127-34-56-789.us-east-1.compute.interna:8020", "topics": "avro_raw_KFK_SRP_USER_TEST_V,avro_raw_KFK_SRP_PG_HITS_TEST_V", "tasks.max": "1", "topics.dir": "/home/hadoop/kafka/data/streams/in/raw/casttest1", "logs.dir": "/home/hadoop/kafka/wal/streams/in/raw/casttest1", "hive.integration": "true", "hive.metastore.uris": "thrift://ip-127-34-56-789.us-east-1.compute.internal:9083", "schema.compatibility": "BACKWARD", "flush.size": "1", "rotate.interval.ms": "1000", "mode": "timestamp", "transforms": "Cast", "transforms.Cast.type": "org.apache.kafka.connect.transforms.Cast$Value", "transforms.Cast.spec": "residuals:float64,comp:float64" } } ``` Exception : * ``` [2017-11-16 01:14:39,719] ERROR Task hdfs-sink-avro-cast-test-stndln-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:148) org.apache.kafka.connect.errors.DataException: Invalid Java object for schema type INT64: class java.util.Date for field: "null" at org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:239) at org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:209) at org.apache.kafka.connect.data.Struct.put(Struct.java:214) at org.apache.kafka.connect.transforms.Cast.applyWithSchema(Cast.java:152) at org.apache.kafka.connect.transforms.Cast.apply(Cast.java:108) at org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:38) at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:414) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [2017-11-16 01:14:39,719] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:149) [2017-11-16 01:14:39,719] INFO Shutting down Hive executor service. (io.confluent.connect.hdfs.DataWriter:309) ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-6290) Kafka Connect fails when using cast transformation
[ https://issues.apache.org/jira/browse/KAFKA-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudhir Pradhan updated KAFKA-6290: -- Labels: confluent-connect confluent-kafka confluentic connect connect-api connect-transformation kafka kafka-connect transform (was: connect connect-api connect-transformation kafka kafka-connect transform) > Kafka Connect fails when using cast transformation > -- > > Key: KAFKA-6290 > URL: https://issues.apache.org/jira/browse/KAFKA-6290 > Project: Kafka > Issue Type: Task > Components: KafkaConnect >Affects Versions: 0.11.0.0 >Reporter: Sudhir Pradhan > Labels: confluent-connect, confluent-kafka, confluentic, > connect, connect-api, connect-transformation, kafka, kafka-connect, transform > > I am facing same issue when consuming from KAFKA to HDFS with CAST > TRANSFORMS. Any pointer please. > My Connector : > * > ``` > { > "name": "hdfs-sink-avro-cast-test-stndln", > "config": { > "key.converter": "io.confluent.connect.avro.AvroConverter", > "key.converter.schema.registry.url": "http://localhost:8081";, > "value.converter": "io.confluent.connect.avro.AvroConverter", > "value.converter.schema.registry.url": "http://localhost:8081";, > "key.converter.schemas.enable": "true", > "value.converter.schemas.enable": "true", > "internal.key.converter": "org.apache.kafka.connect.json.JsonConverter", > "internal.value.converter": "org.apache.kafka.connect.json.JsonConverter", > "internal.key.converter.schemas.enable": "false", > "internal.value.converter.schemas.enable": "false", > "offset.storage.file.filename": > "/tmp/connect.offsets.avroHdfsConsumer.casttest.stndln", > "offset.flush.interval.ms": "500", > "parse.key": "true", > "connector.class": "io.confluent.connect.hdfs.HdfsSinkConnector", > "hadoop.home": "/usr/lib/hadoop", > "hdfs.url": "hdfs://ip-127-34-56-789.us-east-1.compute.interna:8020", > "topics": "avro_raw_KFK_SRP_USER_TEST_V,avro_raw_KFK_SRP_PG_HITS_TEST_V", > "tasks.max": "1", > "topics.dir": "/home/hadoop/kafka/data/streams/in/raw/casttest1", > "logs.dir": "/home/hadoop/kafka/wal/streams/in/raw/casttest1", > "hive.integration": "true", > "hive.metastore.uris": > "thrift://ip-127-34-56-789.us-east-1.compute.internal:9083", > "schema.compatibility": "BACKWARD", > "flush.size": "1", > "rotate.interval.ms": "1000", > "mode": "timestamp", > "transforms": "Cast", > "transforms.Cast.type": "org.apache.kafka.connect.transforms.Cast$Value", > "transforms.Cast.spec": "residuals:float64,comp:float64" > } > } > ``` > Exception : > * > ``` > [2017-11-16 01:14:39,719] ERROR Task hdfs-sink-avro-cast-test-stndln-0 threw > an uncaught and unrecoverable exception > (org.apache.kafka.connect.runtime.WorkerTask:148) > org.apache.kafka.connect.errors.DataException: Invalid Java object for schema > type INT64: class java.util.Date for field: "null" > at > org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:239) > at > org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:209) > at org.apache.kafka.connect.data.Struct.put(Struct.java:214) > at > org.apache.kafka.connect.transforms.Cast.applyWithSchema(Cast.java:152) > at org.apache.kafka.connect.transforms.Cast.apply(Cast.java:108) > at > org.apache.kafka.connect.runtime.TransformationChain.apply(TransformationChain.java:38) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:414) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:250) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180) > at > org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148) > at > org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146) > at > org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [2017-11-16 01:14:39,719] ERROR Task is being killed and will not recover > until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:149) > [2017-11-16 01:14:39,719] INFO Shutting down Hive executor service. > (io.confluent.connect.hdfs.DataWriter:309) > ``` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6289) NetworkClient should not return internal failed api version responses from poll
[ https://issues.apache.org/jira/browse/KAFKA-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273949#comment-16273949 ] ASF GitHub Bot commented on KAFKA-6289: --- GitHub user hachikuji opened a pull request: https://github.com/apache/kafka/pull/4280 KAFKA-6289: NetworkClient should not expose failed internal ApiVersions requests The NetworkClient internally ApiVersion requests to each broker following connection establishment. If this request happens to fail (perhaps due to an incompatible broker), the NetworkClient includes the response in the result of poll(). Applications will generally not be expecting this response which may lead to failed assertions (or in the case of AdminClient, an obscure log message). I've added test cases which await the ApiVersion request sent by NetworkClient to be in-flight, and then disconnect the connection and verify that the response is not included from poll(). ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) You can merge this pull request into a Git repository by running: $ git pull https://github.com/hachikuji/kafka KAFKA-6289 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/4280.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4280 commit 3c7420b2ab51f8d918037681ed9862f0d2676ba7 Author: Jason Gustafson Date: 2017-12-01T05:03:54Z KAFKA-6289: NetworkClient should not expose failed internal ApiVersion requests > NetworkClient should not return internal failed api version responses from > poll > --- > > Key: KAFKA-6289 > URL: https://issues.apache.org/jira/browse/KAFKA-6289 > Project: Kafka > Issue Type: Bug >Reporter: Jason Gustafson >Assignee: Jason Gustafson > > In the AdminClient, if the initial ApiVersion request sent to the broker > fails, we see the following obscure message: > {code} > [2017-11-30 17:18:48,677] ERROR Internal server error on -2: server returned > information about unknown correlation ID 0. requestHeader = > {api_key=18,api_version=1,correlation_id=0,client_id=adminclient-3} > (org.apache.kafka.clients.admin.KafkaAdminClient) > {code} > What's happening is that the response to the internal ApiVersion request > which is received in NetworkClient is mistakenly being sent to the upper > layer (the admin client in this case). The admin wasn't expecting it, so we > see this message. Instead, the request should be handled internally in > NetworkClient. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-6289) NetworkClient should not return internal failed api version responses from poll
Jason Gustafson created KAFKA-6289: -- Summary: NetworkClient should not return internal failed api version responses from poll Key: KAFKA-6289 URL: https://issues.apache.org/jira/browse/KAFKA-6289 Project: Kafka Issue Type: Bug Reporter: Jason Gustafson Assignee: Jason Gustafson In the AdminClient, if the initial ApiVersion request sent to the broker fails, we see the following obscure message: {code} [2017-11-30 17:18:48,677] ERROR Internal server error on -2: server returned information about unknown correlation ID 0. requestHeader = {api_key=18,api_version=1,correlation_id=0,client_id=adminclient-3} (org.apache.kafka.clients.admin.KafkaAdminClient) {code} What's happening is that the response to the internal ApiVersion request which is received in NetworkClient is mistakenly being sent to the upper layer (the admin client in this case). The admin wasn't expecting it, so we see this message. Instead, the request should be handled internally in NetworkClient. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-6288) Broken symlink interrupts scanning the plugin path
[ https://issues.apache.org/jira/browse/KAFKA-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273872#comment-16273872 ] Ted Yu edited comment on KAFKA-6288 at 12/1/17 3:29 AM: How does the patch look (just for reference) ? Edit: Just saw this is taken by Konstantin. was (Author: yuzhih...@gmail.com): Cloning Kafka repo is slow (in China). How does the patch look ? > Broken symlink interrupts scanning the plugin path > -- > > Key: KAFKA-6288 > URL: https://issues.apache.org/jira/browse/KAFKA-6288 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: Yeva Byzek >Assignee: Konstantine Karantasis > Attachments: 6288.v1.txt > > > KAFKA-6087 introduced support for scanning relative symlinks in the plugin > path. However, if a relative symlink points to a target that doesn't exist, > then scanning the plugin path is interrupted. The consequence is that the > unscanned connectors in the plugin path may effectively not be usable. > Desired behavior is that the symlink with the non-existent target is skipped > and scanning the plugin path continues. > Example of error message: > {noformat} > [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: > /usr/share/java. Ignoring. > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) > java.nio.file.NoSuchFileException: /usr/share/java/name.jar > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) > at > org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) > at > org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) > at > org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-6288) Broken symlink interrupts scanning the plugin path
[ https://issues.apache.org/jira/browse/KAFKA-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated KAFKA-6288: -- Attachment: 6288.v1.txt Cloning Kafka repo is slow (in China). How does the patch look ? > Broken symlink interrupts scanning the plugin path > -- > > Key: KAFKA-6288 > URL: https://issues.apache.org/jira/browse/KAFKA-6288 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: Yeva Byzek >Assignee: Konstantine Karantasis > Attachments: 6288.v1.txt > > > KAFKA-6087 introduced support for scanning relative symlinks in the plugin > path. However, if a relative symlink points to a target that doesn't exist, > then scanning the plugin path is interrupted. The consequence is that the > unscanned connectors in the plugin path may effectively not be usable. > Desired behavior is that the symlink with the non-existent target is skipped > and scanning the plugin path continues. > Example of error message: > {noformat} > [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: > /usr/share/java. Ignoring. > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) > java.nio.file.NoSuchFileException: /usr/share/java/name.jar > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) > at > org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) > at > org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) > at > org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-6288) Broken symlink interrupts scanning the plugin path
[ https://issues.apache.org/jira/browse/KAFKA-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yeva Byzek updated KAFKA-6288: -- Description: KAFKA-6087 introduced support for scanning relative symlinks in the plugin path. However, if a relative symlink points to a target that doesn't exist, then scanning the plugin path is interrupted. The consequence is that the unscanned connectors in the plugin path may effectively not be usable. Desired behavior is that the symlink with the non-existent target is skipped and scanning the plugin path continues. Example of error message: {noformat} [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: /usr/share/java. Ignoring. (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) java.nio.file.NoSuchFileException: /usr/share/java/name.jar at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) at org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) at org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) {noformat} was: KAFKA-6087 introduced support for scanning relative symlinks in the plugin path. However, if a relative symlink points to a target that doesn't exist, then scanning the plugin path is interrupted. The consequence is that the unscanned connectors in the plugin path may effectively not be usable. Desired behavior is that the symlink is skipped and scanning the plugin path continues. Example of error message: {noformat} [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: /usr/share/java. Ignoring. (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) java.nio.file.NoSuchFileException: /usr/share/java/name.jar at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) at org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) at org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) {noformat} > Broken symlink interrupts scanning the plugin path > -- > > Key: KAFKA-6288 > URL: https://issues.apache.org/jira/browse/KAFKA-6288 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: Yeva Byzek >Assignee: Konstantine Karantasis > > KAFKA-6087 introduced support for scanning relative symlinks in the plugin > path. However, if a relative symlink points to a target that doesn't exist, > then scanning the plugin path is interrupted. The consequence is that the > unscanned connectors in the plugin path may effectively not be usable. > Desired behavior is that the symlink with the non-existent target is skipped > and scanning the plugin path continues. > Example of error message: > {noformat} > [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: > /usr/share/java. Ignoring. > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) > java.nio.file.NoSuchFileException: /usr/share/java/name.jar > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) > at > org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(Delegatin
[jira] [Updated] (KAFKA-6288) Broken symlink interrupts scanning the plugin path
[ https://issues.apache.org/jira/browse/KAFKA-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yeva Byzek updated KAFKA-6288: -- Description: KAFKA-6087 introduced support for scanning relative symlinks in the plugin path. However, if a relative symlink points to a target that doesn't exist, then scanning the plugin path is interrupted. The consequence is that the unscanned connectors in the plugin path may effectively not be usable. Desired behavior is that the symlink is skipped and scanning the plugin path continues. Example of error message: {noformat} [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: /usr/share/java. Ignoring. (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) java.nio.file.NoSuchFileException: /usr/share/java/name.jar at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) at org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) at org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) {noformat} was: KAFKA-6087 introduced support for scanning relative symlinks in the plugin path. However, if a relative symlink points to a target that doesn't exist, then scanning the plugin path is interrupted. The consequence is that the unscanned connectors in the plugin path may effectively not usable. Desired behavior is that the symlink is skipped and scanning the plugin path continues. Example of error message: {noformat} [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: /usr/share/java. Ignoring. (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) java.nio.file.NoSuchFileException: /usr/share/java/name.jar at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) at org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) at org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) {noformat} > Broken symlink interrupts scanning the plugin path > -- > > Key: KAFKA-6288 > URL: https://issues.apache.org/jira/browse/KAFKA-6288 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: Yeva Byzek >Assignee: Konstantine Karantasis > > KAFKA-6087 introduced support for scanning relative symlinks in the plugin > path. However, if a relative symlink points to a target that doesn't exist, > then scanning the plugin path is interrupted. The consequence is that the > unscanned connectors in the plugin path may effectively not be usable. > Desired behavior is that the symlink is skipped and scanning the plugin path > continues. > Example of error message: > {noformat} > [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: > /usr/share/java. Ignoring. > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) > java.nio.file.NoSuchFileException: /usr/share/java/name.jar > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) > at > org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) > at > org.apache.kafka.connect
[jira] [Updated] (KAFKA-6288) Broken symlink interrupts scanning the plugin path
[ https://issues.apache.org/jira/browse/KAFKA-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yeva Byzek updated KAFKA-6288: -- Description: KAFKA-6087 introduced support for scanning relative symlinks in the plugin path. However, if a relative symlink points to a target that doesn't exist, then scanning the plugin path is interrupted. The consequence is that the unscanned connectors in the plugin path may effectively not usable. Desired behavior is that the symlink is skipped and scanning the plugin path continues. Example of error message: {noformat} [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: /usr/share/java. Ignoring. (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) java.nio.file.NoSuchFileException: /usr/share/java/name.jar at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) at org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) at org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) {noformat} was: KAFKA-6087 introduced support for scanning relative symlinks in the plugin path. However, if a relative symlink points to a target that doesn't exist, then scanning the plugin path is interrupted. Desired behavior is that the symlink is skipped and scanning the plugin path continues. Example of error message: {noformat} [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: /usr/share/java. Ignoring. (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) java.nio.file.NoSuchFileException: /usr/share/java/name.jar at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) at org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) at org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) {noformat} > Broken symlink interrupts scanning the plugin path > -- > > Key: KAFKA-6288 > URL: https://issues.apache.org/jira/browse/KAFKA-6288 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: Yeva Byzek >Assignee: Konstantine Karantasis > > KAFKA-6087 introduced support for scanning relative symlinks in the plugin > path. However, if a relative symlink points to a target that doesn't exist, > then scanning the plugin path is interrupted. The consequence is that the > unscanned connectors in the plugin path may effectively not usable. > Desired behavior is that the symlink is skipped and scanning the plugin path > continues. > Example of error message: > {noformat} > [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: > /usr/share/java. Ignoring. > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) > java.nio.file.NoSuchFileException: /usr/share/java/name.jar > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) > at > org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) > at > org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) > at > org.apache.kafka.connect.cli.ConnectDistribu
[jira] [Assigned] (KAFKA-6288) Broken symlink interrupts scanning the plugin path
[ https://issues.apache.org/jira/browse/KAFKA-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantine Karantasis reassigned KAFKA-6288: - Assignee: Konstantine Karantasis > Broken symlink interrupts scanning the plugin path > -- > > Key: KAFKA-6288 > URL: https://issues.apache.org/jira/browse/KAFKA-6288 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.0.0 >Reporter: Yeva Byzek >Assignee: Konstantine Karantasis > > KAFKA-6087 introduced support for scanning relative symlinks in the plugin > path. However, if a relative symlink points to a target that doesn't exist, > then scanning the plugin path is interrupted. > Desired behavior is that the symlink is skipped and scanning the plugin path > continues. > Example of error message: > {noformat} > [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: > /usr/share/java. Ignoring. > (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) > java.nio.file.NoSuchFileException: /usr/share/java/name.jar > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) > at > org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) > at > org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) > at > org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) > at > org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-6288) Broken symlink interrupts scanning the plugin path
Yeva Byzek created KAFKA-6288: - Summary: Broken symlink interrupts scanning the plugin path Key: KAFKA-6288 URL: https://issues.apache.org/jira/browse/KAFKA-6288 Project: Kafka Issue Type: Bug Components: KafkaConnect Affects Versions: 1.0.0 Reporter: Yeva Byzek KAFKA-6087 introduced support for scanning relative symlinks in the plugin path. However, if a relative symlink points to a target that doesn't exist, then scanning the plugin path is interrupted. Desired behavior is that the symlink is skipped and scanning the plugin path continues. Example of error message: {noformat} [2017-11-30 20:19:26,226] ERROR Could not get listing for plugin path: /usr/share/java. Ignoring. (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:170) java.nio.file.NoSuchFileException: /usr/share/java/name.jar at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixPath.toRealPath(UnixPath.java:837) at org.apache.kafka.connect.runtime.isolation.PluginUtils.pluginUrls(PluginUtils.java:241) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.registerPlugin(DelegatingClassLoader.java:181) at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingClassLoader.java:153) at org.apache.kafka.connect.runtime.isolation.Plugins.(Plugins.java:47) at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:70) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6282) exactly_once semantics breaks demo application
[ https://issues.apache.org/jira/browse/KAFKA-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273817#comment-16273817 ] Guozhang Wang commented on KAFKA-6282: -- [~mjsax] I think we can update the config docs of `StreamsConfig` for this property mentioning that exactly once is dependent on the transactional messaging with the default replication factor; so if users trying to turn on this config need to either make sure the num.brokers is no less than the replication factor or this broker-side config is overridden. We'd suggest the first approach since in practice with less or no replicas persistency is not safe although we maintain consistency. > exactly_once semantics breaks demo application > -- > > Key: KAFKA-6282 > URL: https://issues.apache.org/jira/browse/KAFKA-6282 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 1.0.0 > Environment: Tested on i7+24GB Ubuntu16.04 and i7+8GB Windows 7, with > cluster 1.0.0 and 0.11.0.0 and streams 1.0.0 and 0.11.0.0 >Reporter: Romans Markuns > Attachments: WordCountDemo.java, server.properties > > > +What I try to achieve+ > Do successful run of Kafka streams app with setting "processing.guarantee" > set to "exactly_once" > +How+ > Use Kafka quickstart example > (https://kafka.apache.org/10/documentation/streams/quickstart) and modify > only configuration parameters. > Things I've changed: > 1) Add one line to WordCountDemo: > {code:java} > props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, > StreamsConfig.EXACTLY_ONCE); > {code} > 2) Modify server.properties to be the same as we use in QA: set broker id to > 1, allow deleting topics via admin client and set initial rebalance delay to > 3 s. > +What I expect+ > Modified demo app works exactly as the original as presented in link above. > +What I get+ > 1) Original app works fine. Output topic after each line is submitted via > console producer. > 2) Modified app does not process topic record after it is submitted via > console producer. Streams remain in state REBALANCING, no errors on warning > printed. MAIN thread forever blocks waiting TransactionCoordinator response > (CountdownLatch.await()) and this message getting printed: > [kafka-producer-network-thread | > streams-wordcount-client-StreamThread-1-0_0-producer] DEBUG > org.apache.kafka.clients.producer.internals.TransactionManager - [Producer > clientId=streams-wordcount-client-StreamThread-1-0_0-producer, > transactionalId=streams-wordcount-0_0] Enqueuing transactional request > (type=FindCoordinatorRequest, coordinatorKey=streams-wordcount-0_0, > coordinatorType=TRANSACTION) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6282) exactly_once semantics breaks demo application
[ https://issues.apache.org/jira/browse/KAFKA-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273546#comment-16273546 ] Matthias J. Sax commented on KAFKA-6282: [~hachikuji] I'll take care of it. > exactly_once semantics breaks demo application > -- > > Key: KAFKA-6282 > URL: https://issues.apache.org/jira/browse/KAFKA-6282 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 1.0.0 > Environment: Tested on i7+24GB Ubuntu16.04 and i7+8GB Windows 7, with > cluster 1.0.0 and 0.11.0.0 and streams 1.0.0 and 0.11.0.0 >Reporter: Romans Markuns > Attachments: WordCountDemo.java, server.properties > > > +What I try to achieve+ > Do successful run of Kafka streams app with setting "processing.guarantee" > set to "exactly_once" > +How+ > Use Kafka quickstart example > (https://kafka.apache.org/10/documentation/streams/quickstart) and modify > only configuration parameters. > Things I've changed: > 1) Add one line to WordCountDemo: > {code:java} > props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, > StreamsConfig.EXACTLY_ONCE); > {code} > 2) Modify server.properties to be the same as we use in QA: set broker id to > 1, allow deleting topics via admin client and set initial rebalance delay to > 3 s. > +What I expect+ > Modified demo app works exactly as the original as presented in link above. > +What I get+ > 1) Original app works fine. Output topic after each line is submitted via > console producer. > 2) Modified app does not process topic record after it is submitted via > console producer. Streams remain in state REBALANCING, no errors on warning > printed. MAIN thread forever blocks waiting TransactionCoordinator response > (CountdownLatch.await()) and this message getting printed: > [kafka-producer-network-thread | > streams-wordcount-client-StreamThread-1-0_0-producer] DEBUG > org.apache.kafka.clients.producer.internals.TransactionManager - [Producer > clientId=streams-wordcount-client-StreamThread-1-0_0-producer, > transactionalId=streams-wordcount-0_0] Enqueuing transactional request > (type=FindCoordinatorRequest, coordinatorKey=streams-wordcount-0_0, > coordinatorType=TRANSACTION) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6284) System Test failed: ConnectRestApiTest
[ https://issues.apache.org/jira/browse/KAFKA-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273539#comment-16273539 ] ASF GitHub Bot commented on KAFKA-6284: --- Github user asfgit closed the pull request at: https://github.com/apache/kafka/pull/4279 > System Test failed: ConnectRestApiTest > --- > > Key: KAFKA-6284 > URL: https://issues.apache.org/jira/browse/KAFKA-6284 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.1.0 >Reporter: Mikkin Patel >Assignee: Mikkin Patel > Fix For: 1.1.0 > > > KAFKA-3073 introduced topic regex support for Connect sinks. The > ConnectRestApiTest failed to verifiy configdef with expected response. > {noformat} > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 132, in run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 185, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 92, in test_rest_api > self.verify_config(self.FILE_SINK_CONNECTOR, self.FILE_SINK_CONFIGS, > configs) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 200, in verify_config > assert config_def == set(config_names) > AssertionError > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-6284) System Test failed: ConnectRestApiTest
[ https://issues.apache.org/jira/browse/KAFKA-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava resolved KAFKA-6284. -- Resolution: Fixed Issue resolved by pull request 4279 [https://github.com/apache/kafka/pull/4279] > System Test failed: ConnectRestApiTest > --- > > Key: KAFKA-6284 > URL: https://issues.apache.org/jira/browse/KAFKA-6284 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.1.0 >Reporter: Mikkin Patel >Assignee: Mikkin Patel > Fix For: 1.1.0 > > > KAFKA-3073 introduced topic regex support for Connect sinks. The > ConnectRestApiTest failed to verifiy configdef with expected response. > {noformat} > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 132, in run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 185, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 92, in test_rest_api > self.verify_config(self.FILE_SINK_CONNECTOR, self.FILE_SINK_CONFIGS, > configs) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 200, in verify_config > assert config_def == set(config_names) > AssertionError > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6282) exactly_once semantics breaks demo application
[ https://issues.apache.org/jira/browse/KAFKA-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273527#comment-16273527 ] Jason Gustafson commented on KAFKA-6282: [~bigmcmark] [~mjsax] Would either of you be willing to submit a patch to update the documentation? > exactly_once semantics breaks demo application > -- > > Key: KAFKA-6282 > URL: https://issues.apache.org/jira/browse/KAFKA-6282 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 1.0.0 > Environment: Tested on i7+24GB Ubuntu16.04 and i7+8GB Windows 7, with > cluster 1.0.0 and 0.11.0.0 and streams 1.0.0 and 0.11.0.0 >Reporter: Romans Markuns > Attachments: WordCountDemo.java, server.properties > > > +What I try to achieve+ > Do successful run of Kafka streams app with setting "processing.guarantee" > set to "exactly_once" > +How+ > Use Kafka quickstart example > (https://kafka.apache.org/10/documentation/streams/quickstart) and modify > only configuration parameters. > Things I've changed: > 1) Add one line to WordCountDemo: > {code:java} > props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, > StreamsConfig.EXACTLY_ONCE); > {code} > 2) Modify server.properties to be the same as we use in QA: set broker id to > 1, allow deleting topics via admin client and set initial rebalance delay to > 3 s. > +What I expect+ > Modified demo app works exactly as the original as presented in link above. > +What I get+ > 1) Original app works fine. Output topic after each line is submitted via > console producer. > 2) Modified app does not process topic record after it is submitted via > console producer. Streams remain in state REBALANCING, no errors on warning > printed. MAIN thread forever blocks waiting TransactionCoordinator response > (CountdownLatch.await()) and this message getting printed: > [kafka-producer-network-thread | > streams-wordcount-client-StreamThread-1-0_0-producer] DEBUG > org.apache.kafka.clients.producer.internals.TransactionManager - [Producer > clientId=streams-wordcount-client-StreamThread-1-0_0-producer, > transactionalId=streams-wordcount-0_0] Enqueuing transactional request > (type=FindCoordinatorRequest, coordinatorKey=streams-wordcount-0_0, > coordinatorType=TRANSACTION) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception
[ https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273501#comment-16273501 ] Jason Gustafson commented on KAFKA-4669: Rajini is right that this is probably caused by some request failure on the broker (not the client as I suggested above). One interesting note from the logs provided by [~aartigupta] is that we appear to continue to use the connection in the client after we encounter this error. The correlationId of each subsequent request on the same connection is also off by one. To make the client more resilient, we probably should just close the connection. > KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws > exception > - > > Key: KAFKA-4669 > URL: https://issues.apache.org/jira/browse/KAFKA-4669 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.9.0.1 >Reporter: Cheng Ju >Assignee: Rajini Sivaram >Priority: Critical > Labels: reliability > Fix For: 0.11.0.1, 1.0.0 > > > There is no try catch in NetworkClient.handleCompletedReceives. If an > exception is thrown after inFlightRequests.completeNext(source), then the > corresponding RecordBatch's done will never get called, and > KafkaProducer.flush will hang on this RecordBatch. > I've checked 0.10 code and think this bug does exist in 0.10 versions. > A real case. First a correlateId not match exception happens: > 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] > (org.apache.kafka.clients.producer.internals.Sender.run:130) - Uncaught > error in kafka producer I/O thread: > java.lang.IllegalStateException: Correlation id for response (703766) does > not match request (703764) > at > org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) > at java.lang.Thread.run(Thread.java:745) > Then jstack shows the thread is hanging on: > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:425) > at > org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:544) > at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:224) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145) > at java.lang.Thread.run(Thread.java:745) > client code -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6282) exactly_once semantics breaks demo application
[ https://issues.apache.org/jira/browse/KAFKA-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273302#comment-16273302 ] Matthias J. Sax commented on KAFKA-6282: Glad it's resolved. Agreed that we should update some docs here. Thanks for the feedback! Really appreciate it. > exactly_once semantics breaks demo application > -- > > Key: KAFKA-6282 > URL: https://issues.apache.org/jira/browse/KAFKA-6282 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 1.0.0 > Environment: Tested on i7+24GB Ubuntu16.04 and i7+8GB Windows 7, with > cluster 1.0.0 and 0.11.0.0 and streams 1.0.0 and 0.11.0.0 >Reporter: Romans Markuns > Attachments: WordCountDemo.java, server.properties > > > +What I try to achieve+ > Do successful run of Kafka streams app with setting "processing.guarantee" > set to "exactly_once" > +How+ > Use Kafka quickstart example > (https://kafka.apache.org/10/documentation/streams/quickstart) and modify > only configuration parameters. > Things I've changed: > 1) Add one line to WordCountDemo: > {code:java} > props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, > StreamsConfig.EXACTLY_ONCE); > {code} > 2) Modify server.properties to be the same as we use in QA: set broker id to > 1, allow deleting topics via admin client and set initial rebalance delay to > 3 s. > +What I expect+ > Modified demo app works exactly as the original as presented in link above. > +What I get+ > 1) Original app works fine. Output topic after each line is submitted via > console producer. > 2) Modified app does not process topic record after it is submitted via > console producer. Streams remain in state REBALANCING, no errors on warning > printed. MAIN thread forever blocks waiting TransactionCoordinator response > (CountdownLatch.await()) and this message getting printed: > [kafka-producer-network-thread | > streams-wordcount-client-StreamThread-1-0_0-producer] DEBUG > org.apache.kafka.clients.producer.internals.TransactionManager - [Producer > clientId=streams-wordcount-client-StreamThread-1-0_0-producer, > transactionalId=streams-wordcount-0_0] Enqueuing transactional request > (type=FindCoordinatorRequest, coordinatorKey=streams-wordcount-0_0, > coordinatorType=TRANSACTION) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (KAFKA-6287) Inconsistent protocol type for empty consumer groups
Ryan Leslie created KAFKA-6287: -- Summary: Inconsistent protocol type for empty consumer groups Key: KAFKA-6287 URL: https://issues.apache.org/jira/browse/KAFKA-6287 Project: Kafka Issue Type: Bug Components: consumer Affects Versions: 1.0.0 Reporter: Ryan Leslie Priority: Minor When a consumer is created for a new group, the group metadata's protocol type is set to 'consumer' and this is stored both in __consumer_offsets as well as in the coordinator's local cache. If the consumer leaves the group and the group becomes empty, ListGroups requests will continue to show the group as type 'consumer', and as such kafka-consumer-groups.sh will show it via --list. However, if the coordinator (broker) node is killed and a new coordinator is elected, when the GroupMetadataManager loads the group from __consumer_offsets into its cache, it will not set the protocolType if there are no active consumers. As a result, the group's protocolType will now become the empty string (UNKNOWN_PROTOCOL_TYPE), and kafka-consumer-groups.sh will no longer show the group. Ideally bouncing a broker should not result in the group's protocol changing. protocolType can be set in GroupMetadataManager.readGroupMessageValue() to always reflect what's present in the persistent metadata (__consumer_offsets) regardless if there are active members. In general, things can get confusing when distinguishing between 'consumer' and non-consumer groups. For example, DescribeGroups and OffsetFetchRequest does not filter on protocol type, so it's possible for kafka-consumer-groups.sh to describe groups (--describe) without actually being able to list them. In the protocol guide, OffsetFetchRequest / OffsetCommitRequest have their parameters listed as 'ConsumerGroup', but in reality these can be used for groups of unknown type as well. For instance, a consumer group can be copied by finding a coordinator (GroupCoordinatorRequest / FindCoordinatorRequest) and sending an OffsetCommitRequest. The group will be auto-created with an empty protocol. Although this is arguably correct, the group will now exist but not be a proper 'consumer' group until later when there is a JoinGroupRequest. Again, this can be confusing as far as categorization / visibility of the group is concerned. A group can instead be copied more directly by creating a consumer and calling commitSync (as kafka-consumer-groups.sh), but this does involve extra connections / requests and so is a little slower when trying to keep a large number of groups in sync in real-time across clusters. If we want to make it easier to keep consumer groups consistent, options include allowing groups to be explicitly created with a protocol type instead of only lazily, or for groups created outside of JoinGroupRequest the coordinator can gain a broker config to set a default protocol type for groups. This would default to 'consumer'. This sort of setting might be confusing for users though, since implicitly created groups is certainly not the norm. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-6284) System Test failed: ConnectRestApiTest
[ https://issues.apache.org/jira/browse/KAFKA-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikkin Patel reassigned KAFKA-6284: --- Assignee: Mikkin Patel > System Test failed: ConnectRestApiTest > --- > > Key: KAFKA-6284 > URL: https://issues.apache.org/jira/browse/KAFKA-6284 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.1.0 >Reporter: Mikkin Patel >Assignee: Mikkin Patel > Fix For: 1.1.0 > > > KAFKA-3073 introduced topic regex support for Connect sinks. The > ConnectRestApiTest failed to verifiy configdef with expected response. > {noformat} > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 132, in run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 185, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 92, in test_rest_api > self.verify_config(self.FILE_SINK_CONNECTOR, self.FILE_SINK_CONFIGS, > configs) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 200, in verify_config > assert config_def == set(config_names) > AssertionError > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception
[ https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273064#comment-16273064 ] Nick Travers commented on KAFKA-4669: - [~rsivaram] - thanks for the response. It is definitely possible that non-kafka clients are sending data to the port. For example, we have services fuzzing all the time, port scanners, etc. Good to know this is improved in 1.0. Please let me know if there's any other information we can gather. > KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws > exception > - > > Key: KAFKA-4669 > URL: https://issues.apache.org/jira/browse/KAFKA-4669 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.9.0.1 >Reporter: Cheng Ju >Assignee: Rajini Sivaram >Priority: Critical > Labels: reliability > Fix For: 0.11.0.1, 1.0.0 > > > There is no try catch in NetworkClient.handleCompletedReceives. If an > exception is thrown after inFlightRequests.completeNext(source), then the > corresponding RecordBatch's done will never get called, and > KafkaProducer.flush will hang on this RecordBatch. > I've checked 0.10 code and think this bug does exist in 0.10 versions. > A real case. First a correlateId not match exception happens: > 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] > (org.apache.kafka.clients.producer.internals.Sender.run:130) - Uncaught > error in kafka producer I/O thread: > java.lang.IllegalStateException: Correlation id for response (703766) does > not match request (703764) > at > org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) > at java.lang.Thread.run(Thread.java:745) > Then jstack shows the thread is hanging on: > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:425) > at > org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:544) > at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:224) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145) > at java.lang.Thread.run(Thread.java:745) > client code -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6284) System Test failed: ConnectRestApiTest
[ https://issues.apache.org/jira/browse/KAFKA-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273047#comment-16273047 ] ASF GitHub Bot commented on KAFKA-6284: --- GitHub user mikkin opened a pull request: https://github.com/apache/kafka/pull/4279 KAFKA-6284: Fixed system test for Connect REST API `topics.regex` was added in KAFKA-3073. This change fixes the test that invokes `/validate` to ensure that all the configdefs are returned as expected. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mikkin/kafka KAFKA-6284 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/4279.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4279 commit 426dfa9b888acbe6f8c5f90b1c24dad62f92ded7 Author: Mikkin Date: 2017-11-30T17:37:56Z KAFKA-6284: Fixed system test for Connect REST API > System Test failed: ConnectRestApiTest > --- > > Key: KAFKA-6284 > URL: https://issues.apache.org/jira/browse/KAFKA-6284 > Project: Kafka > Issue Type: Bug > Components: KafkaConnect >Affects Versions: 1.1.0 >Reporter: Mikkin Patel > Fix For: 1.1.0 > > > KAFKA-3073 introduced topic regex support for Connect sinks. The > ConnectRestApiTest failed to verifiy configdef with expected response. > {noformat} > Traceback (most recent call last): > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 132, in run > data = self.run_test() > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/venv/local/lib/python2.7/site-packages/ducktape-0.7.1-py2.7.egg/ducktape/tests/runner_client.py", > line 185, in run_test > return self.test_context.function(self.test) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 92, in test_rest_api > self.verify_config(self.FILE_SINK_CONNECTOR, self.FILE_SINK_CONFIGS, > configs) > File > "/home/jenkins/workspace/system-test-kafka-ci_trunk-STZ4BWS25LWQAJN6JJZHKK6GP4YF7HCVYRFRFXB2KMAQ4PC5H4ZA/kafka/tests/kafkatest/tests/connect/connect_rest_test.py", > line 200, in verify_config > assert config_def == set(config_names) > AssertionError > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6269) KTable state restore fails after rebalance
[ https://issues.apache.org/jira/browse/KAFKA-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272972#comment-16272972 ] Andreas Schroeder commented on KAFKA-6269: -- [~guozhang] it's okay for my team to wait or the 1.0.1 release. Until then, we'll stick to the 0.11.0.1 version we are currently using. The reason to migrate to 1.0.0 was that we are experiencing some unfair task assignment across our stream processor nodes, which leads to some nodes crashing (and immediately being recreated). So our current system runs and we can wait for 1.0.1 Thanks however for giving suggestions on how to proceed! I'll try [~mjsax]'s suggestion on hiding the null value :) > KTable state restore fails after rebalance > -- > > Key: KAFKA-6269 > URL: https://issues.apache.org/jira/browse/KAFKA-6269 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 1.0.0 >Reporter: Andreas Schroeder >Priority: Blocker > Fix For: 1.1.0, 1.0.1 > > > I have the following kafka streams topology: > entity-B -> map step -> entity-B-exists (with state store) > entity-A -> map step -> entity-A-exists (with state store) > (entity-B-exists, entity-A-exists) -> outer join with state store. > The topology building code looks like this (some data type, serde, > valuemapper, and joiner code omitted): > {code} > def buildTable[V](builder: StreamsBuilder, > sourceTopic: String, > existsTopic: String, > valueSerde: Serde[V], > valueMapper: ValueMapper[String, V]): > KTable[String, V] = { > val stream: KStream[String, String] = builder.stream[String, > String](sourceTopic) > val transformed: KStream[String, V] = stream.mapValues(valueMapper) > transformed.to(existsTopic, Produced.`with`(Serdes.String(), valueSerde)) > val inMemoryStoreName = s"$existsTopic-persisted" > val materialized = > Materialized.as(Stores.inMemoryKeyValueStore(inMemoryStoreName)) > .withKeySerde(Serdes.String()) > .withValueSerde(valueSerde) > .withLoggingDisabled() > builder.table(existsTopic, materialized) > } > val builder = new StreamsBuilder > val mapToEmptyString: ValueMapper[String, String] = (value: String) => if > (value != null) "" else null > val entitiesB: KTable[String, EntityBInfo] = > buildTable(builder, > "entity-B", > "entity-B-exists", > EntityBInfoSerde, > ListingImagesToEntityBInfo) > val entitiesA: KTable[String, String] = > buildTable(builder, "entity-A", "entity-A-exists", Serdes.String(), > mapToEmptyString) > val joiner: ValueJoiner[String, EntityBInfo, EntityDiff] = (a, b) => > EntityDiff.fromJoin(a, b) > val materialized = > Materialized.as(Stores.inMemoryKeyValueStore("entity-A-joined-with-entity-B")) > .withKeySerde(Serdes.String()) > .withValueSerde(EntityDiffSerde) > .withLoggingEnabled(new java.util.HashMap[String, String]()) > val joined: KTable[String, EntityDiff] = entitiesA.outerJoin(entitiesB, > joiner, materialized) > {code} > We run 4 processor machines with 30 stream threads each; each topic has 30 > partitions so that there is a total of 4 x 30 = 120 partitions to consume. > The initial launch of the processor works fine, but when killing one > processor and letting him re-join the stream threads leads to some faulty > behaviour. > Fist, the total number of assigned partitions over all processor machines is > larger than 120 (sometimes 157, sometimes just 132), so the partition / task > assignment seems to assign the same job to different stream threads. > The processor machines trying to re-join the consumer group fail constantly > with the error message of 'Detected a task that got migrated to another > thread.' We gave the processor half an hour to recover; usually, rebuilding > the KTable states take around 20 seconds (with Kafka 0.11.0.1). > Here are the details of the errors we see: > stream-thread [kafka-processor-6-StreamThread-9] Detected a task that got > migrated to another thread. This implies that this thread missed a rebalance > and dropped out of the consumer group. Trying to rejoin the consumer group > now. > {code} > org.apache.kafka.streams.errors.TaskMigratedException: Log end offset of > entity-B-exists-0 should not change while restoring: old end offset 4750539, > current offset 4751388 > > StreamsTask taskId: 1_0 > > > ProcessorTopology: > > KSTREAM-SOURCE-08: > > topics: [entity-A-exists] > > children: [KTABLE-SOURCE-09] > > KTABLE-SOURCE-09: > > states: [entity-A-exists-persisted] > > childr
[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception
[ https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272967#comment-16272967 ] Rajini Sivaram commented on KAFKA-4669: --- [~nickt] That is interesting. The oversized messages should just get warned and should not cause any other failures. But BufferUnderflowException can unfortunately cause failures in other connections too in 0.11.0.1. Is it possible at all that a client that is not a Kafka producer/consumer sent some bytes to the broker? That could result in the BufferUnderflowException. And with 0.11.0.1, that could also result in the failures in the producer (in completely unrelated connections). We have improved error handling in SocketServer in 1.0 (KAFKA-5607) so that one connection with invalid data doesn't affect processing of other connections. That would avoid the producer/consumer correlation id errors. But it will be good to understand if the BufferUnderflowException was caused by a Kafka client or not (if it was a Kafka client, we need to figure out why). > KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws > exception > - > > Key: KAFKA-4669 > URL: https://issues.apache.org/jira/browse/KAFKA-4669 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.9.0.1 >Reporter: Cheng Ju >Assignee: Rajini Sivaram >Priority: Critical > Labels: reliability > Fix For: 0.11.0.1, 1.0.0 > > > There is no try catch in NetworkClient.handleCompletedReceives. If an > exception is thrown after inFlightRequests.completeNext(source), then the > corresponding RecordBatch's done will never get called, and > KafkaProducer.flush will hang on this RecordBatch. > I've checked 0.10 code and think this bug does exist in 0.10 versions. > A real case. First a correlateId not match exception happens: > 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] > (org.apache.kafka.clients.producer.internals.Sender.run:130) - Uncaught > error in kafka producer I/O thread: > java.lang.IllegalStateException: Correlation id for response (703766) does > not match request (703764) > at > org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) > at java.lang.Thread.run(Thread.java:745) > Then jstack shows the thread is hanging on: > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:425) > at > org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:544) > at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:224) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145) > at java.lang.Thread.run(Thread.java:745) > client code -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception
[ https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272900#comment-16272900 ] Nick Travers commented on KAFKA-4669: - [~rsivaram] [~hachikuji] - I did some digging on the broker side. While we usually see a steady stream of exceptions in the broker logs due to oversized messages, here's another exception from around the time we say the exception in the producer client. There were over 100 of these exception on the same broker network thread within the same second. A few milliseconds after this, the exception was seen in the producer client. Hard to say definitively on the ordering, given clock skew between the different broker and client hosts. {code} 2017-11-28 09:18:20,599 kafka-network-thread-28-ListenerName(PLAINTEXT)-PLAINTEXT-4 Processor got uncaught exception. java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:506) at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:310) at kafka.network.RequestChannel$Request.(RequestChannel.scala:70) at kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:518) at kafka.network.Processor$$anonfun$processCompletedReceives$1.apply(SocketServer.scala:511) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at kafka.network.Processor.processCompletedReceives(SocketServer.scala:511) at kafka.network.Processor.run(SocketServer.scala:436) at java.lang.Thread.run(Thread.java:748) {code} Here's an example of the oversized message exceptions in the broker: {code} 2017-11-28 09:18:12,379 kafka-network-thread-28-ListenerName(PLAINTEXT)-PLAINTEXT-4 Unexpected error from /10.4.12.70; closing connection org.apache.kafka.common.network.InvalidReceiveException: Invalid receive (size = 335544320 larger than 104857600) at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:95) at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:75) at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:203) at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:167) at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:381) at org.apache.kafka.common.network.Selector.poll(Selector.java:326) at kafka.network.Processor.poll(SocketServer.scala:500) at kafka.network.Processor.run(SocketServer.scala:435) at java.lang.Thread.run(Thread.java:748) {code} > KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws > exception > - > > Key: KAFKA-4669 > URL: https://issues.apache.org/jira/browse/KAFKA-4669 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.9.0.1 >Reporter: Cheng Ju >Assignee: Rajini Sivaram >Priority: Critical > Labels: reliability > Fix For: 0.11.0.1, 1.0.0 > > > There is no try catch in NetworkClient.handleCompletedReceives. If an > exception is thrown after inFlightRequests.completeNext(source), then the > corresponding RecordBatch's done will never get called, and > KafkaProducer.flush will hang on this RecordBatch. > I've checked 0.10 code and think this bug does exist in 0.10 versions. > A real case. First a correlateId not match exception happens: > 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] > (org.apache.kafka.clients.producer.internals.Sender.run:130) - Uncaught > error in kafka producer I/O thread: > java.lang.IllegalStateException: Correlation id for response (703766) does > not match request (703764) > at > org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) > at java.lang.Thread.run(Thread.java:745) > Then jstack shows the thread is hanging on: > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57) > at > org.apache.kafka.clients.producer.internals.RecordAc
[jira] [Comment Edited] (KAFKA-5882) NullPointerException in StreamTask
[ https://issues.apache.org/jira/browse/KAFKA-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272747#comment-16272747 ] Seweryn Habdank-Wojewodzki edited comment on KAFKA-5882 at 11/30/17 2:47 PM: - [~mjsax] In mean while I had ported the code to {{1.0.0}} :-). I will try do my best. was (Author: habdank): [~mjsax] In mean while I had ported the code to {1.0.0} :-). I will try do my best. > NullPointerException in StreamTask > -- > > Key: KAFKA-5882 > URL: https://issues.apache.org/jira/browse/KAFKA-5882 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0 >Reporter: Seweryn Habdank-Wojewodzki > > It seems bugfix [KAFKA-5073|https://issues.apache.org/jira/browse/KAFKA-5073] > is made, but introduce some other issue. > In some cases (I am not sure which ones) I got NPE (below). > I would expect that even in case of FATAL error anythink except NPE is thrown. > {code} > 2017-09-12 23:34:54 ERROR ConsumerCoordinator:269 - User provided listener > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener > for group streamer failed on partition assignment > java.lang.NullPointerException: null > at > org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:123) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:1234) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:294) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:254) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:1313) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.access$1100(StreamThread.java:73) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:183) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:265) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:582) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:553) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:527) > [myapp-streamer.jar:?] > 2017-09-12 23:34:54 INFO StreamThread:1040 - stream-thread > [streamer-3a44578b-faa8-4b5b-bbeb-7a7f04639563-StreamThread-1] Shutting down > 2017-09-12 23:34:54 INFO KafkaProducer:972 - Closing the Kafka producer with > timeoutMillis = 9223372036854775807 ms. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5882) NullPointerException in StreamTask
[ https://issues.apache.org/jira/browse/KAFKA-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272747#comment-16272747 ] Seweryn Habdank-Wojewodzki commented on KAFKA-5882: --- [~mjsax] In mean while I had ported the code to {1.0.0} :-). I will try do my best. > NullPointerException in StreamTask > -- > > Key: KAFKA-5882 > URL: https://issues.apache.org/jira/browse/KAFKA-5882 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0 >Reporter: Seweryn Habdank-Wojewodzki > > It seems bugfix [KAFKA-5073|https://issues.apache.org/jira/browse/KAFKA-5073] > is made, but introduce some other issue. > In some cases (I am not sure which ones) I got NPE (below). > I would expect that even in case of FATAL error anythink except NPE is thrown. > {code} > 2017-09-12 23:34:54 ERROR ConsumerCoordinator:269 - User provided listener > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener > for group streamer failed on partition assignment > java.lang.NullPointerException: null > at > org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:123) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:1234) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:294) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:254) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:1313) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.access$1100(StreamThread.java:73) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:183) > ~[myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:265) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078) > [myapp-streamer.jar:?] > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:582) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:553) > [myapp-streamer.jar:?] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:527) > [myapp-streamer.jar:?] > 2017-09-12 23:34:54 INFO StreamThread:1040 - stream-thread > [streamer-3a44578b-faa8-4b5b-bbeb-7a7f04639563-StreamThread-1] Shutting down > 2017-09-12 23:34:54 INFO KafkaProducer:972 - Closing the Kafka producer with > timeoutMillis = 9223372036854775807 ms. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-6260) AbstractCoordinator not clearly handles NULL Exception
[ https://issues.apache.org/jira/browse/KAFKA-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272741#comment-16272741 ] Seweryn Habdank-Wojewodzki edited comment on KAFKA-6260 at 11/30/17 2:44 PM: - [~hachikuji] * Can you tell me exactly (exact setting name), which settings are logically combined, so I can set them respectively to our timings including our wait for end-clients results, please? * Yes I can test any hot fixes, the only matter is how can I get them into our build process :-). was (Author: habdank): [~hachikuji] * Can you tell me exactly which settings are logically combined, so I can set them respectively to our timings including our wait for end-clients results, please? * Yes I can test any hot fixes, the only matter is how can I get them into our build process :-). > AbstractCoordinator not clearly handles NULL Exception > -- > > Key: KAFKA-6260 > URL: https://issues.apache.org/jira/browse/KAFKA-6260 > Project: Kafka > Issue Type: Bug >Affects Versions: 1.0.0 > Environment: RedHat Linux >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Jason Gustafson > Fix For: 1.1.0, 1.0.1 > > > The error reporting is not clear. But it seems that Kafka Heartbeat shuts > down application due to NULL exception caused by "fake" disconnections. > One more comment. We are processing messages in the stream, but sometimes we > have to block processing for minutes, as consumers are not handling too much > load. Is it possibble that when stream is waiting, then heartbeat is as well > blocked? > Can you check that? > {code} > 2017-11-23 23:54:47 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending Heartbeat request to coordinator > cljp01.eb.lan.at:9093 (id: 2147483646 rack: null) > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending HEARTBEAT > {group_id=kafka-endpoint,generation_id=3834,member_id=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer-94f18be5-e49a-4817-9e5a-fe82a64e0b08} > with correlation id 24 to node 2147483646 > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Completed receive from node 2147483646 for HEARTBEAT > with correlation id 24, received {throttle_time_ms=0,error_code=0} > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:52 DEBUG NetworkClient:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Disconnecting from node 1 due to request timeout. > 2017-11-23 23:54:52 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled request > {replica_id=-1,max_wait_time=6000,min_bytes=1,max_bytes=52428800,isolation_level=0,topics=[{topic=clj_internal_topic,partitions=[{partition=6,fetch_offset=211558395,log_start_offset=-1,max_bytes=1048576},{partition=8,fetch_offset=210178209,log_start_offset=-1,max_bytes=1048576},{partition=0,fetch_offset=209353523,log_start_offset=-1,max_bytes=1048576},{partition=2,fetch_offset=209291462,log_start_offset=-1,max_bytes=1048576},{partition=4,fetch_offset=210728595,log_start_offset=-1,max_bytes=1048576}]}]} > with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG ConsumerNetworkClient:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled FETCH request RequestHeader(apiKey=FETCH, > apiVersion=6, > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > correlationId=21) with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG Fetcher:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Fetch request > {clj_internal_topic-6=(offset=211558395, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-8=(offset=210178209, logStartOffset=-1, > maxBytes
[jira] [Commented] (KAFKA-6260) AbstractCoordinator not clearly handles NULL Exception
[ https://issues.apache.org/jira/browse/KAFKA-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272741#comment-16272741 ] Seweryn Habdank-Wojewodzki commented on KAFKA-6260: --- [~hachikuji] * Can you tell me exactly which settings are logically combined, so I can set them respectively to our timings including our wait for end-clients results, please? * Yes I can test any hot fixes, the only matter is how can I get them into our build process :-). > AbstractCoordinator not clearly handles NULL Exception > -- > > Key: KAFKA-6260 > URL: https://issues.apache.org/jira/browse/KAFKA-6260 > Project: Kafka > Issue Type: Bug >Affects Versions: 1.0.0 > Environment: RedHat Linux >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Jason Gustafson > Fix For: 1.1.0, 1.0.1 > > > The error reporting is not clear. But it seems that Kafka Heartbeat shuts > down application due to NULL exception caused by "fake" disconnections. > One more comment. We are processing messages in the stream, but sometimes we > have to block processing for minutes, as consumers are not handling too much > load. Is it possibble that when stream is waiting, then heartbeat is as well > blocked? > Can you check that? > {code} > 2017-11-23 23:54:47 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending Heartbeat request to coordinator > cljp01.eb.lan.at:9093 (id: 2147483646 rack: null) > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending HEARTBEAT > {group_id=kafka-endpoint,generation_id=3834,member_id=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer-94f18be5-e49a-4817-9e5a-fe82a64e0b08} > with correlation id 24 to node 2147483646 > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Completed receive from node 2147483646 for HEARTBEAT > with correlation id 24, received {throttle_time_ms=0,error_code=0} > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:52 DEBUG NetworkClient:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Disconnecting from node 1 due to request timeout. > 2017-11-23 23:54:52 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled request > {replica_id=-1,max_wait_time=6000,min_bytes=1,max_bytes=52428800,isolation_level=0,topics=[{topic=clj_internal_topic,partitions=[{partition=6,fetch_offset=211558395,log_start_offset=-1,max_bytes=1048576},{partition=8,fetch_offset=210178209,log_start_offset=-1,max_bytes=1048576},{partition=0,fetch_offset=209353523,log_start_offset=-1,max_bytes=1048576},{partition=2,fetch_offset=209291462,log_start_offset=-1,max_bytes=1048576},{partition=4,fetch_offset=210728595,log_start_offset=-1,max_bytes=1048576}]}]} > with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG ConsumerNetworkClient:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled FETCH request RequestHeader(apiKey=FETCH, > apiVersion=6, > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > correlationId=21) with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG Fetcher:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Fetch request > {clj_internal_topic-6=(offset=211558395, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-8=(offset=210178209, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-0=(offset=209353523, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-2=(offset=209291462, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-4=(offset=210728595, logStartOffset=-1, > maxBytes=1048576)} to cljp01.eb.lan.at:9093 (id: 1 rack: DC-1) failed > org.apache.kafka.common.errors.DisconnectException: null > 2017-11-23
[jira] [Commented] (KAFKA-6193) ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics fails sometimes
[ https://issues.apache.org/jira/browse/KAFKA-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272718#comment-16272718 ] Ismael Juma commented on KAFKA-6193: Another failure: https://builds.apache.org/job/kafka-trunk-jdk9/lastCompletedBuild/testReport/kafka.admin/ReassignPartitionsClusterTest/shouldPerformMultipleReassignmentOperationsOverVariousTopics/ Looking into it. > ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics > fails sometimes > -- > > Key: KAFKA-6193 > URL: https://issues.apache.org/jira/browse/KAFKA-6193 > Project: Kafka > Issue Type: Test >Reporter: Ted Yu >Assignee: Ismael Juma > Fix For: 1.1.0 > > Attachments: 6193.out > > > From > https://builds.apache.org/job/kafka-trunk-jdk8/2198/testReport/junit/kafka.admin/ReassignPartitionsClusterTest/shouldPerformMultipleReassignmentOperationsOverVariousTopics/ > : > {code} > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > kafka.admin.ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics(ReassignPartitionsClusterTest.scala:524) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6260) AbstractCoordinator not clearly handles NULL Exception
[ https://issues.apache.org/jira/browse/KAFKA-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272570#comment-16272570 ] Ismael Juma commented on KAFKA-6260: Thanks for the investigation of the impact [~hachikuji]. The fact that a combination of settings is required probably explains why others are not seeing the issue. > AbstractCoordinator not clearly handles NULL Exception > -- > > Key: KAFKA-6260 > URL: https://issues.apache.org/jira/browse/KAFKA-6260 > Project: Kafka > Issue Type: Bug >Affects Versions: 1.0.0 > Environment: RedHat Linux >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Jason Gustafson > Fix For: 1.1.0, 1.0.1 > > > The error reporting is not clear. But it seems that Kafka Heartbeat shuts > down application due to NULL exception caused by "fake" disconnections. > One more comment. We are processing messages in the stream, but sometimes we > have to block processing for minutes, as consumers are not handling too much > load. Is it possibble that when stream is waiting, then heartbeat is as well > blocked? > Can you check that? > {code} > 2017-11-23 23:54:47 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending Heartbeat request to coordinator > cljp01.eb.lan.at:9093 (id: 2147483646 rack: null) > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending HEARTBEAT > {group_id=kafka-endpoint,generation_id=3834,member_id=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer-94f18be5-e49a-4817-9e5a-fe82a64e0b08} > with correlation id 24 to node 2147483646 > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Completed receive from node 2147483646 for HEARTBEAT > with correlation id 24, received {throttle_time_ms=0,error_code=0} > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:52 DEBUG NetworkClient:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Disconnecting from node 1 due to request timeout. > 2017-11-23 23:54:52 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled request > {replica_id=-1,max_wait_time=6000,min_bytes=1,max_bytes=52428800,isolation_level=0,topics=[{topic=clj_internal_topic,partitions=[{partition=6,fetch_offset=211558395,log_start_offset=-1,max_bytes=1048576},{partition=8,fetch_offset=210178209,log_start_offset=-1,max_bytes=1048576},{partition=0,fetch_offset=209353523,log_start_offset=-1,max_bytes=1048576},{partition=2,fetch_offset=209291462,log_start_offset=-1,max_bytes=1048576},{partition=4,fetch_offset=210728595,log_start_offset=-1,max_bytes=1048576}]}]} > with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG ConsumerNetworkClient:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled FETCH request RequestHeader(apiKey=FETCH, > apiVersion=6, > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > correlationId=21) with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG Fetcher:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Fetch request > {clj_internal_topic-6=(offset=211558395, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-8=(offset=210178209, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-0=(offset=209353523, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-2=(offset=209291462, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-4=(offset=210728595, logStartOffset=-1, > maxBytes=1048576)} to cljp01.eb.lan.at:9093 (id: 1 rack: DC-1) failed > org.apache.kafka.common.errors.DisconnectException: null > 2017-11-23 23:54:52 TRACE NetworkClient:123 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > group
[jira] [Updated] (KAFKA-6260) AbstractCoordinator not clearly handles NULL Exception
[ https://issues.apache.org/jira/browse/KAFKA-6260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-6260: --- Fix Version/s: 1.0.1 1.1.0 > AbstractCoordinator not clearly handles NULL Exception > -- > > Key: KAFKA-6260 > URL: https://issues.apache.org/jira/browse/KAFKA-6260 > Project: Kafka > Issue Type: Bug >Affects Versions: 1.0.0 > Environment: RedHat Linux >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Jason Gustafson > Fix For: 1.1.0, 1.0.1 > > > The error reporting is not clear. But it seems that Kafka Heartbeat shuts > down application due to NULL exception caused by "fake" disconnections. > One more comment. We are processing messages in the stream, but sometimes we > have to block processing for minutes, as consumers are not handling too much > load. Is it possibble that when stream is waiting, then heartbeat is as well > blocked? > Can you check that? > {code} > 2017-11-23 23:54:47 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending Heartbeat request to coordinator > cljp01.eb.lan.at:9093 (id: 2147483646 rack: null) > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Sending HEARTBEAT > {group_id=kafka-endpoint,generation_id=3834,member_id=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer-94f18be5-e49a-4817-9e5a-fe82a64e0b08} > with correlation id 24 to node 2147483646 > 2017-11-23 23:54:50 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Completed receive from node 2147483646 for HEARTBEAT > with correlation id 24, received {throttle_time_ms=0,error_code=0} > 2017-11-23 23:54:50 DEBUG AbstractCoordinator:177 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Received successful Heartbeat response > 2017-11-23 23:54:52 DEBUG NetworkClient:183 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Disconnecting from node 1 due to request timeout. > 2017-11-23 23:54:52 TRACE NetworkClient:135 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled request > {replica_id=-1,max_wait_time=6000,min_bytes=1,max_bytes=52428800,isolation_level=0,topics=[{topic=clj_internal_topic,partitions=[{partition=6,fetch_offset=211558395,log_start_offset=-1,max_bytes=1048576},{partition=8,fetch_offset=210178209,log_start_offset=-1,max_bytes=1048576},{partition=0,fetch_offset=209353523,log_start_offset=-1,max_bytes=1048576},{partition=2,fetch_offset=209291462,log_start_offset=-1,max_bytes=1048576},{partition=4,fetch_offset=210728595,log_start_offset=-1,max_bytes=1048576}]}]} > with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG ConsumerNetworkClient:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Cancelled FETCH request RequestHeader(apiKey=FETCH, > apiVersion=6, > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > correlationId=21) with correlation id 21 due to node 1 being disconnected > 2017-11-23 23:54:52 DEBUG Fetcher:195 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Fetch request > {clj_internal_topic-6=(offset=211558395, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-8=(offset=210178209, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-0=(offset=209353523, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-2=(offset=209291462, logStartOffset=-1, > maxBytes=1048576), clj_internal_topic-4=(offset=210728595, logStartOffset=-1, > maxBytes=1048576)} to cljp01.eb.lan.at:9093 (id: 1 rack: DC-1) failed > org.apache.kafka.common.errors.DisconnectException: null > 2017-11-23 23:54:52 TRACE NetworkClient:123 - [Consumer > clientId=kafka-endpoint-be51569b-8795-4709-8ec8-28c9cd099a31-StreamThread-1-consumer, > groupId=kafka-endpoint] Found least loaded node cljp01.eb.lan.at:9093 (id: 1 > rack: DC-1) > 2017-11-23 23:54:52 DEBUG NetworkClient:183 - [Consumer > clientId=kafka-endpoint-be
[jira] [Commented] (KAFKA-6281) Kafka JavaAPI Producer failed with NotLeaderForPartitionException
[ https://issues.apache.org/jira/browse/KAFKA-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272566#comment-16272566 ] Manikumar commented on KAFKA-6281: -- [~anilkumar...@gmail.com] As mentioned in previous comment, maybe due to network issue or long GC pause on broker (check GC logs). based on the analysis you can try to increasing zk session timeout and tune GC settings. > Kafka JavaAPI Producer failed with NotLeaderForPartitionException > - > > Key: KAFKA-6281 > URL: https://issues.apache.org/jira/browse/KAFKA-6281 > Project: Kafka > Issue Type: Bug >Reporter: Anil > Attachments: server1-controller.log, server2-controller.log > > > We are running Kafka (vesion kafka_2.11-0.10.1.0) in a 2 node cluster. We > have 2 producers (Java API) acting on different topics. Each topic has single > partition. The topic where we had this issue, has one consumer running. This > set up has been running fine for 3 months, and we saw this issue. All the > suggested cases/solutions for this issue in other forums don't seem to apply > for my scenario. > Exception at producer; > {code} > -2017-11-25T17:40:33,035 [kafka-producer-network-thread | producer-1] ERROR > client.producer.BingLogProducerCallback - Encountered exception in sending > message ; > org.apache.kafka.common.errors.NotLeaderForPartitionException: > This server is not the leader for that topic-partition. > {code} > We haven't enabled retries for the messages, because this is transactional > data and we want to maintain the order. > Producer config: > {code} > bootstrap.servers : server1ip:9092 > acks :all > retries : 0 > linger.ms :0 > buffer.memory :1024 > max.request.size :1024000 > key.serializer : org.apache.kafka.common.serialization.StringSerializer > value.serializer : org.apache.kafka.common.serialization.StringSerializer > {code} > We are connecting to server1 at both producer and consumer. The controller > log at server2 indicates there is some shutdown happened at during sametime, > but I dont understand why this happened. > {color:red}[2017-11-25 17:31:44,776] DEBUG [Controller 2]: topics not in > preferred replica Map() (kafka.controller.KafkaController) [2017-11-25 > 17:31:44,776] TRACE [Controller 2]: leader imbalance ratio for broker 2 is > 0.00 (kafka.controller.KafkaController) [2017-11-25 17:31:44,776] DEBUG > [Controller 2]: topics not in preferred replica Map() > (kafka.controller.KafkaController) [2017-11-25 17:31:44,776] TRACE > [Controller 2]: leader imbalance ratio for broker 1 is 0.00 > (kafka.controller.KafkaController) [2017-11-25 17:34:18,314] INFO > [SessionExpirationListener on 2], ZK expired; shut down all controller > components and try to re-elect > (kafka.controller.KafkaController$SessionExpirationListener) [2017-11-25 > 17:34:18,317] DEBUG [Controller 2]: Controller resigning, broker id 2 > (kafka.controller.KafkaController) [2017-11-25 17:34:18,317] DEBUG > [Controller 2]: De-registering IsrChangeNotificationListener > (kafka.controller.KafkaController) [2017-11-25 17:34:18,317] INFO > [delete-topics-thread-2], Shutting down > (kafka.controller.TopicDeletionManager$DeleteTopicsThread) [2017-11-25 > 17:34:18,317] INFO [delete-topics-thread-2], Stopped > (kafka.controller.TopicDeletionManager$DeleteTopicsThread) [2017-11-25 > 17:34:18,318] INFO [delete-topics-thread-2], Shutdown completed > (kafka.controller.TopicDeletionManager$DeleteTopicsThread) [2017-11-25 > 17:34:18,318] INFO [Partition state machine on Controller 2]: Stopped > partition state machine (kafka.controller.PartitionStateMachine) [2017-11-25 > 17:34:18,318] INFO [Replica state machine on controller 2]: Stopped replica > state machine (kafka.controller.ReplicaStateMachine) [2017-11-25 > 17:34:18,318] INFO [Controller-2-to-broker-2-send-thread], Shutting down > (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,318] INFO > [Controller-2-to-broker-2-send-thread], Stopped > (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO > [Controller-2-to-broker-2-send-thread], Shutdown completed > (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO > [Controller-2-to-broker-1-send-thread], Shutting down > (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO > [Controller-2-to-broker-1-send-thread], Stopped > (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO > [Controller-2-to-broker-1-send-thread], Shutdown completed > (kafka.controller.RequestSendThread) [2017-11-25 17:34:18,319] INFO > [Controller 2]: Broker 2 resigned as the controller > (kafka.controller.KafkaController) [2017-11-25 17:34:18,353] DEBUG > [IsrChangeNotificationListener] Fired!!! > (kafka.controller.IsrChangeNotificationListener) [2017-11-25 17
[jira] [Commented] (KAFKA-6266) Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of __consumer_offsets-xx to log start offset 203569 since the checkpointed offset 120955 is inval
[ https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272553#comment-16272553 ] VinayKumar commented on KAFKA-6266: --- Hi, The log.cleanup.policy is 'delete' log.cleanup.policy = [delete] > Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of > __consumer_offsets-xx to log start offset 203569 since the checkpointed > offset 120955 is invalid. (kafka.log.LogCleanerManager$) > -- > > Key: KAFKA-6266 > URL: https://issues.apache.org/jira/browse/KAFKA-6266 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 1.0.0 > Environment: CentOS 7, Apache kafka_2.12-1.0.0 >Reporter: VinayKumar > > I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below > warnings in the log. > I'm seeing these continuously in the log, and want these to be fixed- so that > they wont repeat. Can someone please help me in fixing the below warnings. > WARN Resetting first dirty offset of __consumer_offsets-17 to log start > offset 3346 since the checkpointed offset 3332 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-23 to log start > offset 4 since the checkpointed offset 1 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-19 to log start > offset 203569 since the checkpointed offset 120955 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-35 to log start > offset 16957 since the checkpointed offset 7 is invalid. > (kafka.log.LogCleanerManager$) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (KAFKA-6282) exactly_once semantics breaks demo application
[ https://issues.apache.org/jira/browse/KAFKA-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Romans Markuns resolved KAFKA-6282. --- Resolution: Not A Bug > exactly_once semantics breaks demo application > -- > > Key: KAFKA-6282 > URL: https://issues.apache.org/jira/browse/KAFKA-6282 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 1.0.0 > Environment: Tested on i7+24GB Ubuntu16.04 and i7+8GB Windows 7, with > cluster 1.0.0 and 0.11.0.0 and streams 1.0.0 and 0.11.0.0 >Reporter: Romans Markuns > Attachments: WordCountDemo.java, server.properties > > > +What I try to achieve+ > Do successful run of Kafka streams app with setting "processing.guarantee" > set to "exactly_once" > +How+ > Use Kafka quickstart example > (https://kafka.apache.org/10/documentation/streams/quickstart) and modify > only configuration parameters. > Things I've changed: > 1) Add one line to WordCountDemo: > {code:java} > props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, > StreamsConfig.EXACTLY_ONCE); > {code} > 2) Modify server.properties to be the same as we use in QA: set broker id to > 1, allow deleting topics via admin client and set initial rebalance delay to > 3 s. > +What I expect+ > Modified demo app works exactly as the original as presented in link above. > +What I get+ > 1) Original app works fine. Output topic after each line is submitted via > console producer. > 2) Modified app does not process topic record after it is submitted via > console producer. Streams remain in state REBALANCING, no errors on warning > printed. MAIN thread forever blocks waiting TransactionCoordinator response > (CountdownLatch.await()) and this message getting printed: > [kafka-producer-network-thread | > streams-wordcount-client-StreamThread-1-0_0-producer] DEBUG > org.apache.kafka.clients.producer.internals.TransactionManager - [Producer > clientId=streams-wordcount-client-StreamThread-1-0_0-producer, > transactionalId=streams-wordcount-0_0] Enqueuing transactional request > (type=FindCoordinatorRequest, coordinatorKey=streams-wordcount-0_0, > coordinatorType=TRANSACTION) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6282) exactly_once semantics breaks demo application
[ https://issues.apache.org/jira/browse/KAFKA-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272527#comment-16272527 ] Romans Markuns commented on KAFKA-6282: --- Thank you [~mjsax]! That did the trick. I'm closing this as it is not a bug, however, I think documentation of ```processing.guarantee``` property should contain link to ```transaction.state.log.replication.factor```. > exactly_once semantics breaks demo application > -- > > Key: KAFKA-6282 > URL: https://issues.apache.org/jira/browse/KAFKA-6282 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 1.0.0 > Environment: Tested on i7+24GB Ubuntu16.04 and i7+8GB Windows 7, with > cluster 1.0.0 and 0.11.0.0 and streams 1.0.0 and 0.11.0.0 >Reporter: Romans Markuns > Attachments: WordCountDemo.java, server.properties > > > +What I try to achieve+ > Do successful run of Kafka streams app with setting "processing.guarantee" > set to "exactly_once" > +How+ > Use Kafka quickstart example > (https://kafka.apache.org/10/documentation/streams/quickstart) and modify > only configuration parameters. > Things I've changed: > 1) Add one line to WordCountDemo: > {code:java} > props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, > StreamsConfig.EXACTLY_ONCE); > {code} > 2) Modify server.properties to be the same as we use in QA: set broker id to > 1, allow deleting topics via admin client and set initial rebalance delay to > 3 s. > +What I expect+ > Modified demo app works exactly as the original as presented in link above. > +What I get+ > 1) Original app works fine. Output topic after each line is submitted via > console producer. > 2) Modified app does not process topic record after it is submitted via > console producer. Streams remain in state REBALANCING, no errors on warning > printed. MAIN thread forever blocks waiting TransactionCoordinator response > (CountdownLatch.await()) and this message getting printed: > [kafka-producer-network-thread | > streams-wordcount-client-StreamThread-1-0_0-producer] DEBUG > org.apache.kafka.clients.producer.internals.TransactionManager - [Producer > clientId=streams-wordcount-client-StreamThread-1-0_0-producer, > transactionalId=streams-wordcount-0_0] Enqueuing transactional request > (type=FindCoordinatorRequest, coordinatorKey=streams-wordcount-0_0, > coordinatorType=TRANSACTION) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (KAFKA-6282) exactly_once semantics breaks demo application
[ https://issues.apache.org/jira/browse/KAFKA-6282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272527#comment-16272527 ] Romans Markuns edited comment on KAFKA-6282 at 11/30/17 11:04 AM: -- Thank you [~mjsax]! That did the trick. I'm closing this as it is not a bug, however, I think documentation of processing.guarantee property should contain link to transaction.state.log.replication.factor. was (Author: bigmcmark): Thank you [~mjsax]! That did the trick. I'm closing this as it is not a bug, however, I think documentation of ```processing.guarantee``` property should contain link to ```transaction.state.log.replication.factor```. > exactly_once semantics breaks demo application > -- > > Key: KAFKA-6282 > URL: https://issues.apache.org/jira/browse/KAFKA-6282 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 1.0.0 > Environment: Tested on i7+24GB Ubuntu16.04 and i7+8GB Windows 7, with > cluster 1.0.0 and 0.11.0.0 and streams 1.0.0 and 0.11.0.0 >Reporter: Romans Markuns > Attachments: WordCountDemo.java, server.properties > > > +What I try to achieve+ > Do successful run of Kafka streams app with setting "processing.guarantee" > set to "exactly_once" > +How+ > Use Kafka quickstart example > (https://kafka.apache.org/10/documentation/streams/quickstart) and modify > only configuration parameters. > Things I've changed: > 1) Add one line to WordCountDemo: > {code:java} > props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, > StreamsConfig.EXACTLY_ONCE); > {code} > 2) Modify server.properties to be the same as we use in QA: set broker id to > 1, allow deleting topics via admin client and set initial rebalance delay to > 3 s. > +What I expect+ > Modified demo app works exactly as the original as presented in link above. > +What I get+ > 1) Original app works fine. Output topic after each line is submitted via > console producer. > 2) Modified app does not process topic record after it is submitted via > console producer. Streams remain in state REBALANCING, no errors on warning > printed. MAIN thread forever blocks waiting TransactionCoordinator response > (CountdownLatch.await()) and this message getting printed: > [kafka-producer-network-thread | > streams-wordcount-client-StreamThread-1-0_0-producer] DEBUG > org.apache.kafka.clients.producer.internals.TransactionManager - [Producer > clientId=streams-wordcount-client-StreamThread-1-0_0-producer, > transactionalId=streams-wordcount-0_0] Enqueuing transactional request > (type=FindCoordinatorRequest, coordinatorKey=streams-wordcount-0_0, > coordinatorType=TRANSACTION) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5165) Kafka Logs Cleanup Not happening, Huge File Growth - Windows
[ https://issues.apache.org/jira/browse/KAFKA-5165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272524#comment-16272524 ] Zeeshan Haider commented on KAFKA-5165: --- Have you find any solution or workaround for the problem? > Kafka Logs Cleanup Not happening, Huge File Growth - Windows > > > Key: KAFKA-5165 > URL: https://issues.apache.org/jira/browse/KAFKA-5165 > Project: Kafka > Issue Type: Bug >Affects Versions: 0.9.0.1 > Environment: windows, Kafka Server(Version: 0.9.0.1) >Reporter: Manikandan P > > We had set the below configuration: Retention hours as 1, Retention bytes as > 150 MB in the server.properties in the Kafka Server(Version: 0.9.0.1). Also > modified other settings as given below. > log.dirs=/tmp/kafka-logs > log.retention.hours=1 > log.retention.bytes=157286400 > log.segment.bytes=1073741824 > log.retention.check.interval.ms=30 > log.cleaner.enable=true > log.cleanup.policy=delete > After checking few days, Size of the Kafka log folder too huge as 13.2 GB. We > have seen that Topic Offset getting updated and ignores the Old data but Log > File doesnt reduce and has all the Old Data and become too huge. Could you > help us to find out why Kafka is not deleting the logs(Physically). Do we > need to change any configuration ? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-4669) KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception
[ https://issues.apache.org/jira/browse/KAFKA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272507#comment-16272507 ] Rajini Sivaram commented on KAFKA-4669: --- [~nickt] [~aartigupta] It will be useful to know if there were any errors in the broker logs around the time of the exception. An exception in the broker while processing requests can cause this scenario (the last time we saw this, it was due to an NPE in the broker). > KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws > exception > - > > Key: KAFKA-4669 > URL: https://issues.apache.org/jira/browse/KAFKA-4669 > Project: Kafka > Issue Type: Bug > Components: clients >Affects Versions: 0.9.0.1 >Reporter: Cheng Ju >Assignee: Rajini Sivaram >Priority: Critical > Labels: reliability > Fix For: 0.11.0.1, 1.0.0 > > > There is no try catch in NetworkClient.handleCompletedReceives. If an > exception is thrown after inFlightRequests.completeNext(source), then the > corresponding RecordBatch's done will never get called, and > KafkaProducer.flush will hang on this RecordBatch. > I've checked 0.10 code and think this bug does exist in 0.10 versions. > A real case. First a correlateId not match exception happens: > 13 Jan 2017 17:08:24,059 ERROR [kafka-producer-network-thread | producer-21] > (org.apache.kafka.clients.producer.internals.Sender.run:130) - Uncaught > error in kafka producer I/O thread: > java.lang.IllegalStateException: Correlation id for response (703766) does > not match request (703764) > at > org.apache.kafka.clients.NetworkClient.correlate(NetworkClient.java:477) > at > org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:440) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:265) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:216) > at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:128) > at java.lang.Thread.run(Thread.java:745) > Then jstack shows the thread is hanging on: > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) > at > org.apache.kafka.clients.producer.internals.ProduceRequestResult.await(ProduceRequestResult.java:57) > at > org.apache.kafka.clients.producer.internals.RecordAccumulator.awaitFlushCompletion(RecordAccumulator.java:425) > at > org.apache.kafka.clients.producer.KafkaProducer.flush(KafkaProducer.java:544) > at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:224) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145) > at java.lang.Thread.run(Thread.java:745) > client code -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6193) ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics fails sometimes
[ https://issues.apache.org/jira/browse/KAFKA-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272505#comment-16272505 ] Ismael Juma commented on KAFKA-6193: Some of the log errors seen when the test fails have been fixed by https://github.com/apache/kafka/pull/4219/. If anyone sees this failure in a branch that includes PR #4219, please post the link to the failure to this JIRA. > ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics > fails sometimes > -- > > Key: KAFKA-6193 > URL: https://issues.apache.org/jira/browse/KAFKA-6193 > Project: Kafka > Issue Type: Test >Reporter: Ted Yu >Assignee: Ismael Juma > Fix For: 1.1.0 > > Attachments: 6193.out > > > From > https://builds.apache.org/job/kafka-trunk-jdk8/2198/testReport/junit/kafka.admin/ReassignPartitionsClusterTest/shouldPerformMultipleReassignmentOperationsOverVariousTopics/ > : > {code} > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > kafka.admin.ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics(ReassignPartitionsClusterTest.scala:524) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-6193) ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics fails sometimes
[ https://issues.apache.org/jira/browse/KAFKA-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-6193: --- Fix Version/s: (was: 1.0.1) > ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics > fails sometimes > -- > > Key: KAFKA-6193 > URL: https://issues.apache.org/jira/browse/KAFKA-6193 > Project: Kafka > Issue Type: Test >Reporter: Ted Yu > Fix For: 1.1.0 > > Attachments: 6193.out > > > From > https://builds.apache.org/job/kafka-trunk-jdk8/2198/testReport/junit/kafka.admin/ReassignPartitionsClusterTest/shouldPerformMultipleReassignmentOperationsOverVariousTopics/ > : > {code} > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > kafka.admin.ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics(ReassignPartitionsClusterTest.scala:524) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KAFKA-6193) ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics fails sometimes
[ https://issues.apache.org/jira/browse/KAFKA-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma reassigned KAFKA-6193: -- Assignee: Ismael Juma > ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics > fails sometimes > -- > > Key: KAFKA-6193 > URL: https://issues.apache.org/jira/browse/KAFKA-6193 > Project: Kafka > Issue Type: Test >Reporter: Ted Yu >Assignee: Ismael Juma > Fix For: 1.1.0 > > Attachments: 6193.out > > > From > https://builds.apache.org/job/kafka-trunk-jdk8/2198/testReport/junit/kafka.admin/ReassignPartitionsClusterTest/shouldPerformMultipleReassignmentOperationsOverVariousTopics/ > : > {code} > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > kafka.admin.ReassignPartitionsClusterTest.shouldPerformMultipleReassignmentOperationsOverVariousTopics(ReassignPartitionsClusterTest.scala:524) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-6266) Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of __consumer_offsets-xx to log start offset 203569 since the checkpointed offset 120955 is inval
[ https://issues.apache.org/jira/browse/KAFKA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16272422#comment-16272422 ] huxihx commented on KAFKA-6266: --- [~VinayKumar] what's the value for broker-side config `log.cleanup.policy`? > Kafka 1.0.0 : Repeated occurrence of WARN Resetting first dirty offset of > __consumer_offsets-xx to log start offset 203569 since the checkpointed > offset 120955 is invalid. (kafka.log.LogCleanerManager$) > -- > > Key: KAFKA-6266 > URL: https://issues.apache.org/jira/browse/KAFKA-6266 > Project: Kafka > Issue Type: Bug > Components: offset manager >Affects Versions: 1.0.0 > Environment: CentOS 7, Apache kafka_2.12-1.0.0 >Reporter: VinayKumar > > I upgraded Kafka from 0.10.2.1 to 1.0.0 version. From then, I see the below > warnings in the log. > I'm seeing these continuously in the log, and want these to be fixed- so that > they wont repeat. Can someone please help me in fixing the below warnings. > WARN Resetting first dirty offset of __consumer_offsets-17 to log start > offset 3346 since the checkpointed offset 3332 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-23 to log start > offset 4 since the checkpointed offset 1 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-19 to log start > offset 203569 since the checkpointed offset 120955 is invalid. > (kafka.log.LogCleanerManager$) > WARN Resetting first dirty offset of __consumer_offsets-35 to log start > offset 16957 since the checkpointed offset 7 is invalid. > (kafka.log.LogCleanerManager$) -- This message was sent by Atlassian JIRA (v6.4.14#64029)