acdn-mpreston opened a new issue #9391: 0.17.0 Kafka Supervisor becomes "unhealthy" due to missing inputFormat when Overlord is restarted URL: https://github.com/apache/druid/issues/9391 ### Affected Version 0.17.0 The Druid version where the problem was encountered. 0.17.0 ### Description When provisioning a Kafka Supervisor in druid 0.17.0, a payload using the inputFormat property in the ioConfig will be accepted and the supervisor will work properly until you restart the Overlord service. When the Overlord service is restarted, it will fail to create tasks due to there no longer being any "inputFormat" data in the ioConfig. This is due to the way the json marshalling is handled in the configuration persistence. When you issue a command to create a supervisor with an ioConfig like this: "ioConfig": { "topic": "some-cool-topic", "inputFormat": { "type": "json" }, "replicas": 1, "taskCount": 2, "taskDuration": "PT1800S", "consumerProperties": { "bootstrap.servers": "kafka:9092" }, "pollTimeout": 100, "startDelay": "PT5S", "period": "PT30S", "useEarliestOffset": true, "completionTimeout": "PT3600S", "lateMessageRejectionPeriod": null, "earlyMessageRejectionPeriod": "PT3600S", "stream": "npav-ts-metrics", "useEarliestSequenceNumber": true }, It will be accepted, but anytime you try to retrieve this supervisor, you will get back an ioConfig that looks like this: "ioConfig": { "topic": "some-cool-topic", "replicas": 1, "taskCount": 2, "taskDuration": "PT1800S", "consumerProperties": { "bootstrap.servers": "kafka:9092" }, "pollTimeout": 100, "startDelay": "PT5S", "period": "PT30S", "useEarliestOffset": true, "completionTimeout": "PT3600S", "lateMessageRejectionPeriod": null, "earlyMessageRejectionPeriod": "PT3600S", "lateMessageRejectionStartDateTime": null, "stream": "npav-ts-metrics", "useEarliestSequenceNumber": true, "givenInputFormat": { "type": "json", "flattenSpec": { "useFieldDiscovery": true, "fields": [] }, "featureSpec": {} } }, Notice that the inputFormat is gone and givenInputFormat is there instead. This supervisor will work properly until you restart the overlord, at which point the supervisor ioConfig will contain the following: "ioConfig": { "topic": "some-cool-topic", "replicas": 1, "taskCount": 2, "taskDuration": "PT1800S", "consumerProperties": { "bootstrap.servers": "kafka:9092" }, "pollTimeout": 100, "startDelay": "PT5S", "period": "PT30S", "useEarliestOffset": true, "completionTimeout": "PT3600S", "lateMessageRejectionPeriod": null, "earlyMessageRejectionPeriod": "PT3600S", "lateMessageRejectionStartDateTime": null, "stream": "npav-ts-metrics", "useEarliestSequenceNumber": true, "givenInputFormat": null }, Notice that there is no longer a valid inputFormat, nor is there a valid givenInputFormat. When in this state, the supervisor is unable to create new ingestion tasks.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org