acdn-mpreston opened a new issue #9391: 0.17.0 Kafka Supervisor becomes 
"unhealthy" due to missing inputFormat when Overlord is restarted
URL: https://github.com/apache/druid/issues/9391
 
 
   
   ### Affected Version
   
   0.17.0
   
   The Druid version where the problem was encountered.
   
   0.17.0
   
   ### Description
   
   When provisioning a Kafka Supervisor in druid 0.17.0, a payload using the 
inputFormat property in the ioConfig will be accepted and the supervisor will 
work properly until you restart the Overlord service. When the Overlord service 
is restarted, it will fail to create tasks due to there no longer being any 
"inputFormat" data in the ioConfig.
   
   This is due to the way the json marshalling is handled in the configuration 
persistence. When you issue a command to create a supervisor with an ioConfig 
like this:
   
   "ioConfig": {
           "topic": "some-cool-topic",
           "inputFormat": {
               "type": "json"
           },
           "replicas": 1,
           "taskCount": 2,
           "taskDuration": "PT1800S",
           "consumerProperties": {
               "bootstrap.servers": "kafka:9092"
           },
           "pollTimeout": 100,
           "startDelay": "PT5S",
           "period": "PT30S",
           "useEarliestOffset": true,
           "completionTimeout": "PT3600S",
           "lateMessageRejectionPeriod": null,
           "earlyMessageRejectionPeriod": "PT3600S",
           "stream": "npav-ts-metrics",
           "useEarliestSequenceNumber": true
       },
   
   It will be accepted, but anytime you try to retrieve this supervisor, you 
will get back an ioConfig that looks like this:
   
   "ioConfig": {
       "topic": "some-cool-topic",
       "replicas": 1,
       "taskCount": 2,
       "taskDuration": "PT1800S",
       "consumerProperties": {
         "bootstrap.servers": "kafka:9092"
       },
       "pollTimeout": 100,
       "startDelay": "PT5S",
       "period": "PT30S",
       "useEarliestOffset": true,
       "completionTimeout": "PT3600S",
       "lateMessageRejectionPeriod": null,
       "earlyMessageRejectionPeriod": "PT3600S",
       "lateMessageRejectionStartDateTime": null,
       "stream": "npav-ts-metrics",
       "useEarliestSequenceNumber": true,
       "givenInputFormat": {
         "type": "json",
         "flattenSpec": {
           "useFieldDiscovery": true,
           "fields": []
         },
         "featureSpec": {}
       }
     },
   
   Notice that the inputFormat is gone and givenInputFormat is there instead. 
This supervisor will work properly until you restart the overlord, at which 
point the supervisor ioConfig will contain the following:
   
   "ioConfig": {
       "topic": "some-cool-topic",
       "replicas": 1,
       "taskCount": 2,
       "taskDuration": "PT1800S",
       "consumerProperties": {
         "bootstrap.servers": "kafka:9092"
       },
       "pollTimeout": 100,
       "startDelay": "PT5S",
       "period": "PT30S",
       "useEarliestOffset": true,
       "completionTimeout": "PT3600S",
       "lateMessageRejectionPeriod": null,
       "earlyMessageRejectionPeriod": "PT3600S",
       "lateMessageRejectionStartDateTime": null,
       "stream": "npav-ts-metrics",
       "useEarliestSequenceNumber": true,
       "givenInputFormat": null
     },
   
   Notice that there is no longer a valid inputFormat, nor is there a valid 
givenInputFormat. When in this state, the supervisor is unable to create new 
ingestion tasks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to