pchang388 commented on issue #12701: URL: https://github.com/apache/druid/issues/12701#issuecomment-1185869951
Since the Peon seems to be unable to pause in a reasonable timeframe or at times unresponsive/hung, I took a look at some of the metrics for the common actions it would do during a task lifecycle. According to the docs: ``` An indexing task starts running and building a new segment. It must determine the identifier of the segment before it starts building it. For a task that is appending (like a Kafka task, or an index task in append mode) this is done by calling an "allocate" API on the Overlord to potentially add a new partition to an existing set of segments. For a task that is overwriting (like a Hadoop task, or an index task not in append mode) this is done by locking an interval and creating a new version number and new set of segments. When the indexing task has finished reading data for the segment, it pushes it to deep storage and then publishes it by writing a record into the metadata store. ``` So during the `READING` phase (we did see a few, also gave an example earlier, fail because it didn't pause but was in the `READING` phase ) , it is communicating to Overlord via API to allocate new partition to an existing set of segments. And during the `PUBLISH` phase it is pushing to Object store and also Metadata DB, so looking at some the general state for that: 1. SQL Read/Write/Update Performance in our Metadata DB (we are using yugabyte DB - distrubuted/HA postgres - in Kubernetes due to VM capacity constraints on our side) - not as much data yet I recently enabled prom scrapping for it:     * The select and delete operations appear to be quite high. Depending on application, select statements usually should be returned fairly quickly (< 1 second) especially for user visible processes and ours seems quite high but unsure how much of an affect this would have on tasks. But write performance seems to be okay. 2. Object Storage Pushes and Persists by Peon, some of the larger objects appear to take a longer time than expected, especially with multi-part upload but unsure if that is being used by druid: ``` Segment[REDACT_2022-07-14T18:00:00.000Z_2022-07-14T19:00:00.000Z_2022-07-14T19:24:01.698Z_14] of 274,471,389 bytes built from 27 incremental persist(s) in 42,830ms; pushed to deep storage in 47,408ms Segment[REDACT_2022-07-14T17:00:00.000Z_2022-07-14T18:00:00.000Z_2022-07-14T17:47:19.425Z_32] of 42,782,021 bytes built from 12 incremental persist(s) in 4,177ms; pushed to deep storage in 5,958ms Segment[REDACT_2022-07-14T19:00:00.000Z_2022-07-14T20:00:00.000Z_2022-07-14T20:41:44.206Z_4] of 224,815,291 bytes built from 22 incremental persist(s) in 33,123ms; pushed to deep storage in 40,514ms ``` I hope this background information provides more details into our setup/configuration. Hopefully makes it easier to spot a potential issue/bottleneck (like the Overlord seems to be). I really do appreciate the help @abhishekagarwal87 and @AmatyaAvadhanula. My next steps is to get the flame graphs for the peons to get an idea of what the threads are doing. But please let me know if you have any further suggestions or things to try or I should provide any more information. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
