We noticed that it takes a long time for the historicals to download
segments from deep storage (in our case S3). Looking closer at the code in
ZKCoordinator, I noticed that the segment download is happening in a single
threaded fashion. This download happens in the SingleThreadedExecutor
service used by the PathChildrenCache. Looking at the commentary on
https://github.com/apache/incubator-druid/issues/4421 and
https://github.com/apache/incubator-druid/issues/3202, the executor service
used in PathChildrenCache can only be single threaded.

My proposal is to use a multi threaded ExecutorService that will be used to
take action on the  events to perform the download. The role of single
threaded ExecutorService in PathChildrenCache will be simply to delegate
the download task to this new executor service.

Does that sound feasible? IMO, if this happens to be functionally correct,
it should help significantly boost up the time it is taking historicals to
download all the assigned segments.

I would be more than happy to contribute this enhancement to the community.

Thanks,
Samarth

Reply via email to