Hi all,

Our team has been examining the performance characteristics of Apache Drill and 
observed that calls to `getQueryPlan()` can contribute measurable latency. In a 
local fork, we added a basic in-memory cache for query plans within the 
`DrillSqlWorker` class. This change resulted in consistent reductions in total 
query response time of approximately 1500ms in scenarios where the overall 
execution time was around 3600ms.

### Proposal: Cache getQueryPlan() Results

We propose introducing a caching mechanism for the output of `getQueryPlan()` 
in cases where:
- The input SQL query is the same as a previously seen one
- The schema or related metadata used for planning has not changed
- The cached result has not expired, based on a configurable time-to-live (TTL)

### Proposed Caching Features

- Toggle to enable or disable query plan caching
- Configurable TTL-based invalidation
- Optional schema metadata verification to detect changes in the underlying 
data sources

### Motivation

- Reduce planner overhead when query structure and schema remain stable
- Improve performance for repeated calls to `getQueryPlan()`
- Provide an optimization option that can be enabled when performance 
improvements are needed

### Related Discussions

This topic is related to prior discussions on metadata performance, such as the 
INFO_SCHEMA performance thread from February 2022. In that context, metadata 
introspection was identified as a source of overhead, and caching was suggested 
as one possible mitigation. A similar approach at the planning level may be 
applicable here.

### Next Steps

We’d love to hear thoughts from the community on:
- Whether this caching mechanism is appropriate for inclusion in Drill
- Any potential issues or edge cases, such as schema volatility or query plan 
invalidation strategies

If there’s general support, I’d be happy to draft a more detailed design and 
open a JIRA issue for it.

Thanks for your time, looking forward to your feedback and suggestions.

-  Vincent

Reply via email to