Hi all, Our team has been examining the performance characteristics of Apache Drill and observed that calls to `getQueryPlan()` can contribute measurable latency. In a local fork, we added a basic in-memory cache for query plans within the `DrillSqlWorker` class. This change resulted in consistent reductions in total query response time of approximately 1500ms in scenarios where the overall execution time was around 3600ms.
### Proposal: Cache getQueryPlan() Results We propose introducing a caching mechanism for the output of `getQueryPlan()` in cases where: - The input SQL query is the same as a previously seen one - The schema or related metadata used for planning has not changed - The cached result has not expired, based on a configurable time-to-live (TTL) ### Proposed Caching Features - Toggle to enable or disable query plan caching - Configurable TTL-based invalidation - Optional schema metadata verification to detect changes in the underlying data sources ### Motivation - Reduce planner overhead when query structure and schema remain stable - Improve performance for repeated calls to `getQueryPlan()` - Provide an optimization option that can be enabled when performance improvements are needed ### Related Discussions This topic is related to prior discussions on metadata performance, such as the INFO_SCHEMA performance thread from February 2022. In that context, metadata introspection was identified as a source of overhead, and caching was suggested as one possible mitigation. A similar approach at the planning level may be applicable here. ### Next Steps We’d love to hear thoughts from the community on: - Whether this caching mechanism is appropriate for inclusion in Drill - Any potential issues or edge cases, such as schema volatility or query plan invalidation strategies If there’s general support, I’d be happy to draft a more detailed design and open a JIRA issue for it. Thanks for your time, looking forward to your feedback and suggestions. - Vincent