gianm commented on issue #12262: URL: https://github.com/apache/druid/issues/12262#issuecomment-1182883739
Even though the current work is focused on batch ingestion, I wanted to write down some thoughts about low-latency queries. I expect two pieces of the current work will be especially useful: frame format/processors, and the modeling of multi-stage queries as DAGs of processors. First, goals. We want to retain the good properties of the current low-latency query stack, and improve in areas where we can improve. Good properties we should keep: - The ability to do 1000+ QPS & beyond. - Segment caching and cache-aware work assignment. - The concept of a processing thread pool, where we get to decide what code runs on the CPU and what doesn't. - The ability to interleave small work items (like processing one individual segment) for different queries on the processing thread pool. - Query priorities and the ability for higher-priority queries to preferentially gain access to the processing thread pool. Things we can improve: - The obvious: make things multi-stage, and use frames for efficient data transfer between servers 🙂 - No, or reduced, need for server thread pool size and connection pool size tuning. Today the server thread pool sizes on Brokers and data servers, and the Broker-to-data-server connection pool size, dictate the number of queries that can run concurrently. Each query needs to acquire a connection from the Broker-to-data-server pool, and a server thread from the Broker and all relevant data servers, in order to run. High QPS workloads require tuning these three parameters. Too low and the hardware isn't maxed out; too high and the system can suffer from excessive memory use or context switching. It'd be better to arrive at good throughput and stability without needing to adjust these parameters. - In the out-of-box configuration, query priorities apply only to the processing thread pools. They don't apply to the resources mentioned in the prior bullet. So, it's possible for low priority long-running queries to starve out high priority queries. Query laning helps with this, but it isn't configured out of the box, and the 429 error codes make it more difficult to write clients since retries are necessary. I think we can come up with a better solution. I suspect we'll want to decouple server-to-server queries from the http request/response structure. - Query types today are monolithic and handle many functions internally (like aggregation, ordering, limiting) in a fixed structure. Splitting these up into smaller logical operators would simplify planning and make execution more flexible. It will also allow us to factor the execution code into more modular units. Some discussion in #12641. - Query profiling today relies on emitted metrics and Java profilers. A built-in query profiling facility would be useful: an ability for users to receive profiling information in such a way that doesn't require a metrics system or additional tools. Second, form factors. There's four reasonably natural form factors the low-latency stuff can take: (1) the Historicals do the leaf stages [the ones that directly process segments] and Brokers do intermediate and final stages; (2) the Historicals do all work [leaf and intermediate], and the Broker only receives and returns final results; (3) the Historicals do the leaf stages, Brokers receive and return final results, and we introduce a third service type for intermediate stages; and (4) we introduce a completely new service type that is neither a Historical nor a Broker. We should choose the path that gets us to the goals quickest and makes life easiest for users. (1) or (2) seem ideal because they provide a clean migration path. I don't like (3), because two service types seems like enough already, and adding a third would add significant operational complexity. (4) seems like a second-choice option if (1) or (2) don't work out. Welcome any additional thoughts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
