shauryachats opened a new issue, #16456: URL: https://github.com/apache/pinot/issues/16456
## 🧩 Problem Statement At **Uber**, Apache Pinot powers critical observability use cases such as **logging, tracing, and metrics**. A key requirement for these scenarios is the ability to **query across multiple Pinot clusters, regions, or data centers** to provide a **global view** of logs, traces, metrics, and more. However, Pinot currently lacks native support for such cross-cluster queries. As a result, client applications must send individual queries to each cluster and manually merge the results—leading to increased complexity, duplication of logic, and inconsistent behavior across applications. --- ## ✅ Requirements For the **initial iteration** of cross-cluster query federation, the following capabilities are planned: - The **same table name** existing across two or more clusters should be queryable from **any broker**. - Both **Single-Stage Engine (SSE)** and **Multi-Stage Engine (MSE)** queries must be supported. - Federation can be **enabled via a query option**, or **globally via broker config**. - **Schema mismatches** across clusters are detected: if federation is enabled and a column is missing in any cluster, the query **should fail**. - **Partial results** should be supported and surfaced when applicable. - **Hybrid tables are out of scope** for the first phase of federation support. --- ## 💡 Proposed Solution The proposed solution involves making brokers **cross-cluster aware** by enabling them to connect to secondary Pinot clusters via ZooKeeper: - Brokers are configured to connect to **secondary ZooKeeper clusters** during startup (`BaseBrokerStarter`) in addition to their primary cluster. - Brokers act as both `PARTICIPANT` (to receive offline–online transitions) and `SPECTATOR` (to receive IdealState and ExternalView updates) for the secondary clusters. - A new component, `FederatedRoutingManager`, implementing `RoutingManager`, is introduced. It aggregates routing metadata from the primary cluster and all secondary clusters using multiple `BrokerRoutingManager` instances internally. - The unified `RoutingTable` returned to SSE or MSE engines is **cluster-agnostic**—downstream code can treat the segments and servers as if they are from a single logical cluster. ### ✅ Pros - Minimal changes limited to the **routing layer**, preserving downstream planning and scatter-gather logic. - When federation is not enabled via query options, `FederatedRoutingManager` behaves identically to a primary-only routing manager. - Compatible with both **SSE and MSE** query flows. - Preserves Pinot’s existing broker-centric model of Helix transitions. ### ❌ Cons - Breaks strict **cluster isolation** at the broker level. - Potential exposure to failures or delays due to **secondary ZooKeeper malfunction**—though guardrails will be introduced to mitigate this. --- ## 🔄 Alternate Solution In the alternate design, a new optional component called `pinot-router` is introduced: - This `pinot-router` component (extending from `pinot-broker`) receives federated queries. - It connects to **all participating clusters’ ZooKeeper instances**, maintaining metadata like IdealState, ExternalView, and table configs per cluster-table pair. - The router parses and rewrites the query appropriately, then scatters sub-queries to brokers in different clusters. - It **aggregates the results** from these brokers and returns the final merged response to the client. ### ✅ Pros - Preserves **strict cluster isolation**: Pinot clusters remain unaware of any cross-cluster communication. - Provides a **modular federation layer** external to the core Pinot deployment. ### ❌ Cons - **Incompatible with MSE**, as planning and execution would be split across clusters. - **Increased likelihood of partial results**, especially when `LIMIT` clauses are pushed to each sub-query. - Introduces **additional latency** due to the extra routing hop and result aggregation layer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
