shauryachats opened a new issue, #16456:
URL: https://github.com/apache/pinot/issues/16456

   ## đź§© Problem Statement
   
   At **Uber**, Apache Pinot powers critical observability use cases such as 
**logging, tracing, and metrics**. A key requirement for these scenarios is the 
ability to **query across multiple Pinot clusters, regions, or data centers** 
to provide a **global view** of logs, traces, metrics, and more.
   
   However, Pinot currently lacks native support for such cross-cluster 
queries. As a result, client applications must send individual queries to each 
cluster and manually merge the results—leading to increased complexity, 
duplication of logic, and inconsistent behavior across applications.
   
   ---
   
   ## âś… Requirements
   
   For the **initial iteration** of cross-cluster query federation, the 
following capabilities are planned:
   
   - The **same table name** existing across two or more clusters should be 
queryable from **any broker**.
   - Both **Single-Stage Engine (SSE)** and **Multi-Stage Engine (MSE)** 
queries must be supported.
   - Federation can be **enabled via a query option**, or **globally via broker 
config**.
   - **Schema mismatches** across clusters are detected: if federation is 
enabled and a column is missing in any cluster, the query **should fail**.
   - **Partial results** should be supported and surfaced when applicable.
   - **Hybrid tables are out of scope** for the first phase of federation 
support.
   
   ---
   
   ## đź’ˇ Proposed Solution
   
   The proposed solution involves making brokers **cross-cluster aware** by 
enabling them to connect to secondary Pinot clusters via ZooKeeper:
   
   - Brokers are configured to connect to **secondary ZooKeeper clusters** 
during startup (`BaseBrokerStarter`) in addition to their primary cluster.
   - Brokers act as both `PARTICIPANT` (to receive offline–online transitions) 
and `SPECTATOR` (to receive IdealState and ExternalView updates) for the 
secondary clusters.
   - A new component, `FederatedRoutingManager`, implementing `RoutingManager`, 
is introduced. It aggregates routing metadata from the primary cluster and all 
secondary clusters using multiple `BrokerRoutingManager` instances internally.
   - The unified `RoutingTable` returned to SSE or MSE engines is 
**cluster-agnostic**—downstream code can treat the segments and servers as if 
they are from a single logical cluster.
   
   ### âś… Pros
   
   - Minimal changes limited to the **routing layer**, preserving downstream 
planning and scatter-gather logic.
   - When federation is not enabled via query options, 
`FederatedRoutingManager` behaves identically to a primary-only routing manager.
   - Compatible with both **SSE and MSE** query flows.
   - Preserves Pinot’s existing broker-centric model of Helix transitions.
   
   ### ❌ Cons
   
   - Breaks strict **cluster isolation** at the broker level.
   - Potential exposure to failures or delays due to **secondary ZooKeeper 
malfunction**—though guardrails will be introduced to mitigate this.
   
   ---
   
   ## 🔄 Alternate Solution
   
   In the alternate design, a new optional component called `pinot-router` is 
introduced:
   
   - This `pinot-router` component (extending from `pinot-broker`) receives 
federated queries.
   - It connects to **all participating clusters’ ZooKeeper instances**, 
maintaining metadata like IdealState, ExternalView, and table configs per 
cluster-table pair.
   - The router parses and rewrites the query appropriately, then scatters 
sub-queries to brokers in different clusters.
   - It **aggregates the results** from these brokers and returns the final 
merged response to the client.
   
   ### âś… Pros
   
   - Preserves **strict cluster isolation**: Pinot clusters remain unaware of 
any cross-cluster communication.
   - Provides a **modular federation layer** external to the core Pinot 
deployment.
   
   ### ❌ Cons
   
   - **Incompatible with MSE**, as planning and execution would be split across 
clusters.
   - **Increased likelihood of partial results**, especially when `LIMIT` 
clauses are pushed to each sub-query.
   - Introduces **additional latency** due to the extra routing hop and result 
aggregation layer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to