aicam opened a new issue, #4190: URL: https://github.com/apache/texera/issues/4190
### Task Summary # Proposal: Unify Proxy Architecture with Envoy Gateway ## Summary We are proposing a migration from our current dual-proxy setup (Ingress Nginx + Envoy) to a unified **Envoy Gateway** architecture. This change aims to simplify our infrastructure, adopt the modern Kubernetes Gateway API, and natively support the dynamic routing requirements of our ephemeral Computing Units. ## Motivation & Current Limitations Our system currently relies on **Ingress Nginx** for static routing and a separate **Envoy** instance for dynamic routing. While Ingress Nginx is a standard solution, it presents significant architectural limitations for our specific workload, particularly regarding the **Computing Units** in the `texera-workflow-computing-unit-pool` namespace. ### The Problems with the Current Stack 1. **Lack of Dynamic Routing:** Ingress Nginx operates on a configuration reload model. Every time an upstream service changes, Nginx must reload its configuration. This is inefficient for our Computing Units, which are highly transient (dynamically created and terminated). 2. **Dual-Proxy Complexity:** To bypass Nginx's limitations, we introduced a secondary Envoy proxy to handle WebSocket and HTTP connections to these dynamic units. This resulted in a "double-hop" architecture (User -> Ingress -> Envoy -> Compute), increasing latency and operational maintenance. 3. **Legacy Architecture:** Nginx's process-based architecture is less suited for high-churn service discovery compared to Envoy's modern, threaded architecture which utilizes the xDS protocol for seamless, hot-restart-free configuration updates. ## Proposed Solution: Envoy Gateway We propose replacing both the Ingress Controller and the standalone Envoy proxy with **Envoy Gateway**. Envoy Gateway is a Kubernetes-native implementation of the **Gateway API**. It manages Envoy proxies as the data plane, allowing us to handle both static system routes (Web App, Config Service) and dynamic ephemeral routes (Computing Units) in a single, unified layer. ### Key Benefits * **Unified Architecture:** Eliminates the maintenance overhead of managing two different proxy technologies. * **Native Dynamic Routing:** Envoy natively supports service discovery via xDS, allowing it to route to ephemeral pods in the `texera-workflow-computing-unit-pool` without the need for constant configuration reloads. * **Modern Standard:** Adopting the [Kubernetes Gateway API](https://gateway-api.sigs.k8s.io/) future-proofs our networking stack with standard resources (`Gateway`, `HTTPRoute`) rather than vendor-specific annotations. * **Protocol Support:** Seamless support for both HTTP/2 and WebSockets, which are critical for the interactive nature of our workflow system. ## Architecture Comparison ### Current Architecture *Traffic flows through Ingress for static routes, but requires a secondary Envoy hop for dynamic Compute Units.* Current architecture: <img width="1119" height="509" alt="Screenshot from 2026-01-29 13-16-45" src="https://github.com/user-attachments/assets/b73d1403-f973-4153-804e-b330c2824c19" /> ### Target Architecture *A single Envoy Gateway layer handles all ingress traffic, routing directly to services and dynamically discovering Compute Units.* <img width="1253" height="559" alt="Screenshot from 2026-01-29 13-21-07" src="https://github.com/user-attachments/assets/1006d958-3c16-4acd-a902-30e4501e6361" /> ### Priority P2 – Medium ### Task Type - [ ] Code Implementation - [ ] Documentation - [ ] Refactor / Cleanup - [ ] Testing / QA - [ ] DevOps / Deployment -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
