yrenat opened a new issue, #3685:
URL: https://github.com/apache/texera/issues/3685
## Overview
To start with and to be clear, the Texera project involves three key phases
to achieve broad general public access and scalability:
1. **Phase 1: Local Deployment**
- Initially, all services are deployed on a large, powerful local
cluster.
- **Challenges**:
- Handling large file issues.
2. **Phase 2: AWS Deployment**
- We run all the services on remote AWS VMs to support a limited number
of users.
- **Challenges**:
- Determining the minimum costs when user activity is low or
non-existent.
- High latency when new users initiate computing units.
3. **Phase 3: User-Credentialed Local Deployment**
- The final goal is to run the brain layer locally, where users provide
their own credentials of AWS for computation. We will create VMs on their
behalf and use their resources to compute for them.
This issue focuses on **Phase 2**, specifically addressing the cost issues
related to deploying on AWS.
---
## Framework
To better understand the structure in **Phase 2**, here's an overview of the
planned framework (The Texera project is already divided into **8
microservices**):
<img width="1804" height="407" alt="Image"
src="https://github.com/user-attachments/assets/2d69b0c3-3670-4283-9d97-b3a6bd5a9317"
/>
When deploying on AWS, we do not want all the services to be running all the
time (otherwise that will be too costly). Instead, we want them to **start and
stop on request**. This means that when a new request comes in, one or multiple
services will "wake up" and respond. If there are no requests, no services
should wake up.
In this model, we **only pay AWS when computation requests are made**, and
we do not incur costs when the system is idle. To achieve this, we plan to use
**AWS ECS (Elastic Container Service)**. Here's how it works:
- **Resource Threshold**:
- We set a threshold on resource usage (e.g., only 70% of CPU should be
used).
- If total resource usage exceeds the threshold (e.g., CPU usage reaches
85%), AWS ECS will automatically create a new task to handle the additional
load (a **task** in ECS is a collection of resources automatically allocated by
AWS ECS that can handle multiple requests).
- **Scaling Conditions**:
- The threshold for scaling can be based on CPU usage, RAM usage, or the
number of incoming requests.
### Challenges
This approach, while cost-efficient, introduces the latency problem: If all
microservices are idle and need to start when a user makes a request, it takes
time to initialize. This delays the response and harms user experience.
### Observations and Guesses
The **8 microservices** do not serve user requests simultaneously but
instead operate in a **sequential order**. For example: Some key
microservices, such as the **"Config Service"** and **"Texera
WebApplication"**, are the first to handle user requests. If these micro
services are running, then other micro services can be deployed while they are
responding the users. This method should reduce the user waiting time by large.
### Current Solution
To conclude what we observe:
- **Keep a minimum number of microservices running at all times** to improve
responsiveness.
- While these critical services handle the initial user request, other
microservices can start in the background.
---
## Approach
To address the challenges in **Phase 2**, we plan to conduct experiments and
analyze the trade-offs between cost and performance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]