yrenat opened a new issue, #3685:
URL: https://github.com/apache/texera/issues/3685

   ## Overview
   
   To start with and to be clear, the Texera project involves three key phases 
to achieve broad general public access and scalability:
   
   1. **Phase 1: Local Deployment**  
      - Initially, all services are deployed on a large, powerful local 
cluster.  
      - **Challenges**:  
        - Handling large file issues.  
   
   2. **Phase 2: AWS Deployment**  
      - We run all the services on remote AWS VMs to support a limited number 
of users.  
      - **Challenges**:  
        - Determining the minimum costs when user activity is low or 
non-existent.  
        - High latency when new users initiate computing units.  
   
   3. **Phase 3: User-Credentialed Local Deployment**  
      - The final goal is to run the brain layer locally, where users provide 
their own credentials of AWS for computation.  We will create VMs on their 
behalf and use their resources to compute for them. 
   
   This issue focuses on **Phase 2**, specifically addressing the cost issues 
related to deploying on AWS.
   
   ---
   
   ## Framework
   
   To better understand the structure in **Phase 2**, here's an overview of the 
planned framework (The Texera project is already divided into **8 
microservices**):
   
   <img width="1804" height="407" alt="Image" 
src="https://github.com/user-attachments/assets/2d69b0c3-3670-4283-9d97-b3a6bd5a9317";
 />  
   
   When deploying on AWS, we do not want all the services to be running all the 
time (otherwise that will be too costly). Instead, we want them to **start and 
stop on request**. This means that when a new request comes in, one or multiple 
services will "wake up" and respond. If there are no requests, no services 
should wake up.  
   
   In this model, we **only pay AWS when computation requests are made**, and 
we do not incur costs when the system is idle. To achieve this, we plan to use 
**AWS ECS (Elastic Container Service)**. Here's how it works:  
   
   - **Resource Threshold**:  
     - We set a threshold on resource usage (e.g., only 70% of CPU should be 
used).  
     - If total resource usage exceeds the threshold (e.g., CPU usage reaches 
85%), AWS ECS will automatically create a new task to handle the additional 
load (a **task** in ECS is a collection of resources automatically allocated by 
AWS ECS that can handle multiple requests).  
   
   - **Scaling Conditions**:  
     - The threshold for scaling can be based on  CPU usage, RAM usage, or the 
number of incoming requests.  
   
   ### Challenges
   This approach, while cost-efficient, introduces the latency problem: If all 
microservices are idle and need to start when a user makes a request, it takes 
time to initialize. This delays the response and harms user experience.  
   
   ### Observations and Guesses
   The **8 microservices** do not serve user requests simultaneously but 
instead operate in a **sequential order**. For example:  Some key 
microservices, such as the **"Config Service"** and **"Texera 
WebApplication"**, are the first to handle user requests.  If these micro 
services are running, then other micro services can be deployed while they are 
responding the users. This method should reduce the user waiting time by large. 
   
   
   ### Current Solution
   To conclude what we observe:  
   - **Keep a minimum number of microservices running at all times** to improve 
responsiveness.  
   - While these critical services handle the initial user request, other 
microservices can start in the background.  
   
   ---
   
   ## Approach
   
   To address the challenges in **Phase 2**, we plan to conduct experiments and 
analyze the trade-offs between cost and performance. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to