Re: [I] airflow workers and scheduler memory leak [airflow]

2024-12-06 Thread via GitHub


zsdyx commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2522721849

   > > Hi @rinzool, Do you know why this is, is it because of the behavior of 
the linux kernel, the first time you encounter this problem, why is it GC after 
setting limit.
   > 
   > Honestly no. My supposition is that the scheduler has an internal process 
to release some memory when it reaches a limit (like 90% of available memory 
for example), and by defining a memory limit to the pod it helps airflow to 
trigger this process. But honestly I don't know if that's the real reason. All 
I know is that it fixed my problem 🤷
   
   We use the same version, I find this problem in both scheduler and trigger, 
but when I look at airflow code, there is no gc job, which is strange.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-12-06 Thread via GitHub


rinzool commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2522706809

   > Hi @rinzool, Do you know why this is, is it because of the behavior of the 
linux kernel, the first time you encounter this problem, why is it GC after 
setting limit.
   
   Honestly no. My supposition is that the scheduler has an internal process to 
release some memory when it reaches a limit (like 90% of available memory for 
example), and by defining a memory limit to the pod it helps airflow to trigger 
this process. But honestly I don't know if that's the real reason. All I know 
is that it fixes my problem 🤷 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-12-06 Thread via GitHub


zsdyx commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2522698364

   Hi @rinzool, Do you know why this is, is it because of the behavior of the 
linux kernel, the first time you encounter this problem, why is it GC after 
setting limit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-12-03 Thread via GitHub


danielstefanrt commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2516311371

   @zsdyx No, I haven't created a new issue.  The problem is still present in 
our cluster, but we did not have too much time for a deeper investigation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-11-29 Thread via GitHub


zsdyx commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2507304166

   > Hi, @potiuk . Thanks for your quick reply. Sure, I'll do as you suggested 
and create a new issue with my findings. There is no frustration here :) I just 
saw that the outcome was the same and I did not want to open a new issue when 
there was a similar one.
   Hi @danielstefanrt  I have the same phenomenon as you, you have created new 
issues?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-11-05 Thread via GitHub


danielstefanrt commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2456441110

   Hi,
   
   I'm facing the same issue on version 2.10.2. The scheduler memory keeps 
growing with ~1GB each day. 
   I've tried to set the limits and I saw in Kubernetes that the pod ran out of 
memory (status is OOMKilled and I can see that it restarted).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-11-05 Thread via GitHub


danielstefanrt commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2456539749

   Hi, @potiuk . Thanks for your quick reply. Sure, I'll do as you suggested 
and create a new issue with my findings. There is no frustration here :) I just 
saw that the outcome was the same and I did not want to open a new issue when 
there was a similar one.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-11-05 Thread via GitHub


potiuk commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2456527266

   > Hi,
   > 
   > I'm facing the same issue on version 2.10.2. The scheduler memory keeps 
growing with ~1GB each day. I've tried to set the limits and I saw in 
Kubernetes that the pod ran out of memory (status is OOMKilled and I can see 
that it restarted).
   
   Can you please use memray (see instructions 
https://github.com/bloomberg/memray) to monitor and check what memory is 
leaking and open a new issue explaining your configuration and investigations 
around what you looked at and possible causes you might guess by analysing your 
usage. You have clearly a DIFFERENT issue than the one above - where we are 
talking about file system memory that grows naturally and clears itself. 
   
   Just commenting that you have similar (i.e. related to memory) issue on a  
closed issue, does not add too much value and does not help anyone to help you. 
I understand you are frustrated with your problem, but the best course of 
action is to gather enough evidences and explain your circumstances in way that 
is detailed enough for someone to be actually do anything about it (of course 
if there is enough evidences - this is a free forum where people try to help 
others, when they can, so helping them to help you is a good idea).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [I] airflow workers and scheduler memory leak [airflow]

2024-08-28 Thread via GitHub


rinzool commented on issue #28740:
URL: https://github.com/apache/airflow/issues/28740#issuecomment-2314575437

   Hi, 
   If it may help someone, I faced the same issue, my scheduler memory 
increased from 600MB to 1.3GB in 4 days (the 
`container_memory_working_set_bytes` precisely).
   Using Airflow 2.10.0 and chart 1.15.0.
   
![image](https://github.com/user-attachments/assets/9e7f6705-ad0d-42cf-ab02-20f93cea6b0d)
   
   I tried a lot of things (exploring process using `ps` and `htop`), cleaning 
logs manually, etc. nothing worked.
   
   After re-re-re-reading the last message from @potiuk I focused on:
   > Does ite eventually cause your containers to crash with OOM (Out of 
Memory) or does the memory clears itself out when needed?
   
   So I added a memory limit to see if the pod goes OOM or not
   ```yaml
   scheduler:
 resources:
   requests:
 memory: 500Mi
   limits:
 memory: 1Gi
   ```
   Here is the result: 
   
![image](https://github.com/user-attachments/assets/919e8cf6-21d0-4c6e-b064-cdaf206a79a4)
   
   No OOM, the pod clears itself when it is too close to the limit.
   
   Hope this message can help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]