Re: [I] airflow workers and scheduler memory leak [airflow]
zsdyx commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2522721849 > > Hi @rinzool, Do you know why this is, is it because of the behavior of the linux kernel, the first time you encounter this problem, why is it GC after setting limit. > > Honestly no. My supposition is that the scheduler has an internal process to release some memory when it reaches a limit (like 90% of available memory for example), and by defining a memory limit to the pod it helps airflow to trigger this process. But honestly I don't know if that's the real reason. All I know is that it fixed my problem 🤷 We use the same version, I find this problem in both scheduler and trigger, but when I look at airflow code, there is no gc job, which is strange. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
rinzool commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2522706809 > Hi @rinzool, Do you know why this is, is it because of the behavior of the linux kernel, the first time you encounter this problem, why is it GC after setting limit. Honestly no. My supposition is that the scheduler has an internal process to release some memory when it reaches a limit (like 90% of available memory for example), and by defining a memory limit to the pod it helps airflow to trigger this process. But honestly I don't know if that's the real reason. All I know is that it fixes my problem 🤷 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
zsdyx commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2522698364 Hi @rinzool, Do you know why this is, is it because of the behavior of the linux kernel, the first time you encounter this problem, why is it GC after setting limit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
danielstefanrt commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2516311371 @zsdyx No, I haven't created a new issue. The problem is still present in our cluster, but we did not have too much time for a deeper investigation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
zsdyx commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2507304166 > Hi, @potiuk . Thanks for your quick reply. Sure, I'll do as you suggested and create a new issue with my findings. There is no frustration here :) I just saw that the outcome was the same and I did not want to open a new issue when there was a similar one. Hi @danielstefanrt I have the same phenomenon as you, you have created new issues? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
danielstefanrt commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2456441110 Hi, I'm facing the same issue on version 2.10.2. The scheduler memory keeps growing with ~1GB each day. I've tried to set the limits and I saw in Kubernetes that the pod ran out of memory (status is OOMKilled and I can see that it restarted). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
danielstefanrt commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2456539749 Hi, @potiuk . Thanks for your quick reply. Sure, I'll do as you suggested and create a new issue with my findings. There is no frustration here :) I just saw that the outcome was the same and I did not want to open a new issue when there was a similar one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
potiuk commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2456527266 > Hi, > > I'm facing the same issue on version 2.10.2. The scheduler memory keeps growing with ~1GB each day. I've tried to set the limits and I saw in Kubernetes that the pod ran out of memory (status is OOMKilled and I can see that it restarted). Can you please use memray (see instructions https://github.com/bloomberg/memray) to monitor and check what memory is leaking and open a new issue explaining your configuration and investigations around what you looked at and possible causes you might guess by analysing your usage. You have clearly a DIFFERENT issue than the one above - where we are talking about file system memory that grows naturally and clears itself. Just commenting that you have similar (i.e. related to memory) issue on a closed issue, does not add too much value and does not help anyone to help you. I understand you are frustrated with your problem, but the best course of action is to gather enough evidences and explain your circumstances in way that is detailed enough for someone to be actually do anything about it (of course if there is enough evidences - this is a free forum where people try to help others, when they can, so helping them to help you is a good idea). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] airflow workers and scheduler memory leak [airflow]
rinzool commented on issue #28740: URL: https://github.com/apache/airflow/issues/28740#issuecomment-2314575437 Hi, If it may help someone, I faced the same issue, my scheduler memory increased from 600MB to 1.3GB in 4 days (the `container_memory_working_set_bytes` precisely). Using Airflow 2.10.0 and chart 1.15.0.  I tried a lot of things (exploring process using `ps` and `htop`), cleaning logs manually, etc. nothing worked. After re-re-re-reading the last message from @potiuk I focused on: > Does ite eventually cause your containers to crash with OOM (Out of Memory) or does the memory clears itself out when needed? So I added a memory limit to see if the pod goes OOM or not ```yaml scheduler: resources: requests: memory: 500Mi limits: memory: 1Gi ``` Here is the result:  No OOM, the pod clears itself when it is too close to the limit. Hope this message can help! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
