Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
kaxil commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2877549548 I have a fix in https://github.com/apache/airflow/pull/50558 too that should fix the DAG File processor. I added notes in PR description but adding it here too: In a follow-up PR, I will add some cleanup to `InProcessExecutionAPI` itself so our test have no side-effects when used with `dag.test` (which was just ported over few days back). The Triggerer is the other process that uses it, but because it is a long-running process that doesn't spawn other processes like the DAG File processor, it remains unaffected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2877308467 Thanks @zachliu for the validation. From my understanding this could also cause issues in triggerer for certain triggers that access variables. I will validate and raise a PR. https://github.com/apache/airflow/blob/e6430c25cdb00d8cf56093deb1b1561ae615a5f6/airflow-core/src/airflow/jobs/triggerer_job_runner.py#L353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2873858441 Good job @tirkarthi ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2873852497 comrades, i'm happy to report that adding `@functools.cache` on top of `def in_process_api_server()` is able to solve both this issue and the [memory issue](https://github.com/apache/airflow/issues/49887#issuecomment-2852313264)! 🎉  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2872974178 @tirkarthi 👍 gonna test it out today on top of airflow 3.0.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2868630859 @zachliu I was checking on this yesterday. Can you please try the below patch? I tried the sample dag with a copy of 100 dags. The transport object which is created per client could be causing the socket to be open is my assumption as well. That's why this is prevalent in dags with connection, variable etc. that triggers client creation and subsequently transport creation which I feel could be cached. ```diff diff --git a/airflow-core/src/airflow/dag_processing/processor.py b/airflow-core/src/airflow/dag_processing/processor.py index 73d2c23c7f..19dd3e018c 100644 --- a/airflow-core/src/airflow/dag_processing/processor.py +++ b/airflow-core/src/airflow/dag_processing/processor.py @@ -213,6 +213,7 @@ class DagFileParsingResult(BaseModel): type: Literal["DagFileParsingResult"] = "DagFileParsingResult" [email protected] def in_process_api_server() -> InProcessExecutionAPI: from airflow.api_fastapi.execution_api.app import InProcessExecutionAPI ``` ```python # 1.py from airflow.sdk import Variable from airflow.models.dag import DAG email = Variable.get( f"email", default=["[email protected]"], deserialize_json=False, ) ``` 100 copies of the same file in dags folder ``` seq 1 100 | xargs -I{} cp ~/airflow/dags/1.py ~/airflow/dags/{}.py ``` ``` pgrep -f dag-processor | xargs -I{} ls -l /proc/{}/fd/ | grep -i socket ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
uranusjr commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2868084869 It’s plausible, `ASGIMiddleware` basically starts a thread that runs forever in the background. Not sure if the thread ever ends or how. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2867335186 > Or… could it be this? > > [airflow/task-sdk/src/airflow/sdk/execution_time/supervisor.py](https://github.com/apache/airflow/blob/ef2da6f1efd6606e424964dec2a184a3a8521e27/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L1446) > > Line 1446 in [ef2da6f](/apache/airflow/commit/ef2da6f1efd6606e424964dec2a184a3a8521e27) > > log = structlog.get_logger(logger_name="supervisor") > Getting a logger triggers its creation, and that might create a handle to the log file without closing. (I have no idea, just want to keep all possibilities open.) @uranusjr i commented them out yet `lsof` still keeps growing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2867184079 > No secret backend — I’m using a fairly standard Airflow setup. The DAGs are stored in a Git repository, and the metadata database is Postgres. thanks man, this saves me trouble of removing and testing 👍 ``` [secrets] backend = airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
dramis commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2866235236 You're right, my initial implementation wasn't great — using an HTTP connection just to pass a username and password was a bit of an overkill. I've updated it to use a Jinja template variable instead. It's much cleaner and more appropriate now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
uranusjr commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2866700424 Note that @zachliu’s repro does not involve `get_connection`, but `Variable.get`. Reading the implementation of each, my current top suspicion is the secrets backend (both eventually do the same cache check - load secret backends - use backend - save to cache combo). It’s not yet clear which part of the process is problematic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
dramis commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2866842417 No secret backend — I’m using a fairly standard Airflow setup. The DAGs are stored in a Git repository, and the metadata database is Postgres. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
uranusjr commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2866765467 Or… could it be this? https://github.com/apache/airflow/blob/ef2da6f1efd6606e424964dec2a184a3a8521e27/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L1446 Getting a logger triggers its creation, and that might create a handle to the log file without closing. (I have no idea, just want to raise all possibilities.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2866708492 > Note that [@zachliu](https://github.com/zachliu)’s repro does not involve `get_connection`, but `Variable.get`. Reading the implementation of each, my current top suspicion is the secrets backend, most likely somewhere in `ensure_secrets_backend_loaded` or secret backend classes. Yeah. That's a good clue - do you have Secrets Backends configured @zachliu @dramis ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
dramis commented on issue #49887:
URL: https://github.com/apache/airflow/issues/49887#issuecomment-2865907609
Here is a minimal DAG that triggers the issue:
```
from datetime import datetime, timedelta
from airflow import DAG
from airflow.utils.task_group import TaskGroup
from airflow.providers.standard.operators.bash import BashOperator
from airflow.providers.standard.operators.python import PythonOperator
from airflow.hooks.base import BaseHook
default_args = {
}
# === Utility function ===
def _get_credential_from_conn(conn_id: str, **op_kwargs):
now = datetime.now().isoformat()
with open('/tmp/runnint.txt', 'a') as f:
f.write(f"running {now}\n")
conn = BaseHook.get_connection(conn_id)
return f'-u {conn.login} -p {conn.password}'
# === DAG Definition ===
with DAG(
dag_id="test-error",
start_date=datetime(2025, 4, 28),
schedule="15 6 * * *",
catchup=True,
default_args=default_args,
render_template_as_native_obj=True,
tags=['elasticsearch', 'drk-1'],
) as dag:
get_hydroqc_data_hourly = BashOperator(
task_id="get_hydroqc_data_hourly",
bash_command=f"echo {_get_credential_from_conn('test_endpoint')}",
)
get_hydroqc_data_hourly
```
The test_endpoint connection is an HTTP connection with only the login and
password fields filled in.
A line is appended to the /tmp/runnint.txt file only when the DAG is
actually executed.
However, as soon as the DAG processor loads the DAG file, the number of open
files on the system gradually increases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
uranusjr commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2865089634 @dramis Is that utility function you posted run at parse time? (i.e. at top level, not inside a task function) Some example code showing how you’re calling the function would be very helpful to identify the pattern. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
dramis commented on issue #49887:
URL: https://github.com/apache/airflow/issues/49887#issuecomment-2864753569
I apologize, but the issue is not related to the use of variables in the
Jinja template. I ran a few tests on my side using a similar DAG and wasn't
able to reproduce the issue initially. Then I reinstalled a fresh Airflow 3
instance and added my DAGs one by one, which helped me identify the source of
the problem.
In some of my DAGs, I was using the following function to retrieve
credentials from a connection, and passing the result to a BashOperator:
```
# --- Utilities ---
def _get_credential_from_conn(conn_id: str, **op_kwargs):
conn = BaseHook.get_connection(conn_id)
return f"-u {conn.login} -p {conn.password}"
```
This worked fine in Airflow 2, but it seems to break in Airflow 3.
I've since updated those DAGs to use variable in jinja template instead, and
everything works properly.
I can also confirm that I’m seeing the same behavior reported by @zachliu,
even with just a single DAG using Variable.get() on a clean Airflow 3 instance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887:
URL: https://github.com/apache/airflow/issues/49887#issuecomment-2864362630
@dramis i was unable to reproduce the issue using this minimal DAG (the
output of `lsof` is small and stable):
```python
"""Test"""
# airflow DAG
from airflow.models.dag import DAG
from airflow.providers.standard.operators.bash import BashOperator
from airflow.timetables.trigger import CronTriggerTimetable
from pendulum import datetime
LOCAL_TZ = "UTC"
with DAG(
dag_id="dummy_daily",
schedule=CronTriggerTimetable("@daily", timezone=LOCAL_TZ),
start_date=datetime(2025, 5, 1, tz=LOCAL_TZ),
max_active_runs=3,
catchup=False,
tags=["flag"],
default_args={"email": "{{ var.value.email }}"},
):
BashOperator(
task_id="print",
bash_command="echo Hello!",
)
```
however i was able to reproduce using this DAG (the output of `lsof` grows
at every `reparse`):
```python
"""Test"""
# airflow DAG
from airflow.sdk import Variable
from airflow.models.dag import DAG
from airflow.providers.standard.operators.bash import BashOperator
from airflow.timetables.trigger import CronTriggerTimetable
from pendulum import datetime
LOCAL_TZ = "UTC"
email = Variable.get(
"email",
default=["[email protected]"],
deserialize_json=False,
)
with DAG(
dag_id="dummy_daily",
schedule=CronTriggerTimetable("@daily", timezone=LOCAL_TZ),
start_date=datetime(2025, 5, 1, tz=LOCAL_TZ),
max_active_runs=3,
catchup=False,
tags=["flag"],
default_args={"email": email},
):
BashOperator(
task_id="print",
bash_command="echo Hello!",
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
ashb commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2864242203 @dramis Do you have a minimal reproduction DAG you can share by any chance? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
dramis commented on issue #49887:
URL: https://github.com/apache/airflow/issues/49887#issuecomment-2864075758
i’m having the same problem, all my dag use variable in jinja template like
this:
on_failure_callback=send_smtp_notification_failure(to='{{
var.value.default_to_email }}'),
i do not have any import of Variable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2863690475 oh 💩 wtf, `Variable.get()` has 2 different interfaces: * `from airflow.models import Variable` has the `default_var` arg * `from airflow.sdk import Variable` has the `default` arg that's why using `from airflow.sdk import Variable` causes no "leaks" because there is an import error... 🤦 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2863511541 @potiuk > do you have any custom configuration (logging for example) in your airflow settings? no, i already turned off all my custom configs (logging, oauth, etc.) in order to properly identify the root cause > do you have the setting for variable caching turned on ? ditto just to be clear, the import alone doesn't "leak" anything, you'd have to call the `Variable.get(...)` somewhere in the dag -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2862104421 > It’s likely something in airflow/__init__.py, that file is quite dirty and has a ton of side effect. I doubt it's directly `__init__.py` - maybe some lazy thing that gets initialized later - the init file is always imported anyway by a sheer fact we use dag classes - basically airflow's init is importer first time we use any class from airflow - including task SDK. We should indeed get rid of the side effect of this unit - but it's not likely the cause -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2862081171 Also - do you have the setting for variable caching turned on ? https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#use-cache - and if yes - what happens when you disable it ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2862071575 Do you have any custom configuration (logging for example) I. Your airflow settings @zachliu ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
uranusjr commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2862002951 > can't believe `from airflow.models import Variable` is the root cause Interesting… It’s likely something in `airflow/__init__.py`, that file is quite dirty and has a ton of side effect. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887:
URL: https://github.com/apache/airflow/issues/49887#issuecomment-2859981811
i think i've found the problem: as long as `Variable` is used. `lsof`
results (`type=STREAM (CONNECTED)`) keep growing:
```python
"""Test"""
# airflow DAG
from airflow.models import Variable
from airflow.models.dag import DAG
from airflow.operators.empty import EmptyOperator
from airflow.timetables.trigger import CronTriggerTimetable
from pendulum import datetime
LOCAL_TZ = "UTC"
accounts = Variable.get(
"dummy_accounts",
default_var=["dummy"],
deserialize_json=True,
)
with DAG(
dag_id="dummy_daily",
schedule=CronTriggerTimetable("@daily", timezone=LOCAL_TZ),
start_date=datetime(2025, 5, 1, tz=LOCAL_TZ),
max_active_runs=3,
catchup=False,
tags=["flag"],
):
for account in accounts:
step_1 = EmptyOperator(task_id=f"{account}_step_1")
step_2 = EmptyOperator(task_id=f"{account}_step_2")
step_1 >> step_2
```
replace the `Variable` with `accounts = ["dummy", "drummer"]`, everything is
hunky-dory
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
ashb commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2854067874 Does anyone have any reproduction dags that you can share so we can reproduce this? I think that `close_unused_sockets` path is okay (but I have not confirmed as such) since if that doesn't work then the sockets are kept open by the parent, and they never send an EoF and thus the dag processors would fail to break this loop https://github.com/apache/airflow/blob/3198aad1bfd7cbf8b3de56c482dc9d96e00d3a69/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L879 At least that is what my head says right now, but I don't fully trust it yet after a week off. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2853440427 I might take a look later, also to get familiar with the code, but possibly you will know by heart what could be happening - I saw a few cases where we are closing some sockets forcibly ```python self._close_unused_sockets(self.stdin) # Put a message in the viewable task logs ``` ```python @staticmethod def _close_unused_sockets(*sockets): """Close unused ends of sockets after fork.""" for sock in sockets: if isinstance(sock, SocketIO): # If we have the socket IO object, we need to close the underlying socket foricebly here too, # else we get unclosed socket warnings, and likely leaking FDs too sock._sock.close() sock.close() ``` With the FD leaking possibility mentioned, so I guess that's just another instance of that one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2853436137 @kaxil @ashb @amoghrajesh -> From the above it looks like we are leaking unclosed pipes for the communication between supervisor and the dag processor. - I have not looked very closely, as I am not familiar with the new implementation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2852313264 `lsof > files_$(date +%Y%m%d_%H%M%S).txt` in the dag-processor container at 40~50 seconds interval from the start: [files_20250505_185948.txt](https://github.com/user-attachments/files/20047885/files_20250505_185948_masked.txt) [files_20250505_190012.txt](https://github.com/user-attachments/files/20047884/files_20250505_190012_masked.txt) [files_20250505_190100.txt](https://github.com/user-attachments/files/20047891/files_20250505_190100_masked.txt) [files_20250505_190151.txt](https://github.com/user-attachments/files/20047889/files_20250505_190151_masked.txt) at this point, 50% of the entry is `type=STREAM (CONNECTED)` file size doubles every minute or so, the growth rate roughly matches with the memory increase  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2851979694 sure, i'm updating my Dockerfile to allow `root` and install `lsof` 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
vatsrahul1001 commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2850418286 > Another suggestion: Can anyone who experiences the growth of open files, can run (With root) `lsof` few times before the number is exceeded and dump it somewhere with information on timing and growth ? There are quite a few options that can be specified to lsof but seeing the "growth" - i..e. seeing what kind of open files are growing and seeing clearly that they are growing might help to narrow down the reason. And if we see the kinds of open files growing and possibly what is opening them, we can likely run a few more details lsof commands to get more details. @zachliu can you try this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
potiuk commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2850010814 Another suggestion: Can anyone who experiences the growth of open files, can run (With root) `lsof` few times before the number is exceeded and dump it somewhere with information on timing and growth ? There are quite a few options that can be specified to lsof but seeing the "growth" - i..e. seeing what kind of open files are growing and seeing clearly that they are growing might help to narrow down the reason. And if we see the kinds of open files growing and possibly what is opening them, we can likely run a few more details lsof commands to get more details. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
vatsrahul1001 commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2849992816 > [@asasisekar](https://github.com/asasisekar) [@vatsrahul1001](https://github.com/vatsrahul1001) have you also experienced high cpu/memory usage in the meantime? I did not notice in my env -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2847388916 @asasisekar @vatsrahul1001 have you also experienced high cpu/memory usage in the meantime? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
vatsrahul1001 commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2846636804 I wasn’t able to reproduce this issue using the Breeze environment, even after executing 100 DAGs with a sleep interval. I also used a variable within the DAG to confirm behavior. Still continuing to investigate and trying to reproduce the issue. [throughput.txt](https://github.com/user-attachments/files/20010251/throughput.txt) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
uranusjr commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2838232601 It would be helpful if we have some dags that can produce a similar behaviour. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
zachliu commented on issue #49887:
URL: https://github.com/apache/airflow/issues/49887#issuecomment-2836560479
running `watch "pgrep -f dag-processor | xargs -I{} ls -l /proc/{}/fd/ >>
fd.log"` accumulated 1.3mb of `fd.log` with 18k rows in 20 seconds, 12k of
which are `socket: [...]`. i have merely ~20 active DAGs + ~300 disabled DAGs.
this might explain why the new dag-processor is both a CPU-hog and a
memory-hog
(see my datadog metric screenshot at
https://github.com/apache/airflow/issues/49650#issuecomment-2836013393)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
asasisekar commented on issue #49887:
URL: https://github.com/apache/airflow/issues/49887#issuecomment-2835726633
@tirkarthi With running with 100 DAGs, DAG processor getting failed with too
many open files in few mins.
Around 390 open files
```
lrwx-- 1 svc svc 64 Apr 28 16:26 390 -> socket:[2348769229]
lrwx-- 1 svc svc 64 Apr 28 16:26 392 -> socket:[2348769231]
```
DAGs have lot of variables. is that creating too many open files ?
DAG processor have lot of below http calls in log.
```
[2025-04-28T16:25:37.142+0100] {_client.py:1026} INFO - HTTP Request: GET
http://in-process.invalid./variables/ "HTTP/1.1 200 OK"
[2025-04-28T16:25:37.152+0100] {_client.py:1026} INFO - HTTP Request: GET
http://in-process.invalid./variables/ "HTTP/1.1 200 OK"
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
vatsrahul1001 commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2835247984 @tirkarthi looks like same issue as https://github.com/apache/airflow/issues/46048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
boring-cyborg[bot] commented on issue #49887: URL: https://github.com/apache/airflow/issues/49887#issuecomment-2835158216 Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi closed issue #46048: Airflow dag processor exits with too many open files after sometime URL: https://github.com/apache/airflow/issues/46048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi commented on issue #46048: URL: https://github.com/apache/airflow/issues/46048#issuecomment-2710423921 It looks like log file is being opened here but not closed properly leading to `structlog` hanging on to the file? https://github.com/apache/airflow/blob/b2c646af408d2ea70c82aa47fb3f2f68c777fd7e/airflow/dag_processing/manager.py#L830-L834 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi commented on issue #46048: URL: https://github.com/apache/airflow/issues/46048#issuecomment-2710355051 It looks like some log files are not closed. https://unix.stackexchange.com/questions/66235/how-to-display-open-file-descriptors-but-not-using-lsof-command ``` ls -l 14300/fd/ | grep -i example_assets l-wx-- 1 karthikeyan karthikeyan 64 Mar 10 17:30 122 -> /home/karthikeyan/airflow/logs/scheduler/2025-03-10/example_dags/example_assets.py.log l-wx-- 1 karthikeyan karthikeyan 64 Mar 10 17:31 214 -> /home/karthikeyan/airflow/logs/scheduler/2025-03-10/example_dags/example_assets.py.log l-wx-- 1 karthikeyan karthikeyan 64 Mar 10 17:29 30 -> /home/karthikeyan/airflow/logs/scheduler/2025-03-10/example_dags/example_assets.py.log l-wx-- 1 karthikeyan karthikeyan 64 Mar 10 17:31 306 -> /home/karthikeyan/airflow/logs/scheduler/2025-03-10/example_dags/example_assets.py.log l-wx-- 1 karthikeyan karthikeyan 64 Mar 10 17:31 398 -> /home/karthikeyan/airflow/logs/scheduler/2025-03-10/example_dags/example_assets.py.log ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi commented on issue #46048: URL: https://github.com/apache/airflow/issues/46048#issuecomment-2710344142 Reopening it since I am still seeing this after running dag-processor for sometime locally with main. It will be helpful if anyone can confirm that the file count doesn't increase in watch command as the dag-processor runs for sometime. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
ashb closed issue #46048: Airflow dag processor exits with too many open files after sometime URL: https://github.com/apache/airflow/issues/46048 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi commented on issue #46048: URL: https://github.com/apache/airflow/issues/46048#issuecomment-2706531568 @ashb I can see the resource warnings gone with #47462 but the count still keeps increasing in watch command for the dag-processor pid and exits. I am on Ubuntu 20.04.3 and on main branch commit 85c3fbac2df1fd4c6209b80ac2fd034dbdb131b6 . Are you able to reproduce the file count increase? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
jedcunningham commented on issue #46048: URL: https://github.com/apache/airflow/issues/46048#issuecomment-2614705629 Thanks @tirkarthi. I'll look into it this coming week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [I] Airflow dag processor exits with too many open files after sometime [airflow]
tirkarthi commented on issue #46048: URL: https://github.com/apache/airflow/issues/46048#issuecomment-2614198634 cc: @kaxil @ashb @jedcunningham -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
