potiuk opened a new issue, #35254: URL: https://github.com/apache/airflow/issues/35254
### Body The test `airfow/tests/jobs/test_triggerrer_job.py::test_trigger_logging_sensitive_info` has a weird and race'y behaviour - that got exposed when implementing #83221. As a result it's been quarantined until we diagnose/fix it. It's very easy to reproduce the racy behaviour, but the root cause is not yet certain: 1) Enter Breeze (might be any Python version and DB but it has been confirmed with Python 3.8 and Sqlite, Postgres) 2) Run `pytest tests/jobs/test_triggerer_job.py::test_trigger_logging_sensitive_info` 3) The test fails because the logs that the test gets are empty 4) Run it again `tests/jobs/test_triggerer_job.py::test_trigger_logging_sensitive_info` 5) it succeds (and continues doing so until you restart breeze or delete /root/airflow/.airflow_db_initialised 6) When you delete the `/root/airflow/.airflow_db_initialised` the test fails again The presence of /root/airflow/.airflow_db_initialised means that airflow DB has been "reset" at least once by the tests. We have pytest fixture that checks if the file is created at least once and in case it has not been created it will run `initial_db_init` while setting up the tests. In `tests/conftest.py`. This avoids the problem that if semeone never initialized the DB, they will get strange DB errors (missing columns/indexes) and it is very confusing for first time users or when you delete local test DB. This is done in this code: ```python @pytest.fixture(autouse=True, scope="session") def initialize_airflow_tests(request): ... ,,, elif not os.path.exists(lock_file): print( "Initializing the DB - first time after entering the container.\n" "You can force re-initialization the database by adding --with-db-init switch to run-tests." ) initial_db_init() # Create pid file with open(lock_file, "w+"): pass else: ``` In some cases / some machines just commenting out `db.resetdb()` that is run inside `initial_db_init` cause the test to suceed even the first time, but this behaviour is inconsistent - sometims it does not help, which suggest that this is some kind of "startup" race of triggerer log handler - where simply adding more intialization/CPU/disk usage at the startup of tests triggers the handler to either miss or loose the connection. Also there is mysterious stacktrace printed (but it is printed in both cases - when test works and does not work, that sugggests that this is the case and that it is connected with some race in the in-memory handler for logs, either wiht not catching or dropping logs bacause of some race at startup. I tried to debug it but did not have much luck so far - except being able to narrow it down and produce a very esily reproducible scenario. ```python tests/jobs/test_triggerer_job.py::test_trigger_logging_sensitive_info /usr/local/lib/python3.8/site-packages/_pytest/threadexception.py:73: PytestUnhandledThreadExceptionWarning: Exception in thread Thread-3 Traceback (most recent call last): File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/usr/local/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/usr/local/lib/python3.8/logging/handlers.py", line 1487, in _monitor self.handle(record) File "/usr/local/lib/python3.8/logging/handlers.py", line 1468, in handle handler.handle(record) File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 104, in handle self.emit(record) File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 93, in emit h = self._get_or_create_handler(record.trigger_id, record.task_instance) File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 89, in _get_or_create_handler self.handlers[trigger_id] = self._make_handler(ti) File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 84, in _make_handler h.set_context(ti=ti) File "/opt/airflow/airflow/utils/log/file_task_handler.py", line 185, in set_context local_loc = self._init_file(ti) File "/opt/airflow/airflow/utils/log/file_task_handler.py", line 478, in _init_file full_path = self.add_triggerer_suffix(full_path=full_path, job_id=ti.triggerer_job.id) AttributeError: 'NoneType' object has no attribute 'id' ``` cc: @dstandish @hussein-awala Also see https://github.com/apache/airflow/pull/35160#discussion_r1375463230 ### Committer - [X] I acknowledge that I am a maintainer/committer of the Apache Airflow project. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org