potiuk opened a new issue, #35254:
URL: https://github.com/apache/airflow/issues/35254

   ### Body
   
   The test 
`airfow/tests/jobs/test_triggerrer_job.py::test_trigger_logging_sensitive_info` 
has a weird and race'y behaviour - that got exposed when implementing #83221. 
As a result it's been quarantined until we diagnose/fix it.
   
   It's very easy to reproduce the racy behaviour, but the root cause is not 
yet certain:
   
   1) Enter Breeze (might be any Python version and DB but it has been 
confirmed with Python 3.8 and Sqlite, Postgres)
   2) Run `pytest 
tests/jobs/test_triggerer_job.py::test_trigger_logging_sensitive_info` 
   3) The test fails because the logs that the test gets are empty 
   4) Run it again 
`tests/jobs/test_triggerer_job.py::test_trigger_logging_sensitive_info` 
   5) it succeds (and continues doing so until you restart breeze or delete 
/root/airflow/.airflow_db_initialised
   6) When you delete the `/root/airflow/.airflow_db_initialised` the test 
fails again 
   
   The presence of /root/airflow/.airflow_db_initialised means that airflow DB 
has been "reset" at least once by the tests. We have pytest fixture that checks 
if the file is created at least once and in case it has not been created it 
will run `initial_db_init` while setting up the tests.  In `tests/conftest.py`. 
This avoids the problem that if semeone never initialized the DB, they will get 
strange DB errors (missing columns/indexes) and it is very confusing for first 
time users or when you delete local test DB.
   
   This is done in this code:
   
   ```python
   @pytest.fixture(autouse=True, scope="session")
   def initialize_airflow_tests(request):
        ...
          ,,,
           elif not os.path.exists(lock_file):
           print(
               "Initializing the DB - first time after entering the 
container.\n"
               "You can force re-initialization the database by adding 
--with-db-init switch to run-tests."
           )
           initial_db_init()
           # Create pid file
           with open(lock_file, "w+"):
               pass
       else:
   ```
   
   In some cases / some machines just commenting out `db.resetdb()` that is run 
inside `initial_db_init` cause the test to suceed even the first time, but this 
behaviour is inconsistent - sometims it does not help, which suggest that this 
is some kind of "startup" race of triggerer log handler - where simply adding 
more intialization/CPU/disk usage at the startup of tests triggers the handler 
to either miss or loose the connection.
   
   Also there is mysterious stacktrace printed (but it is printed in both cases 
- when test works and does not work, that sugggests that this is the case and 
that it is connected with some race in the in-memory handler for logs, either 
wiht not catching or dropping logs bacause of some race at startup. I tried to 
debug it but did not have much luck so far - except being able to narrow it 
down and produce a very esily reproducible scenario.
   
   ```python
   tests/jobs/test_triggerer_job.py::test_trigger_logging_sensitive_info
     /usr/local/lib/python3.8/site-packages/_pytest/threadexception.py:73: 
PytestUnhandledThreadExceptionWarning: Exception in thread Thread-3
   
     Traceback (most recent call last):
       File "/usr/local/lib/python3.8/threading.py", line 932, in 
_bootstrap_inner
         self.run()
       File "/usr/local/lib/python3.8/threading.py", line 870, in run
         self._target(*self._args, **self._kwargs)
       File "/usr/local/lib/python3.8/logging/handlers.py", line 1487, in 
_monitor
         self.handle(record)
       File "/usr/local/lib/python3.8/logging/handlers.py", line 1468, in handle
         handler.handle(record)
       File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 104, in 
handle
         self.emit(record)
       File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 93, in 
emit
         h = self._get_or_create_handler(record.trigger_id, 
record.task_instance)
       File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 89, in 
_get_or_create_handler
         self.handlers[trigger_id] = self._make_handler(ti)
       File "/opt/airflow/airflow/utils/log/trigger_handler.py", line 84, in 
_make_handler
         h.set_context(ti=ti)
       File "/opt/airflow/airflow/utils/log/file_task_handler.py", line 185, in 
set_context
         local_loc = self._init_file(ti)
       File "/opt/airflow/airflow/utils/log/file_task_handler.py", line 478, in 
_init_file
         full_path = self.add_triggerer_suffix(full_path=full_path, 
job_id=ti.triggerer_job.id)
     AttributeError: 'NoneType' object has no attribute 'id'
   ```
   
   cc: @dstandish @hussein-awala 
   
   Also see https://github.com/apache/airflow/pull/35160#discussion_r1375463230
   
   ### Committer
   
   - [X] I acknowledge that I am a maintainer/committer of the Apache Airflow 
project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to