potiuk commented on PR #35181:
URL: https://github.com/apache/airflow/pull/35181#issuecomment-1788770992

   > This doesn't flush the existing metrics when you create a new instance? 
For example, graphs still have their data from before the flush??
   
   I believe there are virtually no metrics before. Daemonisation happens right 
at the very start of any component that enables it, and I think what this 
problem mainly solves is that it seems Statsd integration when imported and 
initialized saves some states and possibly starts some thread that then are 
"lost" when daemonizing.
   
   The deamonisation does a few things:
   
   1) forks the process (twice I believe) to make sure that it detaches from 
the process that started it and moves itself to have "init" process as parent 
so that any signals (SIGTERM/SIGHUP) from the original terminal process do not 
kill the background process
   
   2) closes stdin/stdout and opened files to make sure it does not continue 
writng / reading from terminal which was used to start it and that the files 
are not accessed from multiple processes at once.
   
   I think what happens here - is that Statsd manages to open a  socket / file 
descriptor before daemonization happnes and that socket gets closed when 
forking/daemonising. When deamonization happens, the parent/intermediate 
processes exit right after the fork happens. So even if Statsd managed to open 
the socket before daemonization - it gets closed right after (and by 
reinitialising Statsd in this PR we are reopening it in the forked process) 
   
   But also writing this comment makes me think:  If we manage to find out the 
resource (likely socket) that Statsd opens and get hang of file descriptor for 
that - there is a better way of handling it 
https://daemonize.readthedocs.io/en/latest/ - you can add such file descriptor 
to "keep_fds" list of descriptors that should NOT be closed when forking and 
their ownership passed to the forked process.  So @pavansharma36  - maybe you 
can investigate what Statsd does under the hood to make it stop working - find 
the descriptor it opens at initialization and pass it in keep_fds parameter?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to