lalalastella opened a new pull request, #5791:
URL: https://github.com/apache/texera/pull/5791

   ### What changes were proposed in this PR?
   
   **Root cause:** `time.time_ns()` is a wall-clock call that can go backward 
during NTP adjustments or inside VMs/containers.  In `MainLoop._switch_context` 
and `_process_dcm`, elapsed time is computed as `end_time - start_time` where 
both values come from `time.time_ns()`.  When the clock steps backward between 
the two calls, the difference is negative.  Passing a negative value to 
`StatisticsManager.increase_data_processing_time` raised `ValueError: Time must 
be non-negative`, which propagated into the DataProcessor thread and left the 
worker in a deadlocked wait.
   
   **Before → After**
   
   ```
   Before:
     start_time = time.time_ns()   # wall clock — can go backward
     ...
     end_time   = time.time_ns()   # can be < start_time after NTP step
     statistics_manager.increase_data_processing_time(end_time - start_time)
     #  → ValueError: Time must be non-negative
     #  → worker thread hangs (unhandled exception)
   
   After:
     start_time = time.monotonic_ns()   # guaranteed non-decreasing
     ...
     end_time   = time.monotonic_ns()   # always ≥ start_time
     statistics_manager.increase_data_processing_time(end_time - start_time)
     #  → always ≥ 0, no exception
   ```
   
   **Changes (3 files):**
   
   | File | Change |
   |------|--------|
   | `core/runnables/main_loop.py` | Replace all 6 `time.time_ns()` calls with 
`time.monotonic_ns()` |
   | `core/architecture/managers/statistics_manager.py` | Defense-in-depth: log 
a warning and return early for negative elapsed times instead of raising 
`ValueError` |
   | `test/python/core/architecture/managers/test_statistics_manager.py` | 
Update `test_negative_time_raises` → `test_negative_time_is_clamped_to_zero` to 
match new behavior |
   
   ### Any related issues, documentation, discussions?
   
   Closes #3768
   
   ### How was this PR tested?
   
   ```bash
   cd amber && pytest 
src/test/python/core/architecture/managers/test_statistics_manager.py -v
   ```
   
   (Requires `bin/python-proto-gen.sh` to have been run first, as in CI.)
   
   All existing `TestStatisticsManager*` tests pass.  The updated test 
`test_negative_time_is_clamped_to_zero` replaces `test_negative_time_raises` 
and covers the new clamp-and-warn behavior.
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Sonnet 4.6


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to