This is an automated email from the ASF dual-hosted git repository.

Yicong-Huang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/texera.git


The following commit(s) were added to refs/heads/main by this push:
     new bb795a0c8f fix(amber): avoid AtomicInteger.get_and_set deadlock (#5010)
bb795a0c8f is described below

commit bb795a0c8febbdaecd6878dcbb81498e8b5c0338
Author: nathant27 <[email protected]>
AuthorDate: Wed May 13 01:17:37 2026 -0700

    fix(amber): avoid AtomicInteger.get_and_set deadlock (#5010)
    
    ### What changes were proposed in this PR?
    Fixed deadlock happening in AtomicInteger.get_and_set
    (amber/src/main/python/core/util/atomic.py).
    
    Before, get_and_set acquired non-reentrant lock and then accessed value
    property, which attempts to grab the same lock, causing a deadlock.
    
    After the change, get_and_set now accesses the objects self._value
    property directly inside the existing critical section, which avoids the
    nested lock acquisition.
    -Chose to do the inline change instead of changing to Reentrant lock to
    avoid potentially unnecessary overhead and because it seems like the
    more appropriate way to access in this context, since we're already
    grabbing the lock anyway in get_and_set
    
    ### Any related issues, documentation, or discussions?
    Fixes #4794
    
    ### How was this PR tested?
    
    in src/test/python/core/util/test_atomic.py
    test_get_and_set_currently_deadlocks_on_non_reentrant_lock, bug pinned
    test
    after the fixes now fails on both asserts lines 99 to 106 on removed
    test(test_get_and_set_currently_deadlocks_on_non_reentrant_lock)
    ```
            assert worker.is_alive(), (
                "worker thread exited unexpectedly — get_and_set neither 
deadlocked "
                "nor completed; the test no longer pins the documented bug."
            )
            assert not completed.is_set(), (
                "get_and_set unexpectedly returned — the deadlock bug appears 
fixed; "
                "delete this pinned test along with the xfail below."
            )
    ```
    This is expected behavior because worker should not be alive after
    fixing deadlock, and should be completed.
    
    #### 2 changes to test_atomic.py
    **REPLACED**
        test_get_and_set_currently_deadlocks_on_non_reentrant_lock
        WITH
        test_get_and_set_does_not_deadlock_on_non_reentrant_lock
    - Mostly the same functionality of old test, but instead of checking if
    deadlocks, checks if it does not deadlock by asserting "not
    worker.is_alive()" and "completed.is_set" from the thread. Basically
    just swapped the two asserts at the end
    
    **UPDATED** test_get_and_set_should_return_old_value_and_replace_state
    - REMOVED the xfail with strict = true, because now passes
    
    
    ### Was this PR authored or co-authored using generative AI tooling?
    No
---
 amber/src/main/python/core/util/atomic.py      |  2 +-
 amber/src/test/python/core/util/test_atomic.py | 26 +++-----------------------
 2 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/amber/src/main/python/core/util/atomic.py 
b/amber/src/main/python/core/util/atomic.py
index a73619489f..55fecb0f45 100644
--- a/amber/src/main/python/core/util/atomic.py
+++ b/amber/src/main/python/core/util/atomic.py
@@ -53,6 +53,6 @@ class AtomicInteger:
 
     def get_and_set(self, v):
         with self._lock:
-            old_value = self.value
+            old_value = self._value
             self._value = int(v)
             return old_value
diff --git a/amber/src/test/python/core/util/test_atomic.py 
b/amber/src/test/python/core/util/test_atomic.py
index 0f6b393233..824b25b671 100644
--- a/amber/src/test/python/core/util/test_atomic.py
+++ b/amber/src/test/python/core/util/test_atomic.py
@@ -65,13 +65,7 @@ class TestAtomicIntegerSingleThreaded:
         a.value = "100"
         assert a.value == 100
 
-    def test_get_and_set_currently_deadlocks_on_non_reentrant_lock(self):
-        # Bug pin: get_and_set acquires self._lock and then reads self.value,
-        # which is a property that ALSO tries to acquire self._lock. The lock
-        # is a non-reentrant threading.Lock, so the call deadlocks the moment
-        # it is invoked. Document via thread + timeout so the test surfaces
-        # the deadlock without hanging the whole suite, and pair it with an
-        # xfail-strict test below that asserts the intended contract.
+    def test_get_and_set_does_not_deadlock_on_non_reentrant_lock(self):
         a = AtomicInteger(10)
         started = threading.Event()
         completed = threading.Event()
@@ -96,23 +90,9 @@ class TestAtomicIntegerSingleThreaded:
         assert not errors, (
             f"get_and_set raised before reaching the deadlock spin: 
{errors[0]!r}"
         )
-        assert worker.is_alive(), (
-            "worker thread exited unexpectedly — get_and_set neither 
deadlocked "
-            "nor completed; the test no longer pins the documented bug."
-        )
-        assert not completed.is_set(), (
-            "get_and_set unexpectedly returned — the deadlock bug appears 
fixed; "
-            "delete this pinned test along with the xfail below."
-        )
+        assert not worker.is_alive()
+        assert completed.is_set()
 
-    @pytest.mark.xfail(
-        strict=True,
-        reason=(
-            "Known bug: AtomicInteger.get_and_set deadlocks because it holds "
-            "the non-reentrant lock while accessing the value property. "
-            "This xfail flips to XPASS when the bug is fixed."
-        ),
-    )
     @pytest.mark.timeout(2)
     def test_get_and_set_should_return_old_value_and_replace_state(self):
         a = AtomicInteger(10)

Reply via email to