TomTJarosz commented on code in PR #39314:
URL: https://github.com/apache/arrow/pull/39314#discussion_r1445530513


##########
python/pyarrow/pandas-shim.pxi:
##########
@@ -96,13 +98,17 @@ cdef class _PandasAPIShim(object):
         self.has_sparse = False
 
     cdef inline _check_import(self, bint raise_=True):
-        if self._tried_importing_pandas:
-            if not self._have_pandas and raise_:
-                self._import_pandas(raise_)
-            return
-
-        self._tried_importing_pandas = True
-        self._import_pandas(raise_)
+        if not self._tried_importing_pandas:
+            with self._lock:
+                if not self._tried_importing_pandas:
+                    try:
+                        self._import_pandas(raise_)
+                    finally:
+                        self._tried_importing_pandas = True
+                    return
+
+        if not self._have_pandas and raise_:
+            self._import_pandas(raise_)

Review Comment:
   This implementation of `_check_import` differs from your suggested change as 
(I believe) the suggested change would introduce another race condition, 
resulting in us not raising when we do not have pandas and `raise_=True`.
   
   Consider 2 threads concurrently call `_check_import` and thread 1 acquires 
lock while thread 2 waits to acquire the lock. Also assume the current env does 
not have pandas. If this method had the following implementation:
   ```py
       cdef inline _check_import(self, bint raise_=True):
           if self._tried_importing_pandas:
               if not self._have_pandas and raise_:
                   self._import_pandas(raise_)
               return
           with self._lock:
               if not self._tried_importing_pandas:
                   try:
                       self._import_pandas(raise_)
                   finally:
                       self._tried_importing_pandas = True
   ```
   thread 2 would _not_ raise, but it should. I believe the committed 
implementation of `_check_import` addresses this race condition, along with the 
original race condition this PR addresses.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to