josh-gree commented on issue #36439:
URL: https://github.com/apache/arrow/issues/36439#issuecomment-2890809416

   I am facing a similar issue - running on an AWS sagemaker notebook - using 
uv for package management - get the following error;
   
   ```
   sagemaker-user@default:~/temp$ uv tree
   Resolved 9 packages in 3ms
   temp v0.1.0
   ├── certifi v2025.4.26
   ├── numpy v1.26.4
   ├── pandas v2.2.3
   │   ├── numpy v1.26.4
   │   ├── python-dateutil v2.9.0.post0
   │   │   └── six v1.17.0
   │   ├── pytz v2025.2
   │   └── tzdata v2025.2
   └── pyarrow v20.0.0
   sagemaker-user@default:~/temp$ uv run python
   Python 3.10.16 (main, Feb 12 2025, 14:50:02) [Clang 19.1.6 ] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import pandas as pd
   >>> pd.read_parquet("s3://<PATH>.parquet")
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File 
"/home/sagemaker-user/temp/.venv/lib/python3.10/site-packages/pandas/io/parquet.py",
 line 667, in read_parquet
       return impl.read(
     File 
"/home/sagemaker-user/temp/.venv/lib/python3.10/site-packages/pandas/io/parquet.py",
 line 267, in read
       path_or_handle, handles, filesystem = _get_path_or_handle(
     File 
"/home/sagemaker-user/temp/.venv/lib/python3.10/site-packages/pandas/io/parquet.py",
 line 117, in _get_path_or_handle
       fs, path_or_handle = pa_fs.FileSystem.from_uri(path)
     File "pyarrow/_fs.pyx", line 475, in pyarrow._fs.FileSystem.from_uri
     File "pyarrow/error.pxi", line 155, in 
pyarrow.lib.pyarrow_internal_check_status
     File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
   OSError: When resolving region for bucket '<BUCKET>': AWS Error 
NETWORK_CONNECTION during HeadBucket operation: curlCode: 60, SSL peer 
certificate or SSH remote key was not OK
   ```
   
   but if I do (setting  SSL_CERT_FILE);
   
   ```
   sagemaker-user@default:~/temp$ SSL_CERT_FILE=$(uv run python -m certifi) uv 
run python
   Python 3.10.16 (main, Feb 12 2025, 14:50:02) [Clang 19.1.6 ] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import pandas as pd
   >>> 
pd.read_parquet("s3://imu-prod-ew1-main-dagster-storage/v2/sagemaker/1db62dd3-aa98-4fa9-8ccf-290bb7238b8c/3efb3229-b176-4313-8535-5515c5627ca3/events.parquet")
           Autofluorescence_1:Area  ...  
root/TIME/CELLS/CELLS2/FSINGLE/SSINGLE/WBC_2/LIVE_CELLS_2/SSChi/CD14n/EOSINOPHIL_2/PDL1p_2
   0                     -0.004086  ...                                         
     False                                         
   1                      0.387499  ...                                         
     False                                         
   2                      0.088288  ...                                         
     False                                         
   3                      0.084076  ...                                         
     False                                         
   4                      0.063045  ...                                         
     False                                         
   ...                         ...  ...                                         
       ...                                         
   282507                 0.183296  ...                                         
     False                                         
   282508                 0.278997  ...                                         
     False                                         
   282509                 0.432743  ...                                         
     False                                         
   282510                 0.401240  ...                                         
     False                                         
   282511                 0.387509  ...                                         
     False                                         
   
   [282512 rows x 158 columns]
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to