westonpace commented on issue #36587:
URL: https://github.com/apache/arrow/issues/36587#issuecomment-1632748371

   I am a little confused then.
   
   When any operation is run by the S3 filesystem then the AWS SDK will attempt 
to determine credentials for that action.  Typically this is done by looking in 
the user's config file (e.g. for ~/.aws/config).  If this configuration file is 
not found then it will attempt to contact a special IP address that EC2 
machines have configured which tells the EC2 machine what its configuration is.
   
   This attempt to contact that special IP address can be very slow, depending 
on the network configuration of the machine (sometimes it will spend minutes 
waiting for a timeout).  Setting variable `AWS_EC2_METADATA_DISABLED` will 
disable the check but that should only affect your connection if you are in an 
EC2 machine to begin with.  So I do not understand how setting that variable to 
true can cause connection issues to S3.
   
   Can you add these lines to the **top** of your script (these lines must come 
before you import any other pyarrow module)?  This will add additional 
debugging information that might help us understand what is happening:
   
   ```
   import pyarrow._s3fs
   pyarrow._s3fs.initialize_s3(pyarrow._s3fs.S3LogLevel.Trace)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to