danfran opened a new issue, #8691:
URL: https://github.com/apache/hudi/issues/8691

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at 
dev-subscr...@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   I am trying to run some tests in Docker using the image 
`amazon/aws-glue-libs:glue_libs_4.0.0_image_01` and `localstack` for AWS 
environment. No matter what, every time the tests run, Hudi tries to connect to 
the remote AWS instead of pointing to LocalStack. This is the configuration I 
am using at the moment:
   
   ```
   packages = [
           '/home/glue_user/spark/jars/spark-avro_2.12-3.3.0-amzn-1.jar',
           
'/home/glue_user/aws-glue-libs/datalake-connectors/hudi-0.12.1/hudi-spark3-bundle_2.12-0.12.1.jar',
           '/home/glue_user/aws-glue-libs/jars/aws-java-sdk-1.12.128.jar',
           '/home/glue_user/aws-glue-libs/jars/aws-java-sdk-glue-1.12.128.jar',
           '/home/glue_user/spark/jars/hadoop-aws-3.3.3-amzn-0.jar',
       ]
   
       conf = SparkConf() \
           .set('spark.jars', ','.join(packages))\
           .set('spark.serializer', 
'org.apache.spark.serializer.KryoSerializer')\
           .set('spark.sql.catalog.spark_catalog', 
'org.apache.spark.sql.hudi.catalog.HoodieCatalog')\
           .set('spark.sql.extensions', 
'org.apache.spark.sql.hudi.HoodieSparkSessionExtension')
   
       spark_context = SparkContext(conf=conf)
       glue_context = GlueContext(spark_context)
       spark_session = glue_context.spark_session
   
       # HUDI S3 ACCESS
       spark_session.conf.set('fs.defaultFS', 's3://mybucket')
       spark_session.conf.set('fs.s3.awsAccessKeyId', 'test')
       spark_session.conf.set('fs.s3.awsSecretAccessKey', 'test')
       spark_session.conf.set('fs.s3a.awsAccessKeyId', 'test')
       spark_session.conf.set('fs.s3a.awsSecretAccessKey', 'test')
       spark_session.conf.set('fs.s3a.endpoint', 'http://localstack:4566')
       spark_session.conf.set('fs.s3a.connection.ssl.enabled', 'false')
       spark_session.conf.set('fs.s3a.path.style.access', 'true')
       spark_session.conf.set('fs.s3a.signing-algorithm', 'S3SignerType')
       
spark_session.conf.set('spark.sql.legacy.setCommandRejectsSparkCoreConfs', 
'false')
   
       # SPARK CONF
       spark_session.conf.set('spark.sql.shuffle.partitions', '2')
       spark_session.conf.set('spark.sql.crossJoin.enabled', 'true')
   ```
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. 
   2. 
   3. 
   4.
   5.
   6.
   
   **Expected behavior**
   
   How can I make it to point to my local environment (http://localstack:4566) 
instead of AWS remote?
   
   **Environment Description**
   
   * Hudi version : 0.12
   
   * Spark version : 3.3.0
   
   * Hive version :
   
   * Hadoop version : 3.3.3
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : yes
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   ```
   An error occurred while calling o1719.save.
   : java.nio.file.AccessDeniedException: 
s3://mybucket/myzone/location/.hoodie: getFileStatus on 
s3://mybucket/myzone/location/.hoodie: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: B1ZJ8JDPY2HX514F; 
S3 Extended Request ID: 
rzBqoLQxJb4PSKNW+uCbyVCbqYtpCB0aFHvX7JWTCDJ/PTfQdgESAkOzxWR6aPua8OhuEcajIM8=; 
Proxy: null), S3 Extended Request ID: 
rzBqoLQxJb4PSKNW+uCbyVCbqYtpCB0aFHvX7JUTVIJ/PTfQdgEHNkOzxWR6aPua8OhuEcajIM8=:403
 Forbidden
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to