danfran opened a new issue, #8691: URL: https://github.com/apache/hudi/issues/8691
**_Tips before filing an issue_** - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)? - Join the mailing list to engage in conversations and get faster support at dev-subscr...@hudi.apache.org. - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly. **Describe the problem you faced** I am trying to run some tests in Docker using the image `amazon/aws-glue-libs:glue_libs_4.0.0_image_01` and `localstack` for AWS environment. No matter what, every time the tests run, Hudi tries to connect to the remote AWS instead of pointing to LocalStack. This is the configuration I am using at the moment: ``` packages = [ '/home/glue_user/spark/jars/spark-avro_2.12-3.3.0-amzn-1.jar', '/home/glue_user/aws-glue-libs/datalake-connectors/hudi-0.12.1/hudi-spark3-bundle_2.12-0.12.1.jar', '/home/glue_user/aws-glue-libs/jars/aws-java-sdk-1.12.128.jar', '/home/glue_user/aws-glue-libs/jars/aws-java-sdk-glue-1.12.128.jar', '/home/glue_user/spark/jars/hadoop-aws-3.3.3-amzn-0.jar', ] conf = SparkConf() \ .set('spark.jars', ','.join(packages))\ .set('spark.serializer', 'org.apache.spark.serializer.KryoSerializer')\ .set('spark.sql.catalog.spark_catalog', 'org.apache.spark.sql.hudi.catalog.HoodieCatalog')\ .set('spark.sql.extensions', 'org.apache.spark.sql.hudi.HoodieSparkSessionExtension') spark_context = SparkContext(conf=conf) glue_context = GlueContext(spark_context) spark_session = glue_context.spark_session # HUDI S3 ACCESS spark_session.conf.set('fs.defaultFS', 's3://mybucket') spark_session.conf.set('fs.s3.awsAccessKeyId', 'test') spark_session.conf.set('fs.s3.awsSecretAccessKey', 'test') spark_session.conf.set('fs.s3a.awsAccessKeyId', 'test') spark_session.conf.set('fs.s3a.awsSecretAccessKey', 'test') spark_session.conf.set('fs.s3a.endpoint', 'http://localstack:4566') spark_session.conf.set('fs.s3a.connection.ssl.enabled', 'false') spark_session.conf.set('fs.s3a.path.style.access', 'true') spark_session.conf.set('fs.s3a.signing-algorithm', 'S3SignerType') spark_session.conf.set('spark.sql.legacy.setCommandRejectsSparkCoreConfs', 'false') # SPARK CONF spark_session.conf.set('spark.sql.shuffle.partitions', '2') spark_session.conf.set('spark.sql.crossJoin.enabled', 'true') ``` **To Reproduce** Steps to reproduce the behavior: 1. 2. 3. 4. 5. 6. **Expected behavior** How can I make it to point to my local environment (http://localstack:4566) instead of AWS remote? **Environment Description** * Hudi version : 0.12 * Spark version : 3.3.0 * Hive version : * Hadoop version : 3.3.3 * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : yes **Additional context** Add any other context about the problem here. **Stacktrace** ```Add the stacktrace of the error.``` ``` An error occurred while calling o1719.save. : java.nio.file.AccessDeniedException: s3://mybucket/myzone/location/.hoodie: getFileStatus on s3://mybucket/myzone/location/.hoodie: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: B1ZJ8JDPY2HX514F; S3 Extended Request ID: rzBqoLQxJb4PSKNW+uCbyVCbqYtpCB0aFHvX7JWTCDJ/PTfQdgESAkOzxWR6aPua8OhuEcajIM8=; Proxy: null), S3 Extended Request ID: rzBqoLQxJb4PSKNW+uCbyVCbqYtpCB0aFHvX7JUTVIJ/PTfQdgEHNkOzxWR6aPua8OhuEcajIM8=:403 Forbidden ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org