alamb commented on code in PR #16644: URL: https://github.com/apache/datafusion/pull/16644#discussion_r2202509446
########## datafusion-cli/tests/cli_integration.rs: ########## @@ -35,6 +40,67 @@ fn make_settings() -> Settings { settings } +async fn setup_minio_container() -> ContainerAsync<minio::MinIO> { + const MINIO_ROOT_USER: &str = "TEST-DataFusionLogin"; Review Comment: I think it is very nice that these environment variables get setup via the test harness now rather than having to be setup outside ########## datafusion-cli/CONTRIBUTING.md: ########## @@ -29,47 +29,15 @@ cargo test ## Running Storage Integration Tests -By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION=1` and -then provide the necessary configuration for that object store. - -For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used. - -### AWS - -To test the S3 integration against [Minio](https://github.com/minio/minio) - -First start up a container with Minio and load test files. - -```shell -docker run -d \ - --name datafusion-test-minio \ - -p 9000:9000 \ - -e MINIO_ROOT_USER=TEST-DataFusionLogin \ - -e MINIO_ROOT_PASSWORD=TEST-DataFusionPassword \ - -v $(pwd)/../datafusion/core/tests/data:/source \ - quay.io/minio/minio server /data - -docker exec datafusion-test-minio /bin/sh -c "\ - mc ready local - mc alias set localminio http://localhost:9000 TEST-DataFusionLogin TEST-DataFusionPassword && \ - mc mb localminio/data && \ - mc cp -r /source/* localminio/data" -``` - -Setup environment +By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION`: ```shell -export TEST_STORAGE_INTEGRATION=1 -export AWS_ACCESS_KEY_ID=TEST-DataFusionLogin -export AWS_SECRET_ACCESS_KEY=TEST-DataFusionPassword -export AWS_ENDPOINT=http://127.0.0.1:9000 -export AWS_ALLOW_HTTP=true +TEST_STORAGE_INTEGRATION=1 cargo test Review Comment: When I first ran this command without docker running, several commands failed Once I started docker it worked great ``` TEST_STORAGE_INTEGRATION=1 cargo test ---- test_aws_options stdout ---- thread 'test_aws_options' panicked at datafusion-cli/tests/cli_integration.rs:63:10: Failed to start MinIO container: Client(CreateContainer(HyperLegacyError { err: hyper_util::client::legacy::Error(Connect, Os { code: 61, kind: ConnectionRefused, message: "Connection refused" }) })) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace ---- test_cli stdout ---- thread 'test_cli' panicked at datafusion-cli/tests/cli_integration.rs:63:10: Failed to start MinIO container: Client(CreateContainer(HyperLegacyError { err: hyper_util::client::legacy::Error(Connect, Os { code: 61, kind: ConnectionRefused, message: "Connection refused" }) })) ``` ########## datafusion-cli/CONTRIBUTING.md: ########## @@ -29,47 +29,15 @@ cargo test ## Running Storage Integration Tests -By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION=1` and -then provide the necessary configuration for that object store. - -For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used. - -### AWS - -To test the S3 integration against [Minio](https://github.com/minio/minio) - -First start up a container with Minio and load test files. - -```shell -docker run -d \ - --name datafusion-test-minio \ - -p 9000:9000 \ - -e MINIO_ROOT_USER=TEST-DataFusionLogin \ - -e MINIO_ROOT_PASSWORD=TEST-DataFusionPassword \ - -v $(pwd)/../datafusion/core/tests/data:/source \ - quay.io/minio/minio server /data - -docker exec datafusion-test-minio /bin/sh -c "\ - mc ready local - mc alias set localminio http://localhost:9000 TEST-DataFusionLogin TEST-DataFusionPassword && \ - mc mb localminio/data && \ - mc cp -r /source/* localminio/data" -``` - -Setup environment +By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION`: ```shell -export TEST_STORAGE_INTEGRATION=1 -export AWS_ACCESS_KEY_ID=TEST-DataFusionLogin -export AWS_SECRET_ACCESS_KEY=TEST-DataFusionPassword -export AWS_ENDPOINT=http://127.0.0.1:9000 -export AWS_ALLOW_HTTP=true +TEST_STORAGE_INTEGRATION=1 cargo test ``` -Note that `AWS_ENDPOINT` is set without slash at the end. +For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used. -Run tests +### AWS -```shell -cargo test -``` +S3 integration is tested against [Minio](https://github.com/minio/minio) with [TestContainers](https://github.com/testcontainers/testcontainers-rs) +This requires Docker to be running on your machine. Review Comment: I also found an issue when running on a remote GCP machine which might be good to document ```suggestion This requires Docker to be running on your machine and port 9000 to be free. If you see an error about " failed to load IMDS session token" such as > ---- object_storage::tests::s3_object_store_builder_resolves_region_when_none_provided stdout ---- > Error: ObjectStore(Generic { store: "S3", source: "Error getting credentials from provider: an error occurred while loading credentials: failed to load IMDS session token" }) You my need to disable trying to fetch S3 credentials from the environment using the `AWS_EC2_METADATA_DISABLED`, for example $ AWS_EC2_METADATA_DISABLED=true TEST_STORAGE_INTEGRATION=1 cargo test ``` ########## datafusion-cli/CONTRIBUTING.md: ########## @@ -29,47 +29,15 @@ cargo test ## Running Storage Integration Tests -By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION=1` and -then provide the necessary configuration for that object store. - -For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used. - -### AWS - -To test the S3 integration against [Minio](https://github.com/minio/minio) - -First start up a container with Minio and load test files. - -```shell -docker run -d \ - --name datafusion-test-minio \ - -p 9000:9000 \ - -e MINIO_ROOT_USER=TEST-DataFusionLogin \ - -e MINIO_ROOT_PASSWORD=TEST-DataFusionPassword \ - -v $(pwd)/../datafusion/core/tests/data:/source \ - quay.io/minio/minio server /data - -docker exec datafusion-test-minio /bin/sh -c "\ - mc ready local - mc alias set localminio http://localhost:9000 TEST-DataFusionLogin TEST-DataFusionPassword && \ - mc mb localminio/data && \ - mc cp -r /source/* localminio/data" -``` - -Setup environment +By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION`: ```shell -export TEST_STORAGE_INTEGRATION=1 -export AWS_ACCESS_KEY_ID=TEST-DataFusionLogin -export AWS_SECRET_ACCESS_KEY=TEST-DataFusionPassword -export AWS_ENDPOINT=http://127.0.0.1:9000 -export AWS_ALLOW_HTTP=true +TEST_STORAGE_INTEGRATION=1 cargo test Review Comment: Note it didn't show any errors about starting the container in the logs / output ########## datafusion-cli/CONTRIBUTING.md: ########## @@ -29,47 +29,15 @@ cargo test ## Running Storage Integration Tests -By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION=1` and -then provide the necessary configuration for that object store. - -For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used. - -### AWS - -To test the S3 integration against [Minio](https://github.com/minio/minio) - -First start up a container with Minio and load test files. - -```shell -docker run -d \ - --name datafusion-test-minio \ - -p 9000:9000 \ - -e MINIO_ROOT_USER=TEST-DataFusionLogin \ - -e MINIO_ROOT_PASSWORD=TEST-DataFusionPassword \ - -v $(pwd)/../datafusion/core/tests/data:/source \ - quay.io/minio/minio server /data - -docker exec datafusion-test-minio /bin/sh -c "\ - mc ready local - mc alias set localminio http://localhost:9000 TEST-DataFusionLogin TEST-DataFusionPassword && \ - mc mb localminio/data && \ - mc cp -r /source/* localminio/data" -``` - -Setup environment +By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION`: Review Comment: I think documenting a bit about what they are doing might help future users too ```suggestion By default, storage integration tests are not run. These test use the `testcontainers` crate to start up a local MinIO server using docker on port 9000. To run them you will need to set `TEST_STORAGE_INTEGRATION`: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org