pachadotdev commented on a change in pull request #10546: URL: https://github.com/apache/arrow/pull/10546#discussion_r677626827
########## File path: r/vignettes/fs.Rmd ########## @@ -128,3 +128,74 @@ s3://minioadmin:minioadmin@?scheme=http&endpoint_override=localhost%3A9000 Among other applications, this can be useful for testing out code locally before running on a remote S3 bucket. + +## Non-AWS S3 cloud alternatives (DigitalOcean, IBM, Alibaba, and others) + +*This section adapts some elements from [Analyzing Room Temperature Data](https://www.jaredlander.com/2021/03/analyzing-room-temperature-data/#getting-the-data) by Jared Lander.* + +If you are using any Amazon S3 Compliant Storage Provider, such as AWS, Alibaba, +Ceph, DigitalOcean, Dreamhost, IBM COS, Minio, or others, you can connect to it +with `arrow` by using the `S3FileSystem` function as for the case of using +MinIO locally. Please note that the use of DigitalOcean here is just an example, as +it can be any other S3 compatible service. + +At the begininning of this vignette we used: + +```r +june2019 <- SubTreeFileSystem$create("s3://ursa-labs-taxi-data/2019/06") +``` + +Which connects to AWS, and the same can be adapted for other providers, For +instructional purposes, we provide [nyc-taxi.sfo3.digitaloceanspaces.com](https://nyc-taxi.sfo3.digitaloceanspaces.com), +which is a public storage with the NYC taxi data used in +[Working with Arrow Datasets and dplyr](dataset.html). + +To connect to this space, you only need to adapt the code from the previous +section: + +```r +space <- arrow::S3FileSystem$create( + anonymous = TRUE, + scheme = "https", + endpoint_override = "sfo3.digitaloceanspaces.com" +) +``` + +The space that we are using space allows anonymous access, but if you were to +connect to a private space (i.e. with sensitive data), you would need to +provide a token, say: + +```r +space <- arrow::S3FileSystem$create( + access_key = Sys.getenv('DO_ARROW_TAXI_TOKEN'), + secret_key = Sys.getenv('DO_ARROW_TAXI_SECRET'), + scheme = "https", + endpoint_override = "sfo3.digitaloceanspaces.com" +) +``` Review comment: how about a ... "just as an example..."? ########## File path: r/vignettes/fs.Rmd ########## @@ -128,3 +128,74 @@ s3://minioadmin:minioadmin@?scheme=http&endpoint_override=localhost%3A9000 Among other applications, this can be useful for testing out code locally before running on a remote S3 bucket. + +## Non-AWS S3 cloud alternatives (DigitalOcean, IBM, Alibaba, and others) + +*This section adapts some elements from [Analyzing Room Temperature Data](https://www.jaredlander.com/2021/03/analyzing-room-temperature-data/#getting-the-data) by Jared Lander.* + +If you are using any Amazon S3 Compliant Storage Provider, such as AWS, Alibaba, +Ceph, DigitalOcean, Dreamhost, IBM COS, Minio, or others, you can connect to it +with `arrow` by using the `S3FileSystem` function as for the case of using +MinIO locally. Please note that the use of DigitalOcean here is just an example, as +it can be any other S3 compatible service. + +At the begininning of this vignette we used: + +```r +june2019 <- SubTreeFileSystem$create("s3://ursa-labs-taxi-data/2019/06") +``` + +Which connects to AWS, and the same can be adapted for other providers, For +instructional purposes, we provide [nyc-taxi.sfo3.digitaloceanspaces.com](https://nyc-taxi.sfo3.digitaloceanspaces.com), +which is a public storage with the NYC taxi data used in +[Working with Arrow Datasets and dplyr](dataset.html). + +To connect to this space, you only need to adapt the code from the previous +section: + +```r +space <- arrow::S3FileSystem$create( + anonymous = TRUE, + scheme = "https", + endpoint_override = "sfo3.digitaloceanspaces.com" +) +``` + +The space that we are using space allows anonymous access, but if you were to +connect to a private space (i.e. with sensitive data), you would need to +provide a token, say: + +```r +space <- arrow::S3FileSystem$create( + access_key = Sys.getenv('DO_ARROW_TAXI_TOKEN'), + secret_key = Sys.getenv('DO_ARROW_TAXI_SECRET'), + scheme = "https", + endpoint_override = "sfo3.digitaloceanspaces.com" +) +``` Review comment: i'll send a coutner proposal for this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org