[ 
https://issues.apache.org/jira/browse/ARROW-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Albers updated ARROW-17448:
-------------------------------
    Component/s:     (was: Python)

> [R] Fix cloud storage paths in some documentation
> -------------------------------------------------
>
>                 Key: ARROW-17448
>                 URL: https://issues.apache.org/jira/browse/ARROW-17448
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 9.0.0
>            Reporter: Sam Albers
>            Priority: Minor
>
> There are a few issues with the documentation for the cloud storage examples 
> where paths are incorrect. For example in this vignette: 
> [https://arrow.apache.org/docs/r/articles/fs.html]
> This doesn't work:
> {code:java}
> df <- 
> read_parquet(bucket$path("nyc-taxi/year=2019/month=6/data.parquet")){code}
>  rather it should be:
> {code:java}
> df <- 
> read_parquet(bucket$path("nyc-taxi/year=2019/month=6/part-0.parquet")){code}
> which I think makes sense as part-0 is the default writing convention for 
> write_dataset and therefore something users are likely to see. Indeed this 
> the way the file structure was written:
> {code:java}
> library(arrow)
> bucket <- s3_bucket("voltrondata-labs-datasets")
> bucket$ls(path = "nyc-taxi/year=2011", recursive = TRUE)
> #>  [1] "nyc-taxi/year=2011/month=1"                
> #>  [2] "nyc-taxi/year=2011/month=1/part-0.parquet" 
> #>  [3] "nyc-taxi/year=2011/month=10"               
> #>  [4] "nyc-taxi/year=2011/month=10/part-0.parquet"
> #>  [5] "nyc-taxi/year=2011/month=11"               
> #>  [6] "nyc-taxi/year=2011/month=11/part-0.parquet"
> #>  [7] "nyc-taxi/year=2011/month=12"               
> #>  [8] "nyc-taxi/year=2011/month=12/part-0.parquet"
> #>  [9] "nyc-taxi/year=2011/month=2"                
> #> [10] "nyc-taxi/year=2011/month=2/part-0.parquet" 
> #> [11] "nyc-taxi/year=2011/month=3"                
> #> [12] "nyc-taxi/year=2011/month=3/part-0.parquet" 
> #> [13] "nyc-taxi/year=2011/month=4"                
> #> [14] "nyc-taxi/year=2011/month=4/part-0.parquet" 
> #> [15] "nyc-taxi/year=2011/month=5"                
> #> [16] "nyc-taxi/year=2011/month=5/part-0.parquet" 
> #> [17] "nyc-taxi/year=2011/month=6"                
> #> [18] "nyc-taxi/year=2011/month=6/part-0.parquet" 
> #> [19] "nyc-taxi/year=2011/month=7"                
> #> [20] "nyc-taxi/year=2011/month=7/part-0.parquet" 
> #> [21] "nyc-taxi/year=2011/month=8"                
> #> [22] "nyc-taxi/year=2011/month=8/part-0.parquet" 
> #> [23] "nyc-taxi/year=2011/month=9"                
> #> [24] "nyc-taxi/year=2011/month=9/part-0.parquet"
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to