egillax opened a new issue, #34519:
URL: https://github.com/apache/arrow/issues/34519

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I was testing the latest arrow develop version using 
[this](https://arrow.apache.org/docs/dev/r/articles/install_nightly.html#install-from-git-repository)
 method to install from git. 
   
   And now it seems I cannot cast columns in a dataset, it results in ```NA``` 
values:
   
   I tried using both parquet and arrow files. This does work using latest 
version on CRAN (11.0.0.3) and using arrow tables instead of datasets.
   
   Reprex:
   
   ``` r
   library(dplyr)
   #> 
   #> Attaching package: 'dplyr'
   #> The following objects are masked from 'package:stats':
   #> 
   #>     filter, lag
   #> The following objects are masked from 'package:base':
   #> 
   #>     intersect, setdiff, setequal, union
   library(arrow)
   #> Some features are not enabled in this build of Arrow. Run `arrow_info()` 
for more information.
   #> 
   #> Attaching package: 'arrow'
   #> The following object is masked from 'package:utils':
   #> 
   #>     timestamp
   
   mtcars %>% write_dataset('./mtcars/')
   ds <- open_dataset('./mtcars')
   
   ds %>% dplyr::collect()
   #>     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
   #> 1  21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
   #> 2  21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
   #> 3  22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
   #> 4  21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
   #> 5  18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
   #> 6  18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
   #> 7  14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
   #> 8  24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
   #> 9  22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
   #> 10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
   #> 11 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
   #> 12 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
   #> 13 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
   #> 14 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
   #> 15 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
   #> 16 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
   #> 17 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
   #> 18 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
   #> 19 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
   #> 20 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
   #> 21 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
   #> 22 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
   #> 23 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
   #> 24 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
   #> 25 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
   #> 26 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
   #> 27 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
   #> 28 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
   #> 29 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
   #> 30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
   #> 31 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
   #> 32 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
   
   ds %>% dplyr::mutate(mpg=as.numeric(mpg)) %>% dplyr::collect()
   #>    mpg cyl  disp  hp drat    wt  qsec vs am gear carb
   #> 1   NA   6 160.0 110 3.90 2.620 16.46  0  1    4    4
   #> 2   NA   6 160.0 110 3.90 2.875 17.02  0  1    4    4
   #> 3   NA   4 108.0  93 3.85 2.320 18.61  1  1    4    1
   #> 4   NA   6 258.0 110 3.08 3.215 19.44  1  0    3    1
   #> 5   NA   8 360.0 175 3.15 3.440 17.02  0  0    3    2
   #> 6   NA   6 225.0 105 2.76 3.460 20.22  1  0    3    1
   #> 7   NA   8 360.0 245 3.21 3.570 15.84  0  0    3    4
   #> 8   NA   4 146.7  62 3.69 3.190 20.00  1  0    4    2
   #> 9   NA   4 140.8  95 3.92 3.150 22.90  1  0    4    2
   #> 10  NA   6 167.6 123 3.92 3.440 18.30  1  0    4    4
   #> 11  NA   6 167.6 123 3.92 3.440 18.90  1  0    4    4
   #> 12  NA   8 275.8 180 3.07 4.070 17.40  0  0    3    3
   #> 13  NA   8 275.8 180 3.07 3.730 17.60  0  0    3    3
   #> 14  NA   8 275.8 180 3.07 3.780 18.00  0  0    3    3
   #> 15  NA   8 472.0 205 2.93 5.250 17.98  0  0    3    4
   #> 16  NA   8 460.0 215 3.00 5.424 17.82  0  0    3    4
   #> 17  NA   8 440.0 230 3.23 5.345 17.42  0  0    3    4
   #> 18  NA   4  78.7  66 4.08 2.200 19.47  1  1    4    1
   #> 19  NA   4  75.7  52 4.93 1.615 18.52  1  1    4    2
   #> 20  NA   4  71.1  65 4.22 1.835 19.90  1  1    4    1
   #> 21  NA   4 120.1  97 3.70 2.465 20.01  1  0    3    1
   #> 22  NA   8 318.0 150 2.76 3.520 16.87  0  0    3    2
   #> 23  NA   8 304.0 150 3.15 3.435 17.30  0  0    3    2
   #> 24  NA   8 350.0 245 3.73 3.840 15.41  0  0    3    4
   #> 25  NA   8 400.0 175 3.08 3.845 17.05  0  0    3    2
   #> 26  NA   4  79.0  66 4.08 1.935 18.90  1  1    4    1
   #> 27  NA   4 120.3  91 4.43 2.140 16.70  0  1    5    2
   #> 28  NA   4  95.1 113 3.77 1.513 16.90  1  1    5    2
   #> 29  NA   8 351.0 264 4.22 3.170 14.50  0  1    5    4
   #> 30  NA   6 145.0 175 3.62 2.770 15.50  0  1    5    6
   #> 31  NA   8 301.0 335 3.54 3.570 14.60  0  1    5    8
   #> 32  NA   4 121.0 109 4.11 2.780 18.60  1  1    4    2
   ```
   
   <sup>Created on 2023-03-09 with [reprex 
v2.0.2](https://reprex.tidyverse.org)</sup>
   
   <details>
    <summary>Arrow Info</summary>
    Arrow package version: 11.0.0.9000
   
   Capabilities:
                  
   dataset    TRUE
   substrait FALSE
   parquet    TRUE
   json       TRUE
   s3        FALSE
   gcs       FALSE
   utf8proc   TRUE
   re2        TRUE
   snappy     TRUE
   gzip      FALSE
   brotli    FALSE
   zstd      FALSE
   lz4        TRUE
   lz4_frame  TRUE
   lzo       FALSE
   bz2       FALSE
   jemalloc  FALSE
   mimalloc   TRUE
   
   To reinstall with more optional capabilities enabled, see
      https://arrow.apache.org/docs/r/articles/install.html
   
   Memory:
                     
   Allocator mimalloc
   Current   13.31 Kb
   Max       46.31 Mb
   
   Runtime:
                           
   SIMD Level          avx2
   Detected SIMD Level avx2
   
   Build:
                                                                
   C++ Library Version                           12.0.0-SNAPSHOT
   C++ Compiler                                              GNU
   C++ Compiler Version                                   12.2.0
   Git ID               b679a96d426f4df1a2d15d452f312c968cdfc8f6
    </details>
   
   <details>
   <summary>sessionInfo</summary>
   R version 4.2.2 (2022-10-31)
   Platform: x86_64-pc-linux-gnu (64-bit)
   Running under: Ubuntu 22.10
   
   Matrix products: default
   BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.1
   LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.1
   
   locale:
    [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                  
LC_TIME=nl_NL.UTF-8           LC_COLLATE=en_US.UTF-8       
    [5] LC_MONETARY=nl_NL.UTF-8       LC_MESSAGES=en_US.UTF-8       
LC_PAPER=nl_NL.UTF-8          LC_NAME=nl_NL.UTF-8          
    [9] LC_ADDRESS=nl_NL.UTF-8        LC_TELEPHONE=nl_NL.UTF-8      
LC_MEASUREMENT=nl_NL.UTF-8    LC_IDENTIFICATION=nl_NL.UTF-8
   
   attached base packages:
   [1] stats     graphics  grDevices utils     datasets  methods   base     
   
   other attached packages:
   [1] arrow_11.0.0.9000                 dplyr_1.0.10                      
PatientLevelPrediction_6.2.0.9000
   
   loaded via a namespace (and not attached):
    [1] pkgload_1.3.2           bit64_4.0.5             jsonlite_1.8.4          
DatabaseConnector_6.0.0 R.utils_2.12.2         
    [6] shiny_1.7.4             assertthat_0.2.1        highr_0.10              
blob_1.2.3              remotes_2.4.2          
   [11] yaml_2.3.6              sessioninfo_1.2.2       pillar_1.8.1            
RSQLite_2.2.18          lattice_0.20-45        
   [16] glue_1.6.2              reticulate_1.26         digest_0.6.31           
promises_1.2.0.1        htmltools_0.5.4        
   [21] httpuv_1.6.8            Matrix_1.5-1            R.oo_1.25.0             
clipr_0.8.0             pkgconfig_2.0.3        
   [26] devtools_2.4.5          purrr_1.0.1             xtable_1.8-4            
processx_3.8.0          later_1.3.0            
   [31] ParallelLogger_3.0.1    tibble_3.1.8            styler_1.9.0            
generics_0.1.3          usethis_2.1.6          
   [36] ellipsis_0.3.2          cachem_1.0.6            withr_2.5.0             
cli_3.6.0               magrittr_2.0.3         
   [41] crayon_1.5.2            mime_0.12               memoise_2.0.1           
evaluate_0.20           ps_1.7.2               
   [46] R.methodsS3_1.8.2       Andromeda_1.0.0         fs_1.5.2                
fansi_1.0.3             R.cache_0.16.0         
   [51] pkgbuild_1.4.0          SqlRender_1.12.0        profvis_0.3.7           
tools_4.2.2             data.table_1.14.4      
   [56] prettyunits_1.1.1       lifecycle_1.0.3         stringr_1.5.0           
reprex_2.0.2            callr_3.7.3            
   [61] compiler_4.2.2          rlang_1.0.6             grid_4.2.2              
rstudioapi_0.14         htmlwidgets_1.6.1      
   [66] miniUI_0.1.1.1          rmarkdown_2.19          DBI_1.1.3               
R6_2.5.1                knitr_1.41             
   [71] fastmap_1.1.0           bit_4.0.4               utf8_1.2.2              
stringi_1.7.12          rJava_1.0-6            
   [76] parallel_4.2.2          Rcpp_1.0.9              vctrs_0.5.1             
png_0.1-7               urlchecker_1.0.1       
   [81] tidyselect_1.2.0        FeatureExtraction_3.2.0 xfun_0.36      
   </details>
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to