Etienne Racine created ARROW-7639:
-------------------------------------

             Summary: [R] Cannot convert Dictionary Array of type 
`dictionary<values=double, indices=int8, ordered=0>` to R
                 Key: ARROW-7639
                 URL: https://issues.apache.org/jira/browse/ARROW-7639
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 0.15.1
         Environment: Ubuntu 16.04.5 LTS
            Reporter: Etienne Racine


{{I got an error in R when reading a feather file using arrow::read_feather() 
prepared in python}}
{code:r}
#' Error in Table__to_dataframe(x, use_threads = option_use_threads()) :
#' Cannot convert Dictionary Array of type `dictionary<values=double, 
indices=int8, ordered=0>` to R{code}
I could reproduce the issue with a minimal example:

In python:
{code:python}
import pandas as pd
import pyarrow as pa
df = pd.DataFrame({"float": [0.1, .2, 0.5, .001]})
df["category"] = df["float"].astype('category')
df.dtypes
#' float float64
#' A object
#' category category
#' dtype: object
df.to_feather("series.feather")
pa.__version__
#' '0.15.1'
{code}
>From R:
{code:r}
arrow::read_feather("series.feather")
#' Error in Table__to_dataframe(x, use_threads = option_use_threads()) :
#' Cannot convert Dictionary Array of type `dictionary<values=double, 
indices=int8, ordered=0>` to R
#' Backtrace:
#' █
#' 1. └─arrow::read_feather("series.feather")
#' 2. ├─[ base::as.data.frame(...) ]
#' 3. └─arrow:::as.data.frame.Table(out)
#' 4. └─arrow:::Table__to_dataframe(x, use_threads = option_use_threads())
{code}
 The feather file is read correctly back in python 
{code:python}
ft = pd.read_feather("series.feather")
ft.dtypes
#' float        float64
#' A             object
#' category    category
#' dtype: object
{code}

{code:r}
sessionInfo()
#' R version 3.5.1 (2018-07-02)
#' Platform: x86_64-conda_cos6-linux-gnu (64-bit)
#' Running under: Ubuntu 16.04.5 LTS
#' 
#' Matrix products: default
#' BLAS/LAPACK: /misc/DLshare/home/etbellem/miniconda3/lib/R/lib/libRblas.so
#' 
#' locale:
#' [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#' [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#' [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#' [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#' [9] LC_ADDRESS=C LC_TELEPHONE=C
#' [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#' 
#' attached base packages:
#' [1] stats graphics grDevices utils datasets methods base
#' 
#' loaded via a namespace (and not attached):
#' [1] Rcpp_1.0.3 arrow_0.15.1 crayon_1.3.4 assertthat_0.2.1
#' [5] R6_2.4.1 magrittr_1.5 rlang_0.4.2 rstudioapi_0.10
#' [9] bit64_0.9-7 glue_1.3.1 purrr_0.3.3 bit_1.1-15.1
#' [13] compiler_3.5.1 tidyselect_0.2.5{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to