[ https://issues.apache.org/jira/browse/ARROW-8977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson updated ARROW-8977: ----------------------------------- Fix Version/s: (was: 1.0.0) 2.0.0 > [R] Table$create with schema crashes with some dictionary index types > --------------------------------------------------------------------- > > Key: ARROW-8977 > URL: https://issues.apache.org/jira/browse/ARROW-8977 > Project: Apache Arrow > Issue Type: Bug > Components: R > Affects Versions: 0.17.1 > Environment: Using the latest nightly build of arrow and R 4.0.0 on > OS X. > R sessionInfo: > R version 4.0.0 (2020-04-24) > Platform: x86_64-apple-darwin17.0 (64-bit) > Running under: macOS High Sierra 10.13.6 > Matrix products: default > BLAS: > /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib > LAPACK: > /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > attached base packages: > [1] stats graphics grDevices utils datasets methods base > other attached packages: > [1] arrow_0.17.1.20200527 > loaded via a namespace (and not attached): > [1] tidyselect_1.1.0 bit_1.1-15.2 compiler_4.0.0 magrittr_1.5 > assertthat_0.2.1 R6_2.4.1 > [7] tools_4.0.0 glue_1.4.1 Rcpp_1.0.4.6 bit64_0.9-7 > vctrs_0.3.0 knitr_1.28 > [13] xfun_0.14 rlang_0.4.6 purrr_0.3.4 > Reporter: Ben Schmidt > Assignee: Romain Francois > Priority: Minor > Fix For: 2.0.0 > > > On the latest nightly build in R, using Table$create with a custom schema can > crash R entirely (fatal error/bomb in RStudio) when the schema includes a > different index_type for the dictionary than expected. > Example: > {code:r} > library(arrow) > native = data.frame(a = c(1, 2, 3), b = as.factor(c("a", "b", "c"))) > #Works. 'a' is <float>, dictionary is string - int8 > Table$create(native) > # Works, although 'a' is cast to int32. > Table$create(native, schema = s1) > s1 = schema(a = uint32(), b = dictionary(value_type = arrow::string(), > index_type = arrow::int8())) > # Crashes R on my system because index_type is int16(), not int8() > s2 = schema(a = uint32(), b = dictionary(value_type = arrow::string(), > index_type = arrow::int16())) > Table$create(native, schema = s2) > {code} > > On restart, following log is in my rstudio session: > > {noformat} > /private/var/folders/84/dvp0h0kn22qcx_0z_hn_b36w0000gn/T/hbtmp/apache-arrow-20200528-16757-1uok2ln/cpp/src/arrow/array.cc:1194: > Check failed: (indices->type_id()) == (dict.index_type()->id()) 0 arrow.so > 0x000000010dd2b3cd _ZN5arrow4util7CerrLogD2Ev + 209 1 arrow.so > 0x000000010dd2b2ee _ZN5arrow4util7CerrLogD0Ev + 14 2 arrow.so > 0x000000010dd2b296 _ZN5arrow4util8ArrowLogD1Ev + 34 3 arrow.so > 0x000000010daf2429 > _ZN5arrow15DictionaryArray10FromArraysERKNSt3__110shared_ptrINS_8DataTypeEEERKNS2_INS_5ArrayEEESA_ > + 619 4 arrow.so 0x000000010d8f1eff > _ZN5arrow1r19MakeFactorArrayImplINS_8Int8TypeEEENSt3__110shared_ptrINS_5ArrayEEEN4Rcpp6VectorILi13ENS7_16NoProtectStorageEEERKNS4_INS_8DataTypeEEE > + 1743 5 arrow.so 0x000000010d8f1684 > _ZN5arrow1r15MakeFactorArrayEN4Rcpp6VectorILi13ENS1_16NoProtectStorageEEERKNSt3__110shared_ptrINS_8DataTypeEEE > + 260 6 arrow.so 0x000000010d8f40f4 > _ZN5arrow1r18Array__from_vectorEP7SEXPRECRKNSt3__110shared_ptrINS_8DataTypeEEEb > + 292 7 arrow.so 0x000000010d98e418 > _ZZ16Table__from_dotsP7SEXPRECS0_ENK3$_1clEiS0_ + 440 8 arrow.so > 0x000000010d98d15b _Z16Table__from_dotsP7SEXPRECS0_ + 1611 9 arrow.so > 0x000000010d95af3b _arrow_Table__from_dots + 91 10 libR.dylib > 0x0000000105466b5d R_doDotCall + 1437 11 libR.dylib 0x00000001054b2a7a bcEval > + 105226 12 libR.dylib 0x0000000105498831 Rf_eval + 385 13 libR.dylib > 0x00000001054b8cf1 R_execClosure + 2193 14 libR.dylib 0x00000001054b7ac9 > Rf_applyClosure + 473 15 libR.dylib 0x000000010549f9a8 bcEval + 27192 16 > libR.dylib 0x0000000105498831 Rf_eval + 385 17 libR.dylib 0x00000001054b71cc > forcePromise + 172 18 libR.dylib 0x00000001054c2b4a getvar + 778 19 > libR.dylib 0x000000010549cb8c bcEval + 15388 20 libR.dylib 0x0000000105498831 > Rf_eval + 385 21 libR.dylib 0x00000001054b71cc forcePromise + 172 22 > libR.dylib 0x00000001054c2b4a getvar + 778 23 libR.dylib 0x000000010549cb8c > bcEval + 15388 24 libR.dylib 0x0000000105498831 Rf_eval + 385 25 libR.dylib > 0x00000001054b8cf1 R_execClosure + 2193 26 libR.dylib 0x00000001054b7ac9 > Rf_applyClosure + 473 27 libR.dylib 0x000000010549f9a8 bcEval + 27192 28 > libR.dylib 0x0000000105498831 Rf_eval + 385 29 libR.dylib 0x00000001054b8cf1 > R_execClosure + 2193 30 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 > 31 libR.dylib 0x000000010549f9a8 bcEval + 27192 32 libR.dylib > 0x0000000105498831 Rf_eval + 385 33 libR.dylib 0x00000001054b8cf1 > R_execClosure + 2193 34 libR.dylib 0x00000001054b7ac9 Rf_applyClosure + 473 > 35 libR.dylib 0x0000000105498d06 Rf_eval + 1622 36 libR.dylib > 0x00000001054edcda Rf_ReplIteration + 810 37 libR.dylib 0x00000001054ef1ff > run_Rmainloop + 207 38 rsession 0x0000000104bef4d0 > _ZN7rstudio1r7session12runEmbeddedRERKNS_4core8FilePathES5_bb7SA_TYPERKNS1_9CallbacksEPNS1_17InternalCallbacksE > + 416 > > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005)