[ 
https://issues.apache.org/jira/browse/ARROW-14677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17463782#comment-17463782
 ] 

Martin Morgan commented on ARROW-14677:
---------------------------------------

This seems to have fixed the issue, thanks. FWIW this is what I see now
{code:java}
> system2('otool', c('-L', system.file('libs/arrow.so', package='arrow')))
/Users/ma38727/Library/R/4.2/Bioc/3.15/library/arrow/libs/arrow.so:
    arrow.so (compatibility version 0.0.0, current version 0.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 
1281.100.1)
    /usr/lib/libcurl.4.dylib (compatibility version 7.0.0, current version 
9.0.0)
    libR.dylib (compatibility version 4.2.0, current version 4.2.0)
    /usr/local/opt/gettext/lib/libintl.8.dylib (compatibility version 11.0.0, 
current version 11.0.0)
    
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation 
(compatibility version 150.0.0, current version 1677.104.0)
    /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 
902.1.0){code}

> [R][C++] macOS R package arrow segfault on `open_dataset()`
> -----------------------------------------------------------
>
>                 Key: ARROW-14677
>                 URL: https://issues.apache.org/jira/browse/ARROW-14677
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, R
>    Affects Versions: 6.0.0
>            Reporter: Martin Morgan
>            Priority: Major
>
> Following a slack post 
> (https://ropensci.slack.com/archives/C026GCWKA/p1636588933095400), accessing 
> a public bucket with the R client
> {code:java}
> df <- 
> arrow::open_dataset("s3://gbif-open-data-af-south-1/occurrence/2021-11-01/occurrence.parquet/")
> {code}
> leads to a segfault
> {code:java}
>   *** caught segfault ***
> address 0x0, cause 'unknown'
> Traceback:
> 1: dataset__DatasetFactory_Finish1(self, unify_schemas)
> 2: factory$Finish(schema, isTRUE(unify_schemas))
> 3: doTryCatch(return(expr), name, parentenv, handler)
> 4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 5: tryCatchList(expr, classes, parentenv, handlers)
> 6: tryCatch(factory$Finish(schema, isTRUE(unify_schemas)), error = function(e)
> { handle_parquet_io_error(e, format)}
> )
> 7: 
> arrow::open_dataset("s3://gbif-open-data-af-south-1/occurrence/2021-11-01/occurrence.parquet/")
>  
> {code}
> The arrow portion of the lldb traceback is
> {code:java}
> (lldb) thread backtrace
> thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS 
> (code=EXC_I386_GPFLT) frame #0: 0x000000012ab2029c 
> libthrift-0.15.0.dylib`std::__1::shared_ptr<apache::thrift::async::TAsyncProcessor>::~shared_ptr()
>  + 46
> frame #1: 0x0000000128bb6ac2 arrow.so`void 
> parquet::DeserializeThriftUnencryptedMsg<parquet::format::FileMetaData>(unsigned
>  char const*, unsigned int*, parquet::format::FileMetaData*) + 309
> frame #2: 0x0000000128bb5f49 
> arrow.so`parquet::FileMetaData::FileMetaDataImpl::FileMetaDataImpl(void 
> const*, unsigned int*, std::__1::shared_ptr<parquet::InternalFileDecryptor>) 
> + 517
> frame #3: 0x0000000128bace0d 
> arrow.so`parquet::FileMetaData::FileMetaData(void const*, unsigned int*, 
> std::__1::shared_ptr<parquet::InternalFileDecryptor>) + 85
> frame #4: 0x0000000128bacd1b arrow.so`parquet::FileMetaData::Make(void 
> const*, unsigned int*, std::__1::shared_ptr<parquet::InternalFileDecryptor>) 
> + 89
> frame #5: 0x0000000128b9cb4a 
> arrow.so`parquet::SerializedFile::ParseUnencryptedFileMetadata(std::__1::shared_ptr<arrow::Buffer>
>  const&, unsigned int) + 118
> frame #6: 0x0000000128b9df43 
> arrow.so`parquet::SerializedFile::ParseMetaData() + 607
> frame #7: 0x0000000128b9dc6c 
> arrow.so`parquet::ParquetFileReader::Contents::Open(std::_1::shared_ptr<arrow::io::RandomAccessFile>,
>  parquet::ReaderProperties const&, 
> std::_1::shared_ptr<parquet::FileMetaData>) + 214
> frame #8: 0x0000000128b9eb72 
> arrow.so`parquet::ParquetFileReader::Open(std::_1::shared_ptr<arrow::io::RandomAccessFile>,
>  parquet::ReaderProperties const&, 
> std::_1::shared_ptr<parquet::FileMetaData>) + 58
> frame #9: 0x0000000128c8a988 
> arrow.so`arrow::dataset::ParquetFileFormat::GetReader(arrow::dataset::FileSource
>  const&, arrow::dataset::ScanOptions*) const + 286
> frame #10: 0x0000000128c8a72e 
> arrow.so`arrow::dataset::ParquetFileFormat::Inspect(arrow::dataset::FileSource
>  const&) const + 44
> frame #11: 0x0000000128c0b994 
> arrow.so`arrow::dataset::FileSystemDatasetFactory::InspectSchemas(arrow::dataset::InspectOptions)
>  + 336
> frame #12: 0x0000000128c09079 
> arrow.so`arrow::dataset::DatasetFactory::Inspect(arrow::dataset::InspectOptions)
>  + 43
> frame #13: 0x0000000128c0c1cf 
> arrow.so`arrow::dataset::FileSystemDatasetFactory::Finish(arrow::dataset::FinishOptions)
>  + 541
> frame #14: 0x0000000128a66805 
> arrow.so`dataset__DatasetFactoryFinish1(std::_1::shared_ptr<arrow::dataset::DatasetFactory>
>  const&, bool) + 69
> frame #15: 0x0000000128a105aa arrow.so`arrow_dataset_DatasetFactory_Finish1 + 
> 154 {code}
> arrow was installed from source on
> {code:java}
> > sessionInfo()
> R Under development (unstable) (2021-10-28 r81109)
> Platform: x86_64-apple-darwin19.6.0 (64-bit)
> Running under: macOS Catalina 10.15.7
> Matrix products: default
> BLAS: /Users/ma38727/bin/R-devel/lib/libRblas.dylib
> LAPACK: /Users/ma38727/bin/R-devel/lib/libRlapack.dylib
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
> other attached packages:
> [1] arrow_6.0.0.2
> loaded via a namespace (and not attached):
> [1] tidyselect_1.1.1 bit_4.0.4 compiler_4.2.0
> [4] BiocManager_1.30.16 magrittr_2.0.1 assertthat_0.2.1
> [7] R6_2.5.1 glue_1.5.0 bit64_4.0.5
> [10] vctrs_0.3.8 rlang_0.4.12 purrr_0.3.4
> {code}
> During package installation, the one step that was 'new' to me was the use of 
> autobrew
> {code:java}
> *** Downloading apache-arrow
> Using autobrew bundle: apache-arrow-6.0.0-high_sierra.tar.xz{code}
> I'm not sure how to validate that this use is consistent with my brew 
> installation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to