raulcd commented on PR #47627:
URL: https://github.com/apache/arrow/pull/47627#issuecomment-3334040625

   > I think there is an upstream change from IANA TZDB and might be fixed in a 
way similar to 
https://github.com/apache/orc/pull/2363/files#diff-6c9de59647d8013be967b7259ce33ba22923a7c340383bbf9a55ec1aa4d03c03R419-R427
   
   @wgtmac I've been investigating a little further the issue is with a couple 
tests that are "intentionally" removing the timezone and expecting ORC to raise 
an exception and Arrow to catch it but not to crash, the problem (on this 
specific case) is with ORC 2.1.0 which is the one on vcpkg and musllinux. See 
the backtrace:
   ```c++
   terminate called after throwing an instance of 'orc::TimezoneError'
     what():  Time zone file /tmp/non_existent/US/Pacific does not exist. 
Please install IANA time zone database and set TZDIR env.
   
   Thread 1 "python" received signal SIGABRT, Aborted.
   __restore_sigs (set=set@entry=0x7ffc7ba108e0) at 
./arch/x86_64/syscall_arch.h:40
   warning: 40  ./arch/x86_64/syscall_arch.h: No such file or directory
   (gdb) backtrace
   #0  __restore_sigs (set=set@entry=0x7ffc7ba108e0) at 
./arch/x86_64/syscall_arch.h:40
   #1  0x000076f54c58fe1b in raise (sig=sig@entry=6) at src/signal/raise.c:11
   #2  0x000076f54c5739ad in abort () at src/exit/abort.c:11
   #3  0x000076f5484f2a4a in ?? () from 
/usr/local/lib/python3.10/site-packages/pyarrow/../pyarrow.libs/libstdc++-0d31ccbe.so.6.0.32
   #4  0x000076f548504ef8 in __cxxabiv1::__terminate(void (*)()) () from 
/usr/local/lib/python3.10/site-packages/pyarrow/../pyarrow.libs/libstdc++-0d31ccbe.so.6.0.32
   #5  0x000076f548504f65 in std::terminate() () from 
/usr/local/lib/python3.10/site-packages/pyarrow/../pyarrow.libs/libstdc++-0d31ccbe.so.6.0.32
   #6  0x000076f5485051b8 in __cxa_throw () from 
/usr/local/lib/python3.10/site-packages/pyarrow/../pyarrow.libs/libstdc++-0d31ccbe.so.6.0.32
   #7  0x000076f54a04b1ee in orc::loadTZDB(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&) ()
      from /usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #8  0x000076f54a04b9f3 in 
std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<orc::LazyTimezone::getImpl()
 const::{lambda()#1}>(std::once_flag&, orc::LazyTimezone::getImpl() 
const::{lambda()#1}&&)::{lambda()#1}>(orc::LazyTimezone::getImpl() 
const::{lambda()#1}&)::{lambda()#1}::_FUN() () from 
/usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #9  0x000076f54c599656 in __pthread_once_full (control=0x76f546a4b1b0, 
init=0x76f548532c70 <__once_proxy>) at src/thread/pthread_once.c:22
   #10 __pthread_once_full (control=0x76f546a4b1b0, init=0x76f548532c70 
<__once_proxy>) at src/thread/pthread_once.c:11
   #11 0x000076f54c5996f3 in __pthread_once (control=<optimized out>, 
init=<optimized out>) at src/thread/pthread_once.c:47
   #12 0x000076f54a049659 in orc::LazyTimezone::getEpoch() const () from 
/usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #13 0x000076f54a0804cf in 
orc::TimestampColumnReader::TimestampColumnReader(orc::Type const&, 
orc::StripeStreams&, bool) ()
      from /usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #14 0x000076f54a08155d in orc::buildReader(orc::Type const&, 
orc::StripeStreams&, bool, bool, bool) () from 
/usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #15 0x000076f54a081c36 in 
orc::StructColumnReader::StructColumnReader(orc::Type const&, 
orc::StripeStreams&, bool, bool) ()
      from /usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #16 0x000076f54a0815e3 in orc::buildReader(orc::Type const&, 
orc::StripeStreams&, bool, bool, bool) () from 
/usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #17 0x000076f54a02f60d in orc::RowReaderImpl::startNextStripe() () from 
/usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #18 0x000076f54a030005 in orc::RowReaderImpl::next(orc::ColumnVectorBatch&) 
() from /usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #19 0x000076f5495206a5 in 
arrow::adapters::orc::ORCFileReader::Impl::ReadBatch(orc::RowReaderOptions 
const&, std::shared_ptr<arrow::Schema> const&, long) ()
      from /usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #20 0x000076f549520d6a in 
arrow::adapters::orc::ORCFileReader::Impl::ReadTable(orc::RowReaderOptions 
const&, std::shared_ptr<arrow::Schema> const&) ()
      from /usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #21 0x000076f54951a3d7 in arrow::adapters::orc::ORCFileReader::Read() () 
from /usr/local/lib/python3.10/site-packages/pyarrow/libarrow.so.2200
   #22 0x000076f525a6c55c in 
__pyx_pw_7pyarrow_4_orc_9ORCReader_43read(_object*, _object* const*, long, 
_object*) ()
   ```
   @pitrou based on the stacktrace and the report this seems to be the exact 
same issue fixed that introduced the failing tests, any idea?
   - https://github.com/apache/arrow/pull/45051
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to