[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408881#comment-17408881 ]
David Li commented on ARROW-13803: ---------------------------------- Thanks, I'll give it a try with bundled dependencies. I'm testing using the entire dataset already as well. > [C++] Segfault on filtering taxi dataset > ---------------------------------------- > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) > Reporter: Neal Richardson > Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x000000013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmp w9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: and w9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > <lgl> <lgl> <int> > 1 FALSE FALSE 805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE 5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)