[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408992#comment-17408992 ] David Li commented on ARROW-13803: -- There is an off-by-one error in [BitUtil::SetBitmap|https://github.com/apache/arrow/blob/8c70a5f5178c5b74cc181dc8bdd4b03ba14f36d9/cpp/src/arrow/util/bit_util.cc#L112-L115]. In this case, offset started as 0 and length started as 65536. At this point in the function, offset is now 65536 and length is now 0. data is a pointer to an 8192-byte buffer. Hence it indexes {{data[8192]}} which is past the end of the buffer. We then crash because the memory at this region is not mapped on this platform. (I'm surprised valgrind/ASan/etc. don't catch the access on x64.) > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408929#comment-17408929 ] David Li commented on ARROW-13803: -- It still doesn't replicate on Linux/x64 or MacOS/x64, unfortunately, so it does seem ARM-specific. > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408900#comment-17408900 ] David Li commented on ARROW-13803: -- Ok! I can reproduce it, turns out a release build was very important (should've thought of that earlier…) > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408881#comment-17408881 ] David Li commented on ARROW-13803: -- Thanks, I'll give it a try with bundled dependencies. I'm testing using the entire dataset already as well. > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408877#comment-17408877 ] Neal Richardson commented on ARROW-13803: - Which years did you test? It's possible there's a data issue in some file that's not being handled correctly, I know there are quirks. I am testing with the Ursa bucket data. No conda here, dependency source AUTO and I haven't installed much on the system so it's basically bundling everything except lz4 and zlib AFAICT. My cmake invocation is {code} cmake \ -GNinja \ -DARROW_COMPUTE=ON \ -DARROW_CSV=ON \ -DARROW_DATASET=ON \ -DARROW_FILESYSTEM=ON \ -DARROW_JEMALLOC=ON \ -DARROW_JSON=ON \ -DARROW_PARQUET=ON \ -DCMAKE_BUILD_TYPE=release \ -DARROW_INSTALL_NAME_RPATH=OFF \ -DARROW_S3=ON \ -DARROW_MIMALLOC=OFF \ -DARROW_WITH_BROTLI=ON \ -DARROW_WITH_BZ2=ON \ -DARROW_WITH_LZ4=ON \ -DARROW_WITH_SNAPPY=ON \ -DARROW_WITH_ZLIB=ON \ -DARROW_WITH_ZSTD=ON \ -DARROW_EXTRA_ERROR_CONTEXT=ON \ -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \ -DARROW_BUILD_TESTS=OFF \ -DARROW_WITH_UTF8PROC=ON \ .. {code} No special compilation flags; cmake reports {code} -- CMAKE_C_FLAGS: -Qunused-arguments -O3 -DNDEBUG -Wall -Wno-unknown-warning-option -Wno-pass-failed -stdlib=libc++ -march=armv8-a -- CMAKE_CXX_FLAGS: -Qunused-arguments -fcolor-diagnostics -O3 -DNDEBUG -Wall -Wno-unknown-warning-option -Wno-pass-failed -stdlib=libc++ -march=armv8-a {code} > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408844#comment-17408844 ] David Li commented on ARROW-13803: -- Trying again on an M1 Mac, I still don't get the crash. Just to check a few things, then: * Is the source of the NYC Taxi dataset you're using also the Parquet files in the Ursa bucket? * What flags are you using to build Arrow and the R library? * Are you using Conda or Homebrew or some other source for dependencies? (Though I couldn't get Conda to work on the M1 Mac.) > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17408350#comment-17408350 ] David Li commented on ARROW-13803: -- I tried again on an x86_64 Mac and didn't get the error either, though I tested only a couple years of the NYC Taxi dataset. > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407357#comment-17407357 ] David Li commented on ARROW-13803: -- Hmm, I built the branch on a Linux/x64 machine and ran the first query using a copy of the NYC Taxi dataset from the Ursa Labs bucket, and it did not crash: {noformat} > ds <- open_dataset("/home/lidavidm/Documents/taxi", partitioning = c("year", > "month")) > ds %>% filter(total_amount > 0, passenger_count > 0) %>% summarise(n=n()) %>% > collect() # A tibble: 1 × 1 n 1 1541561340 {noformat} so this might be something OS-specific or otherwise not as easily reproducible :/ > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARROW-13803) [C++] Segfault on filtering taxi dataset
[ https://issues.apache.org/jira/browse/ARROW-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407321#comment-17407321 ] Neal Richardson commented on ARROW-13803: - Some more investigation: the {{kleene_and}} does seem to be involved, and it seems it crashes if you {{&}} together two filter expressions where at least one of them is a float type. I made several queries with &ed filters on integer and string types and they did not crash. {code} > ds %>% filter(passenger_count > 0 & passenger_count < 4) %>% summarize(n = > n()) %>% collect() # A tibble: 1 × 1 n 1 1373176060 > ds %>% filter(payment_type == "CAS" & payment_type != "CRD") %>% > summarize(n=n()) %>% collect() # A tibble: 1 × 1 n 1 26876825 > ds %>% filter(total_amount > 0 & total_amount < 4) %>% summarize(n = n()) %>% > collect() *** caught bus error *** address 0x139448000, cause 'invalid alignment' > ds %>% filter(trip_distance > 0 & trip_distance < 4) %>% summarize(n=n()) %>% > collect() *** caught segfault *** address 0x120e3c000, cause 'invalid permissions' {code} > [C++] Segfault on filtering taxi dataset > > > Key: ARROW-13803 > URL: https://issues.apache.org/jira/browse/ARROW-13803 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ > Environment: macOS 11.2.1, MacBook Pro (13-inch, M1, 2020) >Reporter: Neal Richardson >Priority: Major > Labels: query-engine > Fix For: 6.0.0 > > > Found this while testing ARROW-13740. Using the nyc-taxi dataset: > {code} > ds %>% > filter(total_amount > 0, passenger_count > 0) %>% > summarise(n = n()) %>% > collect() > {code} > {code} > *** caught segfault *** > address 0x161784000, cause 'invalid permissions' > Traceback: > 1: .Call(`_arrow_ExecPlan_run`, plan, final_node, sort_options) > ... > {code} > lldb shows > {code} > * thread #11, stop reason = EXC_BAD_ACCESS (code=1, address=0x1631a8000) > frame #0: 0x00013a79d9cc > libarrow.600.dylib`arrow::BitUtil::SetBitmap(unsigned char*, long long, long > long) + 296 > libarrow.600.dylib`arrow::BitUtil::SetBitmap: > -> 0x13a79d9cc <+296>: ldrb w10, [x8] > 0x13a79d9d0 <+300>: cmpw9, #0x8 ; =0x8 > 0x13a79d9d4 <+304>: cset w11, lo > 0x13a79d9d8 <+308>: andw9, w9, #0x7 > Target 0: (R) stopped. > (lldb) > {code} > Interestingly, I can evaluate those filter expressions just fine, and it only > seems to crash if both are provided. And I can count over the data with both: > {code} > ds %>% > group_by(total_amount > 0, passenger_count > 0) %>% > summarize(n=n()) %>% > collect() > # A tibble: 4 × 3 > `total_amount > 0` `passenger_count > 0` n > > 1 FALSE FALSE805 > 2 FALSE TRUE 368680 > 3 TRUE FALSE5810556 > 4 TRUE TRUE 1541561340 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)