thisisnic commented on issue #36807:
URL: https://github.com/apache/arrow/issues/36807#issuecomment-1645426872
I tried again, and it read in fine without `head()`. When I tried again
with `head()` then `collect()` it read in the data successfully and then
segfaulted immediately after.
```
> open_dataset("/data/nyc-taxi/year=2016/month=11/part-0.parquet") %>%
head() %>% collect()
# A tibble: 6 × 22
vendor_name pickup_datetime dropoff_datetime passenger_count
<chr> <dttm> <dttm> <int>
1 VTS 2016-11-10 20:14:06 2016-11-10 20:19:37 1
2 VTS 2016-11-10 20:14:06 2016-11-10 20:43:31 1
3 VTS 2016-11-10 20:14:06 2016-11-10 20:17:24 1
4 VTS 2016-11-10 20:14:06 2016-11-10 20:20:12 1
5 CMT 2016-11-10 20:14:07 2016-11-10 20:20:23 1
6 CMT 2016-11-10 20:14:07 2016-11-10 21:13:19 2
# ℹ 18 more variables: trip_distance <dbl>, pickup_longitude <dbl>,
# pickup_latitude <dbl>, rate_code <chr>, store_and_fwd <chr>,
# dropoff_longitude <dbl>, dropoff_latitude <dbl>, payment_type <chr>,
# fare_amount <dbl>, extra <dbl>, mta_tax <dbl>, tip_amount <dbl>,
# tolls_amount <dbl>, total_amount <dbl>, improvement_surcharge <dbl>,
# congestion_surcharge <dbl>, pickup_location_id <int>,
# dropoff_location_id <int>
>
Thread 11 "R" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffbffff640 (LWP 480578)]
0x0000000000000000 in ?? ()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]