blongworth commented on issue #39912:
URL: https://github.com/apache/arrow/issues/39912#issuecomment-2648335956
Hi @zanmato1984, hardest thing will probably be finding the minimal dataset
that will produce the issue. Once there's a minimal dataset that triggers the
issue, reproducing in C++ or python would help isolate the issue. LMK if I can
help with setting up or testing in R.
For my data, I've whittled it down to the summarize step that counts
elements in each group:
```
dsd <- ds |>
group_by(timestamp) |>
summarize(n = n()) |>
collect()
```
I'm not sure whether it's the summarizing or the counting. I still see the
problem in arrow 19.0.0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]