blongworth commented on issue #39912:
URL: https://github.com/apache/arrow/issues/39912#issuecomment-2648335956

   Hi @zanmato1984, hardest thing will probably be finding the minimal dataset 
that will produce the issue. Once there's a minimal dataset that triggers the 
issue, reproducing in C++ or python would help isolate the issue. LMK if I can 
help with setting up or testing in R. 
   
   For my data, I've whittled it down to the summarize step that counts 
elements in each group:
   
   ```
   dsd <- ds |> 
     group_by(timestamp) |>
     summarize(n = n()) |>
     collect() 
   ```
   
    I'm not sure whether it's the summarizing or the counting. I still see the 
problem in arrow 19.0.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to