jonkeane commented on pull request #11783: URL: https://github.com/apache/arrow/pull/11783#issuecomment-981858152
This looks good to me and should work. We lose a bit of precision with floats, but this is a heuristic anyway, so that shouldn't be a big deal. The only thing that might be nice is a test (which would be long running if we were to test it directly, of course). I wonder if it would make sense to pull [this code](https://github.com/apache/arrow/blob/6c236ca72b0bc96c18fbeb57ba36047a1d55aa62/r/R/parquet.R#L205-L222) out into a separate function that takes the dims as input and returns `chunk_size`. Then we could test the logic on arbitrarily large numbers without needing to generate the data itself. This is very much scope creep, so feel free to punt that to a separate ticket. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
