nealrichardson commented on pull request #7415: URL: https://github.com/apache/arrow/pull/7415#issuecomment-643332789
Thanks y'all for looking into this. Given where we are right now, this PR is probably an improvement on the status quo. If we had benchmarks that the conversion to Date<dbl> had a big penalty, that would be different, but we don't. But I have to ask: if `funs::if_else()` handles this situation correctly (which I read to mean doesn't error when mixing integer vs. numeric Dates, but maybe I misunderstand what "correctly" means), why make the change here? Longer discussion: Since the R and Arrow data types aren't exactly the same, there's tension between different objectives of the `arrow` package here: 1. Efficiency and fidelity of moving data from/to Arrow memory, zero-copy where possible (which IIUC would point to using integer) 2. Usability in R and fidelity of R data round trip (which would point to using numeric) We have a few related issues open currently: * https://jira.apache.org/jira/browse/ARROW-9083 (int64) * https://jira.apache.org/jira/browse/ARROW-7798 (related, discussion of R <--> Arrow conversion code) * https://jira.apache.org/jira/browse/ARROW-7657 (handling Arrow dictionary arrays that don't have string type values) * https://jira.apache.org/jira/browse/ARROW-6235 (BinaryArray conversion, which doesn't have an exact R analog) * https://jira.apache.org/jira/browse/ARROW-3628 (Decimal128 conversion) * https://jira.apache.org/jira/browse/ARROW-3446 (documenting how integer types are converted) I wonder whether for things like this, there should be option(s) for choosing how to do the conversion, based on your use case and needs. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org