nealrichardson commented on PR #13985: URL: https://github.com/apache/arrow/pull/13985#issuecomment-1235746964
Benchmark regressions, at least the worst of them, are due to ARROW-17601. By keeping the computation on Decimal types instead of casting to double, we hit an expression that by our current logic would need to promote to a scale that can't fit in Decimal128, so the evaluation errors somewhere, and because these are evaluating on Arrow Table, it falls back to pulling all the data into an R data.frame and doing the work there--hence the regression. I'll see what I can do to mitigate/work around this in this PR. Most extreme case would be to not cast scalars to decimal, i.e. restore the status quo, where most queries on decimal data would end up getting coerced to float. But hopefully we can do better than that. We have very few tests for queries on decimal types, but they're all over the TPC-H data, so that's why we only observed this in the benchmarks. That should probably get rectified too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org