nealrichardson commented on PR #13985:
URL: https://github.com/apache/arrow/pull/13985#issuecomment-1235746964

   Benchmark regressions, at least the worst of them, are due to ARROW-17601. 
By keeping the computation on Decimal types instead of casting to double, we 
hit an expression that by our current logic would need to promote to a scale 
that can't fit in Decimal128, so the evaluation errors somewhere, and because 
these are evaluating on Arrow Table, it falls back to pulling all the data into 
an R data.frame and doing the work there--hence the regression.
   
   I'll see what I can do to mitigate/work around this in this PR. Most extreme 
case would be to not cast scalars to decimal, i.e. restore the status quo, 
where most queries on decimal data would end up getting coerced to float. But 
hopefully we can do better than that.
   
   We have very few tests for queries on decimal types, but they're all over 
the TPC-H data, so that's why we only observed this in the benchmarks. That 
should probably get rectified too. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to