cyb70289 commented on a change in pull request #9435:
URL: https://github.com/apache/arrow/pull/9435#discussion_r580764366
##########
File path: cpp/src/arrow/util/tdigest.h
##########
@@ -60,12 +61,27 @@ class ARROW_EXPORT TDigest {
input_.push_back(value);
}
+ // skip NAN on adding
+ // TODO(yibo): store NAN as is, partition to buffer end before merging
Review comment:
No solid benchmark.
The idea is to drop the `isnan` check in Add function. When input buffer is
full, partition NaNs to the buffer end and ignore them in sorting and merging.
Guess the performance depends heavily on inputs (maybe worse for common cases
where NaN is rare and `isnan` branch is always predicted correctly).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]