gortiz commented on PR #12354: URL: https://github.com/apache/pinot/pull/12354#issuecomment-1961194752
That is the interesting part. Aggregation operators implemented before did not used RoaringBitmap that well and they were iterating over it. Instead here we are not doing that. We are looking for the next null value. In this benchmark that is specially useful because we have 127 not null values followed by a single null value and we repeat that pattern all the time, so we end up executing 127 rows in a loop. We can use the benchmark to play with different distributions, where a distribution where odd positions are null and even positions are not null (or the other way around) will probably be the worst distribution possible. > Micro-benchmark is usually not accurate when too much code is tested (especially multi-threaded). Can you set up a benchmark to only test the aggregation function itself? We may construct a BlockValSet and pass it into the function. We can think the other way around. Benchmarking this way we get results closer to the actual performance we would get. It shouldn't be actually problematic to spend twice much time with null handling if the operation is actually a 5% of the total execution time. On the other hand, we can see that some patterns are significantly slower, which means that the difference there is actually important. For example with `mode` we can see that version 2 is significantly slower than version 5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org