gortiz commented on PR #12354:
URL: https://github.com/apache/pinot/pull/12354#issuecomment-1961194752

   That is the interesting part. Aggregation operators implemented before did 
not used RoaringBitmap that well and they were iterating over it. Instead here 
we are not doing that. We are looking for the next null value. In this 
benchmark that is specially useful because we have 127 not null values followed 
by a single null value and we repeat that pattern all the time, so we end up 
executing 127 rows in a loop.
   
   We can use the benchmark to play with different distributions, where a 
distribution where odd positions are null and even positions are not null (or 
the other way around) will probably be the worst distribution possible.
   
   > Micro-benchmark is usually not accurate when too much code is tested 
(especially multi-threaded). Can you set up a benchmark to only test the 
aggregation function itself? We may construct a BlockValSet and pass it into 
the function.
   
   We can think the other way around. Benchmarking this way we get results 
closer to the actual performance we would get. It shouldn't be actually 
problematic to spend twice much time with null handling if the operation is 
actually a 5% of the total execution time. On the other hand, we can see that 
some patterns are significantly slower, which means that the difference there 
is actually important. For example with `mode` we can see that version 2 is 
significantly slower than version 5.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to