clintropolis commented on issue #8822: optimize numeric column null value checking for low filter selectivity (more rows) URL: https://github.com/apache/incubator-druid/pull/8822#issuecomment-549995118 >I'm curious what percentage of the entire cost for processing a row a null check would be so we can have a good idea of what % overhead we're talking about. The last 3 animations show the estimated per row cost in nanoseconds for each of the 3 strategies. I will summarize: * `get` - most of the numbers look to be in the 10-25ns per row range (higher at low selectivity where it matters most) * `IntIterator` - about half are under 10ns per row (at low selectivity), this is definitely the best, but at super high selectivities (.1% of rows selected) with very dense bitmaps it climbs to over a couple of microseconds per row * `PeekableIntIterator` - about half are between 10-15ns per row (at low selectivity), most below 25ns, but also has more overhead with dense bitmaps at very high selectivity but only climbs to about 50-60ns per row in the worst case. To me it kind of seems like a toss up to me which is better between using the `PeekableIntIterator` and the plain `IntIterator`, it almost seems worth it to eat the slow per row times at high selectivity in exchange for that 5ns per row at low selectivity, but both approaches fair better at low selectivity than using `get`, so I went more conservative and used the `PeekableIntIterator` for now.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org