gianm commented on a change in pull request #6062: Fix a bug in GroupByQueryEngine URL: https://github.com/apache/incubator-druid/pull/6062#discussion_r206202261
########## File path: processing/src/main/java/io/druid/query/groupby/GroupByQueryEngine.java ########## @@ -178,37 +178,33 @@ public int getNumRows() return positions; } - private List<ByteBuffer> updateValues( - ByteBuffer key, - List<DimensionSelector> dims - ) + @Nullable + private List<ByteBuffer> updateValues(ByteBuffer key, List<DimensionSelector> dims) { if (dims.size() > 0) { - List<ByteBuffer> retVal = null; - List<ByteBuffer> unaggregatedBuffers = null; - final DimensionSelector dimSelector = dims.get(0); final IndexedInts row = dimSelector.getRow(); final int rowSize = row.size(); if (rowSize == 0) { ByteBuffer newKey = key.duplicate(); newKey.putInt(MISSING_VALUE); - unaggregatedBuffers = updateValues(newKey, dims.subList(1, dims.size())); + return updateValues(newKey, dims.subList(1, dims.size())); } else { + List<ByteBuffer> retVal = null; for (int i = 0; i < rowSize; i++) { ByteBuffer newKey = key.duplicate(); int dimValue = row.get(i); newKey.putInt(dimValue); - unaggregatedBuffers = updateValues(newKey, dims.subList(1, dims.size())); - } - } - if (unaggregatedBuffers != null) { - if (retVal == null) { - retVal = Lists.newArrayList(); + List<ByteBuffer> unaggregatedBuffers = updateValues(newKey, dims.subList(1, dims.size())); Review comment: > @jihoonson or @jon-wei is that correct? If yes, do you plan to remove groupBy v1? It is correct, groupBy v1 is legacy and should be removed in the future. There is one case I am aware of where it is 'better' than v2: if you have some growable aggregators (like some sketches), v1 will probably use less memory than v2. This is because v2 uses off-heap aggregators which allocate all the memory they need up-front, but v1 uses on-heap aggregators that can grow gradually. I don't have the issue number handy, but someone raised this in the past as a reason that they switched back from v2 to v1 (they were doing groupBy on a high-cardinality dimension with thetaSketch metrics). I think if off-heap aggregators could learn how to start small and then grow, then v1 will be really outpaced by v2 in all known cases and could be removed. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org