siddharthteotia commented on a change in pull request #4535: Implement DISTINCT clause URL: https://github.com/apache/incubator-pinot/pull/4535#discussion_r315381026
########## File path: pinot-core/src/main/java/org/apache/pinot/core/operator/blocks/IntermediateResultsBlock.java ########## @@ -231,41 +237,123 @@ private DataTable getSelectionResultDataTable() SelectionOperatorUtils.getDataTableFromRows(_selectionResult, _selectionDataSchema)); } + @Nonnull private DataTable getAggregationResultDataTable() throws Exception { // Extract each aggregation column name and type from aggregation function context. - int numAggregationFunctions = _aggregationFunctionContexts.length; - String[] columnNames = new String[numAggregationFunctions]; - DataSchema.ColumnDataType[] columnDataTypes = new DataSchema.ColumnDataType[numAggregationFunctions]; - for (int i = 0; i < numAggregationFunctions; i++) { - AggregationFunctionContext aggregationFunctionContext = _aggregationFunctionContexts[i]; - columnNames[i] = aggregationFunctionContext.getAggregationColumnName(); - columnDataTypes[i] = aggregationFunctionContext.getAggregationFunction().getIntermediateResultColumnType(); + if (AggregationFunctionUtils.isDistinct(_aggregationFunctionContexts)) { Review comment: Other than the two special casing occurrences you pointed out earlier in FunctionCallAstNode and SelectAstNode (which I have mentioned in my comments https://github.com/apache/incubator-pinot/pull/4535#discussion_r315355301 and https://github.com/apache/incubator-pinot/pull/4535#discussion_r315314108 as to why they are specifically needed), there are following places in the code where we do the check for DISTINCT (1) DefaultAggregationExecutor -- when operator initializes the executor Earlier for each function we had to build column expression tree (since each function can only have a single column). Now for DISTINCT function, we will have to do this for each column (2) IntermediateResultBlock -- when building the data table. Since now we are sending proper matrix for a DISTINCT query (similar to DataTable for Selection query). The above reason is also applicable here. (3) AggregationPlanNode -- to set the projected column types in function context to be used later during building the data table -- this is needed since for aggregation functions, the type was fixed at Long, Double. Seeing if this can be done in a better way ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org