Re: [PR] Set explain attributes after the query is prepared (druid)

via GitHub Wed, 05 Jul 2023 23:24:49 -0700


abhishekrb19 commented on code in PR #14490:
URL: https://github.com/apache/druid/pull/14490#discussion_r1253988955



##########
sql/src/main/java/org/apache/druid/sql/calcite/parser/DruidSqlParserUtils.java:
##########
@@ -327,84 +325,54 @@ public static SqlOrderBy 
convertClusterByToOrderBy(SqlNode query, SqlNodeList cl
    * </pre>
    *
    * <p>
-   * The function will return the following clusteredBy columns for the above 
SQL: ["__time", "page_alias", "FLOOR(\"cost\")", cityName"]
+   * The function will return the following clusteredBy columns for the above 
SQL: ["__time", "page_alias", "FLOOR(\"cost\")", cityName"].
    * Any ordinal and expression specified in the CLUSTERED BY clause will 
resolve to the final output column name.
    * </p>
+   * <p>
+   * This function must be called after the query is prepared when all the 
validations are complete, including {@link #validateClusteredByColumns},
+   * so we can safely access the arguments.
+   * </p>
    * @param clusteredByNodes  List of {@link SqlNode}s representing columns to 
be clustered by.
-   * @param sourceNode The select or order by source node.
+   * @param sourceFieldMappings The source field output mappings extracted 
from the validated root query rel node post prepare phase.
+   *
    */
   @Nullable
-  public static List<String> 
resolveClusteredByColumnsToOutputColumns(SqlNodeList clusteredByNodes, SqlNode 
sourceNode)
+  public static List<String> resolveClusteredByColumnsToOutputColumns(
+      final SqlNodeList clusteredByNodes,
+      final ImmutableList<Pair<Integer, String>> sourceFieldMappings
+  )
   {
     // CLUSTERED BY is an optional clause
     if (clusteredByNodes == null) {
       return null;
     }
 
-    Preconditions.checkArgument(
-        sourceNode instanceof SqlSelect || sourceNode instanceof SqlOrderBy,
-        "Source node must be either SqlSelect or SqlOrderBy, but found [%s]",
-        sourceNode == null ? null : sourceNode.getKind()
-    );
-
-    final SqlSelect selectNode = (sourceNode instanceof SqlSelect) ? 
(SqlSelect) sourceNode
-                                                                   : 
(SqlSelect) ((SqlOrderBy) sourceNode).query;
-    final List<SqlNode> selectList = selectNode.getSelectList().getList();
     final List<String> retClusteredByNames = new ArrayList<>();
 
     for (SqlNode clusteredByNode : clusteredByNodes) {
 
-      if (SqlUtil.isLiteral(clusteredByNode)) {
-        // The node is a literal number -- an ordinal is specified in the 
CLUSTERED BY clause. Validate and lookup the
-        // ordinal in the select list.
-        int ordinal = ((SqlNumericLiteral) 
clusteredByNode).getValueAs(Integer.class);
-        if (ordinal < 1 || ordinal > selectList.size()) {
-          throw InvalidSqlInput.exception(
-              "Ordinal[%d] specified in the CLUSTERED BY clause is invalid. It 
must be between 1 and %d.",
-              ordinal,
-              selectList.size()
-          );
-        }
-        SqlNode node = selectList.get(ordinal - 1);
-
-        if (node instanceof SqlBasicCall) {
-          retClusteredByNames.add(getColumnNameFromSqlCall(node));
-        } else {
-          Preconditions.checkArgument(
-              node instanceof SqlIdentifier,
-              "Node must be a SqlIdentifier, but found [%s]",
-              node.getKind()
-          );
-          SqlIdentifier n = ((SqlIdentifier) node);
-          retClusteredByNames.add(n.isSimple() ? n.getSimple() : 
n.names.get(1));
-        }
+      if (clusteredByNode instanceof SqlNumericLiteral) {
+        // The node is a literal number -- an ordinal in the CLUSTERED BY 
clause. Lookup the ordinal in field mappings.
+        final int ordinal = ((SqlNumericLiteral) 
clusteredByNode).getValueAs(Integer.class);
+        retClusteredByNames.add(sourceFieldMappings.get(ordinal - 1).right);
       } else if (clusteredByNode instanceof SqlBasicCall) {
         // The node is an expression/operator.
-        retClusteredByNames.add(getColumnNameFromSqlCall(clusteredByNode));
+        retClusteredByNames.add(getColumnNameFromSqlCall((SqlBasicCall) 
clusteredByNode));
       } else {

Review Comment:
   This block is a catch-all and doesn't assume any specific node type. I don't 
think we need to do any special handling to get the output columns other than 
ordinals and expressions, so this simple node `toString()` should suffice. 
Also, my rationale is that if we need to constrain the node type for the 
`CLUSTERED BY` grammar or add validation checks, that needs to be in the SQL 
planner code outside this explain plan logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Set explain attributes after the query is prepared (druid)

Reply via email to