adarshsanjeev commented on code in PR #18235:
URL: https://github.com/apache/druid/pull/18235#discussion_r2213804278


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/DataServerQueryHandlerUtils.java:
##########
@@ -48,17 +56,65 @@ private DataServerQueryHandlerUtils()
    * Performs necessary transforms to a query destined for data servers. Does 
not update the list of segments; callers
    * should do this themselves using {@link 
Queries#withSpecificSegments(Query, List)}.
    *
-   * @param query      the query
-   * @param dataSource datasource name
+   * @param query          the query
+   * @param dataSourceName datasource name
    */
-  public static <R, T extends Query<R>> Query<R> prepareQuery(final T query, 
final String dataSource)
+  public static <R, T extends Query<R>> Query<R> prepareQuery(
+      final T query,
+      final int inputNumber,
+      final String dataSourceName
+  )
   {
     // MSQ changes the datasource to an inputNumber datasource. This needs to 
be changed back for data servers
     // to understand.
+    return query.withDataSource(transformDatasource(query.getDataSource(), 
inputNumber, dataSourceName));
+  }
 
-    // BUG: This transformation is incorrect; see 
https://github.com/apache/druid/issues/18198. It loses decorations
-    // such as join, unnest, etc.
-    return query.withDataSource(new TableDataSource(dataSource));
+  /**
+   * Transforms {@link InputNumberDataSource} and {@link 
RestrictedInputNumberDataSource}, which are only understood
+   * by MSQ tasks, back into {@link TableDataSource} and {@link 
RestrictedDataSource} recursivly.
+   */
+  static DataSource transformDatasource(
+      final DataSource dataSource,
+      final int inputNumber,
+      final String dataSourceName
+  )

Review Comment:
   I have gone through the flow, and I am not able to find much improvement in 
terms of validating earlier. 
   
   The only place to put this check seems to be around `WorkerInputs#create`. 
Checking if any of the slices has any realtime segments and checking if the 
stage inputs contain a `StageInputSpec`s should be a validation. This is right 
after we get the segment information to see if there are any realtime segments.
   
   Unfortunately, this seems to be run at the start of a stage, not the start 
of the query, so this is not much better in terms of validating earlier, since 
it would still run the earlier stages. It's bit messy to add a validation here 
as well.
   
   Since we would need to keep the worker validation anyway as a sanity check, 
I'm not sure if this change should be made. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to