Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/23900 )
Change subject: IMPALA-14488: Calcite planner: Support fallback to Original planner ...................................................................... Patch Set 5: (8 comments) http://gerrit.cloudera.org:8080/#/c/23900/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23900/5//COMMIT_MSG@9 PS5, Line 9: RUNTIME_PLANNERS I'd rather not include "runtime" in the name. Maybe "planner_precedence" or "planner_preference"? I don't know a good name for this. http://gerrit.cloudera.org:8080/#/c/23900/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/23900/5/be/src/service/impala-server.cc@2030 PS5, Line 2030: if (FLAGS_use_calcite_planner) { Should we change the use_calcite_planner flag to something else? http://gerrit.cloudera.org:8080/#/c/23900/5/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/23900/5/be/src/service/query-options.cc@1396 PS5, Line 1396: case TImpalaQueryOptions::RUNTIME_PLANNERS: { : std::vector<TPlannerType::type> runtime_planners; : const string planners = std::regex_replace(value, std::regex("^\"|\"$"), ""); : vector<string> str_types; : split(str_types, planners, is_any_of(","), token_compress_on); : TrimAndRemoveEmptyString(str_types); : for (const auto& t : str_types) { : TPlannerType::type planner_type; : RETURN_IF_ERROR(GetThriftEnum(t, "planner type", : _TPlannerType_VALUES_TO_NAMES, &planner_type)); : runtime_planners.push_back(planner_type); : } : query_options->__set_runtime_planners(runtime_planners); Does this produce an error if runtime_planners is empty? It's possible to add a test in query-options-test.cc (e.g. there is a test there for enabled_runtime_filter_types). Also, we can verify that there are no duplicates (e.g. CALCITE,CALCITE,CALCITE). http://gerrit.cloudera.org:8080/#/c/23900/5/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/23900/5/common/thrift/ImpalaService.thrift@1033 PS5, Line 1033: RUNTIME_PLANNERS = 191 > Enum item USE_CALCITE_PLANNER=191 changed to RUNTIME_PLANNERS=191 in TImpal I think you can leave USE_CALCITE_PLANNER, but remove the comment and add a "// Removed" comment next to it. Then add a new number for this one. That should fix this complaint. http://gerrit.cloudera.org:8080/#/c/23900/5/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/23900/5/common/thrift/Query.thrift@790 PS5, Line 790: 192: optional list<PlanNodes.TPlannerType> runtime_planners = [TPlannerType.ORIGINAL] > Renaming field 'use_calcite_planner' to 'runtime_planners' in TQueryOptions I think it is ok to remove use_calcite_planner and leave this particular numeric id unused. So, this new option can be a new numeric id, and the check shouldn't complain. http://gerrit.cloudera.org:8080/#/c/23900/5/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/23900/5/fe/src/main/java/org/apache/impala/service/Frontend.java@2406 PS5, Line 2406: private TExecRequest getTExecRequestWithFallback( : PlanCtx planCtx, EventSequence timeline) throws ImpalaException { : TExecRequest request = null; : LOG.info("Searching for planner to use..."); : TPlannerType plannerType = getPlannerType(planCtx, false); : CompilerFactory compilerFactory = getCompilerFactory(plannerType); : LOG.info("Attempting with " + compilerFactory.getPlannerString() + " Planner."); : Throwable firstAttemptPlannerException = null; : TPlannerType fallbackPlanner = null; I guess the question is whether this code should be explicit about having at most two options or whether it should try to be handling the list as a list. This is a gist of what the list one would look like: bool onlyCalcite = plannerList.size() == 1 && plannerList.get(0) == CALCITE; Throwable error = null; TQueryOptions queryOptions = ctx.getQueryContext().client_request.getQuery_options(); List<TPlannerType> plannerTypes = queryOptions.getRuntime_planners(); for (TPlannerType plannerType : plannerTypes) { error = null; CompilerFactory compilerFactory = getCompilerFactory(plannerType); try { TExecRequest request = getTExecRequest(compilerFactory, planCtx, timeline); addPlannerToProfile(compiler.getPlannerString()); return request; } catch (Throwable e) { error = e; // Add entry for compilerFactory.getPlannerString() + "FailureReason" with the exception info } } // None of the planners worked. Exceptional case: If only Calcite, fallback for unsupported SQLs if (error != null && onlyCalcite && isUnsupportedCalciteSQL(planCtx, error)) { // run original planner } throw error; http://gerrit.cloudera.org:8080/#/c/23900/5/fe/src/main/java/org/apache/impala/service/Frontend.java@2423 PS5, Line 2423: // For Calcite planner: Certain queries aren't supported by the planner, e.g. DDL. : // If the query isn't supported, we always fallback to the original planner, : // regardless of whether the RUNTIME_PLANNERS query option specifies a fallback : // planner. : fallbackPlanner = isUnsupportedCalciteSQL(plannerType, planCtx, e) : ? TPlannerType.ORIGINAL : : getPlannerType(planCtx, true); We only need to run this if this is Calcite and there isn't a fallback planner, so we can get the fallback planner and only do this isUnsupportedCalciteSQL check if fallbackPlanner is null. That would avoid an extra parse for CALCITE,ORIGINAL. http://gerrit.cloudera.org:8080/#/c/23900/5/fe/src/main/java/org/apache/impala/service/Frontend.java@3822 PS5, Line 3822: try { : CompilerFactory calciteFactory = (CompilerFactory) Class.forName( : "org.apache.impala.calcite.service." + : "CalciteCompilerFactory").newInstance(); : if (calciteFactory != null) { : LOG.info("Found Calcite Planner, using it."); : return calciteFactory; : } else { : LOG.info("Could not find Calcite planner."); : //XXX: what kind of exception should be thrown here? Or should we fallback : // even though explicit planner is mentioned? : throw new RuntimeException("Could not find Calcite Planner"); : } : } catch (Exception e) { : LOG.info("Could not find Calcite planner: " + e); : //XXX: what kind of exception should be thrown here? Or should we fallback even : // though explicit planner is mentioned? : throw new RuntimeException(e); : } This basically shouldn't happen and is a configuration issue. Users always have a way out: Don't use Calcite. So, I don't feel like this really needs to fall back or be handled. I also think we could fail startup if someone has Calcite enabled by default and we can't find Calcite. -- To view, visit http://gerrit.cloudera.org:8080/23900 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id1fdc5ef92fff84e89af0e19c4246cc15e2ea823 Gerrit-Change-Number: 23900 Gerrit-PatchSet: 5 Gerrit-Owner: Steve Carlin <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Steve Carlin <[email protected]> Gerrit-Comment-Date: Tue, 27 Jan 2026 23:21:34 +0000 Gerrit-HasComments: Yes
