[ https://issues.apache.org/jira/browse/SPARK-23171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Teng Peng updated SPARK-23171: ------------------------------ Issue Type: Umbrella (was: Improvement) > Reduce the time costs of the rule runs that do not change the plans > -------------------------------------------------------------------- > > Key: SPARK-23171 > URL: https://issues.apache.org/jira/browse/SPARK-23171 > Project: Spark > Issue Type: Umbrella > Components: SQL > Affects Versions: 2.3.0 > Reporter: Xiao Li > Priority: Major > > Below is the time stats of Analyzer/Optimizer rules. Try to improve the rules > and reduce the time costs, especially for the runs that do not change the > plans. > {noformat} > === Metrics of Analyzer/Optimizer Rules === > Total number of runs = 175827 > Total time: 20.699042877 seconds > Rule > Total Time Effective Time Total Runs > Effective Runs > org.apache.spark.sql.catalyst.optimizer.ColumnPruning > 2340563794 1338268224 1875 > 761 > org.apache.spark.sql.catalyst.analysis.Analyzer$CTESubstitution > 1632672623 1625071881 788 > 37 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions > 1395087131 347339931 1982 > 38 > org.apache.spark.sql.catalyst.optimizer.PruneFilters > 1177711364 21344174 1590 > 3 > org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries > 1145135465 1131417128 285 > 39 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences > 1008347217 663112062 1982 > 616 > org.apache.spark.sql.catalyst.optimizer.ReorderJoin > 767024424 693001699 1590 > 132 > org.apache.spark.sql.catalyst.analysis.Analyzer$FixNullability > 598524650 40802876 742 > 12 > org.apache.spark.sql.catalyst.analysis.DecimalPrecision > 595384169 436153128 1982 > 211 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery > 548178270 459695885 1982 > 49 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts > 423002864 139869503 1982 > 86 > org.apache.spark.sql.catalyst.optimizer.BooleanSimplification > 405544962 17250184 1590 > 7 > org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin > 383837603 284174662 1590 > 708 > org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases > 372901885 3362332 1590 > 9 > org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints > 364628214 343815519 285 > 192 > org.apache.spark.sql.execution.datasources.FindDataSourceTable > 303293296 285344766 1982 > 233 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions > 233195019 92648171 1982 > 294 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion > 220568919 73932736 1982 > 38 > org.apache.spark.sql.catalyst.optimizer.NullPropagation > 207976072 9072305 1590 > 26 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings > 207027618 37834145 1982 > 40 > org.apache.spark.sql.catalyst.optimizer.PushDownPredicate > 203382836 176482044 1590 > 783 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion > 192152216 15738573 1982 > 1 > org.apache.spark.sql.catalyst.optimizer.ConstantFolding > 191624610 58857553 1590 > 126 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion > 183008262 78280172 1982 > 29 > org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator > 176935299 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.ResolveTimeZone > 170161002 74354990 1982 > 417 > org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator > 166173174 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.OptimizeIn > 155410763 8197045 1590 > 16 > org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions > 153726565 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion > 153013269 0 1982 > 0 > org.apache.spark.sql.catalyst.optimizer.SimplifyCasts > 146693495 13537077 1590 > 69 > org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison > 144818581 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations > 143943308 6889302 1982 > 27 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division > 142925142 12653147 1982 > 8 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality > 142775965 0 1982 > 0 > org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals > 141509150 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.LikeSimplification > 132387762 636851 1590 > 1 > org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions > 127412361 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame > 126772671 9317887 1982 > 21 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion > 116484407 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion > 115402736 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.ResolveCreateNamedStruct > 115071447 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder > 113115366 4563584 1982 > 14 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion > 107747140 0 1982 > 0 > org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin > 105020607 13907906 1590 > 11 > org.apache.spark.sql.catalyst.analysis.TimeWindowing > 101018029 0 1982 > 0 > org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery > 98043747 7044358 1590 > 7 > org.apache.spark.sql.catalyst.optimizer.ConstantPropagation > 95173536 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion > 94134701 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics > 84419135 33892351 1982 > 11 > org.apache.spark.sql.execution.datasources.DataSourceAnalysis > 83297816 77023484 742 > 24 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer > 77880196 36980636 1982 > 148 > org.apache.spark.sql.execution.datasources.PreprocessTableCreation > 74091407 0 742 > 0 > org.apache.spark.sql.catalyst.analysis.CleanupAliases > 73837147 37105855 1086 > 344 > org.apache.spark.sql.catalyst.optimizer.RemoveRedundantProject > 73534618 31752937 1875 > 344 > org.apache.spark.sql.execution.datasources.v2.PushDownOperatorsToDataSource > 70120541 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.FoldablePropagation > 67941776 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions > 62917712 22092402 1982 > 23 > org.apache.spark.sql.catalyst.optimizer.CombineFilters > 61116313 41021442 1590 > 449 > org.apache.spark.sql.catalyst.optimizer.CollapseProject > 60872313 30994661 1875 > 279 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases > 58453489 12511798 1982 > 47 > org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions > 58154315 0 750 > 0 > org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions > 54678669 0 285 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences > 53518211 7209138 1982 > 8 > org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates > 45840637 29436271 285 > 23 > org.apache.spark.sql.catalyst.optimizer.CollapseRepartition > 43321502 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic > 42117785 0 742 > 0 > org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime > 40843184 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion > 39997563 5899863 1590 > 10 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations > 39412748 22359409 1990 > 233 > org.apache.spark.sql.catalyst.optimizer.CombineUnions > 38823264 1534424 1875 > 17 > org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes > 38712372 7912192 1982 > 9 > org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF > 38281659 0 742 > 0 > org.apache.spark.sql.catalyst.optimizer.DecimalAggregates > 38277381 17245272 385 > 100 > org.apache.spark.sql.execution.datasources.ResolveSQLOnFile > 37342019 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates > 36958378 1207331 1982 > 46 > org.apache.spark.sql.catalyst.optimizer.CombineLimits > 36794793 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.LimitPushDown > 36378469 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance > 34611065 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast > 33734785 0 1982 > 0 > org.apache.spark.sql.catalyst.optimizer.EliminateSorts > 33731370 0 1590 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy > 33251765 1395920 1982 > 4 > org.apache.spark.sql.catalyst.optimizer.EliminateSerialization > 30890996 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.CollapseWindow > 29512740 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin > 29396498 1492235 300 > 7 > org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery > 29301037 21706110 285 > 148 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy > 23819074 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals > 23136089 10062248 788 > 4 > org.apache.spark.sql.execution.datasources.PreprocessTableInsertion > 20886216 0 742 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot > 20639329 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions > 20293829 0 1990 > 0 > org.apache.spark.sql.catalyst.analysis.ResolveInlineTables > 20255898 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveBroadcastHints > 20250460 0 750 > 0 > org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer$CleanExpressions > 19990727 39271 8280 26 > > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate > 19578333 0 1982 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases > 19414993 0 1982 > 0 > org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts > 19291402 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions > 18790135 0 285 > 0 > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin > 18535762 0 1982 > 0 > org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects > 17835919 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation > 15200130 1525030 288 > 3 > org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabase > 14490778 0 285 > 0 > org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases > 14021504 12790020 285 > 215 > org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates > 13439887 0 285 > 0 > org.apache.spark.sql.catalyst.analysis.EliminateBarriers > 12336513 0 1086 > 0 > org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery > 12082986 0 285 > 0 > org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences > 10792280 0 742 > 0 > org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate > 8978897 0 285 > 0 > org.apache.spark.sql.catalyst.analysis.EliminateUnions > 8886439 0 788 > 0 > org.apache.spark.sql.catalyst.analysis.AliasViewChild > 8317231 0 742 > 0 > org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions > 7964788 184237 286 > 1 > org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution > 7396593 0 788 > 0 > org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints > 6986385 0 750 > 0 > org.apache.spark.sql.catalyst.analysis.EliminateView > 6518436 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation > 6452598 0 288 > 0 > org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions > 5510866 0 286 > 0 > org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter > 5393429 0 300 > 0 > org.apache.spark.sql.catalyst.optimizer.SimplifyCreateArrayOps > 5296187 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.SimplifyCreateStructOps > 5261249 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin > 5152594 925260 300 > 1 > org.apache.spark.sql.catalyst.optimizer.CombineConcats > 4916416 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters > 4810314 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.SimplifyCreateMapOps > 4674195 0 1590 > 0 > org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate > 4406136 727433 300 > 15 > org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate > 4252456 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.EliminateDistinct > 1920392 0 285 > 0 > org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder > 1855658 0 285 > 0 > > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org