[ https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852918#comment-17852918 ]
Michael Smith commented on IMPALA-12800: ---------------------------------------- Difference from null slots cache: {code} # With caching Query Compilation: 4s678ms - Metadata of all 1 tables cached: 26.276ms (26.276ms) - Analysis finished: 3s466ms (3s440ms) - Authorization finished (noop): 3s467ms (130.395us) - Value transfer graph computed: 3s486ms (19.860ms) - Single node plan created: 4s402ms (915.149ms) - Runtime filters computed: 4s453ms (51.628ms) - Distributed plan created: 4s486ms (33.064ms) - Planning finished: 4s678ms (191.281ms) # Without caching via 'set use_null_slots_cache=false' Query Compilation: 14s845ms - Metadata of all 1 tables cached: 7.608ms (7.608ms) - Analysis finished: 3s207ms (3s199ms) - Authorization finished (noop): 3s207ms (120.606us) - Value transfer graph computed: 3s221ms (14.231ms) - Single node plan created: 14s610ms (11s389ms) - Runtime filters computed: 14s661ms (51.286ms) - Distributed plan created: 14s662ms (246.301us) - Planning finished: 14s845ms (183.164ms) {code} So speeds up single node planning, adds some overhead to distributed planning. I'll look into disabling it for distributed planning. > Queries with many nested inline views see performance issues with > ExprSubstitutionMap > ------------------------------------------------------------------------------------- > > Key: IMPALA-12800 > URL: https://issues.apache.org/jira/browse/IMPALA-12800 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 4.3.0 > Reporter: Joe McDonnell > Assignee: Michael Smith > Priority: Critical > Attachments: impala12800repro.sql, impala12800schema.sql, > long_query_jstacks.tar.gz > > > A user running a query with many layers of inline views saw a large amount of > time spent in analysis. > > {noformat} > - Authorization finished (ranger): 7s518ms (13.134ms) > - Value transfer graph computed: 7s760ms (241.953ms) > - Single node plan created: 2m47s (2m39s) > - Distributed plan created: 2m47s (7.430ms) > - Lineage info computed: 2m47s (39.017ms) > - Planning finished: 2m47s (672.518ms){noformat} > In reproducing it locally, we found that most of the stacks end up in > ExprSubstitutionMap. > > Here are the main stacks seen while running jstack every 3 seconds during a > 75 second execution: > Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr > equals) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at java.util.ArrayList.indexOf(ArrayList.java:323) > at java.util.ArrayList.contains(ArrayList.java:306) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat} > Location 2: (ExprSubstitutionMap::compose -> verify -> Expr equals) (9 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat} > Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat} > Location 4: (TupleIsNullPredicate.wrapExprs -> Analyzer.isTrueWithNullSlots > -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at java.lang.StringCoding.encode(StringCoding.java:364) > at java.lang.String.getBytes(String.java:941) > at > org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467) > at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709) > at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400) > at org.apache.thrift.TSerializer.serialize(TSerializer.java:84) > at > org.apache.impala.service.FeSupport.EvalExprWithoutRowBounded(FeSupport.java:206) > at > org.apache.impala.service.FeSupport.EvalExprWithoutRow(FeSupport.java:194) > at org.apache.impala.service.FeSupport.EvalPredicate(FeSupport.java:275) > at > org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Analyzer.java:2888) > at > org.apache.impala.analysis.TupleIsNullPredicate.requiresNullWrapping(TupleIsNullPredicate.java:181) > at > org.apache.impala.analysis.TupleIsNullPredicate.wrapExpr(TupleIsNullPredicate.java:147) > at > org.apache.impala.analysis.TupleIsNullPredicate.wrapExprs(TupleIsNullPredicate.java:136){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org