[ 
https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852912#comment-17852912
 ] 

Michael Smith commented on IMPALA-12800:
----------------------------------------

I've posted two patches at https://gerrit.cloudera.org/c/21484/2 that improves 
query compilation for the repro from
{code}
# 1st run
    Query Compilation: 1m15s
       - Metadata load started: 75.088ms (75.088ms)
       - Metadata load finished. loaded-tables=1/1 load-requests=1 
catalog-updates=3 storage-load-time=46ms: 3s137ms (3s062ms)
       - Analysis finished: 7s504ms (4s367ms)
       - Authorization finished (noop): 7s505ms (946.982us)
       - Value transfer graph computed: 7s553ms (47.618ms)
       - Single node plan created: 1m14s (1m7s)
       - Runtime filters computed: 1m15s (874.659ms)
       - Distributed plan created: 1m15s (1.168ms)
       - Planning finished: 1m15s (284.717ms)
# 2nd run
    Query Compilation: 1m6s
       - Metadata of all 1 tables cached: 18.799ms (18.799ms)
       - Analysis finished: 3s299ms (3s280ms)
       - Authorization finished (noop): 3s299ms (118.618us)
       - Value transfer graph computed: 3s319ms (19.983ms)
       - Single node plan created: 1m5s (1m2s)
       - Runtime filters computed: 1m6s (808.587ms)
       - Distributed plan created: 1m6s (188.167us)
       - Planning finished: 1m6s (189.985ms)
{code}
to
{code}
# 1st run
    Query Compilation: 8s649ms
       - Metadata load started: 62.291ms (62.291ms)
       - Metadata load finished. loaded-tables=1/1 load-requests=1 
catalog-updates=3 storage-load-time=46ms: 3s019ms (2s957ms)
       - Analysis finished: 7s021ms (4s002ms)
       - Authorization finished (noop): 7s021ms (569.098us)
       - Value transfer graph computed: 7s070ms (48.329ms)
       - Single node plan created: 8s194ms (1s124ms)
       - Runtime filters computed: 8s261ms (67.366ms)
       - Distributed plan created: 8s365ms (103.186ms)
       - Planning finished: 8s649ms (284.506ms)
# 2nd run
    Query Compilation: 4s621ms
       - Metadata of all 1 tables cached: 17.932ms (17.932ms)
       - Analysis finished: 3s391ms (3s373ms)
       - Authorization finished (noop): 3s391ms (133.671us)
       - Value transfer graph computed: 3s412ms (20.547ms)
       - Single node plan created: 4s347ms (935.582ms)
       - Runtime filters computed: 4s399ms (51.706ms)
       - Distributed plan created: 4s434ms (35.380ms)
       - Planning finished: 4s621ms (187.070ms)
{code}

> Queries with many nested inline views see performance issues with 
> ExprSubstitutionMap
> -------------------------------------------------------------------------------------
>
>                 Key: IMPALA-12800
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12800
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 4.3.0
>            Reporter: Joe McDonnell
>            Assignee: Michael Smith
>            Priority: Critical
>         Attachments: impala12800repro.sql, impala12800schema.sql, 
> long_query_jstacks.tar.gz
>
>
> A user running a query with many layers of inline views saw a large amount of 
> time spent in analysis. 
>  
> {noformat}
> - Authorization finished (ranger): 7s518ms (13.134ms)
> - Value transfer graph computed: 7s760ms (241.953ms)
> - Single node plan created: 2m47s (2m39s)
> - Distributed plan created: 2m47s (7.430ms)
> - Lineage info computed: 2m47s (39.017ms)
> - Planning finished: 2m47s (672.518ms){noformat}
> In reproducing it locally, we found that most of the stacks end up in 
> ExprSubstitutionMap.
>  
> Here are the main stacks seen while running jstack every 3 seconds during a 
> 75 second execution:
> Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr 
> equals) (4 samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at java.util.ArrayList.indexOf(ArrayList.java:323)
>     at java.util.ArrayList.contains(ArrayList.java:306)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat}
> Location 2:  (ExprSubstitutionMap::compose -> verify -> Expr equals) (9 
> samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat}
> Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5 
> samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat}
> Location 4:  (TupleIsNullPredicate.wrapExprs ->  Analyzer.isTrueWithNullSlots 
> -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at java.lang.StringCoding.encode(StringCoding.java:364)
>     at java.lang.String.getBytes(String.java:941)
>     at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227)
>     at 
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532)
>     at 
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467)
>     at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394)
>     at 
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034)
>     at 
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709)
>     at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400)
>     at org.apache.thrift.TSerializer.serialize(TSerializer.java:84)
>     at 
> org.apache.impala.service.FeSupport.EvalExprWithoutRowBounded(FeSupport.java:206)
>     at 
> org.apache.impala.service.FeSupport.EvalExprWithoutRow(FeSupport.java:194)
>     at org.apache.impala.service.FeSupport.EvalPredicate(FeSupport.java:275)
>     at 
> org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Analyzer.java:2888)
>     at 
> org.apache.impala.analysis.TupleIsNullPredicate.requiresNullWrapping(TupleIsNullPredicate.java:181)
>     at 
> org.apache.impala.analysis.TupleIsNullPredicate.wrapExpr(TupleIsNullPredicate.java:147)
>     at 
> org.apache.impala.analysis.TupleIsNullPredicate.wrapExprs(TupleIsNullPredicate.java:136){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to