Tim Armstrong created IMPALA-9381:
-------------------------------------

             Summary: Lazily convert and/or cache different representations of 
the query profile
                 Key: IMPALA-9381
                 URL: https://issues.apache.org/jira/browse/IMPALA-9381
             Project: IMPALA
          Issue Type: Sub-task
          Components: Backend
            Reporter: Tim Armstrong


There are some obvious inefficiencies with how the query state record works:

* We do an unnecessary copy of the archive string when adding it to the query 
log
https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1812.
* We eagerly convert the profile to text and JSON, when in many cases they 
won't be needed - 
https://github.com/apache/impala/blob/79aae231443a305ce8503dbc7b4335e8ae3f3946/be/src/service/impala-server.cc#L1839
 . I think it is generally rare for more than one profile format to be 
downloaded from the web UI. I know of tools that scrape the thrift profile, but 
the human-readable version would usually only be consumed by humans. We could 
avoid this by only storing the thrift representation of the profile, then 
reconstituting the other representations from thrift if requested.
* After ComputeExecSummary(), the profile shouldn't change, but we'll 
regenerate the thrift representation for every web request to get the encoded. 
This may waste a lot of CPU for tools scraping the profiles.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to