[ 
https://issues.apache.org/jira/browse/IMPALA-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8888.
----------------------------------
    Fix Version/s: Not Applicable
       Resolution: Fixed

> Profile fetch performance when result spooling is enabled
> ---------------------------------------------------------
>
>                 Key: IMPALA-8888
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8888
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>             Fix For: Not Applicable
>
>
> Profile the performance of fetching rows when result spooling is enabled. 
> There are a few queries that can be used to benchmark the performance:
> {{time ./bin/impala-shell.sh -B -q "select l_orderkey from 
> tpch_parquet.lineitem" > /dev/null}}
> {{time ./bin/impala-shell.sh -B -q "select * from tpch_parquet.orders" > 
> /dev/null}}
> The first fetches one column and 6,001,215 the second fetches 9 columns and 
> 1,500,000 - so a mix of rows fetched vs. columns fetched.
> The base line for the benchmark should be the commit prior to IMPALA-8780.
> The benchmark should check for both latency and CPU usage (to see if the copy 
> into {{BufferedTupleStream}} has a significant overhead).
> Various fetch sizes should be used in the benchmark as well to see if 
> increasing the fetch size for result spooling improves performance (ideally 
> it should) (it would be nice to run some fetches between machines as well as 
> that will better reflect network round trip latencies).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to