[ https://issues.apache.org/jira/browse/IMPALA-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sahil Takiar resolved IMPALA-8888. ---------------------------------- Fix Version/s: Not Applicable Resolution: Fixed > Profile fetch performance when result spooling is enabled > --------------------------------------------------------- > > Key: IMPALA-8888 > URL: https://issues.apache.org/jira/browse/IMPALA-8888 > Project: IMPALA > Issue Type: Sub-task > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Priority: Major > Fix For: Not Applicable > > > Profile the performance of fetching rows when result spooling is enabled. > There are a few queries that can be used to benchmark the performance: > {{time ./bin/impala-shell.sh -B -q "select l_orderkey from > tpch_parquet.lineitem" > /dev/null}} > {{time ./bin/impala-shell.sh -B -q "select * from tpch_parquet.orders" > > /dev/null}} > The first fetches one column and 6,001,215 the second fetches 9 columns and > 1,500,000 - so a mix of rows fetched vs. columns fetched. > The base line for the benchmark should be the commit prior to IMPALA-8780. > The benchmark should check for both latency and CPU usage (to see if the copy > into {{BufferedTupleStream}} has a significant overhead). > Various fetch sizes should be used in the benchmark as well to see if > increasing the fetch size for result spooling improves performance (ideally > it should) (it would be nice to run some fetches between machines as well as > that will better reflect network round trip latencies). -- This message was sent by Atlassian Jira (v8.3.4#803005)