Rajesh Balamohan created HIVE-23218:
---------------------------------------
Summary: LlapRecordReader queue limit computation is not optimal
Key: HIVE-23218
URL: https://issues.apache.org/jira/browse/HIVE-23218
Project: Hive
Issue Type: Improvement
Components: llap
Reporter: Rajesh Balamohan
After decoding {{OrcEncodedDataConsumer::decodeBatch}}, data is enqueued into a
queue in LlapRecordReader. Queue limit for this queue is determined in
LlapRecordReader. If it is minimal, it ends up waiting for 100ms until it gets
capacity.
https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java#L168
https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java#L590
https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java#L260
{{determineQueueLimit}} takes into consideration all columns though only few
columns are needed for projection. Here is an example.
{noformat}
create table test_acid(a1 string, a2 string, a3 string, a4 string, a5 string,
a6 string, a7 string, a8 string, a9 string, a10 string,
a11 string, a22 string, a33 string, a44 string, a55 string, a66 string, a77
string, a88 string, a99 string, a100 string,
a111 decimal(25,2), a222 decimal(25,2), a333 decimal(25,2), a444 decimal(25,2),
a555 decimal(25,2), a666 decimal(25,2), a777 decimal(25,2),
a888 decimal(25,2), a999 decimal(25,2), a1000 decimal(25,2)) stored as orc;
insert into table test_acid values
("a1","a2","a3","a4","a5","a6","a7","a8","a9","a10",
"a11","a22","a33","a44","a55","a66","a77","a88","a99","a100",
10.23,10.23,10.23,10.23,10.23,10.23,10.23,10.23,10.23,10.23
);
select a44, count(*) from test_acid where a44 like "a4%" group by a44 order by
a44;
{noformat}
For this query, queue size predicted would be "138" as it takes into account
all fields instead of just 2. This would causes unwanted delays in adding data
to the queue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)