Holman Lan created HIVE-13770:
---------------------------------

             Summary: Improve Thrift result set streaming when serializing 
thrift ResultSets in tasks
                 Key: HIVE-13770
                 URL: https://issues.apache.org/jira/browse/HIVE-13770
             Project: Hive
          Issue Type: Improvement
            Reporter: Holman Lan


When serializing the Thrift result set in final task, i.e. the 
hive.server2.thrift.resultset.serialize.in.tasks property is set to true, HS2 
does not start sending the results until the entire result set has been written 
to HDFS.

This is not efficient and we should find a way for HS2 to start sending the 
results as soon as a block of result becomes available. The advantage for this 
is two folds. One, the client can start consuming the results much sooner. Two, 
we can start reclaiming the storage space in HDFS used by a particular result 
set block as soon as the result set block has been successfully sent to the 
client.

It's worth checking if this is also the case when not serializing the Thrift 
result set in final task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to