The document describes how transaction works and what the data layout is: 
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions. See the 
"Basic design" section. HDFS is immutable. Hive creates a delta directory for 
every transaction and merges it when read, so it does not written to the same 
block.

Wantao






At 2014-12-02 16:58:26, "unmesha sreeveni" <unmeshab...@gmail.com> wrote:

Why hive "UPADTE" is not reusing the blocks.
the update is not written to same block, why is it so ?




On Tue, Dec 2, 2014 at 10:50 AM, unmesha sreeveni <unmeshab...@gmail.com> wrote:

I tried to update my record in hive previous version and also tried out update 
in hive 0.14.0. The newer version which support hive.


I created a table with 3 buckets with 180 MB. In my warehouse the data get 
stored into 3 different blocks
 
delta_0000012_0000012 
--- Block ID: 1073751752
--- Block ID: 1073751750
--- Block ID: 1073751753


After doing an update


I am getting 2 directories


delta_0000012_0000012 
--- Block ID: 1073751752
--- Block ID: 1073751750
--- Block ID: 1073751753
AND
delta_0000014_0000014
               ---Block ID: 1073752044


Ie the blocks are not reused
Whether my understanding is correct?
Any pointers?
                             


--

Thanks & Regards


Unmesha Sreeveni U.B

Hadoop, Bigdata Developer
Centre for Cyber Security | Amrita Vishwa Vidyapeetham

http://www.unmeshasreeveni.blogspot.in/










--

Thanks & Regards


Unmesha Sreeveni U.B

Hadoop, Bigdata Developer
Centre for Cyber Security | Amrita Vishwa Vidyapeetham

http://www.unmeshasreeveni.blogspot.in/




Reply via email to