The document describes how transaction works and what the data layout is: https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions. See the "Basic design" section. HDFS is immutable. Hive creates a delta directory for every transaction and merges it when read, so it does not written to the same block.
Wantao At 2014-12-02 16:58:26, "unmesha sreeveni" <unmeshab...@gmail.com> wrote: Why hive "UPADTE" is not reusing the blocks. the update is not written to same block, why is it so ? On Tue, Dec 2, 2014 at 10:50 AM, unmesha sreeveni <unmeshab...@gmail.com> wrote: I tried to update my record in hive previous version and also tried out update in hive 0.14.0. The newer version which support hive. I created a table with 3 buckets with 180 MB. In my warehouse the data get stored into 3 different blocks delta_0000012_0000012 --- Block ID: 1073751752 --- Block ID: 1073751750 --- Block ID: 1073751753 After doing an update I am getting 2 directories delta_0000012_0000012 --- Block ID: 1073751752 --- Block ID: 1073751750 --- Block ID: 1073751753 AND delta_0000014_0000014 ---Block ID: 1073752044 Ie the blocks are not reused Whether my understanding is correct? Any pointers? -- Thanks & Regards Unmesha Sreeveni U.B Hadoop, Bigdata Developer Centre for Cyber Security | Amrita Vishwa Vidyapeetham http://www.unmeshasreeveni.blogspot.in/ -- Thanks & Regards Unmesha Sreeveni U.B Hadoop, Bigdata Developer Centre for Cyber Security | Amrita Vishwa Vidyapeetham http://www.unmeshasreeveni.blogspot.in/