Re: Controlling Number of small files while inserting into Hive table

2017-06-25 Thread saquib khan
Please remove me from the user list. On Sun, Jun 25, 2017 at 5:10 PM Db-Blog wrote: > Hi Arpan, > Include the partition column in the distribute by clause of DML, it will > generate only one file per day. Hope this will resolve the issue. > > "insert into 'target_table'

Re: Controlling Number of small files while inserting into Hive table

2017-06-25 Thread Db-Blog
Hi Arpan, Include the partition column in the distribute by clause of DML, it will generate only one file per day. Hope this will resolve the issue. > "insert into 'target_table' select a,b,c from x where ... distribute by > (date)" > PS: Backdated processing will generate additional file(s).