Hi, We are using Hive to merge small files by setting hive.merge.smallfiles.avgsize to 120000000 and doing an insert as select to a table. The problem is that this take two passes over the data, first to insert the data and then to merge it.
Is there a more efficient way to have Hive merge small files on the files without running with two passes? Thank you. Daniel
