>alter table <table> concatenate do not work? I have a dynamic
>partitioned table (stored as orc). I tried to alter concatenate, but it
>did not work. See my test result.

ORC fast concatenate does work on partitioned tables, but it doesn¹t work
on bucketed tables.

Bucketed tables cannot merge files, since the file count is capped by the
numBuckets parameter.

>hive> dfs -ls 
>${hiveconf:hive.metastore.warehouse.dir}/orc_merge5a/st=0.8/;
>Found 2 items
>-rw-r--r--   3 patcharee hdfs        534 2015-04-21 12:33
>/apps/hive/warehouse/orc_merge5a/st=0.8/000000_0
>-rw-r--r--   3 patcharee hdfs        533 2015-04-21 12:33
>/apps/hive/warehouse/orc_merge5a/st=0.8/000001_0

Is this a bucketed table?

When you look at the point of view of split generation & cluster
parallelism, bucketing is an anti-pattern, since in most query schemas it
significantly slows down the slowest task.

Making the fastest task faster isn¹t often worth it, if the overall query
time goes up.

Also if you want to, you can send me the yarn logs -applicationId <app-id>
and the desc formatted of the table, which will help me understand what¹s
happening better.

Cheers,
Gopal


Reply via email to