[ https://issues.apache.org/jira/browse/HIVE-396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740431#action_12740431 ]
Yuntao Jia commented on HIVE-396: --------------------------------- Can you check how many jobs are there in the Hive query? If there are two, it means the output from Hive are merged to a single file by running an additional map-reduce job. If that is the case, you can turn it off by change the following property in Hive-default.xml to FALSE (it is true by default). hive.merge.mapfiles true "Merge small files at the end of a map-only job" Other than that, I have no idea why Hive is so slow. > Hive performance benchmarks > --------------------------- > > Key: HIVE-396 > URL: https://issues.apache.org/jira/browse/HIVE-396 > Project: Hadoop Hive > Issue Type: New Feature > Reporter: Zheng Shao > Assignee: Yuntao Jia > Attachments: hive_benchmark_2009-06-18.pdf, > hive_benchmark_2009-06-18.tar.gz, hive_benchmark_2009-07-12.pdf, > hive_benchmark_2009-07-21.tar.gz > > > We need some performance benchmark to measure and track the performance > improvements of Hive. > Some references: > PIG performance benchmarks PIG-200 > PigMix: http://wiki.apache.org/pig/PigMix -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.