I have a 10 node server or so, and have been mainly using pig on it, but would like to try out Hive.
I am running this query, which doesn't take too long in Pig, but is taking quite a long time in Hive. hive -e "select count(1) as ct from my_table where v1='02' and v2 = 11112222;" > thecount One thing is that this job only uses 1 reducer, but it is taking most of its time in its reduce step. I tried manually setting more reducers, but I think that for a job without groups, it forces 1 reducer? Either way, would love to know why this is dragging? It's worth noting that my_table is not saved in the Hive format, but rather as a flat file. I realize that this can influence performance, but shouldn't it at least perform on par with pig? Thanks for your help Jon