Hi, The simplest of hive queries seem to be consuming 100% cpu. This is with a small 4-node cluster. The machines are pretty beefy (16 cores per machine, tons of RAM, 16 M+R maximum tasks configured, 1GB RAM for mapred.child.java.opts, etc). A simple query like "select count(1) from events" where the events table has daily partitions of log files in gzipped file format). While this is probably too generic a question and there is a bunch of investigation we need to, are there any specific areas for me to look at? Has anyone see anything like this before? Also, are there any tools or easy options to profile hive query execution?
Thanks in advance, Vijay