Hi There! I have been doing an interesting experiment of building mac mini cluster ( http://www.scribd.com/doc/76827185/Mac-Mini-Hadoop-Cluster) I am continuously getting "*Exception in thread "Thread-69" java.lang.RuntimeException: Error while reading from task log ur*l" errors when I run hive queries on large set of data.
my current query looks like: INSERT OVERWRITE TABLE new_tbl PARTITION (ds, hour) SELECT line, ds, hour FROM old_tbl where ds like '2012-02%'; old_tbl stores data as text (gziped) new_tbl as SEQUENCEFILE. I also see for tasks failed, and their logs lokks same http://pastie.org/private/t6madgyxj0hppdzi4kaag (I checked the file reported with "hadoop fs -text" and it looks fine) Searching about the issue, I found that many people face this issue because tasktracker went down and needed to increase mapred.child.java.opts = -Xmx1024M http://grokbase.com/t/hive/user/1157es4eaf/bizarro-hive-hadoop-error I did that and didnt make a difference. My Configuration: Hadoop Version: Hadoop 1.0.0 Platform: OS X 10.7.2 (Mac mini ) Nodes: 3 Data , 1 Namenode , 1 Jobtracker, 3 TaskTracker Hive version: 0.8.1 ulimit -a of all nodes: http://pastie.org/private/ukxeuqcz31qckmn9hiqsba memory per node (sysctl -n hw.memsize) : 4.096G free_mem <https://gist.github.com/1690045#file_free_memory.rb>: 1.89G output of allmemory: http://pastie.org/private/drscsrbxf6dg7t9pwoc1g FSCK of whole external table location : http://pastie.org/private/wymv8g4xxprxh44btmtwq -v_abhi_v