Hi There!
  I have been doing an interesting experiment of building mac mini cluster (
http://www.scribd.com/doc/76827185/Mac-Mini-Hadoop-Cluster)
  I am continuously getting "*Exception in thread "Thread-69"
java.lang.RuntimeException: Error while reading from task log ur*l" errors
when I run hive queries on large set of data.

  my current query looks like: INSERT OVERWRITE TABLE new_tbl PARTITION
(ds, hour) SELECT  line, ds, hour FROM old_tbl where ds like '2012-02%';


  old_tbl stores data as text (gziped) new_tbl as SEQUENCEFILE.

I also see for tasks failed, and their logs lokks same
http://pastie.org/private/t6madgyxj0hppdzi4kaag (I checked the file
reported with "hadoop fs -text" and it looks fine)


Searching about the issue, I found that many people face this issue because
tasktracker went down and needed to increase mapred.child.java.opts =
-Xmx1024M
http://grokbase.com/t/hive/user/1157es4eaf/bizarro-hive-hadoop-error
I did that and didnt make a difference.



My Configuration:
Hadoop Version: Hadoop 1.0.0
Platform: OS X 10.7.2 (Mac mini )
Nodes: 3 Data , 1 Namenode , 1 Jobtracker, 3 TaskTracker
Hive version: 0.8.1
ulimit -a of all nodes: http://pastie.org/private/ukxeuqcz31qckmn9hiqsba
memory per node (sysctl -n hw.memsize) : 4.096G
free_mem <https://gist.github.com/1690045#file_free_memory.rb>: 1.89G
output of allmemory: http://pastie.org/private/drscsrbxf6dg7t9pwoc1g


FSCK of whole external table location :
http://pastie.org/private/wymv8g4xxprxh44btmtwq



-v_abhi_v

Reply via email to