Re: How to change logging from DRFA to RFA? Is it a good idea?
On 29/09/10 00:12, Alex Kozlov wrote: Hi Leo, What distribution are you using? Sometimes the log4j.properties is packed inside .jar file, which is picked first, so you need to explicitly give a java option '-Dlog4j.configuration=path-to-your-log4j-file' in the corresponding daemon flags. You find the JAR which has it in Ant, using the whichresource task. Indeed, that was why we wrote it. Here's a snippet from one of my buildfiles, tests.run.classpath is the classpath to run tests that is set up elsewhere target name=find-log4j depends=ready-to-test description=find log4j property files in the classpath whichresource resource=/log4j.properties classpathref=tests.run.classpath property=log4j.test.url / echo Log4J on the test classpath: ${log4j.test.url} /echo /target
Reducers stucks in copy phase
Hi, While trying to run a MapReduce job, the reducers stucks in the copy phase indefinitely. Though, all the Mapper have been finished the reducers stucks at 15-20% completion. The log available at the Reducers is as follows: 2010-09-29 11:33:24,535 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201009291127_0001_r_00_0 Need another 7 map output(s) where 5 is already in progress 2010-09-29 11:33:24,535 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201009291127_0001_r_00_0 Scheduled 0 outputs (0 slow hosts and2 dup hosts) 2010-09-29 11:34:24,536 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201009291127_0001_r_00_0 Need another 7 map output(s) where 5 is already in progress 2010-09-29 11:34:24,536 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201009291127_0001_r_00_0 Scheduled 0 outputs (0 slow hosts and2 dup hosts) 2010-09-29 11:35:24,537 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201009291127_0001_r_00_0 Need another 7 map output(s) where 5 is already in progress 2010-09-29 11:35:24,537 INFO org.apache.hadoop.mapred.ReduceTask: attempt_201009291127_0001_r_00_0 Scheduled 0 outputs (0 slow hosts and2 dup hosts) Could you please help me to figure the cause of this reducers stall ? thanks in advance. --PB
Re: How to change logging from DRFA to RFA? Is it a good idea?
For the benefit of the list archives: the log4j properties are being set inside the hadoop daemon shell script (here is the relevant line, as pointed out to me by Boris) bin/hadoop-daemon.sh:export HADOOP_ROOT_LOGGER=INFO,DRFA On Tue, Sep 28, 2010 at 4:12 PM, Alex Kozlov ale...@cloudera.com wrote: Hi Leo, What distribution are you using? Sometimes the log4j.properties is packed inside .jar file, which is picked first, so you need to explicitly give a java option '-Dlog4j.configuration=path-to-your-log4j-file' in the corresponding daemon flags. Alex K On Tue, Sep 28, 2010 at 2:13 PM, Leo Alekseyev dnqu...@gmail.com wrote: I have all of the above in my log4j.properties; every line that mentions DRFA is commented out. And yet, I still get the following errors: log4j:ERROR Could not find value for key log4j.appender.DRFA log4j:ERROR Could not instantiate appender named DRFA. Is there another config file?.. Is DRFA hard-coded somewhere?.. On Mon, Sep 27, 2010 at 5:28 PM, Boris Shkolnik bo...@yahoo-inc.com wrote: log4j.appender.RFA=org.apache.log4j.RollingFileAppender log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} log4j.appender.RFA.MaxFileSize=1MB log4j.appender.RFA.MaxBackupIndex=30 hadoop.root.logger=INFO,RFA On 9/27/10 4:12 PM, Leo Alekseyev dnqu...@gmail.com wrote: We are looking for ways to prevent Hadoop daemon logs from piling up (over time they can reach several tens of GB and become a nuisance). Unfortunately, the log4j DRFA class doesn't seem to provide an easy way to limit the number of files it creates. I would like to try switching to RFA with set MaxFileSize and MaxBackupIndex, since it looks like that's going to solve the log accumulation problem, but I can't figure out how to change the default logging class for the daemons. Can anyone give me some hints on how to do it? Alternatively, please let me know if there's a better solution to control log accumulation.
Multiple masters in hadoop
Hi, The master files name in hadoop/conf is called as masters. Wondering if I can configure multiple masters for a single cluster. If yes, how can I use them? Thanks, Bhushan DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: Multiple masters in hadoop
The Master appeared in Masters and Salves files is the machine name or ip address. If you have a single cluster, when you specify multiple names in those files it will cause error because of the connection failure. Shi On 2010-9-29 15:28, Bhushan Mahale wrote: Hi, The master files name in hadoop/conf is called as masters. Wondering if I can configure multiple masters for a single cluster. If yes, how can I use them? Thanks, Bhushan DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Postdoctoral Scholar Institute for Genomics and Systems Biology Department of Medicine, the University of Chicago Knapp Center for Biomedical Discovery 900 E. 57th St. Room 10148 Chicago, IL 60637, US Tel: 773-702-6799
Re: Multiple masters in hadoop
Actually the /hadoop/conf/masters file is for configuring secondarynamenode(s). Check http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/ for details. cheers, -jw On Wed, Sep 29, 2010 at 1:36 PM, Shi Yu sh...@uchicago.edu wrote: The Master appeared in Masters and Salves files is the machine name or ip address. If you have a single cluster, when you specify multiple names in those files it will cause error because of the connection failure. Shi On 2010-9-29 15:28, Bhushan Mahale wrote: Hi, The master files name in hadoop/conf is called as masters. Wondering if I can configure multiple masters for a single cluster. If yes, how can I use them? Thanks, Bhushan DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Postdoctoral Scholar Institute for Genomics and Systems Biology Department of Medicine, the University of Chicago Knapp Center for Biomedical Discovery 900 E. 57th St. Room 10148 Chicago, IL 60637, US Tel: 773-702-6799
java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
HI All, I am getting this Exception on a cluster(10 nodes) when I am running simple hadoop map / reduce job. I don't have this Exception while running it on my desktop in hadoop's pseudo distributed mode. Can somebody help? I would really appreciate it. 10/09/29 14:28:34 INFO mapred.JobClient: map 100% reduce 30% 10/09/29 14:28:36 INFO mapred.JobClient: Task Id : attempt_201009291306_0004_r_00_0, Status : FAILED java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103) at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123) at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381) at org.apache.hadoop.mapred.Merger.merge(Merger.java:107) at org.apache.hadoop.mapred.Merger.merge(Merger.java:93) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2316) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:576) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at speeditup.MsRead.readFields(MsRead.java:84) at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97) ... 11 more Here is a class that has WritableComparator.compare. It has only 2 strings max length 20 characters for each. public class MsRead implements WritableComparable MsRead { private static final Log LOG = LogFactory.getLog(speeditup.CalculateMinEvalue.class); private String query_id; private String record; public String getRecord() { return record; } public void setRecord(String record) { this.record = record; } public String getQuery_id() { return query_id; } public void setQuery_id(String queryId) { query_id = queryId; } public MsRead() { ; } public MsRead(String a, String r) { setQuery_id(a); setRecord(r); } @Override public void readFields(DataInput in) throws IOException { LOG.debug(**myreadFields +); LOG.warn(**myreadFields +); LOG.info(**myreadFields + ); query_id = in.readUTF(); record = in.readUTF(); } @Override public void write(DataOutput out) throws IOException { out.writeUTF(query_id); out.writeUTF(record); } public static class FirstComparator extends WritableComparator { private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator(); public FirstComparator() { super(MsRead.class); } @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { try { int firstL1 = WritableUtils.decodeVIntSize(b1[s1]) + readVInt(b1, s1); int firstL2 = WritableUtils.decodeVIntSize(b2[s2]) + readVInt(b2, s2); return TEXT_COMPARATOR.compare(b1, s1, firstL1, b2, s2, firstL2); } catch (IOException e) { throw new IllegalArgumentException(e); } } @Override public int compare(WritableComparable a, WritableComparable b) { if (a instanceof MsRead b instanceof MsRead) { //System.err.println(COMPARE + ((MsRead)a).getType() + \t + ((MsRead)b).getType() + \t //+ (((MsRead) a).toString().compareTo(((MsRead) b).toString(; return (((MsRead) a).toString().compareTo(((MsRead) b).toString())); } return super.compare(a, b); } } @Override public int compareTo(MsRead o) { return this.toString().compareTo(o.toString()); } @Override public boolean equals(Object right) { if (right instanceof MsRead ) { return (query_id.equals(((MsRead)right).query_id)); } else return false; } @Override public int hashCode() { return query_id.hashCode() ; } @Override public String toString() { return query_id; } public String toOutputString() { return record; } }
RE: Multiple masters in hadoop
Thanks James. The link is helpful too. Regards, Bhushan -Original Message- From: james warren [mailto:ja...@rockyou.com] Sent: Wednesday, September 29, 2010 1:50 PM To: common-user@hadoop.apache.org Subject: Re: Multiple masters in hadoop Actually the /hadoop/conf/masters file is for configuring secondarynamenode(s). Check http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/ for details. cheers, -jw On Wed, Sep 29, 2010 at 1:36 PM, Shi Yu sh...@uchicago.edu wrote: The Master appeared in Masters and Salves files is the machine name or ip address. If you have a single cluster, when you specify multiple names in those files it will cause error because of the connection failure. Shi On 2010-9-29 15:28, Bhushan Mahale wrote: Hi, The master files name in hadoop/conf is called as masters. Wondering if I can configure multiple masters for a single cluster. If yes, how can I use them? Thanks, Bhushan DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. -- Postdoctoral Scholar Institute for Genomics and Systems Biology Department of Medicine, the University of Chicago Knapp Center for Biomedical Discovery 900 E. 57th St. Room 10148 Chicago, IL 60637, US Tel: 773-702-6799 DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)
Your MsRead.readFields() doesn't contain readInt(). Can you show us the lines around line 84 of MsRead.java ? On Wed, Sep 29, 2010 at 2:44 PM, Tali K ncherr...@hotmail.com wrote: HI All, I am getting this Exception on a cluster(10 nodes) when I am running simple hadoop map / reduce job. I don't have this Exception while running it on my desktop in hadoop's pseudo distributed mode. Can somebody help? I would really appreciate it. 10/09/29 14:28:34 INFO mapred.JobClient: map 100% reduce 30% 10/09/29 14:28:36 INFO mapred.JobClient: Task Id : attempt_201009291306_0004_r_00_0, Status : FAILED java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103) at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) at org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:123) at org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:50) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:447) at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381) at org.apache.hadoop.mapred.Merger.merge(Merger.java:107) at org.apache.hadoop.mapred.Merger.merge(Merger.java:93) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2316) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:576) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at speeditup.MsRead.readFields(MsRead.java:84) at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97) ... 11 more Here is a class that has WritableComparator.compare. It has only 2 strings max length 20 characters for each. public class MsRead implements WritableComparable MsRead { private static final Log LOG = LogFactory.getLog(speeditup.CalculateMinEvalue.class); private String query_id; private String record; public String getRecord() { return record; } public void setRecord(String record) { this.record = record; } public String getQuery_id() { return query_id; } public void setQuery_id(String queryId) { query_id = queryId; } public MsRead() { ; } public MsRead(String a, String r) { setQuery_id(a); setRecord(r); } @Override public void readFields(DataInput in) throws IOException { LOG.debug(**myreadFields +); LOG.warn(**myreadFields +); LOG.info(**myreadFields + ); query_id = in.readUTF(); record = in.readUTF(); } @Override public void write(DataOutput out) throws IOException { out.writeUTF(query_id); out.writeUTF(record); } public static class FirstComparator extends WritableComparator { private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator(); public FirstComparator() { super(MsRead.class); } @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { try { int firstL1 = WritableUtils.decodeVIntSize(b1[s1]) + readVInt(b1, s1); int firstL2 = WritableUtils.decodeVIntSize(b2[s2]) + readVInt(b2, s2); return TEXT_COMPARATOR.compare(b1, s1, firstL1, b2, s2, firstL2); } catch (IOException e) { throw new IllegalArgumentException(e); } } @Override public int compare(WritableComparable a, WritableComparable b) { if (a instanceof MsRead b instanceof MsRead) { //System.err.println(COMPARE + ((MsRead)a).getType() + \t + ((MsRead)b).getType() + \t //+ (((MsRead) a).toString().compareTo(((MsRead) b).toString(; return (((MsRead) a).toString().compareTo(((MsRead) b).toString())); } return super.compare(a, b); } } @Override public int compareTo(MsRead o) { return this.toString().compareTo(o.toString()); } @Override public boolean equals(Object right) { if (right instanceof MsRead ) { return (query_id.equals(((MsRead)right).query_id)); } else return false; } @Override public int hashCode() { return query_id.hashCode() ; } @Override public String toString() { return query_id; } public String toOutputString() { return record; } }