Archives larger thatn 2^31 bytes in DistributedCache

2008-10-31 Thread Christian Kunz
In hadoop-0.17
we tried to use a 2.2GB archive and seemingly ran into
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6599383:

java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:114)
at java.util.zip.ZipFile.(ZipFile.java:131)
at org.apache.hadoop.fs.FileUtil.unZip(FileUtil.java:421)
at 
org.apache.hadoop.filecache.DistributedCache.localizeCache(DistributedCache.
java:338)
at 
org.apache.hadoop.filecache.DistributedCache.getLocalCache(DistributedCache.
java:161)
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:137)

Any work around known for hadoop-0.17 (except multiple smaller archives)?

Is it correct to assume that in hadoop-0.18 this is no longer an issue when
using tar.gz?

Thanks,
Christian



Please help, don't know how to solve--java.io.IOException: WritableName can't load class

2008-10-31 Thread Mudong Lu
Hello, guys,

I am very new to hadoop. I was trying to read nutch data files using a
script i found on http://wiki.apache.org/nutch/Getting_Started . And after 2
days of trying, I still cannot get it to work. now the error i got is
"java.lang.RuntimeException: java.io.IOException: WritableName can't load
class".

Below is my script:



/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */

package test;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.MapFile;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableComparable;


/**
 *
 * @author mudong
 */
public class Main {

/**
 * @param args the command line arguments
 */
public static void main(String[] args) {
// TODO code application logic here
try{
Configuration conf = new Configuration();
conf.addResource(new
Path("/home/mudong/programming/java/hadoop-0.17.2.1/conf/hadoop-default.xml"));
//conf.addResource(new
Path("/home/mudong/programming/java/hadoop-0.18.1/conf/hadoop-default.xml"));
FileSystem fs= FileSystem.get(conf);
String seqFile = new
String("/home/mudong/programming/java/nutch-0.9/crawl/segments/20081021075837/content/part-0");
MapFile.Reader reader;
reader = new MapFile.Reader (fs, seqFile, conf);
Class keyC = reader.getKeyClass();
Class valueC = reader.getValueClass();

while (true) {
WritableComparable key = null;
Writable value = null;
try {
key = (WritableComparable)keyC.newInstance();
value = (Writable)valueC.newInstance();
} catch (Exception ex) {
ex.printStackTrace();
System.exit(-1);
}

try {
if (!reader.next(key, value)) {
break;
}

System.out.println(key);
System.out.println(value);
} catch (Exception e) {
e.printStackTrace();
System.out.println("Exception occured. " + e);
break;
}
}
}catch(Exception e){
e.printStackTrace();
System.out.println("Exception occured. " + e);
}
}
}


And when I running the script above, I got error messages like below.

java.lang.RuntimeException: java.io.IOException: WritableName can't load
class
at
org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1612)
Exception occured. java.lang.RuntimeException: java.io.IOException:
WritableName can't load class
at
org.apache.hadoop.io.MapFile$Reader.getValueClass(MapFile.java:248)
at test.Main.main(Main.java:36)
Caused by: java.io.IOException: WritableName can't load class
at org.apache.hadoop.io.WritableName.getClass(WritableName.java:74)
at
org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1610)
... 2 more
Caused by: java.lang.ClassNotFoundException:
org.apache.nutch.protocol.Content
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:581)
at org.apache.hadoop.io.WritableName.getClass(WritableName.java:72)

I've tried a lot of things, but it's just not working. I use
hadoop-0.17.2.1. Thanks a lot, guys,

Rongdong


RE: "Merge of the inmemory files threw an exception" and diffs between 0.17.2 and 0.18.1

2008-10-31 Thread Deepika Khera
Wow, if the issue is fixed with version 0.20, then could we please have
a patch for version 0.18? 

Thanks,
Deepika

-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 30, 2008 12:19 PM
To: core-user@hadoop.apache.org
Subject: Re: "Merge of the inmemory files threw an exception" and diffs
between 0.17.2 and 0.18.1

So, Philippe reports that the problem goes away with 0.20-dev  
(trunk?): http://mahout.markmail.org/message/swmzreg6fnzf6icv   We  
aren't totally clear on the structure of SVN for Hadoop, but it seems  
like it is not fixed by this patch.



On Oct 29, 2008, at 10:28 AM, Grant Ingersoll wrote:

> We'll try it out...
>
> On Oct 28, 2008, at 3:00 PM, Arun C Murthy wrote:
>
>>
>> On Oct 27, 2008, at 7:05 PM, Grant Ingersoll wrote:
>>
>>> Hi,
>>>
>>> Over in Mahout (lucene.a.o/mahout), we are seeing an oddity with  
>>> some of our clustering code and Hadoop 0.18.1.  The thread in  
>>> context is at:  http://mahout.markmail.org/message/vcyvlz2met7fnthr
>>>
>>> The problem seems to occur when going from 0.17.2 to 0.18.1.  In  
>>> the user logs, we are seeing the following exception:
>>> 2008-10-27 21:18:37,014 INFO org.apache.hadoop.mapred.Merger: Down  
>>> to the last merge-pass, with 2 segments left of total size: 5011  
>>> bytes
>>> 2008-10-27 21:18:37,033 WARN org.apache.hadoop.mapred.ReduceTask:  
>>> attempt_200810272112_0011_r_00_0 Merge of the inmemory files  
>>> threw an exception: java.io.IOException: Intermedate merge failed
>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier 
>>> $InMemFSMergeThread.doInMemMerge(ReduceTask.java:2147)
>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier 
>>> $InMemFSMergeThread.run(ReduceTask.java:2078)
>>> Caused by: java.lang.NumberFormatException: For input string: "["
>>
>> If you are sure that this isn't caused by your application-logic,  
>> you could try running with
http://issues.apache.org/jira/browse/HADOOP-4277 
>> .
>>
>> That bug caused many a ship to sail in large circles, hopelessly.
>>
>> Arun
>>
>>>
>>>  at  
>>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java: 
>>> 1224)
>>>  at java.lang.Double.parseDouble(Double.java:510)
>>>  at  
>>> org.apache.mahout.matrix.DenseVector.decodeFormat(DenseVector.java: 
>>> 60)
>>>  at  
>>> org 
>>> .apache 
>>> .mahout.matrix.AbstractVector.decodeVector(AbstractVector.java:256)
>>>  at  
>>> org 
>>> .apache 
>>> .mahout 
>>> .clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:38)
>>>  at  
>>> org 
>>> .apache 
>>> .mahout 
>>> .clustering.kmeans.KMeansCombiner.reduce(KMeansCombiner.java:31)
>>>  at org.apache.hadoop.mapred.ReduceTask 
>>> $ReduceCopier.combineAndSpill(ReduceTask.java:2174)
>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access 
>>> $3100(ReduceTask.java:341)
>>>  at org.apache.hadoop.mapred.ReduceTask$ReduceCopier 
>>> $InMemFSMergeThread.doInMemMerge(ReduceTask.java:2134)
>>>
>>> And in the main output log (from running  bin/hadoop jar  mahout/ 
>>> examples/build/apache-mahout-examples-0.1-dev.job  
>>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job) we see:
>>> 08/10/27 21:18:41 INFO mapred.JobClient: Task Id :  
>>> attempt_200810272112_0011_r_00_0, Status : FAILED
>>> java.io.IOException: attempt_200810272112_0011_r_00_0The  
>>> reduce copier failed
>>>  at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:255)
>>>  at org.apache.hadoop.mapred.TaskTracker 
>>> $Child.main(TaskTracker.java:2207)
>>>
>>> If I run this exact same job on 0.17.2 it all runs fine.  I  
>>> suppose either a bug was introduced in 0.18.1 or a bug was fixed  
>>> that we were relying on.  Looking at the release notes between the  
>>> fixes, nothing in particular struck me as related.  If it helps, I  
>>> can provide the instructions for how to run the example in  
>>> question (they need to be written up anyway!)
>>>
>>>
>>> I see some related things at
http://hadoop.markmail.org/search/?q=Merge+of+the+inmemory+files+threw+a
n+exception 
>>> , but those are older, it seems, so not sure what to make of them.
>>>
>>> Thanks,
>>> Grant
>>
>
> --
> Grant Ingersoll
> Lucene Boot Camp Training Nov. 3-4, 2008, ApacheCon US New Orleans.
> http://www.lucenebootcamp.com
>
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>



Re: Status FUSE-Support of HDFS

2008-10-31 Thread Pete Wyckoff

It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted via 
fuse and uses that for some operations.

There have recently been some problems with fuse-dfs when used in a 
multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do 
not use 0.18 or 0.18.1)

The current (known) issues are:
  1. Wrong semantics when copying over an existing file - namely it does a 
delete and then re-creates the file, so ownership/permissions may end up wrong. 
There is a patch for this.
  2. When directories have 10s of thousands of files, performance can be very 
poor.
  3. Posix truncate is supported only for truncating it to 0 size since hdfs 
doesn't support truncate.
  4. Appends are not supported - this is a libhdfs problem and there is a patch 
for it.

It is still a pre-1.0 product for sure, but it has been pretty stable for us.


-- pete


On 10/31/08 9:08 AM, "Robert Krüger" <[EMAIL PROTECTED]> wrote:



Hi,

could anyone tell me what the current Status of FUSE support for HDFS
is? Is this something that can be expected to be usable in a few
weeks/months in a production environment? We have been really
happy/successful with HDFS in our production system. However, some
software we use in our application simply requires an OS-Level file
system which currently requires us to do a lot of copying between HDFS
and a regular file system for processes which require that software and
FUSE support would really eliminate that one disadvantage we have with
HDFS. We wouldn't even require the performance of that to be outstanding
because just by eliminatimng the copy step, we would greatly increase
the thruput of those processes.

Thanks for sharing any thoughts on this.

Regards,

Robert




Re: Mapper settings...

2008-10-31 Thread Owen O'Malley


On Oct 31, 2008, at 3:15 PM, Bhupesh Bansal wrote:


Why do we need these setters in JobConf ??

jobConf.setMapOutputKeyClass(String.class);

jobConf.setMapOutputValueClass(LongWritable.class);


Just historical. The Mapper and Reducer interfaces didn't use to be  
generic. (Hadoop used to run on Java 1.4 too...)


It would be nice to remove the need to call them. There is an old bug  
open to check for consistency HADOOP-1683. It would be even better to  
make the setting of both the map and reduce output types optional if  
they are specified by the template parameters.


-- Owen


Re: LeaseExpiredException and too many xceiver

2008-10-31 Thread Raghu Angadi


Config on most Y! clusters sets dfs.datanode.max.xcievers to a large 
value .. something like 1k to 2k. You could try that.


Raghu.

Nathan Marz wrote:
Looks like the exception on the datanode got truncated a little bit. 
Here's the full exception:


2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: 
DatanodeRegistration(10.100.11.115:50010,
storageID=DS-2129547091-10.100.11.115-50010-1225485937590, 
infoPort=50075, ipcPort=50020):DataXceiver: java.io.IOException:

xceiverCount 257 exceeds the limit of concurrent xcievers 256
at 
org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030)

at java.lang.Thread.run(Thread.java:619)


On Oct 31, 2008, at 2:49 PM, Nathan Marz wrote:


Hello,

We are seeing some really bad errors on our hadoop cluster. After 
reformatting the whole cluster, the first job we run immediately fails 
with "Could not find block locations..." errrors. In the namenode 
logs, we see a ton of errors like:


2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server 
handler 5 on 7276, call addBlock(/tmp/dustintmp/shredded_dataunits/_t$
org.apache.hadoop.dfs.LeaseExpiredException: No lease on 
/tmp/dustintmp/shredded_dataunits/_temporary/_attempt_200810311418_0002_m_23_0$ 

   at 
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166)
   at 
org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1097) 


   at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 


   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)



In the datanode logs, we see a ton of errors like:

2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode: 
DatanodeRegistration(10.100.11.115:50010, 
storageID=DS-2129547091-10.100.11.1$

of concurrent xcievers 256
   at 
org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:1030)

   at java.lang.Thread.run(Thread.java:619)



Anyone have any ideas on what may be wrong?

Thanks,
Nathan Marz
Rapleaf






Mapper settings...

2008-10-31 Thread Bhupesh Bansal
Hey guys, 

Just curious, 

Why do we need these setters in JobConf ??

jobConf.setMapOutputKeyClass(String.class);

jobConf.setMapOutputValueClass(LongWritable.class);

We should be able to extract these from
OutputController of Mapper class ??

IMHO, they have to be consistent with OutputCollector class. so why have
extra point of failures ?


Best
Bhupesh



Re: To Compute or Not to Compute on Prod

2008-10-31 Thread shahab mehmandoust
Currently, I'm just researching so I'm just playing with the idea of
streaming log data into the HDFS.

I'm confused about: "...all you need is a Hadoop install.  Your production
node doesn't need to be a
datanode."  If my production node is *not* a dataNode then how can I do
"hadoop dfs put?"

I was under the impression that when I install HDFS on a cluster each node
in the cluster is a dataNode.

Shahab

On Fri, Oct 31, 2008 at 1:46 PM, Norbert Burger <[EMAIL PROTECTED]>wrote:

> What are you using to "stream logs into the HDFS"?
>
> If the command-line tools (ie., "hadoop dfs put") work for you, then all
> you
> need is a Hadoop install.  Your production node doesn't need to be a
> datanode.
>
> On Fri, Oct 31, 2008 at 2:35 PM, shahab mehmandoust <[EMAIL PROTECTED]
> >wrote:
>
> > I want to stream data from logs into the HDFS in production but I do NOT
> > want my production machine to be apart of the computation cluster.  The
> > reason I want to do it in this way is to take advantage of HDFS without
> > putting computation load on my production machine.  Is this possible*?*
> > Furthermore, is this unnecessary because the computation would not put a
> > significant load on my production box (obviously depends on the
> map/reduce
> > implementation but I'm asking in general)*?*
> >
> > I should note that our prod machine hosts our core web application and
> > database (saving up for another box :-).
> >
> > Thanks,
> > Shahab
> >
>


Re: LeaseExpiredException and too many xceiver

2008-10-31 Thread Nathan Marz
Looks like the exception on the datanode got truncated a little bit.  
Here's the full exception:


2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode:  
DatanodeRegistration(10.100.11.115:50010,
storageID=DS-2129547091-10.100.11.115-50010-1225485937590,  
infoPort=50075, ipcPort=50020):DataXceiver: java.io.IOException:

xceiverCount 257 exceeds the limit of concurrent xcievers 256
at org.apache.hadoop.dfs.DataNode 
$DataXceiver.run(DataNode.java:1030)

at java.lang.Thread.run(Thread.java:619)


On Oct 31, 2008, at 2:49 PM, Nathan Marz wrote:


Hello,

We are seeing some really bad errors on our hadoop cluster. After  
reformatting the whole cluster, the first job we run immediately  
fails with "Could not find block locations..." errrors. In the  
namenode logs, we see a ton of errors like:


2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC  
Server handler 5 on 7276, call addBlock(/tmp/dustintmp/ 
shredded_dataunits/_t$
org.apache.hadoop.dfs.LeaseExpiredException: No lease on /tmp/ 
dustintmp/shredded_dataunits/_temporary/ 
_attempt_200810311418_0002_m_23_0$
   at  
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166)
   at  
org 
.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 
1097)

   at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
   at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
   at  
sun 
.reflect 
.DelegatingMethodAccessorImpl 
.invoke(DelegatingMethodAccessorImpl.java:25)

   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)



In the datanode logs, we see a ton of errors like:

2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode:  
DatanodeRegistration(10.100.11.115:50010,  
storageID=DS-2129547091-10.100.11.1$

of concurrent xcievers 256
   at org.apache.hadoop.dfs.DataNode 
$DataXceiver.run(DataNode.java:1030)

   at java.lang.Thread.run(Thread.java:619)



Anyone have any ideas on what may be wrong?

Thanks,
Nathan Marz
Rapleaf




LeaseExpiredException and too many xceiver

2008-10-31 Thread Nathan Marz

Hello,

We are seeing some really bad errors on our hadoop cluster. After  
reformatting the whole cluster, the first job we run immediately fails  
with "Could not find block locations..." errrors. In the namenode  
logs, we see a ton of errors like:


2008-10-31 14:20:44,799 INFO org.apache.hadoop.ipc.Server: IPC Server  
handler 5 on 7276, call addBlock(/tmp/dustintmp/shredded_dataunits/_t$
org.apache.hadoop.dfs.LeaseExpiredException: No lease on /tmp/ 
dustintmp/shredded_dataunits/_temporary/ 
_attempt_200810311418_0002_m_23_0$
at  
org.apache.hadoop.dfs.FSNamesystem.checkLease(FSNamesystem.java:1166)
at  
org 
.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java: 
1097)

at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)

at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)



In the datanode logs, we see a ton of errors like:

2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode:  
DatanodeRegistration(10.100.11.115:50010,  
storageID=DS-2129547091-10.100.11.1$

of concurrent xcievers 256
at org.apache.hadoop.dfs.DataNode 
$DataXceiver.run(DataNode.java:1030)

at java.lang.Thread.run(Thread.java:619)



Anyone have any ideas on what may be wrong?

Thanks,
Nathan Marz
Rapleaf


RE: "Merge of the inmemory files threw an exception" and diffs between 0.17.2 and 0.18.1

2008-10-31 Thread Deepika Khera
Hi Devraj,

It was pretty consistent with my comparator class in my old email(the
one that uses UTF8). While trying to resolve the issue, I changed UTF8
to Text. That made it disappear for a while but then it came back again.
My new Comparator class(with Text) is - 

public class IncrementalURLIndexKey implements WritableComparable {
  private Text url;
  private long userid;

  public IncrementalURLIndexKey() {
  }

  public IncrementalURLIndexKey(Text url, long userid) {
this.url = url;
this.userid = userid;
  }

  public Text getUrl() {
return url;
  }

  public long getUserid() {
return userid;
  }

  public void write(DataOutput out) throws IOException {
url.write(out);
out.writeLong(userid);
  }

  public void readFields(DataInput in) throws IOException {
url = new Text();
url.readFields(in);
userid = in.readLong();
  }

  public int compareTo(Object o) {
IncrementalURLIndexKey other = (IncrementalURLIndexKey) o;
int result = url.compareTo(other.getUrl());
if (result == 0) result = CUID.compare(userid, other.userid);
return result;
  }

  /**
   * A Comparator optimized for IncrementalURLIndexKey.
   */
  public static class GroupingComparator extends WritableComparator {
public GroupingComparator() {
  super(IncrementalURLIndexKey.class, true);
}

public int compare(WritableComparable a, WritableComparable b) {
  IncrementalURLIndexKey key1 = (IncrementalURLIndexKey) a;
  IncrementalURLIndexKey key2 = (IncrementalURLIndexKey) b;

  return key1.getUrl().compareTo(key2.getUrl());
}
  }

  static {
WritableComparator.define(IncrementalURLIndexKey.class, new
GroupingComparator());
  }
}

Thanks,
Deepika


-Original Message-
From: Devaraj Das [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 28, 2008 9:01 PM
To: core-user@hadoop.apache.org
Subject: Re: "Merge of the inmemory files threw an exception" and diffs
between 0.17.2 and 0.18.1

Quick question (I haven't looked at your comparator code yet) - is this
reproducible/consistent?


On 10/28/08 11:52 PM, "Deepika Khera" <[EMAIL PROTECTED]> wrote:

> I am getting a similar exception too with Hadoop 0.18.1(See stacktrace
> below), though its an EOFException. Does anyone have any idea about
what
> it means and how it can be fixed?
> 
> 2008-10-27 16:53:07,407 WARN org.apache.hadoop.mapred.ReduceTask:
> attempt_200810241922_0844_r_06_0 Merge of the inmemory files threw
> an exception: java.io.IOException: Intermedate merge failed
> at
>
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doIn
> MemMerge(ReduceTask.java:2147)
> at
>
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(
> ReduceTask.java:2078)
> Caused by: java.lang.RuntimeException: java.io.EOFException
> at
>
org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:
> 103)
> at
> org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:269)
> at
> org.apache.hadoop.util.PriorityQueue.upHeap(PriorityQueue.java:122)
> at
> org.apache.hadoop.util.PriorityQueue.put(PriorityQueue.java:49)
> at
> org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:321)
> at org.apache.hadoop.mapred.Merger.merge(Merger.java:72)
> at
>
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doIn
> MemMerge(ReduceTask.java:2123)
> ... 1 more
> Caused by: java.io.EOFException
> at
> java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323)
> at org.apache.hadoop.io.UTF8.readFields(UTF8.java:103)
> at com.collarity.io.IOUtil.readUTF8(IOUtil.java:213)
> at
>
com.collarity.url.IncrementalURLIndexKey.readFields(IncrementalURLIndexK
> ey.java:40)
> at
>
org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:
> 97)
> ... 7 more
> 
> 2008-10-27 16:53:07,407 WARN org.apache.hadoop.mapred.ReduceTask:
> attempt_200810241922_0844_r_06_0 Merging of the local FS files
threw
> an exception: java.io.IOException: java.lang.RuntimeException:
> java.io.EOFException
> at
>
org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:
> 103)
> at
> org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:269)
> at
> org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:135)
> at
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:102)
> at
>
org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.ja
> va:226)
> at
> org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:242)
> at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:83)
> at
>
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$LocalFSMerger.run(Reduc
> eTask.java:2035)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
> at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106)
> at com.collarity.io.IOUtil.readUTF8(IOUtil.java:213)
> at
>
com.collarity.url.IncrementalURLIndexKey.readFields(IncrementalURLIndexK
> ey.java:40)
> at
>
org.apache.hadoop.io.Writable

Re: To Compute or Not to Compute on Prod

2008-10-31 Thread Jerome Boulon
Hi,
We have deployed a new monitoring system Chukwa (
http://wiki.apache.org/hadoop/Chukwa) that is doing exactly that.
Also this system provide an easy way to post-process you log file and
extract useful information using M/R.

/Jerome.


On 10/31/08 1:46 PM, "Norbert Burger" <[EMAIL PROTECTED]> wrote:

> What are you using to "stream logs into the HDFS"?
> 
> If the command-line tools (ie., "hadoop dfs put") work for you, then all you
> need is a Hadoop install.  Your production node doesn't need to be a
> datanode.
> 
> On Fri, Oct 31, 2008 at 2:35 PM, shahab mehmandoust <[EMAIL PROTECTED]>wrote:
> 
>> I want to stream data from logs into the HDFS in production but I do NOT
>> want my production machine to be apart of the computation cluster.  The
>> reason I want to do it in this way is to take advantage of HDFS without
>> putting computation load on my production machine.  Is this possible*?*
>> Furthermore, is this unnecessary because the computation would not put a
>> significant load on my production box (obviously depends on the map/reduce
>> implementation but I'm asking in general)*?*
>> 
>> I should note that our prod machine hosts our core web application and
>> database (saving up for another box :-).
>> 
>> Thanks,
>> Shahab
>> 



Re: To Compute or Not to Compute on Prod

2008-10-31 Thread Norbert Burger
What are you using to "stream logs into the HDFS"?

If the command-line tools (ie., "hadoop dfs put") work for you, then all you
need is a Hadoop install.  Your production node doesn't need to be a
datanode.

On Fri, Oct 31, 2008 at 2:35 PM, shahab mehmandoust <[EMAIL PROTECTED]>wrote:

> I want to stream data from logs into the HDFS in production but I do NOT
> want my production machine to be apart of the computation cluster.  The
> reason I want to do it in this way is to take advantage of HDFS without
> putting computation load on my production machine.  Is this possible*?*
> Furthermore, is this unnecessary because the computation would not put a
> significant load on my production box (obviously depends on the map/reduce
> implementation but I'm asking in general)*?*
>
> I should note that our prod machine hosts our core web application and
> database (saving up for another box :-).
>
> Thanks,
> Shahab
>


Redirecting Libhdfs output

2008-10-31 Thread Brian Bockelman

Hey all,

libhdfs prints out useful information to stderr in the function  
errnoFromException; unfortunately, in the C application framework I  
use, the stderr is directed to /dev/null, making debugging miserably  
hard.


Does anyone have any suggestions to make the errnoFromException  
function write out the error to a different stream?


Brian


Re: To Compute or Not to Compute on Prod

2008-10-31 Thread shahab mehmandoust
Definitely speaking java  Do you think I'm being paranoid about the
possible load?

Shahab

On Fri, Oct 31, 2008 at 11:52 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> Shahab,
>
> This can be done.
> If you client speaks java you can connect to hadoop and write as a stream.
>
> If you client does not have java. The thrift api will generate stubs
> in a variety of languages
>
> Thrift API: http://wiki.apache.org/hadoop/HDFS-APIs
>
> Shameless plug -- If you just want to stream data I created a simple
> socket server-
> http://www.jointhegrid.com/jtgweb/lhadoopserver/index.jsp
>
> So you do not have to be part of the cluster to write to it.
>


Re: To Compute or Not to Compute on Prod

2008-10-31 Thread Edward Capriolo
Shahab,

This can be done.
If you client speaks java you can connect to hadoop and write as a stream.

If you client does not have java. The thrift api will generate stubs
in a variety of languages

Thrift API: http://wiki.apache.org/hadoop/HDFS-APIs

Shameless plug -- If you just want to stream data I created a simple
socket server-
http://www.jointhegrid.com/jtgweb/lhadoopserver/index.jsp

So you do not have to be part of the cluster to write to it.


To Compute or Not to Compute on Prod

2008-10-31 Thread shahab mehmandoust
I want to stream data from logs into the HDFS in production but I do NOT
want my production machine to be apart of the computation cluster.  The
reason I want to do it in this way is to take advantage of HDFS without
putting computation load on my production machine.  Is this possible*?*
Furthermore, is this unnecessary because the computation would not put a
significant load on my production box (obviously depends on the map/reduce
implementation but I'm asking in general)*?*

I should note that our prod machine hosts our core web application and
database (saving up for another box :-).

Thanks,
Shahab


Re: SecondaryNameNode on separate machine

2008-10-31 Thread Konstantin Shvachko

True, dfs.http.address is the NN Web UI address.
This where the NN http server runs. Besides the Web UI there also
a servlet running on that server which is used to transfer image
and edits from NN to the secondary using http get.
So SNN uses both addresses fs.default.name and dfs.http.address.

When SNN finishes the checkpoint the primary needs to transfer the
resulting image back. This is done via the http server running on SNN.

Answering Tomislav's question:
The difference between fs.default.name and dfs.http.address is that
fs.default.name is the name-node's PRC address, where clients and
data-nodes connect to, while dfs.http.address is the NN's http server
address where our browsers connect to, but it is also used for
transferring image and edits files.

--Konstantin

Otis Gospodnetic wrote:

Konstantin & Co, please correct me if I'm wrong, but looking at 
hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN 
*Web UI*.  In other words, this is where we people go look at the NN.

The secondary NN must then be using only the Primary NN URL specified in fs.default.name. 
 This URL looks like hdfs://name-node-hostname-here/.  Something in Hadoop then knows the 
exact port for the Primary NN based on the URI schema (e.g. "hdfs://") in this 
URL.

Is this correct?


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Tomislav Poljak <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Thursday, October 30, 2008 1:52:18 PM
Subject: Re: SecondaryNameNode on separate machine

Hi,
can you, please, explain the difference between fs.default.name and
dfs.http.address (like how and when is SecondaryNameNode using
fs.default.name and how/when dfs.http.address). I have set them both to
same (namenode's) hostname:port. Is this correct (or dfs.http.address
needs some other port)? 


Thanks,

Tomislav

On Wed, 2008-10-29 at 16:10 -0700, Konstantin Shvachko wrote:

SecondaryNameNode uses http protocol to transfer the image and the edits
from the primary name-node and vise versa.
So the secondary does not access local files on the primary directly.
The primary NN should know the secondary's http address.
And the secondary NN need to know both fs.default.name and dfs.http.address of 

the primary.

In general we usually create one configuration file hadoop-site.xml
and copy it to all other machines. So you don't need to set up different
values for all servers.

Regards,
--Konstantin

Tomislav Poljak wrote:

Hi,
I'm not clear on how does SecondaryNameNode communicates with NameNode
(if deployed on separate machine). Does SecondaryNameNode uses direct
connection (over some port and protocol) or is it enough for
SecondaryNameNode to have access to data which NameNode writes locally
on disk?

Tomislav

On Wed, 2008-10-29 at 09:08 -0400, Jean-Daniel Cryans wrote:

I think a lot of the confusion comes from this thread :
http://www.nabble.com/NameNode-failover-procedure-td11711842.html

Particularly because the wiki was updated with wrong information, not
maliciously I'm sure. This information is now gone for good.

Otis, your solution is pretty much like the one given by Dhruba Borthakur
and augmented by Konstantin Shvachko later in the thread but I never did it
myself.

One thing should be clear though, the NN is and will remain a SPOF (just
like HBase's Master) as long as a distributed manager service (like
Zookeeper) is not plugged into Hadoop to help with failover.

J-D

On Wed, Oct 29, 2008 at 2:12 AM, Otis Gospodnetic <
[EMAIL PROTECTED]> wrote:


Hi,
So what is the "recipe" for avoiding NN SPOF using only what comes with
Hadoop?

From what I can tell, I think one has to do the following two things:

1) configure primary NN to save namespace and xa logs to multiple dirs, 

one

of which is actually on a remotely mounted disk, so that the data actually
lives on a separate disk on a separate box.  This saves namespace and xa
logs on multiple boxes in case of primary NN hardware failure.

2) configure secondary NN to periodically merge fsimage+edits and create
the fsimage checkpoint.  This really is a second NN process running on
another box.  It sounds like this secondary NN has to somehow have access 

to

fsimage & edits files from the primary NN server.

http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodedoes 
not describe the best practise around that - the recommended way to

give secondary NN access to primary NN's fsimage and edits files.  Should
one mount a disk from the primary NN box to the secondary NN box to get
access to those files?  Or is there a simpler way?
In any case, this checkpoint is just a merge of fsimage+edits files and
again is there in case the box with the primary NN dies.  That's what's
described on

http://hadoop.apache.org/core/docs/r0.18.1/hdfs_user_guide.html#Secondary+NameNodemore 
or less.
Is this sufficient, or are there other things one has to do to eliminate 

N

Re: ApacheCon US 2008

2008-10-31 Thread Grant Ingersoll
I will also be presenting on Mahout (machine learning) on Wednesday at  
3:30 (I think).  It will have some Hadoop flavor in it.


-Grant

On Oct 31, 2008, at 1:46 PM, Owen O'Malley wrote:

Just a reminder that ApacheCon US is next week in New Orleans. There  
will be a lot of Hadoop developers and talks. (I'm CC'ing core-user  
because it has the widest coverage. Please join the low traffic  
[EMAIL PROTECTED] list for cross sub-project announcements.)


   * Hadoop Camp with lots of talks about Hadoop
 o Introduction to Hadoop by Owen O'Malley
 o A Tour of Apache Hadoop by Tom White
 o Programming Hadoop Map/Reduce by Arun Murthy
 o Hadoop at Yahoo! by Eric Baldeschwieler
 o Hadoop Futures Panel
 o Using Hadoop for an Intranet Seach Engine by Shivakumar  
Vaithyanthan

 o Cloud Computing Testbed by Thomas Sandholm
 o Improving Virtualization and Performance Tracing of  
Hadoop with Open Solaris

 by George Porter
 o An Insight into Hadoop Usage at Facebook by Dhruba  
Borthakur

 o Pig by Alan Gates
 o Zookeeper, Coordinating the Distributed Application by  
Ben Reed

 o Querying JSON Data on Hadoop using Jaql by Kevin Beyer
 o HBase by Michael Stack
   * Hadoop training on Practical Problem Solving in Hadoop
   * Cloudera is providing a test Hadoop cluster and a Hadoop  
hacking contest.


There is also a new Hadoop tutorial available.

-- Owen




Re: ApacheCon US 2008

2008-10-31 Thread Lukáš Vlček
Hi,
Hope somebody will record at least fraction of these talks and put them on
the web as soon as possible.Lukas

On Fri, Oct 31, 2008 at 6:46 PM, Owen O'Malley <[EMAIL PROTECTED]> wrote:

> Just a reminder that ApacheCon US is next week in New Orleans. There will
> be a lot of Hadoop developers and talks. (I'm CC'ing core-user because it
> has the widest coverage. Please join the low traffic [EMAIL PROTECTED] list
> for cross sub-project announcements.)
>
>* Hadoop Camp with lots of talks about Hadoop
>  o Introduction to Hadoop by Owen O'Malley
>  o A Tour of Apache Hadoop by Tom White
>  o Programming Hadoop Map/Reduce by Arun Murthy
>  o Hadoop at Yahoo! by Eric Baldeschwieler
>  o Hadoop Futures Panel
>  o Using Hadoop for an Intranet Seach Engine by Shivakumar
> Vaithyanthan
>  o Cloud Computing Testbed by Thomas Sandholm
>  o Improving Virtualization and Performance Tracing of Hadoop with
> Open Solaris
>  by George Porter
>  o An Insight into Hadoop Usage at Facebook by Dhruba Borthakur
>  o Pig by Alan Gates
>  o Zookeeper, Coordinating the Distributed Application by Ben Reed
>  o Querying JSON Data on Hadoop using Jaql by Kevin Beyer
>  o HBase by Michael Stack
>* Hadoop training on Practical Problem Solving in Hadoop
>* Cloudera is providing a test Hadoop cluster and a Hadoop hacking
> contest.
>
> There is also a new Hadoop tutorial available.
>
> -- Owen




-- 
http://blog.lukas-vlcek.com/


RE: ApacheCon US 2008

2008-10-31 Thread Ashish Thusoo
Owen,

Just wanted to mention that there is a talk on Hive as well on Friday 9:30AM...

Ashish

-Original Message-
From: Owen O'Malley [mailto:[EMAIL PROTECTED]
Sent: Friday, October 31, 2008 10:47 AM
To: [EMAIL PROTECTED]
Cc: core-user@hadoop.apache.org
Subject: ApacheCon US 2008

Just a reminder that ApacheCon US is next week in New Orleans. There will be a 
lot of Hadoop developers and talks. (I'm CC'ing core-user because it has the 
widest coverage. Please join the low traffic [EMAIL PROTECTED] list for cross 
sub-project announcements.)

 * Hadoop Camp with lots of talks about Hadoop
   o Introduction to Hadoop by Owen O'Malley
   o A Tour of Apache Hadoop by Tom White
   o Programming Hadoop Map/Reduce by Arun Murthy
   o Hadoop at Yahoo! by Eric Baldeschwieler
   o Hadoop Futures Panel
   o Using Hadoop for an Intranet Seach Engine by Shivakumar 
Vaithyanthan
   o Cloud Computing Testbed by Thomas Sandholm
   o Improving Virtualization and Performance Tracing of Hadoop with 
Open Solaris
   by George Porter
   o An Insight into Hadoop Usage at Facebook by Dhruba Borthakur
   o Pig by Alan Gates
   o Zookeeper, Coordinating the Distributed Application by Ben Reed
   o Querying JSON Data on Hadoop using Jaql by Kevin Beyer
   o HBase by Michael Stack
 * Hadoop training on Practical Problem Solving in Hadoop
 * Cloudera is providing a test Hadoop cluster and a Hadoop hacking contest.

There is also a new Hadoop tutorial available.

-- Owen


ApacheCon US 2008

2008-10-31 Thread Owen O'Malley
Just a reminder that ApacheCon US is next week in New Orleans. There  
will be a lot of Hadoop developers and talks. (I'm CC'ing core-user  
because it has the widest coverage. Please join the low traffic  
[EMAIL PROTECTED] list for cross sub-project announcements.)


* Hadoop Camp with lots of talks about Hadoop
  o Introduction to Hadoop by Owen O'Malley
  o A Tour of Apache Hadoop by Tom White
  o Programming Hadoop Map/Reduce by Arun Murthy
  o Hadoop at Yahoo! by Eric Baldeschwieler
  o Hadoop Futures Panel
  o Using Hadoop for an Intranet Seach Engine by Shivakumar  
Vaithyanthan

  o Cloud Computing Testbed by Thomas Sandholm
  o Improving Virtualization and Performance Tracing of  
Hadoop with Open Solaris

  by George Porter
  o An Insight into Hadoop Usage at Facebook by Dhruba  
Borthakur

  o Pig by Alan Gates
  o Zookeeper, Coordinating the Distributed Application by  
Ben Reed

  o Querying JSON Data on Hadoop using Jaql by Kevin Beyer
  o HBase by Michael Stack
* Hadoop training on Practical Problem Solving in Hadoop
* Cloudera is providing a test Hadoop cluster and a Hadoop  
hacking contest.


There is also a new Hadoop tutorial available.

-- Owen

Re: SecondaryNameNode on separate machine

2008-10-31 Thread Doug Cutting

Otis Gospodnetic wrote:

Konstantin & Co, please correct me if I'm wrong, but looking at 
hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN 
*Web UI*.  In other words, this is where we people go look at the NN.

The secondary NN must then be using only the Primary NN URL specified in fs.default.name. 
 This URL looks like hdfs://name-node-hostname-here/.  Something in Hadoop then knows the 
exact port for the Primary NN based on the URI schema (e.g. "hdfs://") in this 
URL.

Is this correct?


Yes.  The default port for an HDFS URI is 8020 (NameNode.DEFAULT_PORT). 
 The value of fs.default.name is used by HDFS.  When starting the 
namenode or datanodes, this must be an HDFS URI.  If this names an 
explicit port, then that will be used, otherwise the default, 8020 will 
be used.


The default port for HTTP URIs is 80, but the namenode typically runs 
its web UI on 50070 (the default for dfs.http.address).


Doug


Status FUSE-Support of HDFS

2008-10-31 Thread Robert Krüger

Hi,

could anyone tell me what the current Status of FUSE support for HDFS
is? Is this something that can be expected to be usable in a few
weeks/months in a production environment? We have been really
happy/successful with HDFS in our production system. However, some
software we use in our application simply requires an OS-Level file
system which currently requires us to do a lot of copying between HDFS
and a regular file system for processes which require that software and
FUSE support would really eliminate that one disadvantage we have with
HDFS. We wouldn't even require the performance of that to be outstanding
because just by eliminatimng the copy step, we would greatly increase
the thruput of those processes.

Thanks for sharing any thoughts on this.

Regards,

Robert


Re: [core-user] Help deflating output files

2008-10-31 Thread Martin Davidsson

You can override this property by passing in -jobconf
mapred.output.compress=false to the hadoop binary, e.g.

hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.18.0-streaming.jar 
-input "/user/root/input" -mapper 'cat' -reducer 'wc -l' -output
"/user/root/output"   -jobconf mapred.job.name="Experiment" -jobconf
mapred.output.compress=false

-- Martin


Jim R. Wilson wrote:
> 
> Hi all,
> 
> I'm using hadoop-streaming to execute Python jobs in an EC2 cluster.
> The output directory in HDFS has part-0.deflate files - how can I
> deflate them back into regular text?
> 
> In my hadoop-site.xml, I unfortunately have:
> 
>   mapred.output.compress
>   true
> 
> 
>   mapred.output.compression.type
>   BLOCK
> 
> 
> Of course, I could re-build my AMI's without this option, but is there
> some way I can read my deflate files without going through that
> hassle?  I'm hoping there's a command-line program to read these files
> since I'm none of my code is Java.
> 
> Thanks in advance for any help. :)
> 
> -- Jim R. Wilson (jimbojw)
> 
> 

-- 
View this message in context: 
http://www.nabble.com/-core-user--Help-deflating-output-files-tp17658751p20268639.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



RE: Browse HDFS file in URL

2008-10-31 Thread Malcolm Matalka
The Hadoop file API allows you to open a file based on URL


Path file = new
Path("hdfs://hadoop00:54313/user/hadoop/conflated.20081016/part-9");
JobConf job = new JobConf(new Configuration(),
ReadFileHadoop.class);

job.setJobName("test");

FileSystem fs = file.getFileSystem(job);
FSDataInputStream fileIn = fs.open(file);



-Original Message-
From: Neal Lee (RDLV) [mailto:[EMAIL PROTECTED] 
Sent: Friday, October 31, 2008 4:00
To: core-user@hadoop.apache.org
Cc: Neal Lee (RDLV)
Subject: Browse HDFS file in URL

Hi All,

 

I'm wondering that can I browse a HDFS file in URL (ex.
http://host/test.jpeg) so that I can show this file on my webapp
directly.

 

Thanks,

Neal

This correspondence is from Cyberlink Corp. and is intended only for use
by the recipient named herein, 
and may contain privileged, proprietary and/or confidential information,
and is intended only to be seen 
and used by named addressee(s). You are notified that any discussion,
dissemination, distribution or copying of 
this correspondence and any attachments, is strictly prohibited, unless
otherwise authorized or consented 
to in writing by the sender. If you have received this correspondence in
error, please notify the sender immediately, 
and please permanently delete the original and any copies of it and any
attachment and destroy any
related printouts without reading or copying them.


Re: hostname in logs

2008-10-31 Thread Steve Loughran

Alex Loddengaard wrote:

Thanks, Steve.  I'll look in to this patch.  As a temporary solution I use a
log4j variable to manually set a "hostname" private field in the Appender.
 This solution is rather annoying, but it'll work fro now.
Thanks again.


what about having the task tracker pass down a some jvm properties of 
interest, like hostname/processname. I've done things in the past 
(testing) that stored stuff by hostname, which works with 1 process per 
host. once you start running lots of processes, you want more detail.


Browse HDFS file in URL

2008-10-31 Thread Neal Lee (RDLV)
Hi All,

 

I'm wondering that can I browse a HDFS file in URL (ex.
http://host/test.jpeg) so that I can show this file on my webapp
directly.

 

Thanks,

Neal

This correspondence is from Cyberlink Corp. and is intended only for use by the 
recipient named herein, 
and may contain privileged, proprietary and/or confidential information, and is 
intended only to be seen 
and used by named addressee(s). You are notified that any discussion, 
dissemination, distribution or copying of 
this correspondence and any attachments, is strictly prohibited, unless 
otherwise authorized or consented 
to in writing by the sender. If you have received this correspondence in error, 
please notify the sender immediately, 
and please permanently delete the original and any copies of it and any 
attachment and destroy any
related printouts without reading or copying them.


Re: hostname in logs

2008-10-31 Thread Alex Loddengaard
Thanks, Steve.  I'll look in to this patch.  As a temporary solution I use a
log4j variable to manually set a "hostname" private field in the Appender.
 This solution is rather annoying, but it'll work fro now.
Thanks again.

Alex

On Fri, Oct 31, 2008 at 3:58 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:

> Alex Loddengaard wrote:
>
>> I'd like my log messages to display the hostname of the node that they
>> were
>> outputted on.  Sure, this information can be grabbed from the log
>> filename,
>> but I would like each log message to also have the hostname.  I don't
>> think
>> log4j provides support to include the hostname in a log, so I've tried
>> programmatically inserting the hostname with the following three
>> approaches.
>>  These are all within a log4j Appender.
>> -Using exec to run "hostname" from the command line.  This returns null.
>> -Using InetAddress.getLocalHost().getHostName().  This returns null.
>> -Using InetAddress.getLocalHost().getHostAddress().  This returns null.
>>
>
> You sure your real/virtual hosts networking is set up right? I've seen
> problems in hadoop there
> https://issues.apache.org/jira/browse/HADOOP-3426
> Have a look/apply that patch and see what happens
>
>  Each of these approaches works in an isolated test, but they all return
>> null
>> when in Hadoop's context.  I believe I'd be able to get the hostname with
>> a
>> Java call to a Hadoop configuration method if I were in a Mapper or
>> Reducer,
>> but because I'm in a log4j Appender, I don't have access to any of
>> Hadoop's
>> configuration APIs.  How can I get the hostname?
>>
>
> Log4J appenders should have access to the hostname info, But you are going
> to risk time and trouble if you do that in every operation; every new
> process runs a risk of a 30s delay even if you cache it from then on. It is
> usually a lot faster/easier just to push out  the IP address, as that
> doesn't trigger a reverse DNS lookup or anything.
>


Re: hostname in logs

2008-10-31 Thread Steve Loughran

Alex Loddengaard wrote:

I'd like my log messages to display the hostname of the node that they were
outputted on.  Sure, this information can be grabbed from the log filename,
but I would like each log message to also have the hostname.  I don't think
log4j provides support to include the hostname in a log, so I've tried
programmatically inserting the hostname with the following three approaches.
 These are all within a log4j Appender.
-Using exec to run "hostname" from the command line.  This returns null.
-Using InetAddress.getLocalHost().getHostName().  This returns null.
-Using InetAddress.getLocalHost().getHostAddress().  This returns null.


You sure your real/virtual hosts networking is set up right? I've seen 
problems in hadoop there

https://issues.apache.org/jira/browse/HADOOP-3426
Have a look/apply that patch and see what happens


Each of these approaches works in an isolated test, but they all return null
when in Hadoop's context.  I believe I'd be able to get the hostname with a
Java call to a Hadoop configuration method if I were in a Mapper or Reducer,
but because I'm in a log4j Appender, I don't have access to any of Hadoop's
configuration APIs.  How can I get the hostname?


Log4J appenders should have access to the hostname info, But you are 
going to risk time and trouble if you do that in every operation; every 
new process runs a risk of a 30s delay even if you cache it from then 
on. It is usually a lot faster/easier just to push out  the IP address, 
as that doesn't trigger a reverse DNS lookup or anything.


Re: TaskTrackers disengaging from JobTracker

2008-10-31 Thread Aaron Kimball
To complete the picture: not only was our network swamped, I realized
tonight that the NameNode/JobTracker was running on a 99% full disk (it hit
100% full about thirty minutes ago). That poor JobTracker was fighting
against a lot of odds. As soon as we upgrade to a bigger disk and switch it
back on, I'll apply the supplied patch to the cluster.

Thank you for looking into this!
- Aaron

On Thu, Oct 30, 2008 at 3:42 PM, Raghu Angadi <[EMAIL PROTECTED]> wrote:

> Raghu Angadi wrote:
>
>> Devaraj fwded the stacks that Aaron sent. As he suspected there is a
>> deadlock in RPC server. I will file a blocker for 0.18 and above. This
>> deadlock is more likely on a busy network.
>>
>>
> Aaron,
>
> Could you try the patch attached to
> https://issues.apache.org/jira/browse/HADOOP-4552 ?
>
> Thanks,
> Raghu.
>