skip setting output path for a sequential MR job..

2009-03-30 Thread some speed
Hello everyone, Is it necessary to redirect the ouput of reduce to a file? When I am trying to run the same M-R job more than once, it throws an error that the output file already exists. I dont want to use command line args so I hard coded the file name into the program. So, Is there a way , I c

Problem: Some blocks remain under replicated

2009-03-30 Thread ilayaraja
Hello ! I am trying to increase the replication factor of a directory in our hadoop dfs from 1 to 2. I observe that some of the blocks (12 out of 400) always remain under replicated, throwing the following message when I do an 'fsck' : Under replicated blk_908440823603162800

swap hard drives between datanodes

2009-03-30 Thread Mike Andrews
i tried swapping two hot-swap sata drives between two nodes in a cluster, but it didn't work: after restart, one of the datanodes shut down since namenode said it reported a block belonging to another node, which i guess namenode thinks is a fatal error. is this caused by the hadoop/datanode/curren

Re: Need more detail on Hadoop architecture

2009-03-30 Thread Lukáš Vlček
BTW: there are at least two booksHadoop: The definitive guide Hadoop in Action both of wich I can recommend anyway, simple web searching on the topic should give you a lot of information. Lukas 2009/3/30 Lukáš Vlček > So

Re: Problem: Some blocks remain under replicated

2009-03-30 Thread Hairong Kuang
Which version of HADOOP are you running? Your cluster might have hit HADOOP-5465. Hairong On 3/29/09 10:24 PM, "ilayaraja" wrote: > Hello ! > > I am trying to increase the replication factor of a directory in our hadoop > dfs from 1 to 2. > I observe that some of the blocks (12 out of 400) al

Re: Need more detail on Hadoop architecture

2009-03-30 Thread Lukáš Vlček
Sorry ... :-)I was too quick and I didn't notice that you already pointed out this link. On Mon, Mar 30, 2009 at 11:57 PM, Lukáš Vlček wrote: > Hi, > This tutorial can be a good start: > http://hadoop.apache.org/core/docs/current/mapred_tutorial.html > > Regards, > Lukas > > > On Mon, Mar 30, 20

Re: Need more detail on Hadoop architecture

2009-03-30 Thread Lukáš Vlček
Hi, This tutorial can be a good start: http://hadoop.apache.org/core/docs/current/mapred_tutorial.html Regards, Lukas On Mon, Mar 30, 2009 at 11:49 PM, I LOVE Hadoop :) < kusanagiyang.had...@gmail.com> wrote: > Hello, > > I want to know more details on using Hadoop framework to sort input data.

Need more detail on Hadoop architecture

2009-03-30 Thread I LOVE Hadoop :)
Hello, I want to know more details on using Hadoop framework to sort input data. In example, sorting can be as simple as using identical map and reduce class, and just allow the framework to do its basic work. >From article http://hadoop.apache.org/core/docs/r0.19.1/mapred_tutorial.html#Overview,

Re: hadoop-a small doubt

2009-03-30 Thread Aaron Kimball
"Becoming a part of the cluster" implies that you're running the daemons on the node. You need the Hadoop JARs on the client machine so that you can use FileSystem.open(), etc. And the conf/hadoop-site.xml file should indicate the NameNode's address in its fs.default.name parameter -- that's how th

Re: a doubt regarding an appropriate file system

2009-03-30 Thread Aaron Kimball
The short version is that the in-memory structures used by the NameNode are "heavy" on a per-file basis, and light on a per-block basis. So petabytes of files that are only a few hundred KB will require the NameNode to have a huge amount of memory to hold the filesystem data structures. More than y

Re: Socket closed Exception

2009-03-30 Thread lohit
Thanks Raghu, is the log level at DEBUG? I do not see any socket close exception at NameNode at WARN/INFO level. Lohit - Original Message From: Raghu Angadi To: core-user@hadoop.apache.org Sent: Monday, March 30, 2009 12:08:19 PM Subject: Re: Socket closed Exception If it is NameNo

Re: Socket closed Exception

2009-03-30 Thread Raghu Angadi
If it is NameNode, then there is probably a log about closing the socket around that time. Raghu. lohit wrote: Recently we are seeing lot of Socket closed exception in our cluster. Many task's open/create/getFileInfo calls get back 'SocketException' with message 'Socket closed'. We seem to

RE: Socket closed Exception

2009-03-30 Thread Koji Noguchi
Lohit, You're right. We saw " java.net.SocketTimeoutException: timed out waiting for rpc response" and not Socket closed exception. If you're getting "closed exception", then I don't remember seeing that problem on our clusters. Our users often report "Socket closed exception" as a problem, but

Re: Socket closed Exception

2009-03-30 Thread lohit
Thanks Koji. If I look at the code, NameNode (RPC Server) seems to tear down idle connections. Did you see 'Socket closed' exception instead of 'timed out waiting for socket'? We seem to hit the 'Socket closed' exception where client do not timeout, but get back socket closed exception when th

Re: Typical hardware configurations

2009-03-30 Thread Scott Carey
On 3/30/09 4:41 AM, "Steve Loughran" wrote: > Ryan Rawson wrote: > >> You should also be getting 64-bit systems and running a 64 bit distro on it >> and a jvm that has -d64 available. > > For the namenode yes. For the others, you will take a fairly big memory > hit (1.5X object size) due to the

Re: hadoop-a small doubt

2009-03-30 Thread Brian Bockelman
On Mar 30, 2009, at 3:59 AM, W wrote: I already try the mountable HDFS, both webDav and FUSE approach, it seem both of it is not production ready .. Depends on what you define to be "production ready"; for a business serving HDFS to external customers directly, no. But then again, it's

Re: hadoop-a small doubt

2009-03-30 Thread Brian Bockelman
On Mar 30, 2009, at 3:53 AM, deepya wrote: Do you mean to say the node from which we want to access hdfs should also have hadoop installed on it??If that is the case then doesnt that node also become apart of the cluster?? Yes. You need the Hadoop client installed to access HDFS. You

Re: Typical hardware configurations

2009-03-30 Thread Steve Loughran
Ryan Rawson wrote: You should also be getting 64-bit systems and running a 64 bit distro on it and a jvm that has -d64 available. For the namenode yes. For the others, you will take a fairly big memory hit (1.5X object size) due to the longer pointers. JRockit has special compressed pointers

Re: Using HDFS to serve www requests

2009-03-30 Thread Steve Loughran
Edward Capriolo wrote: It is a little more natural to connect to HDFS from apache tomcat. This will allow you to skip the FUSE mounts and just use the HDFS-API. I have modified this code to run inside tomcat. http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample I will not testify to how well

Re: virtualization with hadoop

2009-03-30 Thread Steve Loughran
Oliver Fischer wrote: Hello Vishal, I did the same some weeks ago. The most important fact is, that it works. But it is horrible slow if you not have enough ram and multiple disks since all I/o-Operations go to the same disk. they may go to separate disks underneath, but performance is bad as

Re: JNI and calling Hadoop jar files

2009-03-30 Thread Steve Loughran
jason hadoop wrote: The exception reference to *org.apache.hadoop.hdfs.DistributedFileSystem*, implies strongly that a hadoop-default.xml file, or at least a job.xml file is present. Since hadoop-default.xml is bundled into the hadoop-0.X.Y-core.jar, the assumption is that the core jar is availa

Re: hadoop-a small doubt

2009-03-30 Thread nitesh bhatia
Hi You can ssh to them from any PC in your domain. --nitesh On Sun, Mar 29, 2009 at 10:59 AM, deepya wrote: > > Hi, > I am SreeDeepya doing MTech in IIIT.I am working on a project named cost > effective and scalable storage server.I configured a small hadoop cluster > with only two nodes one n

Re: hadoop-a small doubt

2009-03-30 Thread W
I already try the mountable HDFS, both webDav and FUSE approach, it seem both of it is not production ready .. CMIIW Best Regards, Wildan --- OpenThink Labs www.tobethink.com Aligning IT and Education >> 021-99325243 Y! : hawking_123 Linkedln : http://www.linkedin.com/in/wildanmaulana On Su

Re: hadoop-a small doubt

2009-03-30 Thread deepya
Do you mean to say the node from which we want to access hdfs should also have hadoop installed on it??If that is the case then doesnt that node also become apart of the cluster?? Can you please be a bit more clear Sagar Naik-3 wrote: > > Yes u can > Java Client : > Copy the conf dir (same as o

Re: a doubt regarding an appropriate file system

2009-03-30 Thread deepya
Hi, Thanks. Can you please specify in detail what kind of problems I will face if I use Hadoop for this project. SreeDeepya TimRobertson100 wrote: > > I believe Hadoop is not best suited to many small files like yours but > is really geared to handling very large files that get split int