Re: Map-Reduce Slow Down

2009-04-17 Thread Mithila Nagendra
Hey Jason The problem s fixed! :) My network admin had messed something up! Now it works! Thanks for your help! Mithila On Thu, Apr 16, 2009 at 11:58 PM, Mithila Nagendra wrote: > Thanks Jason! This helps a lot. I m planning to talk to my network admin > tomorrow. I hoping he ll be able to fix

Re: Map-Reduce Slow Down

2009-04-16 Thread Mithila Nagendra
Thanks Jason! This helps a lot. I m planning to talk to my network admin tomorrow. I hoping he ll be able to fix this problem. Mithila On Fri, Apr 17, 2009 at 9:00 AM, jason hadoop wrote: > Assuming you are on a linux box, on both machines > verify that the servers are listening on the ports you

Re: Map-Reduce Slow Down

2009-04-16 Thread jason hadoop
Assuming you are on a linux box, on both machines verify that the servers are listening on the ports you expect via netstat -a -n -t -p -a show sockets accepting connections -n do not translate ip addresses to host names -t only list tcp sockets -p list the pid/process name on the machine 192.168.

Re: Map-Reduce Slow Down

2009-04-16 Thread Mithila Nagendra
Thanks! I ll see what I can find out. On Fri, Apr 17, 2009 at 4:55 AM, jason hadoop wrote: > The firewall was run at system startup, I think there was a > /etc/sysconfig/iptables file present which triggered the firewall. > I don't currently have access to any centos 5 machines so I can't easily

Re: Map-Reduce Slow Down

2009-04-16 Thread jason hadoop
The firewall was run at system startup, I think there was a /etc/sysconfig/iptables file present which triggered the firewall. I don't currently have access to any centos 5 machines so I can't easily check. On Thu, Apr 16, 2009 at 6:54 PM, jason hadoop wrote: > The kickstart script was somethin

Re: Map-Reduce Slow Down

2009-04-16 Thread jason hadoop
The kickstart script was something that the operations staff was using to initialize new machines, I never actually saw the script, just figured out that there was a firewall in place. On Thu, Apr 16, 2009 at 1:28 PM, Mithila Nagendra wrote: > Jason: the kickstart script - was it something you

Re: Map-Reduce Slow Down

2009-04-16 Thread Mithila Nagendra
Jason: the kickstart script - was it something you wrote or is it run when the system turns on? Mithila On Thu, Apr 16, 2009 at 1:06 AM, Mithila Nagendra wrote: > Thanks Jason! Will check that out. > Mithila > > > On Thu, Apr 16, 2009 at 5:23 AM, jason hadoop wrote: > >> Double check that there

Re: Map-Reduce Slow Down

2009-04-16 Thread Mithila Nagendra
Thanks Jason! Will check that out. Mithila On Thu, Apr 16, 2009 at 5:23 AM, jason hadoop wrote: > Double check that there is no firewall in place. > At one point a bunch of new machines were kickstarted and placed in a > cluster and they all failed with something similar. > It turned out the kick

Re: Map-Reduce Slow Down

2009-04-15 Thread jason hadoop
Double check that there is no firewall in place. At one point a bunch of new machines were kickstarted and placed in a cluster and they all failed with something similar. It turned out the kickstart script turned enabled the firewall with a rule that blocked ports in the 50k range. It took us a whi

Re: Map-Reduce Slow Down

2009-04-15 Thread Mithila Nagendra
Hi Aaron I will look into that thanks! I spoke to the admin who overlooks the cluster. He said that the gateway comes in to the picture only when one of the nodes communicates with a node outside of the cluster. But in my case the communication is carried out between the nodes which all belong to

Re: Map-Reduce Slow Down

2009-04-15 Thread Aaron Kimball
Hi, I wrote a blog post a while back about connecting nodes via a gateway. See http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/ This assumes that the client is outside the gateway and all datanodes/namenode are inside, but the same principles apply. You'll just

Re: Map-Reduce Slow Down

2009-04-15 Thread Ravi Phulari
Looks like your NameNode is down . Verify if hadoop process are running ( jps should show you all java running process). If your hadoop process are running try restarting your hadoop process . I guess this problem is due to your fsimage not being correct . You might have to format your namenode.

Re: Map-Reduce Slow Down

2009-04-15 Thread Mithila Nagendra
The log file runs into thousands of line with the same message being displayed every time. On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra wrote: > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the > following in it: > > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.Dat

Re: Map-Reduce Slow Down

2009-04-15 Thread Mithila Nagendra
The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has the following in it: 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = node19/127.0.0.1 STARTU

Re: Map-Reduce Slow Down

2009-04-14 Thread Mithila Nagendra
Also, Would the way the port is accessed change if all these node are connected through a gateway? I mean in the hadoop-site.xml file? The Ubuntu systems we worked with earlier didnt have a gateway. Mithila On Tue, Apr 14, 2009 at 9:48 PM, Mithila Nagendra wrote: > Aaron: Which log file do I loo

Re: Map-Reduce Slow Down

2009-04-14 Thread Mithila Nagendra
Aaron: Which log file do I look into - there are alot of them. Here s what the error looks like: [mith...@node19:~]$ cd hadoop [mith...@node19:~/hadoop]$ bin/hadoop dfs -ls 09/04/14 10:09:29 INFO ipc.Client: Retrying connect to server: node18/ 192.168.0.18:54310. Already tried 0 time(s). 09/04/14 1

Re: Map-Reduce Slow Down

2009-04-14 Thread Aaron Kimball
Are there any error messages in the log files on those nodes? - Aaron On Tue, Apr 14, 2009 at 9:03 AM, Mithila Nagendra wrote: > I ve drawn a blank here! Can't figure out what s wrong with the ports. I > can > ssh between the nodes but cant access the DFS from the slaves - says "Bad > connection

Re: Map-Reduce Slow Down

2009-04-14 Thread Mithila Nagendra
I ve drawn a blank here! Can't figure out what s wrong with the ports. I can ssh between the nodes but cant access the DFS from the slaves - says "Bad connection to DFS". Master seems to be fine. Mithila On Tue, Apr 14, 2009 at 4:28 AM, Mithila Nagendra wrote: > Yes I can.. > > > On Mon, Apr 13,

Re: Map-Reduce Slow Down

2009-04-13 Thread Mithila Nagendra
Yes I can.. On Mon, Apr 13, 2009 at 5:12 PM, Jim Twensky wrote: > Can you ssh between the nodes? > > -jim > > On Mon, Apr 13, 2009 at 6:49 PM, Mithila Nagendra > wrote: > > > Thanks Aaron. > > Jim: The three clusters I setup had ubuntu running on them and the dfs > was > > accessed at port 5431

Re: Map-Reduce Slow Down

2009-04-13 Thread Jim Twensky
Can you ssh between the nodes? -jim On Mon, Apr 13, 2009 at 6:49 PM, Mithila Nagendra wrote: > Thanks Aaron. > Jim: The three clusters I setup had ubuntu running on them and the dfs was > accessed at port 54310. The new cluster which I ve setup has Red Hat Linux > release 7.2 (Enigma)running on

Re: Map-Reduce Slow Down

2009-04-13 Thread Mithila Nagendra
Thanks Aaron. Jim: The three clusters I setup had ubuntu running on them and the dfs was accessed at port 54310. The new cluster which I ve setup has Red Hat Linux release 7.2 (Enigma)running on it. Now when I try to access the dfs from one of the slaves i get the following response: dfs cannot be

Re: Map-Reduce Slow Down

2009-04-13 Thread Jim Twensky
Mithila, You said all the slaves were being utilized in the 3 node cluster. Which application did you run to test that and what was your input size? If you tried the word count application on a 516 MB input file on both cluster setups, than some of your nodes in the 15 node cluster may not be runn

Re: Map-Reduce Slow Down

2009-04-13 Thread Aaron Kimball
in hadoop-*-examples.jar, use "randomwriter" to generate the data and "sort" to sort it. - Aaron On Sun, Apr 12, 2009 at 9:33 PM, Pankil Doshi wrote: > Your data is too small I guess for 15 clusters ..So it might be overhead > time of these clusters making your total MR jobs more time consuming.

Re: Map-Reduce Slow Down

2009-04-12 Thread Pankil Doshi
Your data is too small I guess for 15 clusters ..So it might be overhead time of these clusters making your total MR jobs more time consuming. I guess you will have to try with larger set of data.. Pankil On Sun, Apr 12, 2009 at 6:54 PM, Mithila Nagendra wrote: > Aaron > > That could be the issu

Re: Map-Reduce Slow Down

2009-04-12 Thread Mithila Nagendra
Aaron That could be the issue, my data is just 516MB - wouldn't this see a bit of speed up? Could you guide me to the example? I ll run my cluster on it and see what I get. Also for my program I had a java timer running to record the time taken to complete execution. Does Hadoop have an inbuilt ti

Re: Map-Reduce Slow Down

2009-04-12 Thread Aaron Kimball
Virtually none of the examples that ship with Hadoop are designed to showcase its speed. Hadoop's speedup comes from its ability to process very large volumes of data (starting around, say, tens of GB per job, and going up in orders of magnitude from there). So if you are timing the pi calculator (

Map-Reduce Slow Down

2009-04-12 Thread Mithila Nagendra
Hey all I recently setup a three node hadoop cluster and ran an examples on it. It was pretty fast, and all the three nodes were being used (I checked the log files to make sure that the slaves are utilized). Now I ve setup another cluster consisting of 15 nodes. I ran the same example, but instea