The kickstart script was something that the operations staff was using to initialize new machines, I never actually saw the script, just figured out that there was a firewall in place.
On Thu, Apr 16, 2009 at 1:28 PM, Mithila Nagendra <mnage...@asu.edu> wrote: > Jason: the kickstart script - was it something you wrote or is it run when > the system turns on? > Mithila > > On Thu, Apr 16, 2009 at 1:06 AM, Mithila Nagendra <mnage...@asu.edu> > wrote: > > > Thanks Jason! Will check that out. > > Mithila > > > > > > On Thu, Apr 16, 2009 at 5:23 AM, jason hadoop <jason.had...@gmail.com > >wrote: > > > >> Double check that there is no firewall in place. > >> At one point a bunch of new machines were kickstarted and placed in a > >> cluster and they all failed with something similar. > >> It turned out the kickstart script turned enabled the firewall with a > rule > >> that blocked ports in the 50k range. > >> It took us a while to even think to check that was not a part of our > >> normal > >> machine configuration > >> > >> On Wed, Apr 15, 2009 at 11:04 AM, Mithila Nagendra <mnage...@asu.edu> > >> wrote: > >> > >> > Hi Aaron > >> > I will look into that thanks! > >> > > >> > I spoke to the admin who overlooks the cluster. He said that the > gateway > >> > comes in to the picture only when one of the nodes communicates with a > >> node > >> > outside of the cluster. But in my case the communication is carried > out > >> > between the nodes which all belong to the same cluster. > >> > > >> > Mithila > >> > > >> > On Wed, Apr 15, 2009 at 8:59 PM, Aaron Kimball <aa...@cloudera.com> > >> wrote: > >> > > >> > > Hi, > >> > > > >> > > I wrote a blog post a while back about connecting nodes via a > gateway. > >> > See > >> > > > >> > > >> > http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/ > >> > > > >> > > This assumes that the client is outside the gateway and all > >> > > datanodes/namenode are inside, but the same principles apply. You'll > >> just > >> > > need to set up ssh tunnels from every datanode to the namenode. > >> > > > >> > > - Aaron > >> > > > >> > > > >> > > On Wed, Apr 15, 2009 at 10:19 AM, Ravi Phulari < > >> rphul...@yahoo-inc.com > >> > >wrote: > >> > > > >> > >> Looks like your NameNode is down . > >> > >> Verify if hadoop process are running ( jps should show you all > java > >> > >> running process). > >> > >> If your hadoop process are running try restarting your hadoop > process > >> . > >> > >> I guess this problem is due to your fsimage not being correct . > >> > >> You might have to format your namenode. > >> > >> Hope this helps. > >> > >> > >> > >> Thanks, > >> > >> -- > >> > >> Ravi > >> > >> > >> > >> > >> > >> On 4/15/09 10:15 AM, "Mithila Nagendra" <mnage...@asu.edu> wrote: > >> > >> > >> > >> The log file runs into thousands of line with the same message > being > >> > >> displayed every time. > >> > >> > >> > >> On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra < > mnage...@asu.edu> > >> > >> wrote: > >> > >> > >> > >> > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has > >> the > >> > >> > following in it: > >> > >> > > >> > >> > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode: > >> > >> STARTUP_MSG: > >> > >> > /************************************************************ > >> > >> > STARTUP_MSG: Starting DataNode > >> > >> > STARTUP_MSG: host = node19/127.0.0.1 > >> > >> > STARTUP_MSG: args = [] > >> > >> > STARTUP_MSG: version = 0.18.3 > >> > >> > STARTUP_MSG: build = > >> > >> > > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18-r > >> > >> > 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009 > >> > >> > ************************************************************/ > >> > >> > 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > >> > >> > 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > >> > >> > 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > >> > >> > 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > >> > >> > 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > >> > >> > 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > >> > >> > 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > >> > >> > 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > >> > >> > 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > >> > >> > 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > >> > >> > 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server at > >> > >> node18/ > >> > >> > 192.168.0.18:54310 not available yet, Zzzzz... > >> > >> > 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > >> > >> > 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > >> > >> > 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > >> > >> > 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s). > >> > >> > 2009-04-14 10:08:28,065 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 4 time(s). > >> > >> > 2009-04-14 10:08:29,075 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 5 time(s). > >> > >> > 2009-04-14 10:08:30,085 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 6 time(s). > >> > >> > 2009-04-14 10:08:31,095 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 7 time(s). > >> > >> > 2009-04-14 10:08:32,105 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 8 time(s). > >> > >> > 2009-04-14 10:08:33,115 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 9 time(s). > >> > >> > 2009-04-14 10:08:33,116 INFO org.apache.hadoop.ipc.RPC: Server at > >> > >> node18/ > >> > >> > 192.168.0.18:54310 not available yet, Zzzzz... > >> > >> > 2009-04-14 10:08:35,135 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s). > >> > >> > 2009-04-14 10:08:36,145 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s). > >> > >> > 2009-04-14 10:08:37,155 INFO org.apache.hadoop.ipc.Client: > Retrying > >> > >> connect > >> > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s). > >> > >> > > >> > >> > > >> > >> > Hmmm I still cant figure it out.. > >> > >> > > >> > >> > Mithila > >> > >> > > >> > >> > > >> > >> > On Tue, Apr 14, 2009 at 10:22 PM, Mithila Nagendra < > >> mnage...@asu.edu > >> > >> >wrote: > >> > >> > > >> > >> >> Also, Would the way the port is accessed change if all these > node > >> are > >> > >> >> connected through a gateway? I mean in the hadoop-site.xml file? > >> The > >> > >> Ubuntu > >> > >> >> systems we worked with earlier didnt have a gateway. > >> > >> >> Mithila > >> > >> >> > >> > >> >> On Tue, Apr 14, 2009 at 9:48 PM, Mithila Nagendra < > >> mnage...@asu.edu > >> > >> >wrote: > >> > >> >> > >> > >> >>> Aaron: Which log file do I look into - there are alot of them. > >> Here > >> > s > >> > >> >>> what the error looks like: > >> > >> >>> [mith...@node19:~]$ cd hadoop > >> > >> >>> [mith...@node19:~/hadoop]$ bin/hadoop dfs -ls > >> > >> >>> 09/04/14 10:09:29 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 0 time(s). > >> > >> >>> 09/04/14 10:09:30 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 1 time(s). > >> > >> >>> 09/04/14 10:09:31 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 2 time(s). > >> > >> >>> 09/04/14 10:09:32 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 3 time(s). > >> > >> >>> 09/04/14 10:09:33 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 4 time(s). > >> > >> >>> 09/04/14 10:09:34 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 5 time(s). > >> > >> >>> 09/04/14 10:09:35 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 6 time(s). > >> > >> >>> 09/04/14 10:09:36 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 7 time(s). > >> > >> >>> 09/04/14 10:09:37 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 8 time(s). > >> > >> >>> 09/04/14 10:09:38 INFO ipc.Client: Retrying connect to server: > >> > node18/ > >> > >> >>> 192.168.0.18:54310. Already tried 9 time(s). > >> > >> >>> Bad connection to FS. command aborted. > >> > >> >>> > >> > >> >>> Node19 is a slave and Node18 is the master. > >> > >> >>> > >> > >> >>> Mithila > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> On Tue, Apr 14, 2009 at 8:53 PM, Aaron Kimball < > >> aa...@cloudera.com > >> > >> >wrote: > >> > >> >>> > >> > >> >>>> Are there any error messages in the log files on those nodes? > >> > >> >>>> - Aaron > >> > >> >>>> > >> > >> >>>> On Tue, Apr 14, 2009 at 9:03 AM, Mithila Nagendra < > >> > mnage...@asu.edu> > >> > >> >>>> wrote: > >> > >> >>>> > >> > >> >>>> > I ve drawn a blank here! Can't figure out what s wrong with > >> the > >> > >> ports. > >> > >> >>>> I > >> > >> >>>> > can > >> > >> >>>> > ssh between the nodes but cant access the DFS from the > slaves > >> - > >> > >> says > >> > >> >>>> "Bad > >> > >> >>>> > connection to DFS". Master seems to be fine. > >> > >> >>>> > Mithila > >> > >> >>>> > > >> > >> >>>> > On Tue, Apr 14, 2009 at 4:28 AM, Mithila Nagendra < > >> > >> mnage...@asu.edu> > >> > >> >>>> > wrote: > >> > >> >>>> > > >> > >> >>>> > > Yes I can.. > >> > >> >>>> > > > >> > >> >>>> > > > >> > >> >>>> > > On Mon, Apr 13, 2009 at 5:12 PM, Jim Twensky < > >> > >> jim.twen...@gmail.com > >> > >> >>>> > >wrote: > >> > >> >>>> > > > >> > >> >>>> > >> Can you ssh between the nodes? > >> > >> >>>> > >> > >> > >> >>>> > >> -jim > >> > >> >>>> > >> > >> > >> >>>> > >> On Mon, Apr 13, 2009 at 6:49 PM, Mithila Nagendra < > >> > >> >>>> mnage...@asu.edu> > >> > >> >>>> > >> wrote: > >> > >> >>>> > >> > >> > >> >>>> > >> > Thanks Aaron. > >> > >> >>>> > >> > Jim: The three clusters I setup had ubuntu running on > >> them > >> > and > >> > >> >>>> the dfs > >> > >> >>>> > >> was > >> > >> >>>> > >> > accessed at port 54310. The new cluster which I ve > setup > >> has > >> > >> Red > >> > >> >>>> Hat > >> > >> >>>> > >> Linux > >> > >> >>>> > >> > release 7.2 (Enigma)running on it. Now when I try to > >> access > >> > >> the > >> > >> >>>> dfs > >> > >> >>>> > from > >> > >> >>>> > >> > one > >> > >> >>>> > >> > of the slaves i get the following response: dfs cannot > be > >> > >> >>>> accessed. > >> > >> >>>> > When > >> > >> >>>> > >> I > >> > >> >>>> > >> > access the DFS throught the master there s no problem. > So > >> I > >> > >> feel > >> > >> >>>> there > >> > >> >>>> > a > >> > >> >>>> > >> > problem with the port. Any ideas? I did check the list > of > >> > >> slaves, > >> > >> >>>> it > >> > >> >>>> > >> looks > >> > >> >>>> > >> > fine to me. > >> > >> >>>> > >> > > >> > >> >>>> > >> > Mithila > >> > >> >>>> > >> > > >> > >> >>>> > >> > > >> > >> >>>> > >> > > >> > >> >>>> > >> > > >> > >> >>>> > >> > On Mon, Apr 13, 2009 at 2:58 PM, Jim Twensky < > >> > >> >>>> jim.twen...@gmail.com> > >> > >> >>>> > >> > wrote: > >> > >> >>>> > >> > > >> > >> >>>> > >> > > Mithila, > >> > >> >>>> > >> > > > >> > >> >>>> > >> > > You said all the slaves were being utilized in the 3 > >> node > >> > >> >>>> cluster. > >> > >> >>>> > >> Which > >> > >> >>>> > >> > > application did you run to test that and what was > your > >> > input > >> > >> >>>> size? > >> > >> >>>> > If > >> > >> >>>> > >> you > >> > >> >>>> > >> > > tried the word count application on a 516 MB input > file > >> on > >> > >> both > >> > >> >>>> > >> cluster > >> > >> >>>> > >> > > setups, than some of your nodes in the 15 node > cluster > >> may > >> > >> not > >> > >> >>>> be > >> > >> >>>> > >> running > >> > >> >>>> > >> > > at > >> > >> >>>> > >> > > all. Generally, one map job is assigned to each input > >> > split > >> > >> and > >> > >> >>>> if > >> > >> >>>> > you > >> > >> >>>> > >> > are > >> > >> >>>> > >> > > running your cluster with the defaults, the splits > are > >> 64 > >> > MB > >> > >> >>>> each. I > >> > >> >>>> > >> got > >> > >> >>>> > >> > > confused when you said the Namenode seemed to do all > >> the > >> > >> work. > >> > >> >>>> Can > >> > >> >>>> > you > >> > >> >>>> > >> > > check > >> > >> >>>> > >> > > conf/slaves and make sure you put the names of all > task > >> > >> >>>> trackers > >> > >> >>>> > >> there? I > >> > >> >>>> > >> > > also suggest comparing both clusters with a larger > >> input > >> > >> size, > >> > >> >>>> say > >> > >> >>>> > at > >> > >> >>>> > >> > least > >> > >> >>>> > >> > > 5 GB, to really see a difference. > >> > >> >>>> > >> > > > >> > >> >>>> > >> > > Jim > >> > >> >>>> > >> > > > >> > >> >>>> > >> > > On Mon, Apr 13, 2009 at 4:17 PM, Aaron Kimball < > >> > >> >>>> aa...@cloudera.com> > >> > >> >>>> > >> > wrote: > >> > >> >>>> > >> > > > >> > >> >>>> > >> > > > in hadoop-*-examples.jar, use "randomwriter" to > >> generate > >> > >> the > >> > >> >>>> data > >> > >> >>>> > >> and > >> > >> >>>> > >> > > > "sort" > >> > >> >>>> > >> > > > to sort it. > >> > >> >>>> > >> > > > - Aaron > >> > >> >>>> > >> > > > > >> > >> >>>> > >> > > > On Sun, Apr 12, 2009 at 9:33 PM, Pankil Doshi < > >> > >> >>>> > forpan...@gmail.com> > >> > >> >>>> > >> > > wrote: > >> > >> >>>> > >> > > > > >> > >> >>>> > >> > > > > Your data is too small I guess for 15 clusters > ..So > >> it > >> > >> >>>> might be > >> > >> >>>> > >> > > overhead > >> > >> >>>> > >> > > > > time of these clusters making your total MR jobs > >> more > >> > >> time > >> > >> >>>> > >> consuming. > >> > >> >>>> > >> > > > > I guess you will have to try with larger set of > >> data.. > >> > >> >>>> > >> > > > > > >> > >> >>>> > >> > > > > Pankil > >> > >> >>>> > >> > > > > On Sun, Apr 12, 2009 at 6:54 PM, Mithila Nagendra > < > >> > >> >>>> > >> mnage...@asu.edu> > >> > >> >>>> > >> > > > > wrote: > >> > >> >>>> > >> > > > > > >> > >> >>>> > >> > > > > > Aaron > >> > >> >>>> > >> > > > > > > >> > >> >>>> > >> > > > > > That could be the issue, my data is just 516MB > - > >> > >> wouldn't > >> > >> >>>> this > >> > >> >>>> > >> see > >> > >> >>>> > >> > a > >> > >> >>>> > >> > > > bit > >> > >> >>>> > >> > > > > of > >> > >> >>>> > >> > > > > > speed up? > >> > >> >>>> > >> > > > > > Could you guide me to the example? I ll run my > >> > cluster > >> > >> on > >> > >> >>>> it > >> > >> >>>> > and > >> > >> >>>> > >> > see > >> > >> >>>> > >> > > > what > >> > >> >>>> > >> > > > > I > >> > >> >>>> > >> > > > > > get. Also for my program I had a java timer > >> running > >> > to > >> > >> >>>> record > >> > >> >>>> > >> the > >> > >> >>>> > >> > > time > >> > >> >>>> > >> > > > > > taken > >> > >> >>>> > >> > > > > > to complete execution. Does Hadoop have an > >> inbuilt > >> > >> timer? > >> > >> >>>> > >> > > > > > > >> > >> >>>> > >> > > > > > Mithila > >> > >> >>>> > >> > > > > > > >> > >> >>>> > >> > > > > > On Mon, Apr 13, 2009 at 1:13 AM, Aaron Kimball > < > >> > >> >>>> > >> aa...@cloudera.com > >> > >> >>>> > >> > > > >> > >> >>>> > >> > > > > wrote: > >> > >> >>>> > >> > > > > > > >> > >> >>>> > >> > > > > > > Virtually none of the examples that ship with > >> > Hadoop > >> > >> >>>> are > >> > >> >>>> > >> designed > >> > >> >>>> > >> > > to > >> > >> >>>> > >> > > > > > > showcase its speed. Hadoop's speedup comes > from > >> > its > >> > >> >>>> ability > >> > >> >>>> > to > >> > >> >>>> > >> > > > process > >> > >> >>>> > >> > > > > > very > >> > >> >>>> > >> > > > > > > large volumes of data (starting around, say, > >> tens > >> > of > >> > >> GB > >> > >> >>>> per > >> > >> >>>> > >> job, > >> > >> >>>> > >> > > and > >> > >> >>>> > >> > > > > > going > >> > >> >>>> > >> > > > > > > up in orders of magnitude from there). So if > >> you > >> > are > >> > >> >>>> timing > >> > >> >>>> > >> the > >> > >> >>>> > >> > pi > >> > >> >>>> > >> > > > > > > calculator (or something like that), its > >> results > >> > >> won't > >> > >> >>>> > >> > necessarily > >> > >> >>>> > >> > > be > >> > >> >>>> > >> > > > > > very > >> > >> >>>> > >> > > > > > > consistent. If a job doesn't have enough > >> fragments > >> > >> of > >> > >> >>>> data > >> > >> >>>> > to > >> > >> >>>> > >> > > > allocate > >> > >> >>>> > >> > > > > > one > >> > >> >>>> > >> > > > > > > per each node, some of the nodes will also > just > >> go > >> > >> >>>> unused. > >> > >> >>>> > >> > > > > > > > >> > >> >>>> > >> > > > > > > The best example for you to run is to use > >> > >> randomwriter > >> > >> >>>> to > >> > >> >>>> > fill > >> > >> >>>> > >> up > >> > >> >>>> > >> > > > your > >> > >> >>>> > >> > > > > > > cluster with several GB of random data and > then > >> > run > >> > >> the > >> > >> >>>> sort > >> > >> >>>> > >> > > program. > >> > >> >>>> > >> > > > > If > >> > >> >>>> > >> > > > > > > that doesn't scale up performance from 3 > nodes > >> to > >> > >> 15, > >> > >> >>>> then > >> > >> >>>> > >> you've > >> > >> >>>> > >> > > > > > > definitely > >> > >> >>>> > >> > > > > > > got something strange going on. > >> > >> >>>> > >> > > > > > > > >> > >> >>>> > >> > > > > > > - Aaron > >> > >> >>>> > >> > > > > > > > >> > >> >>>> > >> > > > > > > > >> > >> >>>> > >> > > > > > > On Sun, Apr 12, 2009 at 8:39 AM, Mithila > >> Nagendra > >> > < > >> > >> >>>> > >> > > mnage...@asu.edu> > >> > >> >>>> > >> > > > > > > wrote: > >> > >> >>>> > >> > > > > > > > >> > >> >>>> > >> > > > > > > > Hey all > >> > >> >>>> > >> > > > > > > > I recently setup a three node hadoop > cluster > >> and > >> > >> ran > >> > >> >>>> an > >> > >> >>>> > >> > examples > >> > >> >>>> > >> > > on > >> > >> >>>> > >> > > > > it. > >> > >> >>>> > >> > > > > > > It > >> > >> >>>> > >> > > > > > > > was pretty fast, and all the three nodes > were > >> > >> being > >> > >> >>>> used > >> > >> >>>> > (I > >> > >> >>>> > >> > > checked > >> > >> >>>> > >> > > > > the > >> > >> >>>> > >> > > > > > > log > >> > >> >>>> > >> > > > > > > > files to make sure that the slaves are > >> > utilized). > >> > >> >>>> > >> > > > > > > > > >> > >> >>>> > >> > > > > > > > Now I ve setup another cluster consisting > of > >> 15 > >> > >> >>>> nodes. I > >> > >> >>>> > ran > >> > >> >>>> > >> > the > >> > >> >>>> > >> > > > same > >> > >> >>>> > >> > > > > > > > example, but instead of speeding up, the > >> > >> map-reduce > >> > >> >>>> task > >> > >> >>>> > >> seems > >> > >> >>>> > >> > to > >> > >> >>>> > >> > > > > take > >> > >> >>>> > >> > > > > > > > forever! The slaves are not being used for > >> some > >> > >> >>>> reason. > >> > >> >>>> > This > >> > >> >>>> > >> > > second > >> > >> >>>> > >> > > > > > > cluster > >> > >> >>>> > >> > > > > > > > has a lower, per node processing power, but > >> > should > >> > >> >>>> that > >> > >> >>>> > make > >> > >> >>>> > >> > any > >> > >> >>>> > >> > > > > > > > difference? > >> > >> >>>> > >> > > > > > > > How can I ensure that the data is being > >> mapped > >> > to > >> > >> all > >> > >> >>>> the > >> > >> >>>> > >> > nodes? > >> > >> >>>> > >> > > > > > > Presently, > >> > >> >>>> > >> > > > > > > > the only node that seems to be doing all > the > >> > work > >> > >> is > >> > >> >>>> the > >> > >> >>>> > >> Master > >> > >> >>>> > >> > > > node. > >> > >> >>>> > >> > > > > > > > > >> > >> >>>> > >> > > > > > > > Does 15 nodes in a cluster increase the > >> network > >> > >> cost? > >> > >> >>>> What > >> > >> >>>> > >> can > >> > >> >>>> > >> > I > >> > >> >>>> > >> > > do > >> > >> >>>> > >> > > > > to > >> > >> >>>> > >> > > > > > > > setup > >> > >> >>>> > >> > > > > > > > the cluster to function more efficiently? > >> > >> >>>> > >> > > > > > > > > >> > >> >>>> > >> > > > > > > > Thanks! > >> > >> >>>> > >> > > > > > > > Mithila Nagendra > >> > >> >>>> > >> > > > > > > > Arizona State University > >> > >> >>>> > >> > > > > > > > > >> > >> >>>> > >> > > > > > > > >> > >> >>>> > >> > > > > > > >> > >> >>>> > >> > > > > > >> > >> >>>> > >> > > > > >> > >> >>>> > >> > > > >> > >> >>>> > >> > > >> > >> >>>> > >> > >> > >> >>>> > > > >> > >> >>>> > > > >> > >> >>>> > > >> > >> >>>> > >> > >> >>> > >> > >> >>> > >> > >> >> > >> > >> > > >> > >> > >> > >> > >> > >> Ravi > >> > >> -- > >> > >> > >> > >> > >> > > > >> > > >> > >> > >> > >> -- > >> Alpha Chapters of my book on Hadoop are available > >> http://www.apress.com/book/view/9781430219422 > >> > > > > > -- Alpha Chapters of my book on Hadoop are available http://www.apress.com/book/view/9781430219422