only one reducer running in a hadoop cluster
Hi, I hava a hadoop cluster with 4 pc. And I wanna to integrate hadoop and lucene together, so i copy some of the source code from nutch's Indexer class, but when i run my job, i found that there is only 1 reducer running on 1 pc, so the performance is not as far as expect. -- http://daily.appspot.com/food/
Re: Problem with Counters
Thank you all so much. That works. I made a stupid mistake with the naming of a local variable. so the error. :( On Thu, Feb 5, 2009 at 9:49 AM, Tom White wrote: > Try moving the enum to inside the top level class (as you already did) > and then use getCounter() passing the enum value: > > public class MyJob { > > static enum MyCounter{ct_key1}; > > // Mapper and Reducer defined here > > public static void main(String[] args) throws IOException { >// ... > RunningJob running =JobClient.runJob(conf); > Counters ct = running.getCounters(); > long res = ct.getCounter(MyCounter.ct_key1); > // ... > } > > } > > BTW org.apache.hadoop.mapred.Task$Counter is a built-in MapReduce > counter, so that won't help you retrieve your custom counter. > > Cheers, > > Tom > > On Thu, Feb 5, 2009 at 2:22 PM, Rasit OZDAS wrote: > > Sharath, > > > > You're using reporter.incrCounter(enumVal, intVal); to increment > counter, > > I think method to get should also be similar. > > > > Try to use findCounter(enumVal).getCounter() or getCounter(enumVal). > > > > Hope this helps, > > Rasit > > > > 2009/2/5 some speed : > >> In fact I put the enum in my Reduce method as the following link (from > >> Yahoo) says so: > >> > >> > http://public.yahoo.com/gogate/hadoop-tutorial/html/module5.html#metrics > >> --->Look at the section under Reporting Custom Metrics. > >> > >> 2009/2/5 some speed > >> > >>> Thanks Rasit. > >>> > >>> I did as you said. > >>> > >>> 1) Put the static enum MyCounter{ct_key1} just above main() > >>> > >>> 2) Changed result = > >>> ct.findCounter("org.apache.hadoop.mapred.Task$Counter", 1, > >>> "Reduce.MyCounter").getCounter(); > >>> > >>> Still is doesnt seem to help. It throws a null pointer exception.Its > not > >>> able to find the Counter. > >>> > >>> > >>> > >>> Thanks, > >>> > >>> Sharath > >>> > >>> > >>> > >>> > >>> On Thu, Feb 5, 2009 at 8:04 AM, Rasit OZDAS > wrote: > >>> > Forgot to say, value "0" means that the requested counter does not > exist. > > 2009/2/5 Rasit OZDAS : > > Sharath, > > I think the static enum definition should be out of Reduce class. > > Hadoop probably tries to find it elsewhere with "MyCounter", but > it's > > actually "Reduce.MyCounter" in your example. > > > > Hope this helps, > > Rasit > > > > 2009/2/5 some speed : > >> I Tried the following...It gets compiled but the value of result > seems > to be > >> 0 always. > >> > >>RunningJob running = JobClient.runJob(conf); > >> > >> Counters ct = new Counters(); > >> ct = running.getCounters(); > >> > >>long result = > >> ct.findCounter("org.apache.hadoop.mapred.Task$Counter", 0, > >> "*MyCounter*").getCounter(); > >> //even tried MyCounter.Key1 > >> > >> > >> > >> Does anyone know whay that is happening? > >> > >> Thanks, > >> > >> Sharath > >> > >> > >> > >> On Thu, Feb 5, 2009 at 5:59 AM, some speed > wrote: > >> > >>> Hi Tom, > >>> > >>> I get the error : > >>> > >>> Cannot find Symbol* "**MyCounter.ct_key1 " * > >>> > >>> > >>> > >>> > >>> > >>> > >>> On Thu, Feb 5, 2009 at 5:51 AM, Tom White > wrote: > >>> > Hi Sharath, > > The code you posted looks right to me. Counters#getCounter() will > return the counter's value. What error are you getting? > > Tom > > On Thu, Feb 5, 2009 at 10:09 AM, some speed < > speed.s...@gmail.com> > wrote: > > Hi, > > > > Can someone help me with the usage of counters please? I am > incrementing > a > > counter in Reduce method but I am unable to collect the counter > value > after > > the job is completed. > > > > Its something like this: > > > > public static class Reduce extends MapReduceBase implements > Reducer > FloatWritable, Text, FloatWritable> > >{ > >static enum MyCounter{ct_key1}; > > > > public void reduce(..) throws IOException > >{ > > > >reporter.incrCounter(MyCounter.ct_key1, 1); > > > >output.collect(..); > > > >} > > } > > > > -main method > > { > >RunningJob running = null; > >running=JobClient.runJob(conf); > > > >Counters ct = running.getCounters(); > > /* How do I Collect the ct_key1 value ***/ > >long res = ct.getCounter(MyCounter.ct_key1); > > >
Re: Re: Re: Re: Regarding "Hadoop multi cluster" set-up
I ran into this trouble again. This time, formatting the namenode didnt help. So, I changed the directories where the metadata and the data was being stored. That made it work. You might want to check this up at your end too. Amandeep PS: I dont have an explanation for how and why this made it work. Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sat, Feb 7, 2009 at 9:06 AM, jason hadoop wrote: > On your master machine, use the netstat command to determine what ports and > addresses the namenode process is listening on. > > On the datanode machines, examine the log files,, to verify that the > datanode has attempted to connect to the namenode ip address on one of > those > ports, and was successfull. > > The common ports used for datanode -> namenode rondevu are 50010, 54320 and > 8020, depending on your hadoop version > > If the datanodes have been started, and the connection to the namenode > failed, there will be a log message with a socket error, indicating what > host and port the datanode used to attempt to communicate with the > namenode. > Verify that that ip address is correct for your namenode, and reachable > from > the datanode host (for multi homed machines this can be an issue), and that > the port listed is one of the tcp ports that the namenode process is > listing > on. > > For linux, you can use command > *netstat -a -t -n -p | grep java | grep LISTEN* > to determine the ip addresses and ports and pids of the java processes that > are listening for tcp socket connections > > and the jps command from the bin directory of your java installation to > determine the pid of the namenode. > > On Sat, Feb 7, 2009 at 6:27 AM, shefali pawar >wrote: > > > Hi, > > > > No, not yet. We are still struggling! If you find the solution please let > > me know. > > > > Shefali > > > > On Sat, 07 Feb 2009 02:56:15 +0530 wrote > > >I had to change the master on my running cluster and ended up with the > > same > > >problem. Were you able to fix it at your end? > > > > > >Amandeep > > > > > > > > >Amandeep Khurana > > >Computer Science Graduate Student > > >University of California, Santa Cruz > > > > > > > > >On Thu, Feb 5, 2009 at 8:46 AM, shefali pawar wrote: > > > > > >> Hi, > > >> > > >> I do not think that the firewall is blocking the port because it has > > been > > >> turned off on both the computers! And also since it is a random port > > number > > >> I do not think it should create a problem. > > >> > > >> I do not understand what is going wrong! > > >> > > >> Shefali > > >> > > >> On Wed, 04 Feb 2009 23:23:04 +0530 wrote > > >> >I'm not certain that the firewall is your problem but if that port is > > >> >blocked on your master you should open it to let communication > through. > > >> Here > > >> >is one website that might be relevant: > > >> > > > >> > > > >> > > > http://stackoverflow.com/questions/255077/open-ports-under-fedora-core-8-for-vmware-server > > >> > > > >> >but again, this may not be your problem. > > >> > > > >> >John > > >> > > > >> >On Wed, Feb 4, 2009 at 12:46 PM, shefali pawar wrote: > > >> > > > >> >> Hi, > > >> >> > > >> >> I will have to check. I can do that tomorrow in college. But if > that > > is > > >> the > > >> >> case what should i do? > > >> >> > > >> >> Should i change the port number and try again? > > >> >> > > >> >> Shefali > > >> >> > > >> >> On Wed, 04 Feb 2009 S D wrote : > > >> >> > > >> >> >Shefali, > > >> >> > > > >> >> >Is your firewall blocking port 54310 on the master? > > >> >> > > > >> >> >John > > >> >> > > > >> >> >On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar > >wrote: > > >> >> > > > >> >> > > Hi, > > >> >> > > > > >> >> > > I am trying to set-up a two node cluster using Hadoop0.19.0, > with > > 1 > > >> >> > > master(which should also work as a slave) and 1 slave node. > > >> >> > > > > >> >> > > But while running bin/start-dfs.sh the datanode is not starting > > on > > >> the > > >> >> > > slave. I had read the previous mails on the list, but nothing > > seems > > >> to > > >> >> be > > >> >> > > working in this case. I am getting the following error in the > > >> >> > > hadoop-root-datanode-slave log file while running the command > > >> >> > > bin/start-dfs.sh => > > >> >> > > > > >> >> > > 2009-02-03 13:00:27,516 INFO > > >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: > > >> >> > > / > > >> >> > > STARTUP_MSG: Starting DataNode > > >> >> > > STARTUP_MSG: host = slave/172.16.0.32 > > >> >> > > STARTUP_MSG: args = [] > > >> >> > > STARTUP_MSG: version = 0.19.0 > > >> >> > > STARTUP_MSG: build = > > >> >> > > > > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19-r > > >> >> > > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 > > >> >> > > / > > >> >> > > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: > > Re
Re: Re: Re: Re: Regarding "Hadoop multi cluster" set-up
On your master machine, use the netstat command to determine what ports and addresses the namenode process is listening on. On the datanode machines, examine the log files,, to verify that the datanode has attempted to connect to the namenode ip address on one of those ports, and was successfull. The common ports used for datanode -> namenode rondevu are 50010, 54320 and 8020, depending on your hadoop version If the datanodes have been started, and the connection to the namenode failed, there will be a log message with a socket error, indicating what host and port the datanode used to attempt to communicate with the namenode. Verify that that ip address is correct for your namenode, and reachable from the datanode host (for multi homed machines this can be an issue), and that the port listed is one of the tcp ports that the namenode process is listing on. For linux, you can use command *netstat -a -t -n -p | grep java | grep LISTEN* to determine the ip addresses and ports and pids of the java processes that are listening for tcp socket connections and the jps command from the bin directory of your java installation to determine the pid of the namenode. On Sat, Feb 7, 2009 at 6:27 AM, shefali pawar wrote: > Hi, > > No, not yet. We are still struggling! If you find the solution please let > me know. > > Shefali > > On Sat, 07 Feb 2009 02:56:15 +0530 wrote > >I had to change the master on my running cluster and ended up with the > same > >problem. Were you able to fix it at your end? > > > >Amandeep > > > > > >Amandeep Khurana > >Computer Science Graduate Student > >University of California, Santa Cruz > > > > > >On Thu, Feb 5, 2009 at 8:46 AM, shefali pawar wrote: > > > >> Hi, > >> > >> I do not think that the firewall is blocking the port because it has > been > >> turned off on both the computers! And also since it is a random port > number > >> I do not think it should create a problem. > >> > >> I do not understand what is going wrong! > >> > >> Shefali > >> > >> On Wed, 04 Feb 2009 23:23:04 +0530 wrote > >> >I'm not certain that the firewall is your problem but if that port is > >> >blocked on your master you should open it to let communication through. > >> Here > >> >is one website that might be relevant: > >> > > >> > > >> > http://stackoverflow.com/questions/255077/open-ports-under-fedora-core-8-for-vmware-server > >> > > >> >but again, this may not be your problem. > >> > > >> >John > >> > > >> >On Wed, Feb 4, 2009 at 12:46 PM, shefali pawar wrote: > >> > > >> >> Hi, > >> >> > >> >> I will have to check. I can do that tomorrow in college. But if that > is > >> the > >> >> case what should i do? > >> >> > >> >> Should i change the port number and try again? > >> >> > >> >> Shefali > >> >> > >> >> On Wed, 04 Feb 2009 S D wrote : > >> >> > >> >> >Shefali, > >> >> > > >> >> >Is your firewall blocking port 54310 on the master? > >> >> > > >> >> >John > >> >> > > >> >> >On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar > >wrote: > >> >> > > >> >> > > Hi, > >> >> > > > >> >> > > I am trying to set-up a two node cluster using Hadoop0.19.0, with > 1 > >> >> > > master(which should also work as a slave) and 1 slave node. > >> >> > > > >> >> > > But while running bin/start-dfs.sh the datanode is not starting > on > >> the > >> >> > > slave. I had read the previous mails on the list, but nothing > seems > >> to > >> >> be > >> >> > > working in this case. I am getting the following error in the > >> >> > > hadoop-root-datanode-slave log file while running the command > >> >> > > bin/start-dfs.sh => > >> >> > > > >> >> > > 2009-02-03 13:00:27,516 INFO > >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: > >> >> > > / > >> >> > > STARTUP_MSG: Starting DataNode > >> >> > > STARTUP_MSG: host = slave/172.16.0.32 > >> >> > > STARTUP_MSG: args = [] > >> >> > > STARTUP_MSG: version = 0.19.0 > >> >> > > STARTUP_MSG: build = > >> >> > > > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19-r > >> >> > > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 > >> >> > > / > >> >> > > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: > Retrying > >> >> connect > >> >> > > to server: master/172.16.0.46:54310. Already tried 0 time(s). > >> >> > > 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: > Retrying > >> >> connect > >> >> > > to server: master/172.16.0.46:54310. Already tried 1 time(s). > >> >> > > 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: > Retrying > >> >> connect > >> >> > > to server: master/172.16.0.46:54310. Already tried 2 time(s). > >> >> > > 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: > Retrying > >> >> connect > >> >> > > to server: master/172.16.0.46:54310. Already tried 3 time(s). > >> >> > > 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: > Retrying > >> >> connect > >> >> >
Re: Cannot copy from local file system to DFS
Please examine the web console for the namenode. The url for this should be http://*namenodehost*:50070/ This will tell you what datanodes are successfully connected to the namenode. If the number is 0, then no datanodes are either running or were able to connect to the namenode at start, or were able to be started. The common reasons for this case are configuration errors, installation errors, or network connectivity issues due to firewalls blocking ports, or dns lookup errors (either failure or incorrect address returned) for the namenode hostname on the datanodes. At this point you will need to investigate the log files for the datanodes to make an assessment of what has happened. On Sat, Feb 7, 2009 at 6:17 AM, Rasit OZDAS wrote: > Hi, Mithila, > > "File /user/mithila/test/20417.txt could only be replicated to 0 > nodes, instead of 1" > > I think your datanode isn't working properly. > please take a look at log file of your datanode (logs/*datanode*.log). > > If there is no error in that log file, I've heard that hadoop can sometimes > mark > a datanode as "BAD" and refuses to send the block to that node, this > can be the cause. > (List, please correct me if I'm wrong!) > > Hope this helps, > Rasit > > 2009/2/6 Mithila Nagendra : > > Hey all > > I was trying to run the word count example on one of the hadoop systems I > > installed, but when i try to copy the text files from the local file > system > > to the DFS, it throws up the following exception: > > > > [mith...@node02 hadoop]$ jps > > 8711 JobTracker > > 8805 TaskTracker > > 8901 Jps > > 8419 NameNode > > 8642 SecondaryNameNode > > [mith...@node02 hadoop]$ cd .. > > [mith...@node02 mithila]$ ls > > hadoop hadoop-0.17.2.1.tar hadoop-datastore test > > [mith...@node02 mithila]$ hadoop/bin/hadoop dfs -copyFromLocal test test > > 09/02/06 11:26:26 INFO dfs.DFSClient: > org.apache.hadoop.ipc.RemoteException: > > java.io.IOException: File /user/mithila/test/20417.txt could only be > > replicated to 0 nodes, instead of 1 > >at > > > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1145) > >at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300) > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >at java.lang.reflect.Method.invoke(Method.java:597) > >at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) > >at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) > > > >at org.apache.hadoop.ipc.Client.call(Client.java:557) > >at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212) > >at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source) > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >at java.lang.reflect.Method.invoke(Method.java:597) > >at > > > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) > >at > > > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) > >at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source) > >at > > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2335) > >at > > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2220) > >at > > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1700(DFSClient.java:1702) > >at > > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1842) > > > > 09/02/06 11:26:26 WARN dfs.DFSClient: NotReplicatedYetException sleeping > > /user/mithila/test/20417.txt retries left 4 > > 09/02/06 11:26:27 INFO dfs.DFSClient: > org.apache.hadoop.ipc.RemoteException: > > java.io.IOException: File /user/mithila/test/20417.txt could only be > > replicated to 0 nodes, instead of 1 > >at > > > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1145) > >at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300) > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >at java.lang.reflect.Method.invoke(Method.java:597) > >at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) > >at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) > > > >at org.apache.hadoop.ipc.Client.call(Client.java:557) > >at
Re: Heap size error
The default task memory allocation size is set in the hadoop-default.xml file for your configuration and is usually The parameter is mapred.child.java.opts, and the value is generally -Xmx200m. You may alter this value in your JobConf object before you submit the job and the individual tasks will use the altered value If the variable that contains your JobConf object is named conf, * conf.set( "mapred.child.java.opts", "-Xmx512m");* will override any existing value from your configuation with the value "-Xmx512m", for job are are about to launch. A way to do this that, in general, will preserve any values, with the sun JDK would be to: * conf.set( "mapred.child.java.opts", conf.get("mapred.child.java.opts","") + " -Xmx512m");* The above line will append -Xmx512m to the current value of the mapred.child.java.opts parameter, and use the value of "" if there is no value set, or the value is null. It of course may the that your application is using more memory than you expect do to an incorrect assumption or programming error, and the above will not be effective. The hadoop script, in the bin directory of your installation, provides a way to pass arguments to the On Sat, Feb 7, 2009 at 5:54 AM, Rasit OZDAS wrote: > Hi, Amandeep, > I've copied following lines from a site: > -- > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > > This can have two reasons: > >* Your Java application has a memory leak. There are tools like > YourKit Java Profiler that help you to identify such leaks. >* Your Java application really needs a lot of memory (more than > 128 MB by default!). In this case the Java heap size can be increased > using the following runtime parameters: > > java -Xms -Xmx > > Defaults are: > > java -Xms32m -Xmx128m > > You can set this either in the Java Control Panel or on the command > line, depending on the environment you run your application. > - > > Hope this helps, > Rasit > > 2009/2/7 Amandeep Khurana : > > I'm getting the following error while running my hadoop job: > > > > 09/02/06 15:33:03 INFO mapred.JobClient: Task Id : > > attempt_200902061333_0004_r_00_1, Status : FAILED > > java.lang.OutOfMemoryError: Java heap space > >at java.util.Arrays.copyOf(Unknown Source) > >at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source) > >at java.lang.AbstractStringBuilder.append(Unknown Source) > >at java.lang.StringBuffer.append(Unknown Source) > >at TableJoin$Reduce.reduce(TableJoin.java:61) > >at TableJoin$Reduce.reduce(TableJoin.java:1) > >at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430) > >at org.apache.hadoop.mapred.Child.main(Child.java:155) > > > > Any inputs? > > > > Amandeep > > > > > > Amandeep Khurana > > Computer Science Graduate Student > > University of California, Santa Cruz > > > > > > -- > M. Raşit ÖZDAŞ >
Re: Re: Re: Re: Regarding "Hadoop multi cluster" set-up
Hi, No, not yet. We are still struggling! If you find the solution please let me know. Shefali On Sat, 07 Feb 2009 02:56:15 +0530 wrote >I had to change the master on my running cluster and ended up with the same >problem. Were you able to fix it at your end? > >Amandeep > > >Amandeep Khurana >Computer Science Graduate Student >University of California, Santa Cruz > > >On Thu, Feb 5, 2009 at 8:46 AM, shefali pawar wrote: > >> Hi, >> >> I do not think that the firewall is blocking the port because it has been >> turned off on both the computers! And also since it is a random port number >> I do not think it should create a problem. >> >> I do not understand what is going wrong! >> >> Shefali >> >> On Wed, 04 Feb 2009 23:23:04 +0530 wrote >> >I'm not certain that the firewall is your problem but if that port is >> >blocked on your master you should open it to let communication through. >> Here >> >is one website that might be relevant: >> > >> > >> http://stackoverflow.com/questions/255077/open-ports-under-fedora-core-8-for-vmware-server >> > >> >but again, this may not be your problem. >> > >> >John >> > >> >On Wed, Feb 4, 2009 at 12:46 PM, shefali pawar wrote: >> > >> >> Hi, >> >> >> >> I will have to check. I can do that tomorrow in college. But if that is >> the >> >> case what should i do? >> >> >> >> Should i change the port number and try again? >> >> >> >> Shefali >> >> >> >> On Wed, 04 Feb 2009 S D wrote : >> >> >> >> >Shefali, >> >> > >> >> >Is your firewall blocking port 54310 on the master? >> >> > >> >> >John >> >> > >> >> >On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar > >wrote: >> >> > >> >> > > Hi, >> >> > > >> >> > > I am trying to set-up a two node cluster using Hadoop0.19.0, with 1 >> >> > > master(which should also work as a slave) and 1 slave node. >> >> > > >> >> > > But while running bin/start-dfs.sh the datanode is not starting on >> the >> >> > > slave. I had read the previous mails on the list, but nothing seems >> to >> >> be >> >> > > working in this case. I am getting the following error in the >> >> > > hadoop-root-datanode-slave log file while running the command >> >> > > bin/start-dfs.sh => >> >> > > >> >> > > 2009-02-03 13:00:27,516 INFO >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: >> >> > > / >> >> > > STARTUP_MSG: Starting DataNode >> >> > > STARTUP_MSG: host = slave/172.16.0.32 >> >> > > STARTUP_MSG: args = [] >> >> > > STARTUP_MSG: version = 0.19.0 >> >> > > STARTUP_MSG: build = >> >> > > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19-r >> >> > > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 >> >> > > / >> >> > > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 0 time(s). >> >> > > 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 1 time(s). >> >> > > 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 2 time(s). >> >> > > 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 3 time(s). >> >> > > 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 4 time(s). >> >> > > 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 5 time(s). >> >> > > 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 6 time(s). >> >> > > 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 7 time(s). >> >> > > 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 8 time(s). >> >> > > 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying >> >> connect >> >> > > to server: master/172.16.0.46:54310. Already tried 9 time(s). >> >> > > 2009-02-03 13:00:37,738 ERROR >> >> > > org.apache.hadoop.hdfs.server.datanode.DataNode: >> java.io.IOException: >> >> Call >> >> > > to master/172.16.0.46:54310 failed on local exception: No route to >> >> host >> >> > > at org.apache.hadoop.ipc.Client.call(Client.java:699) >> >> > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) >> >> > > at $Proxy4.getProtocolVersion(Unknown Source) >> >> > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) >> >> > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306) >> >> > >
Re: Cannot copy from local file system to DFS
Hi, Mithila, "File /user/mithila/test/20417.txt could only be replicated to 0 nodes, instead of 1" I think your datanode isn't working properly. please take a look at log file of your datanode (logs/*datanode*.log). If there is no error in that log file, I've heard that hadoop can sometimes mark a datanode as "BAD" and refuses to send the block to that node, this can be the cause. (List, please correct me if I'm wrong!) Hope this helps, Rasit 2009/2/6 Mithila Nagendra : > Hey all > I was trying to run the word count example on one of the hadoop systems I > installed, but when i try to copy the text files from the local file system > to the DFS, it throws up the following exception: > > [mith...@node02 hadoop]$ jps > 8711 JobTracker > 8805 TaskTracker > 8901 Jps > 8419 NameNode > 8642 SecondaryNameNode > [mith...@node02 hadoop]$ cd .. > [mith...@node02 mithila]$ ls > hadoop hadoop-0.17.2.1.tar hadoop-datastore test > [mith...@node02 mithila]$ hadoop/bin/hadoop dfs -copyFromLocal test test > 09/02/06 11:26:26 INFO dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: > java.io.IOException: File /user/mithila/test/20417.txt could only be > replicated to 0 nodes, instead of 1 >at > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1145) >at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >at java.lang.reflect.Method.invoke(Method.java:597) >at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) >at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) > >at org.apache.hadoop.ipc.Client.call(Client.java:557) >at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212) >at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >at java.lang.reflect.Method.invoke(Method.java:597) >at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) >at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) >at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source) >at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2335) >at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2220) >at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1700(DFSClient.java:1702) >at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1842) > > 09/02/06 11:26:26 WARN dfs.DFSClient: NotReplicatedYetException sleeping > /user/mithila/test/20417.txt retries left 4 > 09/02/06 11:26:27 INFO dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: > java.io.IOException: File /user/mithila/test/20417.txt could only be > replicated to 0 nodes, instead of 1 >at > org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1145) >at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:300) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >at java.lang.reflect.Method.invoke(Method.java:597) >at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:446) >at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896) > >at org.apache.hadoop.ipc.Client.call(Client.java:557) >at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:212) >at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >at java.lang.reflect.Method.invoke(Method.java:597) >at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) >at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) >at org.apache.hadoop.dfs.$Proxy0.addBlock(Unknown Source) >at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2335) >at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2220) >at > org.apache.h
Re: Heap size error
Hi, Amandeep, I've copied following lines from a site: -- Exception in thread "main" java.lang.OutOfMemoryError: Java heap space This can have two reasons: * Your Java application has a memory leak. There are tools like YourKit Java Profiler that help you to identify such leaks. * Your Java application really needs a lot of memory (more than 128 MB by default!). In this case the Java heap size can be increased using the following runtime parameters: java -Xms -Xmx Defaults are: java -Xms32m -Xmx128m You can set this either in the Java Control Panel or on the command line, depending on the environment you run your application. - Hope this helps, Rasit 2009/2/7 Amandeep Khurana : > I'm getting the following error while running my hadoop job: > > 09/02/06 15:33:03 INFO mapred.JobClient: Task Id : > attempt_200902061333_0004_r_00_1, Status : FAILED > java.lang.OutOfMemoryError: Java heap space >at java.util.Arrays.copyOf(Unknown Source) >at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source) >at java.lang.AbstractStringBuilder.append(Unknown Source) >at java.lang.StringBuffer.append(Unknown Source) >at TableJoin$Reduce.reduce(TableJoin.java:61) >at TableJoin$Reduce.reduce(TableJoin.java:1) >at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430) >at org.apache.hadoop.mapred.Child.main(Child.java:155) > > Any inputs? > > Amandeep > > > Amandeep Khurana > Computer Science Graduate Student > University of California, Santa Cruz > -- M. Raşit ÖZDAŞ
Re: Completed jobs not finishing, errors in jobtracker logs
On Feb 6, 2009, at 12:39 PM, Bryan Duxbury wrote: I'm seeing some strange behavior on my cluster. Jobs will be done (that is, all tasks completed), but the job will still be "running". This state seems to persist for minutes, and is really killing my throughput. I'm seeing errors (warnings) in the jobtracker log that look like this: Looks like a bug, can you please file a jira? thanks, Arun