Re: Not able to copy a file to HDFS after installing
where is the namenode running ? localhost or some other host -Sagar Rajshekar wrote: Hello, I am new to Hadoop and I jus installed on Ubuntu 8.0.4 LTS as per guidance of a web site. I tested it and found working fine. I tried to copy a file but it is giving some error pls help me out had...@excel-desktop:/usr/local/hadoop/hadoop-0.17.2.1$ bin/hadoop jar hadoop-0.17.2.1-examples.jar wordcount /home/hadoop/Download\ URLs.txt download-output 09/02/02 11:18:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s). 09/02/02 11:19:00 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s). 09/02/02 11:19:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s). 09/02/02 11:19:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s). 09/02/02 11:19:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s). 09/02/02 11:19:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s). 09/02/02 11:19:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s). 09/02/02 11:19:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s). 09/02/02 11:19:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s). 09/02/02 11:19:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 10 time(s). java.lang.RuntimeException: java.net.ConnectException: Connection refused at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:356) at org.apache.hadoop.mapred.FileInputFormat.setInputP aths(FileInputFormat.java:331) at org.apache.hadoop.mapred.FileInputFormat.setInputP aths(FileInputFormat.java:304) at org.apache.hadoop.examples.WordCount.run(WordCount .java:146) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65) at org.apache.hadoop.examples.WordCount.main(WordCoun t.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Native MethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(De legatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.ProgramDriver$ProgramDescri ption.invoke(ProgramDriver.java:6 at org.apache.hadoop.util.ProgramDriver.driver(Progra mDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(Exam pleDriver.java:53) at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Native MethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(De legatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.RunJar.main(RunJar.java:155 ) at org.apache.hadoop.mapred.JobShell.run(JobShell.jav a:194) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.ja va:220) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketC hannelImpl.java:592) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.jav a:11 at org.apache.hadoop.ipc.Client$Connection.setupIOstr eams(Client.java:174) at org.apache.hadoop.ipc.Client.getConnection(Client. java:623) at org.apache.hadoop.ipc.Client.call(Client.java:546) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java: 212) at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(U nknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:313) at org.apache.hadoop.dfs.DFSClient.createRPCNamenode( DFSClient.java:102) at org.apache.hadoop.dfs.DFSClient.(DFSClient.j ava:17 at org.apache.hadoop.dfs.DistributedFileSystem.initia lize(DistributedFileSystem.java:6 at org.apache.hadoop.fs.FileSystem.createFileSystem(F ileSystem.java:1280) at org.apache.hadoop.fs.FileSystem.access$300(FileSys tem.java:56) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSyst em.java:1291) at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:203) at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:10 at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:352)
Hadoop IO performance, prefetch etc
Hi, Most of our map jobs are IO bound. However, for the same node, the IO throughput during the map phase is only 20% of its real sequential IO capability (we tested the sequential IO throughput by iozone) I think the reason is that while each map has a sequential IO request, since there are many maps concurrently running on the same node, this causes quite expensive IO switches. Prefetch may be a good solution here especially a map job is supposed to scan through an entire block and no more no less. Any idea how to enable it? Thanks, -Songting
Not able to copy a file to HDFS after installing
Hello, I am new to Hadoop and I jus installed on Ubuntu 8.0.4 LTS as per guidance of a web site. I tested it and found working fine. I tried to copy a file but it is giving some error pls help me out had...@excel-desktop:/usr/local/hadoop/hadoop-0.17.2.1$ bin/hadoop jar hadoop-0.17.2.1-examples.jar wordcount /home/hadoop/Download\ URLs.txt download-output 09/02/02 11:18:59 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 1 time(s). 09/02/02 11:19:00 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 2 time(s). 09/02/02 11:19:01 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 3 time(s). 09/02/02 11:19:02 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 4 time(s). 09/02/02 11:19:04 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 5 time(s). 09/02/02 11:19:05 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 6 time(s). 09/02/02 11:19:06 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 7 time(s). 09/02/02 11:19:07 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 8 time(s). 09/02/02 11:19:08 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 9 time(s). 09/02/02 11:19:09 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9000. Already tried 10 time(s). java.lang.RuntimeException: java.net.ConnectException: Connection refused at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:356) at org.apache.hadoop.mapred.FileInputFormat.setInputP aths(FileInputFormat.java:331) at org.apache.hadoop.mapred.FileInputFormat.setInputP aths(FileInputFormat.java:304) at org.apache.hadoop.examples.WordCount.run(WordCount .java:146) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65) at org.apache.hadoop.examples.WordCount.main(WordCoun t.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Native MethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(De legatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.ProgramDriver$ProgramDescri ption.invoke(ProgramDriver.java:6 at org.apache.hadoop.util.ProgramDriver.driver(Progra mDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(Exam pleDriver.java:53) at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Native MethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(De legatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.util.RunJar.main(RunJar.java:155 ) at org.apache.hadoop.mapred.JobShell.run(JobShell.jav a:194) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.ja va:220) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketC hannelImpl.java:592) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.jav a:11 at org.apache.hadoop.ipc.Client$Connection.setupIOstr eams(Client.java:174) at org.apache.hadoop.ipc.Client.getConnection(Client. java:623) at org.apache.hadoop.ipc.Client.call(Client.java:546) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java: 212) at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(U nknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:313) at org.apache.hadoop.dfs.DFSClient.createRPCNamenode( DFSClient.java:102) at org.apache.hadoop.dfs.DFSClient.(DFSClient.j ava:17 at org.apache.hadoop.dfs.DistributedFileSystem.initia lize(DistributedFileSystem.java:6 at org.apache.hadoop.fs.FileSystem.createFileSystem(F ileSystem.java:1280) at org.apache.hadoop.fs.FileSystem.access$300(FileSys tem.java:56) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSyst em.java:1291) at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:203) at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:10 at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:352) -- View this message in context: http://www.nabble.com/Not-able-to-copy-a-file-to-HDFS-after-installing-tp21845768p21845768.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
copying binary files to a SequenceFile
Hi all, I am copying regular binary files to a SequenceFile, and I am using BytesWritable, to which I am giving all the byte[] content of the file. However, once it hits a file larger than my computer memory, it may have problems. Is there a better way? Thank you, Mark
Re: Bad connection to FS.
As noted by others NameNode is not running. Before formatting anything (which is like deleting your data), try to see why NameNode isnt running. search for value of HADOOP_LOG_DIR in ./conf/hadoop-env.sh if you have not set it explicitly it would default to /logs/*namenode*.log Lohit - Original Message From: Amandeep Khurana To: core-user@hadoop.apache.org Sent: Wednesday, February 4, 2009 5:26:43 PM Subject: Re: Bad connection to FS. Here's what I had done.. 1. Stop the whole system 2. Delete all the data in the directories where the data and the metadata is being stored. 3. Format the namenode 4. Start the system This solved my problem. I'm not sure if this is a good idea to do for you or not. I was pretty much installing from scratch so didnt mind deleting the files in those directories.. Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Feb 4, 2009 at 3:49 PM, TCK wrote: > > I believe the debug logs location is still specified in hadoop-env.sh (I > just read the 0.19.0 doc). I think you have to shut down all nodes first > (stop-all), then format the namenode, and then restart (start-all) and make > sure that NameNode comes up too. We are using a very old version, 0.12.3, > and are upgrading. > -TCK > > > > --- On Wed, 2/4/09, Mithila Nagendra wrote: > From: Mithila Nagendra > Subject: Re: Bad connection to FS. > To: core-user@hadoop.apache.org, moonwatcher32...@yahoo.com > Date: Wednesday, February 4, 2009, 6:30 PM > > @TCK: Which version of hadoop have you installed? > @Amandeep: I did tried reformatting the namenode, but it hasn't helped me > out in anyway. > Mithila > > > On Wed, Feb 4, 2009 at 4:18 PM, TCK wrote: > > > > Mithila, how come there is no NameNode java process listed by your jps > command? I would check the hadoop namenode logs to see if there was some > startup problem (the location of those logs would be specified in > hadoop-env.sh, at least in the version I'm using). > > > -TCK > > > > > > > > --- On Wed, 2/4/09, Mithila Nagendra wrote: > > From: Mithila Nagendra > > Subject: Bad connection to FS. > > To: "core-user@hadoop.apache.org" , " > core-user-subscr...@hadoop.apache.org" < > core-user-subscr...@hadoop.apache.org> > > > Date: Wednesday, February 4, 2009, 6:06 PM > > > > Hey all > > > > When I try to copy a folder from the local file system in to the HDFS using > > the command hadoop dfs -copyFromLocal, the copy fails and it gives an error > > which says "Bad connection to FS". How do I get past this? The > > following is > > the output at the time of execution: > > > > had...@renweiyu-desktop:/usr/local/hadoop$ jps > > 6873 Jps > > 6299 JobTracker > > 6029 DataNode > > 6430 TaskTracker > > 6189 SecondaryNameNode > > had...@renweiyu-desktop:/usr/local/hadoop$ ls > > bin docslib README.txt > > build.xmlhadoop-0.18.3-ant.jar libhdfs src > > c++ hadoop-0.18.3-core.jar librecordio webapps > > CHANGES.txt hadoop-0.18.3-examples.jar LICENSE.txt > > conf hadoop-0.18.3-test.jar logs > > contrib hadoop-0.18.3-tools.jar NOTICE.txt > > had...@renweiyu-desktop:/usr/local/hadoop$ cd .. > > had...@renweiyu-desktop:/usr/local$ ls > > bin etc games gutenberg hadoop hadoop-0.18.3.tar.gz hadoop-datastore > > include lib man sbin share src > > had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal > > gutenberg gutenberg > > 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 0 time(s). > > 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 1 time(s). > > 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 2 time(s). > > 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 3 time(s). > > 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 4 time(s). > > 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 5 time(s). > > 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 6 time(s). > > 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 7 time(s). > > 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 8 time(s). > > 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 9 time(s). > > Bad connection to FS. command aborted. > > > > The commmand jps shows that the hadoop system s up and running. So I have > no > > idea whats wrong! > > > > Thanks! > > Mithila > > > > > > > > > > > > > >
Re: Bad connection to FS.
Here's what I had done.. 1. Stop the whole system 2. Delete all the data in the directories where the data and the metadata is being stored. 3. Format the namenode 4. Start the system This solved my problem. I'm not sure if this is a good idea to do for you or not. I was pretty much installing from scratch so didnt mind deleting the files in those directories.. Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Feb 4, 2009 at 3:49 PM, TCK wrote: > > I believe the debug logs location is still specified in hadoop-env.sh (I > just read the 0.19.0 doc). I think you have to shut down all nodes first > (stop-all), then format the namenode, and then restart (start-all) and make > sure that NameNode comes up too. We are using a very old version, 0.12.3, > and are upgrading. > -TCK > > > > --- On Wed, 2/4/09, Mithila Nagendra wrote: > From: Mithila Nagendra > Subject: Re: Bad connection to FS. > To: core-user@hadoop.apache.org, moonwatcher32...@yahoo.com > Date: Wednesday, February 4, 2009, 6:30 PM > > @TCK: Which version of hadoop have you installed? > @Amandeep: I did tried reformatting the namenode, but it hasn't helped me > out in anyway. > Mithila > > > On Wed, Feb 4, 2009 at 4:18 PM, TCK wrote: > > > > Mithila, how come there is no NameNode java process listed by your jps > command? I would check the hadoop namenode logs to see if there was some > startup problem (the location of those logs would be specified in > hadoop-env.sh, at least in the version I'm using). > > > -TCK > > > > > > > > --- On Wed, 2/4/09, Mithila Nagendra wrote: > > From: Mithila Nagendra > > Subject: Bad connection to FS. > > To: "core-user@hadoop.apache.org" , " > core-user-subscr...@hadoop.apache.org" < > core-user-subscr...@hadoop.apache.org> > > > Date: Wednesday, February 4, 2009, 6:06 PM > > > > Hey all > > > > When I try to copy a folder from the local file system in to the HDFS using > > the command hadoop dfs -copyFromLocal, the copy fails and it gives an error > > which says "Bad connection to FS". How do I get past this? The > > following is > > the output at the time of execution: > > > > had...@renweiyu-desktop:/usr/local/hadoop$ jps > > 6873 Jps > > 6299 JobTracker > > 6029 DataNode > > 6430 TaskTracker > > 6189 SecondaryNameNode > > had...@renweiyu-desktop:/usr/local/hadoop$ ls > > bin docslib README.txt > > build.xmlhadoop-0.18.3-ant.jar libhdfs src > > c++ hadoop-0.18.3-core.jar librecordio webapps > > CHANGES.txt hadoop-0.18.3-examples.jar LICENSE.txt > > conf hadoop-0.18.3-test.jar logs > > contrib hadoop-0.18.3-tools.jar NOTICE.txt > > had...@renweiyu-desktop:/usr/local/hadoop$ cd .. > > had...@renweiyu-desktop:/usr/local$ ls > > bin etc games gutenberg hadoop hadoop-0.18.3.tar.gz hadoop-datastore > > include lib man sbin share src > > had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal > > gutenberg gutenberg > > 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 0 time(s). > > 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 1 time(s). > > 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 2 time(s). > > 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 3 time(s). > > 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 4 time(s). > > 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 5 time(s). > > 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 6 time(s). > > 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 7 time(s). > > 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 8 time(s). > > 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/ > > 127.0.0.1:54310. Already tried 9 time(s). > > Bad connection to FS. command aborted. > > > > The commmand jps shows that the hadoop system s up and running. So I have > no > > idea whats wrong! > > > > Thanks! > > Mithila > > > > > > > > > > > > > >
Re: Value-Only Reduce Output
My (0.18.2) reduce src looks like this: write(key); clientOut_.write('\t'); write(val); clientOut_.write('\n'); which explains why avoiding the trailing tab is unavoidable. Thanks for your help, though, Jason! 2009/2/4 jason hadoop > For your reduce, the parameter is stream.reduce.input.field.separator, if > you are supplying a reduce class and I believe the output format is > TextOutputFormat... > > It looks like you have tried the map parameter for the separator, not the > reduce parameter. > > From 0.19.0 PipeReducer: > configure: > reduceOutFieldSeparator = > job_.get("stream.reduce.output.field.separator", "\t").getBytes("UTF-8"); > reduceInputFieldSeparator = > job_.get("stream.reduce.input.field.separator", "\t").getBytes("UTF-8"); > this.numOfReduceOutputKeyFields = > job_.getInt("stream.num.reduce.output.key.fields", 1); > > getInputSeparator: > byte[] getInputSeparator() { >return reduceInputFieldSeparator; > } > > reduce: > write(key); > * clientOut_.write(getInputSeparator());* > write(val); > clientOut_.write('\n'); >} else { > // "identity reduce" > * output.collect(key, val);* > } > > > On Wed, Feb 4, 2009 at 6:15 AM, Rasit OZDAS wrote: > > > I tried it myself, it doesn't work. > > I've also tried stream.map.output.field.separator and > > map.output.key.field.separator parameters for this purpose, they > > don't work either. When hadoop sees empty string, it takes default tab > > character instead. > > > > Rasit > > > > 2009/2/4 jason hadoop > > > > > > Ooops, you are using streaming., and I am not familar. > > > As a terrible hack, you could set mapred.textoutputformat.separator to > > the > > > empty string, in your configuration. > > > > > > On Tue, Feb 3, 2009 at 9:26 PM, jason hadoop > > wrote: > > > > > > > If you are using the standard TextOutputFormat, and the output > > collector is > > > > passed a null for the value, there will not be a trailing tab > character > > > > added to the output line. > > > > > > > > output.collect( key, null ); > > > > Will give you the behavior you are looking for if your configuration > is > > as > > > > I expect. > > > > > > > > > > > > On Tue, Feb 3, 2009 at 7:49 PM, Jack Stahl wrote: > > > > > > > >> Hello, > > > >> > > > >> I'm interested in a map-reduce flow where I output only values (no > > keys) > > > >> in > > > >> my reduce step. For example, imagine the canonical word-counting > > program > > > >> where I'd like my output to be an unlabeled histogram of counts > > instead of > > > >> (word, count) pairs. > > > >> > > > >> I'm using HadoopStreaming (specifically, I'm using the dumbo module > to > > run > > > >> my python scripts). When I simulate the map reduce using pipes and > > sort > > > >> in > > > >> bash, it works fine. However, in Hadoop, if I output a value with > no > > > >> tabs, > > > >> Hadoop appends a trailing "\t", apparently interpreting my output as > a > > > >> (value, "") KV pair. I'd like to avoid outputing this trailing tab > if > > > >> possible. > > > >> > > > >> Is there a command line option that could be use to effect this? > More > > > >> generally, is there something wrong with outputing arbitrary > strings, > > > >> instead of key-value pairs, in your reduce step? > > > >> > > > > > > > > > > > > > > > > -- > > M. Raşit ÖZDAŞ > > >
Re: Bad connection to FS.
I believe the debug logs location is still specified in hadoop-env.sh (I just read the 0.19.0 doc). I think you have to shut down all nodes first (stop-all), then format the namenode, and then restart (start-all) and make sure that NameNode comes up too. We are using a very old version, 0.12.3, and are upgrading. -TCK --- On Wed, 2/4/09, Mithila Nagendra wrote: From: Mithila Nagendra Subject: Re: Bad connection to FS. To: core-user@hadoop.apache.org, moonwatcher32...@yahoo.com Date: Wednesday, February 4, 2009, 6:30 PM @TCK: Which version of hadoop have you installed? @Amandeep: I did tried reformatting the namenode, but it hasn't helped me out in anyway. Mithila On Wed, Feb 4, 2009 at 4:18 PM, TCK wrote: Mithila, how come there is no NameNode java process listed by your jps command? I would check the hadoop namenode logs to see if there was some startup problem (the location of those logs would be specified in hadoop-env.sh, at least in the version I'm using). -TCK --- On Wed, 2/4/09, Mithila Nagendra wrote: From: Mithila Nagendra Subject: Bad connection to FS. To: "core-user@hadoop.apache.org" , "core-user-subscr...@hadoop.apache.org" Date: Wednesday, February 4, 2009, 6:06 PM Hey all When I try to copy a folder from the local file system in to the HDFS using the command hadoop dfs -copyFromLocal, the copy fails and it gives an error which says "Bad connection to FS". How do I get past this? The following is the output at the time of execution: had...@renweiyu-desktop:/usr/local/hadoop$ jps 6873 Jps 6299 JobTracker 6029 DataNode 6430 TaskTracker 6189 SecondaryNameNode had...@renweiyu-desktop:/usr/local/hadoop$ ls bin docs lib README.txt build.xml hadoop-0.18.3-ant.jar libhdfs src c++ hadoop-0.18.3-core.jar librecordio webapps CHANGES.txt hadoop-0.18.3-examples.jar LICENSE.txt conf hadoop-0.18.3-test.jar logs contrib hadoop-0.18.3-tools.jar NOTICE.txt had...@renweiyu-desktop:/usr/local/hadoop$ cd .. had...@renweiyu-desktop:/usr/local$ ls bin etc games gutenberg hadoop hadoop-0.18.3.tar.gz hadoop-datastore include lib man sbin share src had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal gutenberg gutenberg 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 0 time(s). 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 1 time(s). 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 2 time(s). 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 3 time(s). 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 4 time(s). 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 5 time(s). 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 6 time(s). 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 7 time(s). 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 8 time(s). 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 9 time(s). Bad connection to FS. command aborted. The commmand jps shows that the hadoop system s up and running. So I have no idea whats wrong! Thanks! Mithila
Re: problem with completion notification from block movement
Karl Kleinpaste wrote: On Sun, 2009-02-01 at 17:58 -0800, jason hadoop wrote: The Datanode's use multiple threads with locking and one of the assumptions is that the block report (1ce per hour by default) takes little time. The datanode will pause while the block report is running and if it happens to take a while weird things start to happen. Thank you for responding, this is very informative for us. Having looked through the source code with a co-worker regarding periodic scan and then checking the logs once again, we find that we are finding reports of this sort: BlockReport of 1158499 blocks got processed in 308860 msecs BlockReport of 1159840 blocks got processed in 237925 msecs BlockReport of 1161274 blocks got processed in 177853 msecs BlockReport of 1162408 blocks got processed in 285094 msecs BlockReport of 1164194 blocks got processed in 184478 msecs BlockReport of 1165673 blocks got processed in 226401 msecs The 3rd of these exactly straddles the particular example timeline I discussed in my original email about this question. I suspect I'll find more of the same as I look through other related errors. You could ask for "complete fix" in https://issues.apache.org/jira/browse/HADOOP-4584 . I don't think current patch there fixes your problem. Raghu. --karl
Re: Bad connection to FS.
I faced the same issue a few days back. Formatting the namenode made it work for me. Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Feb 4, 2009 at 3:06 PM, Mithila Nagendra wrote: > Hey all > > When I try to copy a folder from the local file system in to the HDFS using > the command hadoop dfs -copyFromLocal, the copy fails and it gives an error > which says "Bad connection to FS". How do I get past this? The following is > the output at the time of execution: > > had...@renweiyu-desktop:/usr/local/hadoop$ jps > 6873 Jps > 6299 JobTracker > 6029 DataNode > 6430 TaskTracker > 6189 SecondaryNameNode > had...@renweiyu-desktop:/usr/local/hadoop$ ls > bin docslib README.txt > build.xmlhadoop-0.18.3-ant.jar libhdfs src > c++ hadoop-0.18.3-core.jar librecordio webapps > CHANGES.txt hadoop-0.18.3-examples.jar LICENSE.txt > conf hadoop-0.18.3-test.jar logs > contrib hadoop-0.18.3-tools.jar NOTICE.txt > had...@renweiyu-desktop:/usr/local/hadoop$ cd .. > had...@renweiyu-desktop:/usr/local$ ls > bin etc games gutenberg hadoop hadoop-0.18.3.tar.gz hadoop-datastore > include lib man sbin share src > had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal > gutenberg gutenberg > 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 0 time(s). > 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 1 time(s). > 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 2 time(s). > 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 3 time(s). > 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 4 time(s). > 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 5 time(s). > 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 6 time(s). > 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 7 time(s). > 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 8 time(s). > 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/ > 127.0.0.1:54310. Already tried 9 time(s). > Bad connection to FS. command aborted. > > The commmand jps shows that the hadoop system s up and running. So I have > no > idea whats wrong! > > Thanks! > Mithila >
Re: Bad connection to FS.
Mithila, how come there is no NameNode java process listed by your jps command? I would check the hadoop namenode logs to see if there was some startup problem (the location of those logs would be specified in hadoop-env.sh, at least in the version I'm using). -TCK --- On Wed, 2/4/09, Mithila Nagendra wrote: From: Mithila Nagendra Subject: Bad connection to FS. To: "core-user@hadoop.apache.org" , "core-user-subscr...@hadoop.apache.org" Date: Wednesday, February 4, 2009, 6:06 PM Hey all When I try to copy a folder from the local file system in to the HDFS using the command hadoop dfs -copyFromLocal, the copy fails and it gives an error which says "Bad connection to FS". How do I get past this? The following is the output at the time of execution: had...@renweiyu-desktop:/usr/local/hadoop$ jps 6873 Jps 6299 JobTracker 6029 DataNode 6430 TaskTracker 6189 SecondaryNameNode had...@renweiyu-desktop:/usr/local/hadoop$ ls bin docslib README.txt build.xmlhadoop-0.18.3-ant.jar libhdfs src c++ hadoop-0.18.3-core.jar librecordio webapps CHANGES.txt hadoop-0.18.3-examples.jar LICENSE.txt conf hadoop-0.18.3-test.jar logs contrib hadoop-0.18.3-tools.jar NOTICE.txt had...@renweiyu-desktop:/usr/local/hadoop$ cd .. had...@renweiyu-desktop:/usr/local$ ls bin etc games gutenberg hadoop hadoop-0.18.3.tar.gz hadoop-datastore include lib man sbin share src had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal gutenberg gutenberg 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 0 time(s). 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 1 time(s). 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 2 time(s). 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 3 time(s). 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 4 time(s). 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 5 time(s). 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 6 time(s). 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 7 time(s). 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 8 time(s). 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 9 time(s). Bad connection to FS. command aborted. The commmand jps shows that the hadoop system s up and running. So I have no idea whats wrong! Thanks! Mithila
Bad connection to FS.
Hey all When I try to copy a folder from the local file system in to the HDFS using the command hadoop dfs -copyFromLocal, the copy fails and it gives an error which says "Bad connection to FS". How do I get past this? The following is the output at the time of execution: had...@renweiyu-desktop:/usr/local/hadoop$ jps 6873 Jps 6299 JobTracker 6029 DataNode 6430 TaskTracker 6189 SecondaryNameNode had...@renweiyu-desktop:/usr/local/hadoop$ ls bin docslib README.txt build.xmlhadoop-0.18.3-ant.jar libhdfs src c++ hadoop-0.18.3-core.jar librecordio webapps CHANGES.txt hadoop-0.18.3-examples.jar LICENSE.txt conf hadoop-0.18.3-test.jar logs contrib hadoop-0.18.3-tools.jar NOTICE.txt had...@renweiyu-desktop:/usr/local/hadoop$ cd .. had...@renweiyu-desktop:/usr/local$ ls bin etc games gutenberg hadoop hadoop-0.18.3.tar.gz hadoop-datastore include lib man sbin share src had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal gutenberg gutenberg 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 0 time(s). 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 1 time(s). 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 2 time(s). 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 3 time(s). 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 4 time(s). 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 5 time(s). 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 6 time(s). 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 7 time(s). 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 8 time(s). 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/ 127.0.0.1:54310. Already tried 9 time(s). Bad connection to FS. command aborted. The commmand jps shows that the hadoop system s up and running. So I have no idea whats wrong! Thanks! Mithila
Re: HADOOP-2536 supports Oracle too?
Ok. I created the same database in a MySQL database and ran the same hadoop job against it. It worked. So, that means there is some Oracle specific issue. It cant be an issue with the JDBC drivers since I am using the same drivers in a simple JDBC client. What could it be? Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Feb 4, 2009 at 10:26 AM, Amandeep Khurana wrote: > Ok. I'm not sure if I got it correct. Are you saying, I should test the > statement that hadoop creates directly with the database? > > Amandeep > > > Amandeep Khurana > Computer Science Graduate Student > University of California, Santa Cruz > > > On Wed, Feb 4, 2009 at 7:13 AM, Enis Soztutar wrote: > >> Hadoop-2536 connects to the db via JDBC, so in theory it should work with >> proper jdbc drivers. >> It has been tested against MySQL, Hsqldb, and PostreSQL, but not Oracle. >> >> To answer your earlier question, the actual SQL statements might not be >> recognized by Oracle, so I suggest the best way to test this is to insert >> print statements, and run the actual SQL statements against Oracle to see if >> the syntax is accepted. >> >> We would appreciate if you publish your results. >> >> Enis >> >> >> Amandeep Khurana wrote: >> >>> Does the patch HADOOP-2536 support connecting to Oracle databases as >>> well? >>> Or is it just limited to MySQL? >>> >>> Amandeep >>> >>> >>> Amandeep Khurana >>> Computer Science Graduate Student >>> University of California, Santa Cruz >>> >>> >>> >> >
RE: Control over max map/reduce tasks per job
I have filed an issue for this: https://issues.apache.org/jira/browse/HADOOP-5170 JG > -Original Message- > From: Bryan Duxbury [mailto:br...@rapleaf.com] > Sent: Tuesday, February 03, 2009 10:59 PM > To: core-user@hadoop.apache.org > Subject: Re: Control over max map/reduce tasks per job > > This sounds good enough for a JIRA ticket to me. > -Bryan > > On Feb 3, 2009, at 11:44 AM, Jonathan Gray wrote: > > > Chris, > > > > For my specific use cases, it would be best to be able to set N > > mappers/reducers per job per node (so I can explicitly say, run at > > most 2 at > > a time of this CPU bound task on any given node). However, the > > other way > > would work as well (on 10 node system, would set job to max 20 > > tasks at a > > time globally), but opens up the possibility that a node could be > > assigned > > more than 2 of that task. > > > > I would work with whatever is easiest to implement as either would > > be a vast > > improvement for me (can run high numbers of network latency bound > > tasks > > without fear of cpu bound tasks killing the cluster). > > > > JG > > > > > > > >> -Original Message- > >> From: Chris K Wensel [mailto:ch...@wensel.net] > >> Sent: Tuesday, February 03, 2009 11:34 AM > >> To: core-user@hadoop.apache.org > >> Subject: Re: Control over max map/reduce tasks per job > >> > >> Hey Jonathan > >> > >> Are you looking to limit the total number of concurrent mapper/ > >> reducers a single job can consume cluster wide, or limit the number > >> per node? > >> > >> That is, you have X mappers/reducers, but only can allow N mappers/ > >> reducers to run at a time globally, for a given job. > >> > >> Or, you are cool with all X running concurrently globally, but > >> want to > >> guarantee that no node can run more than N tasks from that job? > >> > >> Or both? > >> > >> just reconciling the conversation we had last week with this thread. > >> > >> ckw > >> > >> On Feb 3, 2009, at 11:16 AM, Jonathan Gray wrote: > >> > >>> All, > >>> > >>> > >>> > >>> I have a few relatively small clusters (5-20 nodes) and am having > >>> trouble > >>> keeping them loaded with my MR jobs. > >>> > >>> > >>> > >>> The primary issue is that I have different jobs that have > >>> drastically > >>> different patterns. I have jobs that read/write to/from HBase or > >>> Hadoop > >>> with minimal logic (network throughput bound or io bound), others > >> that > >>> perform crawling (network latency bound), and one huge parsing > >>> streaming job > >>> (very CPU bound, each task eats a core). > >>> > >>> > >>> > >>> I'd like to launch very large numbers of tasks for network latency > >>> bound > >>> jobs, however the large CPU bound job means I have to keep the max > >>> maps > >>> allowed per node low enough as to not starve the Datanode and > >>> Regionserver. > >>> > >>> > >>> > >>> I'm an HBase dev but not familiar enough with Hadoop MR code to > even > >>> know > >>> what would be involved with implementing this. However, in talking > >>> with > >>> other users, it seems like this would be a well-received option. > >>> > >>> > >>> > >>> I wanted to ping the list before filing an issue because it seems > >> like > >>> someone may have thought about this in the past. > >>> > >>> > >>> > >>> Thanks. > >>> > >>> > >>> > >>> Jonathan Gray > >>> > >> > >> -- > >> Chris K Wensel > >> ch...@wensel.net > >> http://www.cascading.org/ > >> http://www.scaleunlimited.com/ > >
Re: Regarding "Hadoop multi cluster" set-up
I would love to see someplace a complete list of the ports that the various Hadoop daemons expect to have open. Does anyone have that? Ian On Feb 4, 2009, at 1:16 PM, shefali pawar wrote: Hi, I will have to check. I can do that tomorrow in college. But if that is the case what should i do? Should i change the port number and try again? Shefali On Wed, 04 Feb 2009 S D wrote : Shefali, Is your firewall blocking port 54310 on the master? John On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar >wrote: Hi, I am trying to set-up a two node cluster using Hadoop0.19.0, with 1 master(which should also work as a slave) and 1 slave node. But while running bin/start-dfs.sh the datanode is not starting on the slave. I had read the previous mails on the list, but nothing seems to be working in this case. I am getting the following error in the hadoop-root-datanode-slave log file while running the command bin/start-dfs.sh => 2009-02-03 13:00:27,516 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = slave/172.16.0.32 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.19.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 / 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 0 time(s). 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 1 time(s). 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 2 time(s). 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 3 time(s). 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 4 time(s). 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 5 time(s). 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 6 time(s). 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 7 time(s). 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 8 time(s). 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 9 time(s). 2009-02-03 13:00:37,738 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to master/172.16.0.46:54310 failed on local exception: No route to host at org.apache.hadoop.ipc.Client.call(Client.java:699) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at $Proxy4.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288) at org .apache .hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java: 258) at org .apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java: 205) at org .apache .hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java: 1199) at org .apache .hadoop .hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java: 1154) at org .apache .hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java: 1162) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java: 1284) Caused by: java.net.NoRouteToHostException: No route to host at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java: 574) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java: 299) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:685) ... 12 more 2009-02-03 13:00:37,739 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32 / Also, the Pseudo distributed operation is working on both the machines. And i am able to ssh from 'master to master'
Re: Chukwa documentation
Howdy. You do not need torque. It's not even helpful, as far as I know. You don't need a database, but if you don't have one, you'd probably need to do a bit more work to analyze the collected data in HDFS. If you were going to be using MapReduce for analysis anyway, that's probably a non-issue for you. We're working on documentation, but it's sort of chasing a moving target, since the Chukwa codebase and configuration interfaces are still in flux. --Ari On Tue, Feb 3, 2009 at 5:00 PM, wrote: > Hi Everybody, > > I don't know if there is a mail list for Chukwa so I apologies in > advance if this is not the right place to ask my questions. > > I have the following questions and comments: > > It was simple the configuration of the collector and the agent. > However, there is other features that are not documented it all like: > - torque (Do I have to install torque before? Yes? No? and Why?), > - database, (Do I have to have a DB?) > -what is queueinfo.properties, which kind of information provides me? > -and there is more stuff that I need to dig in the code to understand. > Could somebody update the documentation from Chukwa?. > -- Ari Rabkin asrab...@gmail.com UC Berkeley Computer Science Department
Re: How to use DBInputFormat?
Adding a semicolon gives me the error "ORA-00911: Invalid character" Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Feb 4, 2009 at 6:46 AM, Rasit OZDAS wrote: > Amandeep, > "SQL command not properly ended" > I get this error whenever I forget the semicolon at the end. > I know, it doesn't make sense, but I recommend giving it a try > > Rasit > > 2009/2/4 Amandeep Khurana : > > The same query is working if I write a simple JDBC client and query the > > database. So, I'm probably doing something wrong in the connection > settings. > > But the error looks to be on the query side more than the connection > side. > > > > Amandeep > > > > > > Amandeep Khurana > > Computer Science Graduate Student > > University of California, Santa Cruz > > > > > > On Tue, Feb 3, 2009 at 7:25 PM, Amandeep Khurana > wrote: > > > >> Thanks Kevin > >> > >> I couldnt get it work. Here's the error I get: > >> > >> bin/hadoop jar ~/dbload.jar LoadTable1 > >> 09/02/03 19:21:17 INFO jvm.JvmMetrics: Initializing JVM Metrics with > >> processName=JobTracker, sessionId= > >> 09/02/03 19:21:20 INFO mapred.JobClient: Running job: job_local_0001 > >> 09/02/03 19:21:21 INFO mapred.JobClient: map 0% reduce 0% > >> 09/02/03 19:21:22 INFO mapred.MapTask: numReduceTasks: 0 > >> 09/02/03 19:21:24 WARN mapred.LocalJobRunner: job_local_0001 > >> java.io.IOException: ORA-00933: SQL command not properly ended > >> > >> at > >> > org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289) > >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:321) > >> at > >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) > >> java.io.IOException: Job failed! > >> at > org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217) > >> at LoadTable1.run(LoadTable1.java:130) > >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > >> at LoadTable1.main(LoadTable1.java:107) > >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown > Source) > >> at java.lang.reflect.Method.invoke(Unknown Source) > >> at org.apache.hadoop.util.RunJar.main(RunJar.java:165) > >> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) > >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > >> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) > >> > >> Exception closing file > >> > /user/amkhuran/contract_table/_temporary/_attempt_local_0001_m_00_0/part-0 > >> java.io.IOException: Filesystem closed > >> at > org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198) > >> at > org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053) > >> at > >> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942) > >> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210) > >> at > >> > org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:243) > >> at > >> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413) > >> at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236) > >> at > >> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221) > >> > >> > >> Here's my code: > >> > >> public class LoadTable1 extends Configured implements Tool { > >> > >> // data destination on hdfs > >> private static final String CONTRACT_OUTPUT_PATH = > "contract_table"; > >> > >> // The JDBC connection URL and driver implementation class > >> > >> private static final String CONNECT_URL = "jdbc:oracle:thin:@dbhost > >> :1521:PSEDEV"; > >> private static final String DB_USER = "user"; > >> private static final String DB_PWD = "pass"; > >> private static final String DATABASE_DRIVER_CLASS = > >> "oracle.jdbc.driver.OracleDriver"; > >> > >> private static final String CONTRACT_INPUT_TABLE = > >> "OSE_EPR_CONTRACT"; > >> > >> private static final String [] CONTRACT_INPUT_TABLE_FIELDS = { > >> "PORTFOLIO_NUMBER", "CONTRACT_NUMBER"}; > >> > >> private static final String ORDER_CONTRACT_BY_COL = > >> "CONTRACT_NUMBER"; > >> > >> > >> static class ose_epr_contract implements Writable, DBWritable { > >> > >> > >> String CONTRACT_NUMBER; > >> > >> > >> public void readFields(DataInput in) throws IOException { > >> > >>
Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?
Sounds overly complicated. Complicated usually leads to mistakes :) What about just having a single cluster and only running the tasktrackers on the fast CPUs? No messy cross-cluster transferring. Brian On Feb 4, 2009, at 12:46 PM, TCK wrote: Thanks, Brian. This sounds encouraging for us. What are the advantages/disadvantages of keeping a persistent storage (HD/K)FS cluster separate from a processing Hadoop+(HD/K)FS cluster ? The advantage I can think of is that a permanent storage cluster has different requirements from a map-reduce processing cluster -- the permanent storage cluster would need faster, bigger hard disks, and would need to grow as the total volume of all collected logs grows, whereas the processing cluster would need fast CPUs and would only need to grow with the rate of incoming data. So it seems to make sense to me to copy a piece of data from the permanent storage cluster to the processing cluster only when it needs to be processed. Is my line of thinking reasonable? How would this compare to running the map-reduce processing on same cluster as the data is stored in? Which approach is used by most people? Best Regards, TCK --- On Wed, 2/4/09, Brian Bockelman wrote: From: Brian Bockelman Subject: Re: Batch processing with Hadoop -- does HDFS scale for parallel reads? To: core-user@hadoop.apache.org Date: Wednesday, February 4, 2009, 1:06 PM Hey TCK, We use HDFS+FUSE solely as a storage solution for a application which doesn't understand MapReduce. We've scaled this solution to around 80Gbps. For 300 processes reading from the same file, we get about 20Gbps. Do consider your data retention policies -- I would say that Hadoop as a storage system is thus far about 99% reliable for storage and is not a backup solution. If you're scared of getting more than 1% of your logs lost, have a good backup solution. I would also add that when you are learning your operational staff's abilities, expect even more data loss. As you gain experience, data loss goes down. I don't believe we've lost a single block in the last month, but it took us 2-3 months of 1%-level losses to get here. Brian On Feb 4, 2009, at 11:51 AM, TCK wrote: Hey guys, We have been using Hadoop to do batch processing of logs. The logs get written and stored on a NAS. Our Hadoop cluster periodically copies a batch of new logs from the NAS, via NFS into Hadoop's HDFS, processes them, and copies the output back to the NAS. The HDFS is cleaned up at the end of each batch (ie, everything in it is deleted). The problem is that reads off the NAS via NFS don't scale even if we try to scale the copying process by adding more threads to read in parallel. If we instead stored the log files on an HDFS cluster (instead of NAS), it seems like the reads would scale since the data can be read from multiple data nodes at the same time without any contention (except network IO, which shouldn't be a problem). I would appreciate if anyone could share any similar experience they have had with doing parallel reads from a storage HDFS. Also is it a good idea to have a separate HDFS for storage vs for doing the batch processing ? Best Regards, TCK
Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?
Thanks, Brian. This sounds encouraging for us. What are the advantages/disadvantages of keeping a persistent storage (HD/K)FS cluster separate from a processing Hadoop+(HD/K)FS cluster ? The advantage I can think of is that a permanent storage cluster has different requirements from a map-reduce processing cluster -- the permanent storage cluster would need faster, bigger hard disks, and would need to grow as the total volume of all collected logs grows, whereas the processing cluster would need fast CPUs and would only need to grow with the rate of incoming data. So it seems to make sense to me to copy a piece of data from the permanent storage cluster to the processing cluster only when it needs to be processed. Is my line of thinking reasonable? How would this compare to running the map-reduce processing on same cluster as the data is stored in? Which approach is used by most people? Best Regards, TCK --- On Wed, 2/4/09, Brian Bockelman wrote: From: Brian Bockelman Subject: Re: Batch processing with Hadoop -- does HDFS scale for parallel reads? To: core-user@hadoop.apache.org Date: Wednesday, February 4, 2009, 1:06 PM Hey TCK, We use HDFS+FUSE solely as a storage solution for a application which doesn't understand MapReduce. We've scaled this solution to around 80Gbps. For 300 processes reading from the same file, we get about 20Gbps. Do consider your data retention policies -- I would say that Hadoop as a storage system is thus far about 99% reliable for storage and is not a backup solution. If you're scared of getting more than 1% of your logs lost, have a good backup solution. I would also add that when you are learning your operational staff's abilities, expect even more data loss. As you gain experience, data loss goes down. I don't believe we've lost a single block in the last month, but it took us 2-3 months of 1%-level losses to get here. Brian On Feb 4, 2009, at 11:51 AM, TCK wrote: > > Hey guys, > > We have been using Hadoop to do batch processing of logs. The logs get written and stored on a NAS. Our Hadoop cluster periodically copies a batch of new logs from the NAS, via NFS into Hadoop's HDFS, processes them, and copies the output back to the NAS. The HDFS is cleaned up at the end of each batch (ie, everything in it is deleted). > > The problem is that reads off the NAS via NFS don't scale even if we try to scale the copying process by adding more threads to read in parallel. > > If we instead stored the log files on an HDFS cluster (instead of NAS), it seems like the reads would scale since the data can be read from multiple data nodes at the same time without any contention (except network IO, which shouldn't be a problem). > > I would appreciate if anyone could share any similar experience they have had with doing parallel reads from a storage HDFS. > > Also is it a good idea to have a separate HDFS for storage vs for doing the batch processing ? > > Best Regards, > TCK > > > >
Re: HDFS Namenode Heap Size woes
Brian, Jason, Thanks again for your help. Just to close thread, while following your suggestions I found I had an incredibly large number of files on my data nodes that were being marked for invalidation at startup. I believe they were left behind from an old mass-delete that was followed by a shutdown before the deletes were performed. I've cleaned out those files and we're humming along with <1GB heap size. Thanks, Sean On Sun, Feb 1, 2009 at 10:48 PM, jason hadoop wrote: > If you set up your namenode for remote debugging, you could attach with > eclipse or the debugger of your choice. > > Look at the objects in org.apache.hadoop.hdfs.server.namenode.FSNamesystem > private UnderReplicatedBlocks neededReplications = new > UnderReplicatedBlocks(); > private PendingReplicationBlocks pendingReplications; > > // > // Keeps a Collection for every named machine containing > // blocks that have recently been invalidated and are thought to live > // on the machine in question. > // Mapping: StorageID -> ArrayList > // > private Map> recentInvalidateSets = >new TreeMap>(); > > // > // Keeps a TreeSet for every named node. Each treeset contains > // a list of the blocks that are "extra" at that location. We'll > // eventually remove these extras. > // Mapping: StorageID -> TreeSet > // > Map> excessReplicateMap = >new TreeMap>(); > > Much of this is run out of a thread ReplicationMonitor. > > In our case we had datanodes with 2million blocks dropping off and on > again, > and this was trashing these queues with the 2million blocks on the > datanodoes, re-replicating the blocks and then invalidating them all when > the datanode came back. > > > On Sun, Feb 1, 2009 at 7:03 PM, Brian Bockelman >wrote: > > > Hey Sean, > > > > I use JMX monitoring -- which allows me to trigger GC via jconsole. > > There's decent documentation out there to making it work, but you'd have > to > > restart the namenode to do it ... let the list know if you can't figure > it > > out. > > > > Brian > > > > > > On Feb 1, 2009, at 8:59 PM, Sean Knapp wrote: > > > > Brian, > >> Thanks for jumping in as well. Is there a recommended way of manually > >> triggering GC? > >> > >> Thanks, > >> Sean > >> > >> On Sun, Feb 1, 2009 at 6:06 PM, Brian Bockelman >> >wrote: > >> > >> Hey Sean, > >>> > >>> Dumb question: how much memory is used after a garbage collection > cycle? > >>> > >>> Look at the graph "jvm.metrics.memHeapUsedM": > >>> > >>> > >>> > >>> > http://rcf.unl.edu/ganglia/?m=network_report&r=hour&s=descending&c=red&h=hadoop-name&sh=1&hc=4&z=small > >>> > >>> If you tell the JVM it has 16GB of memory to play with, it will often > use > >>> a > >>> significant portion of that before it does a thorough GC. In our site, > >>> it > >>> actually only needs ~ 500MB, but sometimes it will hit 1GB before GC is > >>> triggered. One of the vagaries of Java, eh? > >>> > >>> Trigger a GC and see how much is actually used. > >>> > >>> Brian > >>> > >>> > >>> On Feb 1, 2009, at 6:11 PM, Sean Knapp wrote: > >>> > >>> Jason, > >>> > Thanks for the response. By falling out, do you mean a longer time > since > last contact (100s+), or fully timed out where it is dropped into dead > nodes? The former happens fairly often, the latter only under serious > load > but not in the last day. Also, my namenode is now up to 10GB with less > than > 700k files after some additional archiving. > > Thanks, > Sean > > On Sun, Feb 1, 2009 at 4:00 PM, jason hadoop > wrote: > > If your datanodes are pausing and falling out of the cluster you will > get > > > a > > large workload for the namenode of blocks to replicate and when the > > paused > > datanode comes back, a large workload of blocks to delete. > > These lists are stored in memory on the namenode. > > The startup messages lead me to wonder if your datanodes are > > periodically > > pausing or are otherwise dropping in and out of the cluster. > > > > On Sat, Jan 31, 2009 at 2:20 PM, Sean Knapp wrote: > > > > I'm running 0.19.0 on a 10 node cluster (8 core, 16GB RAM, 4x1.5TB). > > The > > > >> current status of my FS is approximately 1 million files and > >> directories, > >> 950k blocks, and heap size of 7GB (16GB reserved). Average block > >> replication > >> is 3.8. I'm concerned that the heap size is steadily climbing... a > 7GB > >> > >> heap > > > > is substantially higher per file that I have on a similar 0.18.2 > >> cluster, > >> which has closer to a 1GB heap. > >> My typical usage model is 1) write a number of small files into HDFS > >> > >> (tens > > > > or hundreds of thousands at a time), 2) archive those files, 3) > delete > >> > >> the > > > > originals. I've tried dropping the replication factor of the _index > >> and > >> _masterindex fi
Re: HADOOP-2536 supports Oracle too?
Ok. I'm not sure if I got it correct. Are you saying, I should test the statement that hadoop creates directly with the database? Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Wed, Feb 4, 2009 at 7:13 AM, Enis Soztutar wrote: > Hadoop-2536 connects to the db via JDBC, so in theory it should work with > proper jdbc drivers. > It has been tested against MySQL, Hsqldb, and PostreSQL, but not Oracle. > > To answer your earlier question, the actual SQL statements might not be > recognized by Oracle, so I suggest the best way to test this is to insert > print statements, and run the actual SQL statements against Oracle to see if > the syntax is accepted. > > We would appreciate if you publish your results. > > Enis > > > Amandeep Khurana wrote: > >> Does the patch HADOOP-2536 support connecting to Oracle databases as well? >> Or is it just limited to MySQL? >> >> Amandeep >> >> >> Amandeep Khurana >> Computer Science Graduate Student >> University of California, Santa Cruz >> >> >> >
Re: Re: Regarding "Hadoop multi cluster" set-up
Hi, I will have to check. I can do that tomorrow in college. But if that is the case what should i do? Should i change the port number and try again? Shefali On Wed, 04 Feb 2009 S D wrote : >Shefali, > >Is your firewall blocking port 54310 on the master? > >John > >On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar wrote: > > > Hi, > > > > I am trying to set-up a two node cluster using Hadoop0.19.0, with 1 > > master(which should also work as a slave) and 1 slave node. > > > > But while running bin/start-dfs.sh the datanode is not starting on the > > slave. I had read the previous mails on the list, but nothing seems to be > > working in this case. I am getting the following error in the > > hadoop-root-datanode-slave log file while running the command > > bin/start-dfs.sh => > > > > 2009-02-03 13:00:27,516 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: > > / > > STARTUP_MSG: Starting DataNode > > STARTUP_MSG: host = slave/172.16.0.32 > > STARTUP_MSG: args = [] > > STARTUP_MSG: version = 0.19.0 > > STARTUP_MSG: build = > > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r > > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 > > / > > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 0 time(s). > > 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 1 time(s). > > 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 2 time(s). > > 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 3 time(s). > > 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 4 time(s). > > 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 5 time(s). > > 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 6 time(s). > > 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 7 time(s). > > 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 8 time(s). > > 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect > > to server: master/172.16.0.46:54310. Already tried 9 time(s). > > 2009-02-03 13:00:37,738 ERROR > > org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call > > to master/172.16.0.46:54310 failed on local exception: No route to host > >at org.apache.hadoop.ipc.Client.call(Client.java:699) > >at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) > >at $Proxy4.getProtocolVersion(Unknown Source) > >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) > >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306) > >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343) > >at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288) > >at > > org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258) > >at > > org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205) > >at > > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199) > >at > > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154) > >at > > org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162) > >at > > org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284) > > Caused by: java.net.NoRouteToHostException: No route to host > >at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > >at > > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) > >at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100) > >at > > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299) > >at > > org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) > >at org.apache.hadoop.ipc.Client.getConnection(Client.java:772) > >at org.apache.hadoop.ipc.Client.call(Client.java:685) > >... 12 more > > > > 2009-02-03 13:00:37,739 INFO > > org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: > > / > > SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32 > > / > > > > > > Also, the Pseudo distributed operation is
Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?
Hey TCK, We use HDFS+FUSE solely as a storage solution for a application which doesn't understand MapReduce. We've scaled this solution to around 80Gbps. For 300 processes reading from the same file, we get about 20Gbps. Do consider your data retention policies -- I would say that Hadoop as a storage system is thus far about 99% reliable for storage and is not a backup solution. If you're scared of getting more than 1% of your logs lost, have a good backup solution. I would also add that when you are learning your operational staff's abilities, expect even more data loss. As you gain experience, data loss goes down. I don't believe we've lost a single block in the last month, but it took us 2-3 months of 1%-level losses to get here. Brian On Feb 4, 2009, at 11:51 AM, TCK wrote: Hey guys, We have been using Hadoop to do batch processing of logs. The logs get written and stored on a NAS. Our Hadoop cluster periodically copies a batch of new logs from the NAS, via NFS into Hadoop's HDFS, processes them, and copies the output back to the NAS. The HDFS is cleaned up at the end of each batch (ie, everything in it is deleted). The problem is that reads off the NAS via NFS don't scale even if we try to scale the copying process by adding more threads to read in parallel. If we instead stored the log files on an HDFS cluster (instead of NAS), it seems like the reads would scale since the data can be read from multiple data nodes at the same time without any contention (except network IO, which shouldn't be a problem). I would appreciate if anyone could share any similar experience they have had with doing parallel reads from a storage HDFS. Also is it a good idea to have a separate HDFS for storage vs for doing the batch processing ? Best Regards, TCK
Batch processing with Hadoop -- does HDFS scale for parallel reads?
Hey guys, We have been using Hadoop to do batch processing of logs. The logs get written and stored on a NAS. Our Hadoop cluster periodically copies a batch of new logs from the NAS, via NFS into Hadoop's HDFS, processes them, and copies the output back to the NAS. The HDFS is cleaned up at the end of each batch (ie, everything in it is deleted). The problem is that reads off the NAS via NFS don't scale even if we try to scale the copying process by adding more threads to read in parallel. If we instead stored the log files on an HDFS cluster (instead of NAS), it seems like the reads would scale since the data can be read from multiple data nodes at the same time without any contention (except network IO, which shouldn't be a problem). I would appreciate if anyone could share any similar experience they have had with doing parallel reads from a storage HDFS. Also is it a good idea to have a separate HDFS for storage vs for doing the batch processing ? Best Regards, TCK
Re: Regarding "Hadoop multi cluster" set-up
Shefali, Is your firewall blocking port 54310 on the master? John On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar wrote: > Hi, > > I am trying to set-up a two node cluster using Hadoop0.19.0, with 1 > master(which should also work as a slave) and 1 slave node. > > But while running bin/start-dfs.sh the datanode is not starting on the > slave. I had read the previous mails on the list, but nothing seems to be > working in this case. I am getting the following error in the > hadoop-root-datanode-slave log file while running the command > bin/start-dfs.sh => > > 2009-02-03 13:00:27,516 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: > / > STARTUP_MSG: Starting DataNode > STARTUP_MSG: host = slave/172.16.0.32 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.19.0 > STARTUP_MSG: build = > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 > / > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 0 time(s). > 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 1 time(s). > 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 2 time(s). > 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 3 time(s). > 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 4 time(s). > 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 5 time(s). > 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 6 time(s). > 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 7 time(s). > 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 8 time(s). > 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: master/172.16.0.46:54310. Already tried 9 time(s). > 2009-02-03 13:00:37,738 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call > to master/172.16.0.46:54310 failed on local exception: No route to host >at org.apache.hadoop.ipc.Client.call(Client.java:699) >at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) >at $Proxy4.getProtocolVersion(Unknown Source) >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306) >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343) >at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288) >at > org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258) >at > org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205) >at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199) >at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154) >at > org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162) >at > org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284) > Caused by: java.net.NoRouteToHostException: No route to host >at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) >at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100) >at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299) >at > org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) >at org.apache.hadoop.ipc.Client.getConnection(Client.java:772) >at org.apache.hadoop.ipc.Client.call(Client.java:685) >... 12 more > > 2009-02-03 13:00:37,739 INFO > org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32 > / > > > Also, the Pseudo distributed operation is working on both the machines. And > i am able to ssh from 'master to master' and 'master to slave' via a > password-less ssh login. I do not think there is any problem with the > network because cross pinging is working fine. > > I am working on Linux (Fedora 8) > > The following is the configuration which i am using > > On master and slave, /conf/masters looks like this:
Regarding "Hadoop multi cluster" set-up
Hi, I am trying to set-up a two node cluster using Hadoop0.19.0, with 1 master(which should also work as a slave) and 1 slave node. But while running bin/start-dfs.sh the datanode is not starting on the slave. I had read the previous mails on the list, but nothing seems to be working in this case. I am getting the following error in the hadoop-root-datanode-slave log file while running the command bin/start-dfs.sh => 2009-02-03 13:00:27,516 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = slave/172.16.0.32 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.19.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008 / 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 0 time(s). 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 1 time(s). 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 2 time(s). 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 3 time(s). 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 4 time(s). 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 5 time(s). 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 6 time(s). 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 7 time(s). 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 8 time(s). 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/172.16.0.46:54310. Already tried 9 time(s). 2009-02-03 13:00:37,738 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to master/172.16.0.46:54310 failed on local exception: No route to host at org.apache.hadoop.ipc.Client.call(Client.java:699) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) at $Proxy4.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258) at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284) Caused by: java.net.NoRouteToHostException: No route to host at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:685) ... 12 more 2009-02-03 13:00:37,739 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32 / Also, the Pseudo distributed operation is working on both the machines. And i am able to ssh from 'master to master' and 'master to slave' via a password-less ssh login. I do not think there is any problem with the network because cross pinging is working fine. I am working on Linux (Fedora 8) The following is the configuration which i am using On master and slave, /conf/masters looks like this: master On master and slave, /conf/slaves looks like this: master slave On both the machines conf/hadoop-site.xml looks like this fs.default.name hdfs://master:54310 The name of the default file system. A URI whose scheme an
Control over max map/reduce tasks per job
All, I have a few relatively small clusters (5-20 nodes) and am having trouble keeping them loaded with my MR jobs. The primary issue is that I have different jobs that have drastically different patterns. I have jobs that read/write to/from HBase or Hadoop with minimal logic (network throughput bound or io bound), others that perform crawling (network latency bound), and one huge parsing streaming job (very CPU bound, each task eats a core). I'd like to launch very large numbers of tasks for network latency bound jobs, however the large CPU bound job means I have to keep the max maps allowed per node low enough as to not starve the Datanode and Regionserver. I'm an HBase dev but not familiar enough with Hadoop MR code to even know what would be involved with implementing this. However, in talking with other users, it seems like this would be a well-received option. I wanted to ping the list before filing an issue because it seems like someone may have thought about this in the past. Thanks. Jonathan Gray
Re: HDD benchmark/checking tool
On Tue, Feb 3, 2009 at 8:53 PM, Dmitry Pushkarev wrote: > Recently I have had a number of drive failures that slowed down processes a > lot until they were discovered. It is there any easy way or tool, to check > HDD performance and see if there any IO errors? > > Currently I wrote a simple script that looks at /var/log/messages and greps > everything abnormal for /dev/sdaX. But if you have better solution I'd > appreciate if you share it. If you have any hardware RAIDs you'd like to monitor/manage, good chances that you'd want to use Einarc to access them: http://www.inquisitor.ru/doc/einarc/ - in fact, it won't hurt even if you use just a bunch of HDDs or software RAIDs :) -- WBR, Mikhail Yakshin
Re: Hadoop 0.19, Cascading 1.0 and MultipleOutputs problem
On Wed, Feb 4, 2009 at 10:07 AM, Alejandro Abdelnur wrote: > Mikhail, > > You are right, please open a Jira on this. > > Alejandro Done: https://issues.apache.org/jira/browse/HADOOP-5167 -- WBR, Mikhail Yakshin
Re: decommissioned node showing up ad dead node in web based interface to namenode (dfshealth.jsp)
I have been looking into this some more by looking a the output of dfsadmin -report during the decommissioning process. After a node has been decommissioned, dfsadmin -report shows that the node is in the Decommissioned state. The web interface dfshealth.jsp shows it as a dead node. After I removed the decommissioned node from the exclude file and run the refreshNodes command, the web interface continues to show it as a dead node but dfsadmin -report shows the node to be in service. After I restart HDFS dfsadmin -report shows the correct information again. If I restart HDFS leaving the decommissioned node in the exlude, the web interface shows it as a dead node and dfsadmin -report shows it to be in service. But after I remove it from the exclude file and run the refreshNodes command, both the web interface and dfsadmin -report show the correct information. It looks to me I should only remove the decommissioned node from the exclude file after restarting HDFS. I would still like to see the web interface report any decommissioned node as decommissioned rather than dead as with the case with dfsadmin -report. I am willing to work on a patch for this. Before I start, does anyone know if this is already in the works? Bill On Mon, Feb 2, 2009 at 5:00 PM, Bill Au wrote: > It looks like the behavior is the same with 0.18.2 and 0.19.0. Even though > I removed the decommissioned node from the exclude file and run the > refreshNode command, the decommissioned node still show up as a dead node. > What I did noticed is that if I leave the decommissioned node in the exclude > and restart HDFS, the node will show up as a dead node after restart. But > then if I remove it from the exclude file and run the refreshNode command, > it will disappear from the status page (dfshealth.jsp). > > So it looks like I will have to stop and start the entire cluster in order > to get what I want. > > Bill > > > On Thu, Jan 29, 2009 at 5:40 PM, Bill Au wrote: > >> Not sure why but this does not work for me. I am running 0.18.2. I ran >> hadoop dfsadmin -refreshNodes after removing the decommissioned node from >> the exclude file. It still shows up as a dead node. I also removed it from >> the slaves file and ran the refresh nodes command again. It still shows up >> as a dead node after that. >> >> I am going to upgrade to 0.19.0 to see if it makes any difference. >> >> Bill >> >> >> On Tue, Jan 27, 2009 at 7:01 PM, paul wrote: >> >>> Once the nodes are listed as dead, if you still have the host names in >>> your >>> conf/exclude file, remove the entries and then run hadoop dfsadmin >>> -refreshNodes. >>> >>> >>> This works for us on our cluster. >>> >>> >>> >>> -paul >>> >>> >>> On Tue, Jan 27, 2009 at 5:08 PM, Bill Au wrote: >>> >>> > I was able to decommission a datanode successfully without having to >>> stop >>> > my >>> > cluster. But I noticed that after a node has been decommissioned, it >>> shows >>> > up as a dead node in the web base interface to the namenode (ie >>> > dfshealth.jsp). My cluster is relatively small and losing a datanode >>> will >>> > have performance impact. So I have a need to monitor the health of my >>> > cluster and take steps to revive any dead datanode in a timely fashion. >>> So >>> > is there any way to altogether "get rid of" any decommissioned datanode >>> > from >>> > the web interace of the namenode? Or is there a better way to monitor >>> the >>> > health of the cluster? >>> > >>> > Bill >>> > >>> >> >> >
Re: Hadoop FS Shell - command overwrite capability
Rasit, Thanks for this comment. I do need console-based control and will consider your suggestion of using a jar file. Thanks, John On Wed, Feb 4, 2009 at 10:17 AM, Rasit OZDAS wrote: > John, I also couldn't find a way from console, > Maybe you already know and don't prefer to use, but API solves this > problem. > FileSystem.copyFromLocalFile(boolean delSrc, boolean overwrite, Path > src, Path dst) > > If you have to use console, long solution, but you can create a jar > for this, and call it just like hadoop calls FileSystem class in > "hadoop" file in bin directory. > > I think File System API also needs some improvement. I wonder if it's > considered by head developers. > > Hope this helps, > Rasit > > 2009/2/4 S D : > > I'm using the Hadoop FS commands to move files from my local machine into > > the Hadoop dfs. I'd like a way to force a write to the dfs even if a file > of > > the same name exists. Ideally I'd like to use a "-force" switch or some > > such; e.g., > >hadoop dfs -copyFromLocal -force adirectory s3n://wholeinthebucket/ > > > > Is there a way to do this or does anyone know if this is in the future > > Hadoop plans? > > > > Thanks > > John SD > > > > > > -- > M. Raşit ÖZDAŞ >
Re: Value-Only Reduce Output
For your reduce, the parameter is stream.reduce.input.field.separator, if you are supplying a reduce class and I believe the output format is TextOutputFormat... It looks like you have tried the map parameter for the separator, not the reduce parameter. >From 0.19.0 PipeReducer: configure: reduceOutFieldSeparator = job_.get("stream.reduce.output.field.separator", "\t").getBytes("UTF-8"); reduceInputFieldSeparator = job_.get("stream.reduce.input.field.separator", "\t").getBytes("UTF-8"); this.numOfReduceOutputKeyFields = job_.getInt("stream.num.reduce.output.key.fields", 1); getInputSeparator: byte[] getInputSeparator() { return reduceInputFieldSeparator; } reduce: write(key); * clientOut_.write(getInputSeparator());* write(val); clientOut_.write('\n'); } else { // "identity reduce" * output.collect(key, val);* } On Wed, Feb 4, 2009 at 6:15 AM, Rasit OZDAS wrote: > I tried it myself, it doesn't work. > I've also tried stream.map.output.field.separator and > map.output.key.field.separator parameters for this purpose, they > don't work either. When hadoop sees empty string, it takes default tab > character instead. > > Rasit > > 2009/2/4 jason hadoop > > > > Ooops, you are using streaming., and I am not familar. > > As a terrible hack, you could set mapred.textoutputformat.separator to > the > > empty string, in your configuration. > > > > On Tue, Feb 3, 2009 at 9:26 PM, jason hadoop > wrote: > > > > > If you are using the standard TextOutputFormat, and the output > collector is > > > passed a null for the value, there will not be a trailing tab character > > > added to the output line. > > > > > > output.collect( key, null ); > > > Will give you the behavior you are looking for if your configuration is > as > > > I expect. > > > > > > > > > On Tue, Feb 3, 2009 at 7:49 PM, Jack Stahl wrote: > > > > > >> Hello, > > >> > > >> I'm interested in a map-reduce flow where I output only values (no > keys) > > >> in > > >> my reduce step. For example, imagine the canonical word-counting > program > > >> where I'd like my output to be an unlabeled histogram of counts > instead of > > >> (word, count) pairs. > > >> > > >> I'm using HadoopStreaming (specifically, I'm using the dumbo module to > run > > >> my python scripts). When I simulate the map reduce using pipes and > sort > > >> in > > >> bash, it works fine. However, in Hadoop, if I output a value with no > > >> tabs, > > >> Hadoop appends a trailing "\t", apparently interpreting my output as a > > >> (value, "") KV pair. I'd like to avoid outputing this trailing tab if > > >> possible. > > >> > > >> Is there a command line option that could be use to effect this? More > > >> generally, is there something wrong with outputing arbitrary strings, > > >> instead of key-value pairs, in your reduce step? > > >> > > > > > > > > > > -- > M. Raşit ÖZDAŞ >
Re: HADOOP-2536 supports Oracle too?
Hadoop-2536 connects to the db via JDBC, so in theory it should work with proper jdbc drivers. It has been tested against MySQL, Hsqldb, and PostreSQL, but not Oracle. To answer your earlier question, the actual SQL statements might not be recognized by Oracle, so I suggest the best way to test this is to insert print statements, and run the actual SQL statements against Oracle to see if the syntax is accepted. We would appreciate if you publish your results. Enis Amandeep Khurana wrote: Does the patch HADOOP-2536 support connecting to Oracle databases as well? Or is it just limited to MySQL? Amandeep Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz
Re: Hadoop FS Shell - command overwrite capability
John, I also couldn't find a way from console, Maybe you already know and don't prefer to use, but API solves this problem. FileSystem.copyFromLocalFile(boolean delSrc, boolean overwrite, Path src, Path dst) If you have to use console, long solution, but you can create a jar for this, and call it just like hadoop calls FileSystem class in "hadoop" file in bin directory. I think File System API also needs some improvement. I wonder if it's considered by head developers. Hope this helps, Rasit 2009/2/4 S D : > I'm using the Hadoop FS commands to move files from my local machine into > the Hadoop dfs. I'd like a way to force a write to the dfs even if a file of > the same name exists. Ideally I'd like to use a "-force" switch or some > such; e.g., >hadoop dfs -copyFromLocal -force adirectory s3n://wholeinthebucket/ > > Is there a way to do this or does anyone know if this is in the future > Hadoop plans? > > Thanks > John SD > -- M. Raşit ÖZDAŞ
Re: How to use DBInputFormat?
Amandeep, "SQL command not properly ended" I get this error whenever I forget the semicolon at the end. I know, it doesn't make sense, but I recommend giving it a try Rasit 2009/2/4 Amandeep Khurana : > The same query is working if I write a simple JDBC client and query the > database. So, I'm probably doing something wrong in the connection settings. > But the error looks to be on the query side more than the connection side. > > Amandeep > > > Amandeep Khurana > Computer Science Graduate Student > University of California, Santa Cruz > > > On Tue, Feb 3, 2009 at 7:25 PM, Amandeep Khurana wrote: > >> Thanks Kevin >> >> I couldnt get it work. Here's the error I get: >> >> bin/hadoop jar ~/dbload.jar LoadTable1 >> 09/02/03 19:21:17 INFO jvm.JvmMetrics: Initializing JVM Metrics with >> processName=JobTracker, sessionId= >> 09/02/03 19:21:20 INFO mapred.JobClient: Running job: job_local_0001 >> 09/02/03 19:21:21 INFO mapred.JobClient: map 0% reduce 0% >> 09/02/03 19:21:22 INFO mapred.MapTask: numReduceTasks: 0 >> 09/02/03 19:21:24 WARN mapred.LocalJobRunner: job_local_0001 >> java.io.IOException: ORA-00933: SQL command not properly ended >> >> at >> org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:321) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) >> java.io.IOException: Job failed! >> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217) >> at LoadTable1.run(LoadTable1.java:130) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at LoadTable1.main(LoadTable1.java:107) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) >> at java.lang.reflect.Method.invoke(Unknown Source) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:165) >> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) >> >> Exception closing file >> /user/amkhuran/contract_table/_temporary/_attempt_local_0001_m_00_0/part-0 >> java.io.IOException: Filesystem closed >> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198) >> at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053) >> at >> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942) >> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210) >> at >> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:243) >> at >> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413) >> at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236) >> at >> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221) >> >> >> Here's my code: >> >> public class LoadTable1 extends Configured implements Tool { >> >> // data destination on hdfs >> private static final String CONTRACT_OUTPUT_PATH = "contract_table"; >> >> // The JDBC connection URL and driver implementation class >> >> private static final String CONNECT_URL = "jdbc:oracle:thin:@dbhost >> :1521:PSEDEV"; >> private static final String DB_USER = "user"; >> private static final String DB_PWD = "pass"; >> private static final String DATABASE_DRIVER_CLASS = >> "oracle.jdbc.driver.OracleDriver"; >> >> private static final String CONTRACT_INPUT_TABLE = >> "OSE_EPR_CONTRACT"; >> >> private static final String [] CONTRACT_INPUT_TABLE_FIELDS = { >> "PORTFOLIO_NUMBER", "CONTRACT_NUMBER"}; >> >> private static final String ORDER_CONTRACT_BY_COL = >> "CONTRACT_NUMBER"; >> >> >> static class ose_epr_contract implements Writable, DBWritable { >> >> >> String CONTRACT_NUMBER; >> >> >> public void readFields(DataInput in) throws IOException { >> >> this.CONTRACT_NUMBER = Text.readString(in); >> >> } >> >> public void write(DataOutput out) throws IOException { >> >> Text.writeString(out, this.CONTRACT_NUMBER); >> >> >> } >> >> public void readFields(ResultSet in_set) throws SQLException { >> >> this.CONTRACT_NUMBER = in_set.getString(1); >> >> } >> >> @Override >> public void write(PreparedStatement prep_st) throws SQLException { >>
Re: Value-Only Reduce Output
I tried it myself, it doesn't work. I've also tried stream.map.output.field.separator and map.output.key.field.separator parameters for this purpose, they don't work either. When hadoop sees empty string, it takes default tab character instead. Rasit 2009/2/4 jason hadoop > > Ooops, you are using streaming., and I am not familar. > As a terrible hack, you could set mapred.textoutputformat.separator to the > empty string, in your configuration. > > On Tue, Feb 3, 2009 at 9:26 PM, jason hadoop wrote: > > > If you are using the standard TextOutputFormat, and the output collector is > > passed a null for the value, there will not be a trailing tab character > > added to the output line. > > > > output.collect( key, null ); > > Will give you the behavior you are looking for if your configuration is as > > I expect. > > > > > > On Tue, Feb 3, 2009 at 7:49 PM, Jack Stahl wrote: > > > >> Hello, > >> > >> I'm interested in a map-reduce flow where I output only values (no keys) > >> in > >> my reduce step. For example, imagine the canonical word-counting program > >> where I'd like my output to be an unlabeled histogram of counts instead of > >> (word, count) pairs. > >> > >> I'm using HadoopStreaming (specifically, I'm using the dumbo module to run > >> my python scripts). When I simulate the map reduce using pipes and sort > >> in > >> bash, it works fine. However, in Hadoop, if I output a value with no > >> tabs, > >> Hadoop appends a trailing "\t", apparently interpreting my output as a > >> (value, "") KV pair. I'd like to avoid outputing this trailing tab if > >> possible. > >> > >> Is there a command line option that could be use to effect this? More > >> generally, is there something wrong with outputing arbitrary strings, > >> instead of key-value pairs, in your reduce step? > >> > > > > -- M. Raşit ÖZDAŞ
Re: Task tracker archive contains too many files
Andrew wrote: I've noticed that task tracker moves all unpacked jars into ${hadoop.tmp.dir}/mapred/local/taskTracker. We are using a lot of external libraries, that are deployed via "-libjars" option. The total number of files after unpacking is about 20 thousands. After running a number of jobs, tasks start to be killed with timeout reason ("Task attempt_200901281518_0011_m_000173_2 failed to report status for 601 seconds. Killing!"). All killed tasks are in "initializing" state. I've watched the tasktracker logs and found such messages: Thread 20926 (Thread-10368): State: BLOCKED Blocked count: 3611 Waited count: 24 Blocked on java.lang.ref.reference$l...@e48ed6 Blocked by 20882 (Thread-10341) Stack: java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) java.lang.StringCoding.encode(StringCoding.java:272) java.lang.String.getBytes(String.java:947) java.io.UnixFileSystem.getBooleanAttributes0(Native Method) java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228) java.io.File.isDirectory(File.java:754) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:427) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) This is exactly as in HADOOP-4780. As I understand, patch brings the code, which stores map of directories along with their DU's, thus reducing the number of calls to DU. This must help but the process of deleting 2 files taks too long. I've manually deleted archive after 10 jobs had run and it took over 30 minutes on XFS. Three times more, that default timeout for tasks! Is there is the way to prohibit unpacking of jars? Or at least not to hold the archive? Or any other better way to solve this problem? Hadoop version: 0.19.0. Now, there is no way to stop DistributedCache from stopping unpacking of jars. I think it should have an option (thru configuration) whether to unpack or not. Can you raise a jira for the same? Thanks Amareshwari
Task tracker archive contains too many files
I've noticed that task tracker moves all unpacked jars into ${hadoop.tmp.dir}/mapred/local/taskTracker. We are using a lot of external libraries, that are deployed via "-libjars" option. The total number of files after unpacking is about 20 thousands. After running a number of jobs, tasks start to be killed with timeout reason ("Task attempt_200901281518_0011_m_000173_2 failed to report status for 601 seconds. Killing!"). All killed tasks are in "initializing" state. I've watched the tasktracker logs and found such messages: Thread 20926 (Thread-10368): State: BLOCKED Blocked count: 3611 Waited count: 24 Blocked on java.lang.ref.reference$l...@e48ed6 Blocked by 20882 (Thread-10341) Stack: java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) java.lang.StringCoding.encode(StringCoding.java:272) java.lang.String.getBytes(String.java:947) java.io.UnixFileSystem.getBooleanAttributes0(Native Method) java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228) java.io.File.isDirectory(File.java:754) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:427) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433) This is exactly as in HADOOP-4780. As I understand, patch brings the code, which stores map of directories along with their DU's, thus reducing the number of calls to DU. This must help but the process of deleting 2 files taks too long. I've manually deleted archive after 10 jobs had run and it took over 30 minutes on XFS. Three times more, that default timeout for tasks! Is there is the way to prohibit unpacking of jars? Or at least not to hold the archive? Or any other better way to solve this problem? Hadoop version: 0.19.0. -- Andrew Gudkov PGP key id: CB9F07D8 (cryptonomicon.mit.edu) Jabber: gu...@jabber.ru