Re: Not able to copy a file to HDFS after installing

2009-02-04 Thread Sagar Naik


where is the namenode running ? localhost or some other host

-Sagar
Rajshekar wrote:
Hello, 
I am new to Hadoop and I jus installed on Ubuntu 8.0.4 LTS as per guidance

of a web site. I tested it and found working fine. I tried to copy a file
but it is giving some error pls help me out

had...@excel-desktop:/usr/local/hadoop/hadoop-0.17.2.1$  bin/hadoop jar
hadoop-0.17.2.1-examples.jar wordcount /home/hadoop/Download\ URLs.txt
download-output
09/02/02 11:18:59 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 1 time(s).
09/02/02 11:19:00 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 2 time(s).
09/02/02 11:19:01 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 3 time(s).
09/02/02 11:19:02 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 4 time(s).
09/02/02 11:19:04 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 5 time(s).
09/02/02 11:19:05 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 6 time(s).
09/02/02 11:19:06 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 7 time(s).
09/02/02 11:19:07 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 8 time(s).
09/02/02 11:19:08 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 9 time(s).
09/02/02 11:19:09 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 10 time(s).
java.lang.RuntimeException: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:356)
at org.apache.hadoop.mapred.FileInputFormat.setInputP
aths(FileInputFormat.java:331)
at org.apache.hadoop.mapred.FileInputFormat.setInputP
aths(FileInputFormat.java:304)
at org.apache.hadoop.examples.WordCount.run(WordCount .java:146)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65)
at org.apache.hadoop.examples.WordCount.main(WordCoun t.java:155)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Native
MethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(De
legatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.ProgramDriver$ProgramDescri
ption.invoke(ProgramDriver.java:6
at org.apache.hadoop.util.ProgramDriver.driver(Progra mDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(Exam pleDriver.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Native
MethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(De
legatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155 )
at org.apache.hadoop.mapred.JobShell.run(JobShell.jav a:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.ja va:220)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketC hannelImpl.java:592)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.jav a:11
at org.apache.hadoop.ipc.Client$Connection.setupIOstr eams(Client.java:174)
at org.apache.hadoop.ipc.Client.getConnection(Client. java:623)
at org.apache.hadoop.ipc.Client.call(Client.java:546)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java: 212)
at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(U nknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:313)
at org.apache.hadoop.dfs.DFSClient.createRPCNamenode( DFSClient.java:102)
at org.apache.hadoop.dfs.DFSClient.(DFSClient.j ava:17
at org.apache.hadoop.dfs.DistributedFileSystem.initia
lize(DistributedFileSystem.java:6
at org.apache.hadoop.fs.FileSystem.createFileSystem(F ileSystem.java:1280)
at org.apache.hadoop.fs.FileSystem.access$300(FileSys tem.java:56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSyst em.java:1291)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:203)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:10
at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:352)
  


Hadoop IO performance, prefetch etc

2009-02-04 Thread Songting Chen
Hi,
   Most of our map jobs are IO bound. However, for the same node, the IO 
throughput during the map phase is only 20% of its real sequential IO 
capability (we tested the sequential IO throughput by iozone) 
   I think the reason is that while each map has a sequential IO request, since 
there are many maps concurrently running on the same node, this causes quite 
expensive IO switches.
   Prefetch may be a good solution here especially a map job is supposed to 
scan through an entire block and no more no less. Any idea how to enable it?

Thanks,
-Songting


Not able to copy a file to HDFS after installing

2009-02-04 Thread Rajshekar

Hello, 
I am new to Hadoop and I jus installed on Ubuntu 8.0.4 LTS as per guidance
of a web site. I tested it and found working fine. I tried to copy a file
but it is giving some error pls help me out

had...@excel-desktop:/usr/local/hadoop/hadoop-0.17.2.1$  bin/hadoop jar
hadoop-0.17.2.1-examples.jar wordcount /home/hadoop/Download\ URLs.txt
download-output
09/02/02 11:18:59 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 1 time(s).
09/02/02 11:19:00 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 2 time(s).
09/02/02 11:19:01 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 3 time(s).
09/02/02 11:19:02 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 4 time(s).
09/02/02 11:19:04 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 5 time(s).
09/02/02 11:19:05 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 6 time(s).
09/02/02 11:19:06 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 7 time(s).
09/02/02 11:19:07 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 8 time(s).
09/02/02 11:19:08 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 9 time(s).
09/02/02 11:19:09 INFO ipc.Client: Retrying connect to server:
localhost/127.0.0.1:9000. Already tried 10 time(s).
java.lang.RuntimeException: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:356)
at org.apache.hadoop.mapred.FileInputFormat.setInputP
aths(FileInputFormat.java:331)
at org.apache.hadoop.mapred.FileInputFormat.setInputP
aths(FileInputFormat.java:304)
at org.apache.hadoop.examples.WordCount.run(WordCount .java:146)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65)
at org.apache.hadoop.examples.WordCount.main(WordCoun t.java:155)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Native
MethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(De
legatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.ProgramDriver$ProgramDescri
ption.invoke(ProgramDriver.java:6
at org.apache.hadoop.util.ProgramDriver.driver(Progra mDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(Exam pleDriver.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ e Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Native
MethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(De
legatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155 )
at org.apache.hadoop.mapred.JobShell.run(JobShell.jav a:194)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.j ava:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.ja va:220)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketC hannelImpl.java:592)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.jav a:11
at org.apache.hadoop.ipc.Client$Connection.setupIOstr eams(Client.java:174)
at org.apache.hadoop.ipc.Client.getConnection(Client. java:623)
at org.apache.hadoop.ipc.Client.call(Client.java:546)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java: 212)
at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(U nknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:313)
at org.apache.hadoop.dfs.DFSClient.createRPCNamenode( DFSClient.java:102)
at org.apache.hadoop.dfs.DFSClient.(DFSClient.j ava:17
at org.apache.hadoop.dfs.DistributedFileSystem.initia
lize(DistributedFileSystem.java:6
at org.apache.hadoop.fs.FileSystem.createFileSystem(F ileSystem.java:1280)
at org.apache.hadoop.fs.FileSystem.access$300(FileSys tem.java:56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSyst em.java:1291)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:203)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.jav a:10
at org.apache.hadoop.mapred.JobConf.getWorkingDirecto ry(JobConf.java:352)
-- 
View this message in context: 
http://www.nabble.com/Not-able-to-copy-a-file-to-HDFS-after-installing-tp21845768p21845768.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



copying binary files to a SequenceFile

2009-02-04 Thread Mark Kerzner
Hi all,

I am copying regular binary files to a SequenceFile, and I am using
BytesWritable, to which I am giving all the byte[] content of the file.
However, once it hits a file larger than my computer memory, it may have
problems. Is there a better way?

Thank you,
Mark


Re: Bad connection to FS.

2009-02-04 Thread lohit
As noted by others NameNode is not running.
Before formatting anything (which is like deleting your data), try to see why 
NameNode isnt running.
search for value of HADOOP_LOG_DIR in ./conf/hadoop-env.sh if you have not set 
it explicitly it would default to /logs/*namenode*.log
Lohit



- Original Message 
From: Amandeep Khurana 
To: core-user@hadoop.apache.org
Sent: Wednesday, February 4, 2009 5:26:43 PM
Subject: Re: Bad connection to FS.

Here's what I had done..

1. Stop the whole system
2. Delete all the data in the directories where the data and the metadata is
being stored.
3. Format the namenode
4. Start the system

This solved my problem. I'm not sure if this is a good idea to do for you or
not. I was pretty much installing from scratch so didnt mind deleting the
files in those directories..

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 3:49 PM, TCK  wrote:

>
> I believe the debug logs location is still specified in hadoop-env.sh (I
> just read the 0.19.0 doc). I think you have to shut down all nodes first
> (stop-all), then format the namenode, and then restart (start-all) and make
> sure that NameNode comes up too. We are using a very old version, 0.12.3,
> and are upgrading.
> -TCK
>
>
>
> --- On Wed, 2/4/09, Mithila Nagendra  wrote:
> From: Mithila Nagendra 
> Subject: Re: Bad connection to FS.
> To: core-user@hadoop.apache.org, moonwatcher32...@yahoo.com
> Date: Wednesday, February 4, 2009, 6:30 PM
>
> @TCK: Which version of hadoop have you installed?
> @Amandeep: I did tried reformatting the namenode, but it hasn't helped me
> out in anyway.
> Mithila
>
>
> On Wed, Feb 4, 2009 at 4:18 PM, TCK  wrote:
>
>
>
> Mithila, how come there is no NameNode java process listed by your jps
> command? I would check the hadoop namenode logs to see if there was some
> startup problem (the location of those logs would be specified in
> hadoop-env.sh, at least in the version I'm using).
>
>
> -TCK
>
>
>
>
>
>
>
> --- On Wed, 2/4/09, Mithila Nagendra  wrote:
>
> From: Mithila Nagendra 
>
> Subject: Bad connection to FS.
>
> To: "core-user@hadoop.apache.org" , "
> core-user-subscr...@hadoop.apache.org" <
> core-user-subscr...@hadoop.apache.org>
>
>
> Date: Wednesday, February 4, 2009, 6:06 PM
>
>
>
> Hey all
>
>
>
> When I try to copy a folder from the local file system in to the HDFS using
>
> the command hadoop dfs -copyFromLocal, the copy fails and it gives an error
>
> which says "Bad connection to FS". How do I get past this? The
>
> following is
>
> the output at the time of execution:
>
>
>
> had...@renweiyu-desktop:/usr/local/hadoop$ jps
>
> 6873 Jps
>
> 6299 JobTracker
>
> 6029 DataNode
>
> 6430 TaskTracker
>
> 6189 SecondaryNameNode
>
> had...@renweiyu-desktop:/usr/local/hadoop$ ls
>
> bin  docslib  README.txt
>
> build.xmlhadoop-0.18.3-ant.jar   libhdfs  src
>
> c++  hadoop-0.18.3-core.jar  librecordio  webapps
>
> CHANGES.txt  hadoop-0.18.3-examples.jar  LICENSE.txt
>
> conf hadoop-0.18.3-test.jar  logs
>
> contrib  hadoop-0.18.3-tools.jar NOTICE.txt
>
> had...@renweiyu-desktop:/usr/local/hadoop$ cd ..
>
> had...@renweiyu-desktop:/usr/local$ ls
>
> bin  etc  games  gutenberg  hadoop  hadoop-0.18.3.tar.gz  hadoop-datastore
>
> include  lib  man  sbin  share  src
>
> had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal
>
> gutenberg gutenberg
>
> 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 0 time(s).
>
> 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 1 time(s).
>
> 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 2 time(s).
>
> 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 3 time(s).
>
> 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 4 time(s).
>
> 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 5 time(s).
>
> 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 6 time(s).
>
> 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 7 time(s).
>
> 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 8 time(s).
>
> 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 9 time(s).
>
> Bad connection to FS. command aborted.
>
>
>
> The commmand jps shows that the hadoop system s up and running. So I have
> no
>
> idea whats wrong!
>
>
>
> Thanks!
>
> Mithila
>
>
>
>
>
>
>
>
>
>
>
>
>
>



Re: Bad connection to FS.

2009-02-04 Thread Amandeep Khurana
Here's what I had done..

1. Stop the whole system
2. Delete all the data in the directories where the data and the metadata is
being stored.
3. Format the namenode
4. Start the system

This solved my problem. I'm not sure if this is a good idea to do for you or
not. I was pretty much installing from scratch so didnt mind deleting the
files in those directories..

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 3:49 PM, TCK  wrote:

>
> I believe the debug logs location is still specified in hadoop-env.sh (I
> just read the 0.19.0 doc). I think you have to shut down all nodes first
> (stop-all), then format the namenode, and then restart (start-all) and make
> sure that NameNode comes up too. We are using a very old version, 0.12.3,
> and are upgrading.
> -TCK
>
>
>
> --- On Wed, 2/4/09, Mithila Nagendra  wrote:
> From: Mithila Nagendra 
> Subject: Re: Bad connection to FS.
> To: core-user@hadoop.apache.org, moonwatcher32...@yahoo.com
> Date: Wednesday, February 4, 2009, 6:30 PM
>
> @TCK: Which version of hadoop have you installed?
> @Amandeep: I did tried reformatting the namenode, but it hasn't helped me
> out in anyway.
> Mithila
>
>
> On Wed, Feb 4, 2009 at 4:18 PM, TCK  wrote:
>
>
>
> Mithila, how come there is no NameNode java process listed by your jps
> command? I would check the hadoop namenode logs to see if there was some
> startup problem (the location of those logs would be specified in
> hadoop-env.sh, at least in the version I'm using).
>
>
> -TCK
>
>
>
>
>
>
>
> --- On Wed, 2/4/09, Mithila Nagendra  wrote:
>
> From: Mithila Nagendra 
>
> Subject: Bad connection to FS.
>
> To: "core-user@hadoop.apache.org" , "
> core-user-subscr...@hadoop.apache.org" <
> core-user-subscr...@hadoop.apache.org>
>
>
> Date: Wednesday, February 4, 2009, 6:06 PM
>
>
>
> Hey all
>
>
>
> When I try to copy a folder from the local file system in to the HDFS using
>
> the command hadoop dfs -copyFromLocal, the copy fails and it gives an error
>
> which says "Bad connection to FS". How do I get past this? The
>
> following is
>
> the output at the time of execution:
>
>
>
> had...@renweiyu-desktop:/usr/local/hadoop$ jps
>
> 6873 Jps
>
> 6299 JobTracker
>
> 6029 DataNode
>
> 6430 TaskTracker
>
> 6189 SecondaryNameNode
>
> had...@renweiyu-desktop:/usr/local/hadoop$ ls
>
> bin  docslib  README.txt
>
> build.xmlhadoop-0.18.3-ant.jar   libhdfs  src
>
> c++  hadoop-0.18.3-core.jar  librecordio  webapps
>
> CHANGES.txt  hadoop-0.18.3-examples.jar  LICENSE.txt
>
> conf hadoop-0.18.3-test.jar  logs
>
> contrib  hadoop-0.18.3-tools.jar NOTICE.txt
>
> had...@renweiyu-desktop:/usr/local/hadoop$ cd ..
>
> had...@renweiyu-desktop:/usr/local$ ls
>
> bin  etc  games  gutenberg  hadoop  hadoop-0.18.3.tar.gz  hadoop-datastore
>
> include  lib  man  sbin  share  src
>
> had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal
>
> gutenberg gutenberg
>
> 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 0 time(s).
>
> 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 1 time(s).
>
> 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 2 time(s).
>
> 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 3 time(s).
>
> 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 4 time(s).
>
> 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 5 time(s).
>
> 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 6 time(s).
>
> 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 7 time(s).
>
> 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 8 time(s).
>
> 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/
>
> 127.0.0.1:54310. Already tried 9 time(s).
>
> Bad connection to FS. command aborted.
>
>
>
> The commmand jps shows that the hadoop system s up and running. So I have
> no
>
> idea whats wrong!
>
>
>
> Thanks!
>
> Mithila
>
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: Value-Only Reduce Output

2009-02-04 Thread Jack Stahl
My (0.18.2) reduce src looks like this:

  write(key);
  clientOut_.write('\t');
  write(val);
  clientOut_.write('\n');

which explains why avoiding the trailing tab is unavoidable.

Thanks for your help, though, Jason!

2009/2/4 jason hadoop 

> For your reduce, the parameter is stream.reduce.input.field.separator, if
> you are supplying a reduce class and I believe the output format is
> TextOutputFormat...
>
> It looks like you have tried the map parameter for the separator, not the
> reduce parameter.
>
> From 0.19.0 PipeReducer:
> configure:
>  reduceOutFieldSeparator =
> job_.get("stream.reduce.output.field.separator", "\t").getBytes("UTF-8");
>  reduceInputFieldSeparator =
> job_.get("stream.reduce.input.field.separator", "\t").getBytes("UTF-8");
>  this.numOfReduceOutputKeyFields =
> job_.getInt("stream.num.reduce.output.key.fields", 1);
>
> getInputSeparator:
>  byte[] getInputSeparator() {
>return reduceInputFieldSeparator;
>  }
>
> reduce:
>  write(key);
> *  clientOut_.write(getInputSeparator());*
>  write(val);
>  clientOut_.write('\n');
>} else {
>  // "identity reduce"
> *  output.collect(key, val);*
> }
>
>
> On Wed, Feb 4, 2009 at 6:15 AM, Rasit OZDAS  wrote:
>
> > I tried it myself, it doesn't work.
> > I've also tried   stream.map.output.field.separator   and
> > map.output.key.field.separator  parameters for this purpose, they
> > don't work either. When hadoop sees empty string, it takes default tab
> > character instead.
> >
> > Rasit
> >
> > 2009/2/4 jason hadoop 
> > >
> > > Ooops, you are using streaming., and I am not familar.
> > > As a terrible hack, you could set mapred.textoutputformat.separator to
> > the
> > > empty string, in your configuration.
> > >
> > > On Tue, Feb 3, 2009 at 9:26 PM, jason hadoop 
> > wrote:
> > >
> > > > If you are using the standard TextOutputFormat, and the output
> > collector is
> > > > passed a null for the value, there will not be a trailing tab
> character
> > > > added to the output line.
> > > >
> > > > output.collect( key, null );
> > > > Will give you the behavior you are looking for if your configuration
> is
> > as
> > > > I expect.
> > > >
> > > >
> > > > On Tue, Feb 3, 2009 at 7:49 PM, Jack Stahl  wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >> I'm interested in a map-reduce flow where I output only values (no
> > keys)
> > > >> in
> > > >> my reduce step.  For example, imagine the canonical word-counting
> > program
> > > >> where I'd like my output to be an unlabeled histogram of counts
> > instead of
> > > >> (word, count) pairs.
> > > >>
> > > >> I'm using HadoopStreaming (specifically, I'm using the dumbo module
> to
> > run
> > > >> my python scripts).  When I simulate the map reduce using pipes and
> > sort
> > > >> in
> > > >> bash, it works fine.   However, in Hadoop, if I output a value with
> no
> > > >> tabs,
> > > >> Hadoop appends a trailing "\t", apparently interpreting my output as
> a
> > > >> (value, "") KV pair.  I'd like to avoid outputing this trailing tab
> if
> > > >> possible.
> > > >>
> > > >> Is there a command line option that could be use to effect this?
>  More
> > > >> generally, is there something wrong with outputing arbitrary
> strings,
> > > >> instead of key-value pairs, in your reduce step?
> > > >>
> > > >
> > > >
> >
> >
> >
> > --
> > M. Raşit ÖZDAŞ
> >
>


Re: Bad connection to FS.

2009-02-04 Thread TCK

I believe the debug logs location is still specified in hadoop-env.sh (I just 
read the 0.19.0 doc). I think you have to shut down all nodes first (stop-all), 
then format the namenode, and then restart (start-all) and make sure that 
NameNode comes up too. We are using a very old version, 0.12.3, and are 
upgrading.
-TCK



--- On Wed, 2/4/09, Mithila Nagendra  wrote:
From: Mithila Nagendra 
Subject: Re: Bad connection to FS.
To: core-user@hadoop.apache.org, moonwatcher32...@yahoo.com
Date: Wednesday, February 4, 2009, 6:30 PM

@TCK: Which version of hadoop have you installed?
@Amandeep: I did tried reformatting the namenode, but it hasn't helped me out 
in anyway.
Mithila


On Wed, Feb 4, 2009 at 4:18 PM, TCK  wrote:



Mithila, how come there is no NameNode java process listed by your jps command? 
I would check the hadoop namenode logs to see if there was some startup problem 
(the location of those logs would be specified in hadoop-env.sh, at least in 
the version I'm using).


-TCK







--- On Wed, 2/4/09, Mithila Nagendra  wrote:

From: Mithila Nagendra 

Subject: Bad connection to FS.

To: "core-user@hadoop.apache.org" , 
"core-user-subscr...@hadoop.apache.org" 


Date: Wednesday, February 4, 2009, 6:06 PM



Hey all



When I try to copy a folder from the local file system in to the HDFS using

the command hadoop dfs -copyFromLocal, the copy fails and it gives an error

which says "Bad connection to FS". How do I get past this? The

following is

the output at the time of execution:



had...@renweiyu-desktop:/usr/local/hadoop$ jps

6873 Jps

6299 JobTracker

6029 DataNode

6430 TaskTracker

6189 SecondaryNameNode

had...@renweiyu-desktop:/usr/local/hadoop$ ls

bin          docs                        lib          README.txt

build.xml    hadoop-0.18.3-ant.jar       libhdfs      src

c++          hadoop-0.18.3-core.jar      librecordio  webapps

CHANGES.txt  hadoop-0.18.3-examples.jar  LICENSE.txt

conf         hadoop-0.18.3-test.jar      logs

contrib      hadoop-0.18.3-tools.jar     NOTICE.txt

had...@renweiyu-desktop:/usr/local/hadoop$ cd ..

had...@renweiyu-desktop:/usr/local$ ls

bin  etc  games  gutenberg  hadoop  hadoop-0.18.3.tar.gz  hadoop-datastore

include  lib  man  sbin  share  src

had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal

gutenberg gutenberg

09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 0 time(s).

09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 1 time(s).

09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 2 time(s).

09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 3 time(s).

09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 4 time(s).

09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 5 time(s).

09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 6 time(s).

09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 7 time(s).

09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 8 time(s).

09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/

127.0.0.1:54310. Already tried 9 time(s).

Bad connection to FS. command aborted.



The commmand jps shows that the hadoop system s up and running. So I have no

idea whats wrong!



Thanks!

Mithila







      




  

Re: problem with completion notification from block movement

2009-02-04 Thread Raghu Angadi

Karl Kleinpaste wrote:

On Sun, 2009-02-01 at 17:58 -0800, jason hadoop wrote:

The Datanode's use multiple threads with locking and one of the
assumptions is that the block report (1ce per hour by default) takes
little time. The datanode will pause while the block report is running
and if it happens to take a while weird things start to happen.


Thank you for responding, this is very informative for us.

Having looked through the source code with a co-worker regarding
periodic scan and then checking the logs once again, we find that we are
finding reports of this sort:

BlockReport of 1158499 blocks got processed in 308860 msecs
BlockReport of 1159840 blocks got processed in 237925 msecs
BlockReport of 1161274 blocks got processed in 177853 msecs
BlockReport of 1162408 blocks got processed in 285094 msecs
BlockReport of 1164194 blocks got processed in 184478 msecs
BlockReport of 1165673 blocks got processed in 226401 msecs

The 3rd of these exactly straddles the particular example timeline I
discussed in my original email about this question.  I suspect I'll find
more of the same as I look through other related errors.


You could ask for "complete fix" in 
https://issues.apache.org/jira/browse/HADOOP-4584 . I don't think 
current patch there fixes your problem.


Raghu.


--karl





Re: Bad connection to FS.

2009-02-04 Thread Amandeep Khurana
I faced the same issue a few days back. Formatting the namenode made it work
for me.

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 3:06 PM, Mithila Nagendra  wrote:

> Hey all
>
> When I try to copy a folder from the local file system in to the HDFS using
> the command hadoop dfs -copyFromLocal, the copy fails and it gives an error
> which says "Bad connection to FS". How do I get past this? The following is
> the output at the time of execution:
>
> had...@renweiyu-desktop:/usr/local/hadoop$ jps
> 6873 Jps
> 6299 JobTracker
> 6029 DataNode
> 6430 TaskTracker
> 6189 SecondaryNameNode
> had...@renweiyu-desktop:/usr/local/hadoop$ ls
> bin  docslib  README.txt
> build.xmlhadoop-0.18.3-ant.jar   libhdfs  src
> c++  hadoop-0.18.3-core.jar  librecordio  webapps
> CHANGES.txt  hadoop-0.18.3-examples.jar  LICENSE.txt
> conf hadoop-0.18.3-test.jar  logs
> contrib  hadoop-0.18.3-tools.jar NOTICE.txt
> had...@renweiyu-desktop:/usr/local/hadoop$ cd ..
> had...@renweiyu-desktop:/usr/local$ ls
> bin  etc  games  gutenberg  hadoop  hadoop-0.18.3.tar.gz  hadoop-datastore
> include  lib  man  sbin  share  src
> had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal
> gutenberg gutenberg
> 09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 0 time(s).
> 09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 1 time(s).
> 09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 2 time(s).
> 09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 3 time(s).
> 09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 4 time(s).
> 09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 5 time(s).
> 09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 6 time(s).
> 09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 7 time(s).
> 09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 8 time(s).
> 09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:54310. Already tried 9 time(s).
> Bad connection to FS. command aborted.
>
> The commmand jps shows that the hadoop system s up and running. So I have
> no
> idea whats wrong!
>
> Thanks!
> Mithila
>


Re: Bad connection to FS.

2009-02-04 Thread TCK

Mithila, how come there is no NameNode java process listed by your jps command? 
I would check the hadoop namenode logs to see if there was some startup problem 
(the location of those logs would be specified in hadoop-env.sh, at least in 
the version I'm using).
-TCK



--- On Wed, 2/4/09, Mithila Nagendra  wrote:
From: Mithila Nagendra 
Subject: Bad connection to FS.
To: "core-user@hadoop.apache.org" , 
"core-user-subscr...@hadoop.apache.org" 
Date: Wednesday, February 4, 2009, 6:06 PM

Hey all

When I try to copy a folder from the local file system in to the HDFS using
the command hadoop dfs -copyFromLocal, the copy fails and it gives an error
which says "Bad connection to FS". How do I get past this? The
following is
the output at the time of execution:

had...@renweiyu-desktop:/usr/local/hadoop$ jps
6873 Jps
6299 JobTracker
6029 DataNode
6430 TaskTracker
6189 SecondaryNameNode
had...@renweiyu-desktop:/usr/local/hadoop$ ls
bin  docslib  README.txt
build.xmlhadoop-0.18.3-ant.jar   libhdfs  src
c++  hadoop-0.18.3-core.jar  librecordio  webapps
CHANGES.txt  hadoop-0.18.3-examples.jar  LICENSE.txt
conf hadoop-0.18.3-test.jar  logs
contrib  hadoop-0.18.3-tools.jar NOTICE.txt
had...@renweiyu-desktop:/usr/local/hadoop$ cd ..
had...@renweiyu-desktop:/usr/local$ ls
bin  etc  games  gutenberg  hadoop  hadoop-0.18.3.tar.gz  hadoop-datastore
include  lib  man  sbin  share  src
had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal
gutenberg gutenberg
09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 0 time(s).
09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 1 time(s).
09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 2 time(s).
09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 3 time(s).
09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 4 time(s).
09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 5 time(s).
09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 6 time(s).
09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 7 time(s).
09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 8 time(s).
09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 9 time(s).
Bad connection to FS. command aborted.

The commmand jps shows that the hadoop system s up and running. So I have no
idea whats wrong!

Thanks!
Mithila



  

Bad connection to FS.

2009-02-04 Thread Mithila Nagendra
Hey all

When I try to copy a folder from the local file system in to the HDFS using
the command hadoop dfs -copyFromLocal, the copy fails and it gives an error
which says "Bad connection to FS". How do I get past this? The following is
the output at the time of execution:

had...@renweiyu-desktop:/usr/local/hadoop$ jps
6873 Jps
6299 JobTracker
6029 DataNode
6430 TaskTracker
6189 SecondaryNameNode
had...@renweiyu-desktop:/usr/local/hadoop$ ls
bin  docslib  README.txt
build.xmlhadoop-0.18.3-ant.jar   libhdfs  src
c++  hadoop-0.18.3-core.jar  librecordio  webapps
CHANGES.txt  hadoop-0.18.3-examples.jar  LICENSE.txt
conf hadoop-0.18.3-test.jar  logs
contrib  hadoop-0.18.3-tools.jar NOTICE.txt
had...@renweiyu-desktop:/usr/local/hadoop$ cd ..
had...@renweiyu-desktop:/usr/local$ ls
bin  etc  games  gutenberg  hadoop  hadoop-0.18.3.tar.gz  hadoop-datastore
include  lib  man  sbin  share  src
had...@renweiyu-desktop:/usr/local$ hadoop/bin/hadoop dfs -copyFromLocal
gutenberg gutenberg
09/02/04 15:58:21 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 0 time(s).
09/02/04 15:58:22 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 1 time(s).
09/02/04 15:58:23 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 2 time(s).
09/02/04 15:58:24 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 3 time(s).
09/02/04 15:58:25 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 4 time(s).
09/02/04 15:58:26 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 5 time(s).
09/02/04 15:58:27 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 6 time(s).
09/02/04 15:58:28 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 7 time(s).
09/02/04 15:58:29 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 8 time(s).
09/02/04 15:58:30 INFO ipc.Client: Retrying connect to server: localhost/
127.0.0.1:54310. Already tried 9 time(s).
Bad connection to FS. command aborted.

The commmand jps shows that the hadoop system s up and running. So I have no
idea whats wrong!

Thanks!
Mithila


Re: HADOOP-2536 supports Oracle too?

2009-02-04 Thread Amandeep Khurana
Ok. I created the same database in a MySQL database and ran the same hadoop
job against it. It worked. So, that means there is some Oracle specific
issue. It cant be an issue with the JDBC drivers since I am using the same
drivers in a simple JDBC client.

What could it be?

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 10:26 AM, Amandeep Khurana  wrote:

> Ok. I'm not sure if I got it correct. Are you saying, I should test the
> statement that hadoop creates directly with the database?
>
> Amandeep
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Wed, Feb 4, 2009 at 7:13 AM, Enis Soztutar  wrote:
>
>> Hadoop-2536 connects to the db via JDBC, so in theory it should work with
>> proper jdbc drivers.
>> It has been tested against MySQL, Hsqldb, and PostreSQL, but not Oracle.
>>
>> To answer your earlier question, the actual SQL statements might not be
>> recognized by Oracle, so I suggest the best way to test this is to insert
>> print statements, and run the actual SQL statements against Oracle to see if
>> the syntax is accepted.
>>
>> We would appreciate if you publish your results.
>>
>> Enis
>>
>>
>> Amandeep Khurana wrote:
>>
>>> Does the patch HADOOP-2536 support connecting to Oracle databases as
>>> well?
>>> Or is it just limited to MySQL?
>>>
>>> Amandeep
>>>
>>>
>>> Amandeep Khurana
>>> Computer Science Graduate Student
>>> University of California, Santa Cruz
>>>
>>>
>>>
>>
>


RE: Control over max map/reduce tasks per job

2009-02-04 Thread Jonathan Gray
I have filed an issue for this:

https://issues.apache.org/jira/browse/HADOOP-5170

JG

> -Original Message-
> From: Bryan Duxbury [mailto:br...@rapleaf.com]
> Sent: Tuesday, February 03, 2009 10:59 PM
> To: core-user@hadoop.apache.org
> Subject: Re: Control over max map/reduce tasks per job
> 
> This sounds good enough for a JIRA ticket to me.
> -Bryan
> 
> On Feb 3, 2009, at 11:44 AM, Jonathan Gray wrote:
> 
> > Chris,
> >
> > For my specific use cases, it would be best to be able to set N
> > mappers/reducers per job per node (so I can explicitly say, run at
> > most 2 at
> > a time of this CPU bound task on any given node).  However, the
> > other way
> > would work as well (on 10 node system, would set job to max 20
> > tasks at a
> > time globally), but opens up the possibility that a node could be
> > assigned
> > more than 2 of that task.
> >
> > I would work with whatever is easiest to implement as either would
> > be a vast
> > improvement for me (can run high numbers of network latency bound
> > tasks
> > without fear of cpu bound tasks killing the cluster).
> >
> > JG
> >
> >
> >
> >> -Original Message-
> >> From: Chris K Wensel [mailto:ch...@wensel.net]
> >> Sent: Tuesday, February 03, 2009 11:34 AM
> >> To: core-user@hadoop.apache.org
> >> Subject: Re: Control over max map/reduce tasks per job
> >>
> >> Hey Jonathan
> >>
> >> Are you looking to limit the total number of concurrent mapper/
> >> reducers a single job can consume cluster wide, or limit the number
> >> per node?
> >>
> >> That is, you have X mappers/reducers, but only can allow N mappers/
> >> reducers to run at a time globally, for a given job.
> >>
> >> Or, you are cool with all X running concurrently globally, but
> >> want to
> >> guarantee that no node can run more than N tasks from that job?
> >>
> >> Or both?
> >>
> >> just reconciling the conversation we had last week with this thread.
> >>
> >> ckw
> >>
> >> On Feb 3, 2009, at 11:16 AM, Jonathan Gray wrote:
> >>
> >>> All,
> >>>
> >>>
> >>>
> >>> I have a few relatively small clusters (5-20 nodes) and am having
> >>> trouble
> >>> keeping them loaded with my MR jobs.
> >>>
> >>>
> >>>
> >>> The primary issue is that I have different jobs that have
> >>> drastically
> >>> different patterns.  I have jobs that read/write to/from HBase or
> >>> Hadoop
> >>> with minimal logic (network throughput bound or io bound), others
> >> that
> >>> perform crawling (network latency bound), and one huge parsing
> >>> streaming job
> >>> (very CPU bound, each task eats a core).
> >>>
> >>>
> >>>
> >>> I'd like to launch very large numbers of tasks for network latency
> >>> bound
> >>> jobs, however the large CPU bound job means I have to keep the max
> >>> maps
> >>> allowed per node low enough as to not starve the Datanode and
> >>> Regionserver.
> >>>
> >>>
> >>>
> >>> I'm an HBase dev but not familiar enough with Hadoop MR code to
> even
> >>> know
> >>> what would be involved with implementing this.  However, in talking
> >>> with
> >>> other users, it seems like this would be a well-received option.
> >>>
> >>>
> >>>
> >>> I wanted to ping the list before filing an issue because it seems
> >> like
> >>> someone may have thought about this in the past.
> >>>
> >>>
> >>>
> >>> Thanks.
> >>>
> >>>
> >>>
> >>> Jonathan Gray
> >>>
> >>
> >> --
> >> Chris K Wensel
> >> ch...@wensel.net
> >> http://www.cascading.org/
> >> http://www.scaleunlimited.com/
> >



Re: Regarding "Hadoop multi cluster" set-up

2009-02-04 Thread Ian Soboroff
I would love to see someplace a complete list of the ports that the  
various Hadoop daemons expect to have open.  Does anyone have that?


Ian

On Feb 4, 2009, at 1:16 PM, shefali pawar wrote:



Hi,

I will have to check. I can do that tomorrow in college. But if that  
is the case what should i do?


Should i change the port number and try again?

Shefali

On Wed, 04 Feb 2009 S D wrote :

Shefali,

Is your firewall blocking port 54310 on the master?

John

On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar >wrote:



Hi,

I am trying to set-up a two node cluster using Hadoop0.19.0, with 1
master(which should also work as a slave) and 1 slave node.

But while running bin/start-dfs.sh the datanode is not starting on  
the
slave. I had read the previous mails on the list, but nothing  
seems to be

working in this case. I am getting the following error in the
hadoop-root-datanode-slave log file while running the command
bin/start-dfs.sh =>

2009-02-03 13:00:27,516 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = slave/172.16.0.32
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.19.0
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
/
2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 0 time(s).
2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 1 time(s).
2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 2 time(s).
2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 3 time(s).
2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 4 time(s).
2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 5 time(s).
2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 6 time(s).
2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 7 time(s).
2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 8 time(s).
2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client:  
Retrying connect

to server: master/172.16.0.46:54310. Already tried 9 time(s).
2009-02-03 13:00:37,738 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:  
java.io.IOException: Call
to master/172.16.0.46:54310 failed on local exception: No route to  
host

  at org.apache.hadoop.ipc.Client.call(Client.java:699)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
  at $Proxy4.getProtocolVersion(Unknown Source)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
  at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
  at
org 
.apache 
.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java: 
258)

  at
org 
.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java: 
205)

  at
org 
.apache 
.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java: 
1199)

  at
org 
.apache 
.hadoop 
.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java: 
1154)

  at
org 
.apache 
.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java: 
1162)

  at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java: 
1284)

Caused by: java.net.NoRouteToHostException: No route to host
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java: 
574)

  at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
  at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java: 
299)

  at
org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
  at org.apache.hadoop.ipc.Client.call(Client.java:685)
  ... 12 more

2009-02-03 13:00:37,739 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
/


Also, the Pseudo distributed operation is working on both the  
machines. And

i am able to ssh from 'master to master' 

Re: Chukwa documentation

2009-02-04 Thread Ariel Rabkin
Howdy.

You do not need torque. It's not even helpful, as far as I know.
You don't need a database, but if you don't have one, you'd probably
need to do a bit more work to analyze the collected data in HDFS. If
you were going to be using MapReduce for analysis anyway, that's
probably a non-issue for you.

We're working on documentation, but it's sort of chasing a moving
target, since the Chukwa codebase and configuration interfaces are
still in flux.

--Ari


On Tue, Feb 3, 2009 at 5:00 PM,   wrote:
> Hi Everybody,
>
> I don't know if there is a mail list for Chukwa so I apologies in
> advance if this is not the right place to ask my questions.
>
> I have the following questions and comments:
>
> It was simple the configuration of the collector and the agent.
> However, there is other features that are not documented it all like:
> - torque (Do I have to install torque before? Yes? No? and Why?),
> - database, (Do I have to have a DB?)
> -what is queueinfo.properties, which kind of information provides me?
> -and there is more stuff that I need to dig in the code to understand.
> Could somebody update the documentation from Chukwa?.
>


-- 
Ari Rabkin asrab...@gmail.com
UC Berkeley Computer Science Department


Re: How to use DBInputFormat?

2009-02-04 Thread Amandeep Khurana
Adding a semicolon gives me the error "ORA-00911: Invalid character"

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 6:46 AM, Rasit OZDAS  wrote:

> Amandeep,
> "SQL command not properly ended"
> I get this error whenever I forget the semicolon at the end.
> I know, it doesn't make sense, but I recommend giving it a try
>
> Rasit
>
> 2009/2/4 Amandeep Khurana :
> > The same query is working if I write a simple JDBC client and query the
> > database. So, I'm probably doing something wrong in the connection
> settings.
> > But the error looks to be on the query side more than the connection
> side.
> >
> > Amandeep
> >
> >
> > Amandeep Khurana
> > Computer Science Graduate Student
> > University of California, Santa Cruz
> >
> >
> > On Tue, Feb 3, 2009 at 7:25 PM, Amandeep Khurana 
> wrote:
> >
> >> Thanks Kevin
> >>
> >> I couldnt get it work. Here's the error I get:
> >>
> >> bin/hadoop jar ~/dbload.jar LoadTable1
> >> 09/02/03 19:21:17 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> >> processName=JobTracker, sessionId=
> >> 09/02/03 19:21:20 INFO mapred.JobClient: Running job: job_local_0001
> >> 09/02/03 19:21:21 INFO mapred.JobClient:  map 0% reduce 0%
> >> 09/02/03 19:21:22 INFO mapred.MapTask: numReduceTasks: 0
> >> 09/02/03 19:21:24 WARN mapred.LocalJobRunner: job_local_0001
> >> java.io.IOException: ORA-00933: SQL command not properly ended
> >>
> >> at
> >>
> org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289)
> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:321)
> >> at
> >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> >> java.io.IOException: Job failed!
> >> at
> org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
> >> at LoadTable1.run(LoadTable1.java:130)
> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >> at LoadTable1.main(LoadTable1.java:107)
> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown
> Source)
> >> at java.lang.reflect.Method.invoke(Unknown Source)
> >> at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
> >> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> >>
> >> Exception closing file
> >>
> /user/amkhuran/contract_table/_temporary/_attempt_local_0001_m_00_0/part-0
> >> java.io.IOException: Filesystem closed
> >> at
> org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
> >> at
> org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
> >> at
> >>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
> >> at
> >>
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
> >> at
> >> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
> >> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)
> >> at
> >>
> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:243)
> >> at
> >> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413)
> >> at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236)
> >> at
> >> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221)
> >>
> >>
> >> Here's my code:
> >>
> >> public class LoadTable1 extends Configured implements Tool  {
> >>
> >>   // data destination on hdfs
> >>   private static final String CONTRACT_OUTPUT_PATH =
> "contract_table";
> >>
> >>   // The JDBC connection URL and driver implementation class
> >>
> >> private static final String CONNECT_URL = "jdbc:oracle:thin:@dbhost
> >> :1521:PSEDEV";
> >>   private static final String DB_USER = "user";
> >>   private static final String DB_PWD = "pass";
> >>   private static final String DATABASE_DRIVER_CLASS =
> >> "oracle.jdbc.driver.OracleDriver";
> >>
> >>   private static final String CONTRACT_INPUT_TABLE =
> >> "OSE_EPR_CONTRACT";
> >>
> >>   private static final String [] CONTRACT_INPUT_TABLE_FIELDS = {
> >> "PORTFOLIO_NUMBER", "CONTRACT_NUMBER"};
> >>
> >>   private static final String ORDER_CONTRACT_BY_COL =
> >> "CONTRACT_NUMBER";
> >>
> >>
> >> static class ose_epr_contract implements Writable, DBWritable {
> >>
> >>
> >> String CONTRACT_NUMBER;
> >>
> >>
> >> public void readFields(DataInput in) throws IOException {
> >>
> >>   

Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?

2009-02-04 Thread Brian Bockelman

Sounds overly complicated.  Complicated usually leads to mistakes :)

What about just having a single cluster and only running the  
tasktrackers on the fast CPUs?  No messy cross-cluster transferring.


Brian

On Feb 4, 2009, at 12:46 PM, TCK wrote:




Thanks, Brian. This sounds encouraging for us.

What are the advantages/disadvantages of keeping a persistent  
storage (HD/K)FS cluster separate from a processing Hadoop+(HD/K)FS  
cluster ?
The advantage I can think of is that a permanent storage cluster has  
different requirements from a map-reduce processing cluster -- the  
permanent storage cluster would need faster, bigger hard disks, and  
would need to grow as the total volume of all collected logs grows,  
whereas the processing cluster would need fast CPUs and would only  
need to grow with the rate of incoming data. So it seems to make  
sense to me to copy a piece of data from the permanent storage  
cluster to the processing cluster only when it needs to be  
processed. Is my line of thinking reasonable? How would this compare  
to running the map-reduce processing on same cluster as the data is  
stored in? Which approach is used by most people?


Best Regards,
TCK



--- On Wed, 2/4/09, Brian Bockelman  wrote:
From: Brian Bockelman 
Subject: Re: Batch processing with Hadoop -- does HDFS scale for  
parallel reads?

To: core-user@hadoop.apache.org
Date: Wednesday, February 4, 2009, 1:06 PM

Hey TCK,

We use HDFS+FUSE solely as a storage solution for a application which
doesn't understand MapReduce.  We've scaled this solution to around
80Gbps.  For 300 processes reading from the same file, we get about  
20Gbps.


Do consider your data retention policies -- I would say that Hadoop  
as a
storage system is thus far about 99% reliable for storage and is not  
a backup
solution.  If you're scared of getting more than 1% of your logs  
lost, have
a good backup solution.  I would also add that when you are learning  
your
operational staff's abilities, expect even more data loss.  As you  
gain

experience, data loss goes down.

I don't believe we've lost a single block in the last month, but it
took us 2-3 months of 1%-level losses to get here.

Brian

On Feb 4, 2009, at 11:51 AM, TCK wrote:



Hey guys,

We have been using Hadoop to do batch processing of logs. The logs  
get
written and stored on a NAS. Our Hadoop cluster periodically copies  
a batch of

new logs from the NAS, via NFS into Hadoop's HDFS, processes them, and
copies the output back to the NAS. The HDFS is cleaned up at the end  
of each

batch (ie, everything in it is deleted).


The problem is that reads off the NAS via NFS don't scale even if we
try to scale the copying process by adding more threads to read in  
parallel.


If we instead stored the log files on an HDFS cluster (instead of  
NAS), it
seems like the reads would scale since the data can be read from  
multiple data
nodes at the same time without any contention (except network IO,  
which

shouldn't be a problem).


I would appreciate if anyone could share any similar experience  
they have

had with doing parallel reads from a storage HDFS.


Also is it a good idea to have a separate HDFS for storage vs for  
doing

the batch processing ?


Best Regards,
TCK













Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?

2009-02-04 Thread TCK


Thanks, Brian. This sounds encouraging for us.

What are the advantages/disadvantages of keeping a persistent storage (HD/K)FS 
cluster separate from a processing Hadoop+(HD/K)FS cluster ?
The advantage I can think of is that a permanent storage cluster has different 
requirements from a map-reduce processing cluster -- the permanent storage 
cluster would need faster, bigger hard disks, and would need to grow as the 
total volume of all collected logs grows, whereas the processing cluster would 
need fast CPUs and would only need to grow with the rate of incoming data. So 
it seems to make sense to me to copy a piece of data from the permanent storage 
cluster to the processing cluster only when it needs to be processed. Is my 
line of thinking reasonable? How would this compare to running the map-reduce 
processing on same cluster as the data is stored in? Which approach is used by 
most people?

Best Regards,
TCK



--- On Wed, 2/4/09, Brian Bockelman  wrote:
From: Brian Bockelman 
Subject: Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?
To: core-user@hadoop.apache.org
Date: Wednesday, February 4, 2009, 1:06 PM

Hey TCK,

We use HDFS+FUSE solely as a storage solution for a application which
doesn't understand MapReduce.  We've scaled this solution to around
80Gbps.  For 300 processes reading from the same file, we get about 20Gbps.

Do consider your data retention policies -- I would say that Hadoop as a
storage system is thus far about 99% reliable for storage and is not a backup
solution.  If you're scared of getting more than 1% of your logs lost, have
a good backup solution.  I would also add that when you are learning your
operational staff's abilities, expect even more data loss.  As you gain
experience, data loss goes down.

I don't believe we've lost a single block in the last month, but it
took us 2-3 months of 1%-level losses to get here.

Brian

On Feb 4, 2009, at 11:51 AM, TCK wrote:

> 
> Hey guys,
> 
> We have been using Hadoop to do batch processing of logs. The logs get
written and stored on a NAS. Our Hadoop cluster periodically copies a batch of
new logs from the NAS, via NFS into Hadoop's HDFS, processes them, and
copies the output back to the NAS. The HDFS is cleaned up at the end of each
batch (ie, everything in it is deleted).
> 
> The problem is that reads off the NAS via NFS don't scale even if we
try to scale the copying process by adding more threads to read in parallel.
> 
> If we instead stored the log files on an HDFS cluster (instead of NAS), it
seems like the reads would scale since the data can be read from multiple data
nodes at the same time without any contention (except network IO, which
shouldn't be a problem).
> 
> I would appreciate if anyone could share any similar experience they have
had with doing parallel reads from a storage HDFS.
> 
> Also is it a good idea to have a separate HDFS for storage vs for doing
the batch processing ?
> 
> Best Regards,
> TCK
> 
> 
> 
> 




  

Re: HDFS Namenode Heap Size woes

2009-02-04 Thread Sean Knapp
Brian, Jason,
Thanks again for your help. Just to close thread, while following your
suggestions I found I had an incredibly large number of files on my data
nodes that were being marked for invalidation at startup. I believe they
were left behind from an old mass-delete that was followed by a shutdown
before the deletes were performed. I've cleaned out those files and we're
humming along with <1GB heap size.

Thanks,
Sean

On Sun, Feb 1, 2009 at 10:48 PM, jason hadoop wrote:

> If you set up your namenode for remote debugging, you could attach with
> eclipse or the debugger of your choice.
>
> Look at the objects in org.apache.hadoop.hdfs.server.namenode.FSNamesystem
>  private UnderReplicatedBlocks neededReplications = new
> UnderReplicatedBlocks();
>  private PendingReplicationBlocks pendingReplications;
>
>  //
>  // Keeps a Collection for every named machine containing
>  // blocks that have recently been invalidated and are thought to live
>  // on the machine in question.
>  // Mapping: StorageID -> ArrayList
>  //
>  private Map> recentInvalidateSets =
>new TreeMap>();
>
>  //
>  // Keeps a TreeSet for every named node.  Each treeset contains
>  // a list of the blocks that are "extra" at that location.  We'll
>  // eventually remove these extras.
>  // Mapping: StorageID -> TreeSet
>  //
>  Map> excessReplicateMap =
>new TreeMap>();
>
> Much of this is run out of a thread ReplicationMonitor.
>
> In our case we had datanodes with 2million blocks dropping off and on
> again,
> and this was trashing these queues with the 2million blocks on the
> datanodoes, re-replicating the blocks and then invalidating them all when
> the datanode came back.
>
>
> On Sun, Feb 1, 2009 at 7:03 PM, Brian Bockelman  >wrote:
>
> > Hey Sean,
> >
> > I use JMX monitoring -- which allows me to trigger GC via jconsole.
> >  There's decent documentation out there to making it work, but you'd have
> to
> > restart the namenode to do it ... let the list know if you can't figure
> it
> > out.
> >
> > Brian
> >
> >
> > On Feb 1, 2009, at 8:59 PM, Sean Knapp wrote:
> >
> >  Brian,
> >> Thanks for jumping in as well. Is there a recommended way of manually
> >> triggering GC?
> >>
> >> Thanks,
> >> Sean
> >>
> >> On Sun, Feb 1, 2009 at 6:06 PM, Brian Bockelman  >> >wrote:
> >>
> >>  Hey Sean,
> >>>
> >>> Dumb question: how much memory is used after a garbage collection
> cycle?
> >>>
> >>> Look at the graph "jvm.metrics.memHeapUsedM":
> >>>
> >>>
> >>>
> >>>
> http://rcf.unl.edu/ganglia/?m=network_report&r=hour&s=descending&c=red&h=hadoop-name&sh=1&hc=4&z=small
> >>>
> >>> If you tell the JVM it has 16GB of memory to play with, it will often
> use
> >>> a
> >>> significant portion of that before it does a thorough GC.  In our site,
> >>> it
> >>> actually only needs ~ 500MB, but sometimes it will hit 1GB before GC is
> >>> triggered.  One of the vagaries of Java, eh?
> >>>
> >>> Trigger a GC and see how much is actually used.
> >>>
> >>> Brian
> >>>
> >>>
> >>> On Feb 1, 2009, at 6:11 PM, Sean Knapp wrote:
> >>>
> >>> Jason,
> >>>
>  Thanks for the response. By falling out, do you mean a longer time
> since
>  last contact (100s+), or fully timed out where it is dropped into dead
>  nodes? The former happens fairly often, the latter only under serious
>  load
>  but not in the last day. Also, my namenode is now up to 10GB with less
>  than
>  700k files after some additional archiving.
> 
>  Thanks,
>  Sean
> 
>  On Sun, Feb 1, 2009 at 4:00 PM, jason hadoop 
>  wrote:
> 
>  If your datanodes are pausing and falling out of the cluster you will
>  get
> 
> > a
> > large workload for the namenode of blocks to replicate and when the
> > paused
> > datanode comes back, a large workload of blocks to delete.
> > These lists are stored in memory on the namenode.
> > The startup messages lead me to wonder if your datanodes are
> > periodically
> > pausing or are otherwise dropping in and out of the cluster.
> >
> > On Sat, Jan 31, 2009 at 2:20 PM, Sean Knapp  wrote:
> >
> > I'm running 0.19.0 on a 10 node cluster (8 core, 16GB RAM, 4x1.5TB).
> > The
> >
> >> current status of my FS is approximately 1 million files and
> >> directories,
> >> 950k blocks, and heap size of 7GB (16GB reserved). Average block
> >> replication
> >> is 3.8. I'm concerned that the heap size is steadily climbing... a
> 7GB
> >>
> >>  heap
> >
> >  is substantially higher per file that I have on a similar 0.18.2
> >> cluster,
> >> which has closer to a 1GB heap.
> >> My typical usage model is 1) write a number of small files into HDFS
> >>
> >>  (tens
> >
> >  or hundreds of thousands at a time), 2) archive those files, 3)
> delete
> >>
> >>  the
> >
> >  originals. I've tried dropping the replication factor of the _index
> >> and
> >> _masterindex fi

Re: HADOOP-2536 supports Oracle too?

2009-02-04 Thread Amandeep Khurana
Ok. I'm not sure if I got it correct. Are you saying, I should test the
statement that hadoop creates directly with the database?

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz


On Wed, Feb 4, 2009 at 7:13 AM, Enis Soztutar  wrote:

> Hadoop-2536 connects to the db via JDBC, so in theory it should work with
> proper jdbc drivers.
> It has been tested against MySQL, Hsqldb, and PostreSQL, but not Oracle.
>
> To answer your earlier question, the actual SQL statements might not be
> recognized by Oracle, so I suggest the best way to test this is to insert
> print statements, and run the actual SQL statements against Oracle to see if
> the syntax is accepted.
>
> We would appreciate if you publish your results.
>
> Enis
>
>
> Amandeep Khurana wrote:
>
>> Does the patch HADOOP-2536 support connecting to Oracle databases as well?
>> Or is it just limited to MySQL?
>>
>> Amandeep
>>
>>
>> Amandeep Khurana
>> Computer Science Graduate Student
>> University of California, Santa Cruz
>>
>>
>>
>


Re: Re: Regarding "Hadoop multi cluster" set-up

2009-02-04 Thread shefali pawar
  
Hi,

I will have to check. I can do that tomorrow in college. But if that is the 
case what should i do?

Should i change the port number and try again?

Shefali

On Wed, 04 Feb 2009 S D wrote :
>Shefali,
>
>Is your firewall blocking port 54310 on the master?
>
>John
>
>On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar wrote:
>
> > Hi,
> >
> > I am trying to set-up a two node cluster using Hadoop0.19.0, with 1
> > master(which should also work as a slave) and 1 slave node.
> >
> > But while running bin/start-dfs.sh the datanode is not starting on the
> > slave. I had read the previous mails on the list, but nothing seems to be
> > working in this case. I am getting the following error in the
> > hadoop-root-datanode-slave log file while running the command
> > bin/start-dfs.sh =>
> >
> > 2009-02-03 13:00:27,516 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> > /
> > STARTUP_MSG: Starting DataNode
> > STARTUP_MSG:   host = slave/172.16.0.32
> > STARTUP_MSG:   args = []
> > STARTUP_MSG:   version = 0.19.0
> > STARTUP_MSG:   build =
> > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
> > 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
> > /
> > 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 0 time(s).
> > 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 1 time(s).
> > 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 2 time(s).
> > 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 3 time(s).
> > 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 4 time(s).
> > 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 5 time(s).
> > 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 6 time(s).
> > 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 7 time(s).
> > 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 8 time(s).
> > 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect
> > to server: master/172.16.0.46:54310. Already tried 9 time(s).
> > 2009-02-03 13:00:37,738 ERROR
> > org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call
> > to master/172.16.0.46:54310 failed on local exception: No route to host
> >at org.apache.hadoop.ipc.Client.call(Client.java:699)
> >at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> >at $Proxy4.getProtocolVersion(Unknown Source)
> >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
> >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
> >at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
> >at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
> >at
> > org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258)
> >at
> > org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205)
> >at
> > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
> >at
> > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
> >at
> > org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
> >at
> > org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
> > Caused by: java.net.NoRouteToHostException: No route to host
> >at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> >at
> > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
> >at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
> >at
> > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
> >at
> > org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
> >at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
> >at org.apache.hadoop.ipc.Client.call(Client.java:685)
> >... 12 more
> >
> > 2009-02-03 13:00:37,739 INFO
> > org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> > /
> > SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
> > /
> >
> >
> > Also, the Pseudo distributed operation is

Re: Batch processing with Hadoop -- does HDFS scale for parallel reads?

2009-02-04 Thread Brian Bockelman

Hey TCK,

We use HDFS+FUSE solely as a storage solution for a application which  
doesn't understand MapReduce.  We've scaled this solution to around  
80Gbps.  For 300 processes reading from the same file, we get about  
20Gbps.


Do consider your data retention policies -- I would say that Hadoop as  
a storage system is thus far about 99% reliable for storage and is not  
a backup solution.  If you're scared of getting more than 1% of your  
logs lost, have a good backup solution.  I would also add that when  
you are learning your operational staff's abilities, expect even more  
data loss.  As you gain experience, data loss goes down.


I don't believe we've lost a single block in the last month, but it  
took us 2-3 months of 1%-level losses to get here.


Brian

On Feb 4, 2009, at 11:51 AM, TCK wrote:



Hey guys,

We have been using Hadoop to do batch processing of logs. The logs  
get written and stored on a NAS. Our Hadoop cluster periodically  
copies a batch of new logs from the NAS, via NFS into Hadoop's HDFS,  
processes them, and copies the output back to the NAS. The HDFS is  
cleaned up at the end of each batch (ie, everything in it is deleted).


The problem is that reads off the NAS via NFS don't scale even if we  
try to scale the copying process by adding more threads to read in  
parallel.


If we instead stored the log files on an HDFS cluster (instead of  
NAS), it seems like the reads would scale since the data can be read  
from multiple data nodes at the same time without any contention  
(except network IO, which shouldn't be a problem).


I would appreciate if anyone could share any similar experience they  
have had with doing parallel reads from a storage HDFS.


Also is it a good idea to have a separate HDFS for storage vs for  
doing the batch processing ?


Best Regards,
TCK








Batch processing with Hadoop -- does HDFS scale for parallel reads?

2009-02-04 Thread TCK

Hey guys, 

We have been using Hadoop to do batch processing of logs. The logs get written 
and stored on a NAS. Our Hadoop cluster periodically copies a batch of new logs 
from the NAS, via NFS into Hadoop's HDFS, processes them, and copies the output 
back to the NAS. The HDFS is cleaned up at the end of each batch (ie, 
everything in it is deleted).

The problem is that reads off the NAS via NFS don't scale even if we try to 
scale the copying process by adding more threads to read in parallel.

If we instead stored the log files on an HDFS cluster (instead of NAS), it 
seems like the reads would scale since the data can be read from multiple data 
nodes at the same time without any contention (except network IO, which 
shouldn't be a problem).

I would appreciate if anyone could share any similar experience they have had 
with doing parallel reads from a storage HDFS.

Also is it a good idea to have a separate HDFS for storage vs for doing the 
batch processing ?

Best Regards,
TCK




  

Re: Regarding "Hadoop multi cluster" set-up

2009-02-04 Thread S D
Shefali,

Is your firewall blocking port 54310 on the master?

John

On Wed, Feb 4, 2009 at 12:34 PM, shefali pawar wrote:

> Hi,
>
> I am trying to set-up a two node cluster using Hadoop0.19.0, with 1
> master(which should also work as a slave) and 1 slave node.
>
> But while running bin/start-dfs.sh the datanode is not starting on the
> slave. I had read the previous mails on the list, but nothing seems to be
> working in this case. I am getting the following error in the
> hadoop-root-datanode-slave log file while running the command
> bin/start-dfs.sh =>
>
> 2009-02-03 13:00:27,516 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = slave/172.16.0.32
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.19.0
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r
> 713890; compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
> /
> 2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 0 time(s).
> 2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 1 time(s).
> 2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 2 time(s).
> 2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 3 time(s).
> 2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 4 time(s).
> 2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 5 time(s).
> 2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 6 time(s).
> 2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 7 time(s).
> 2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 8 time(s).
> 2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: master/172.16.0.46:54310. Already tried 9 time(s).
> 2009-02-03 13:00:37,738 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call
> to master/172.16.0.46:54310 failed on local exception: No route to host
>at org.apache.hadoop.ipc.Client.call(Client.java:699)
>at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>at $Proxy4.getProtocolVersion(Unknown Source)
>at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
>at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
>at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
>at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
>at
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
> Caused by: java.net.NoRouteToHostException: No route to host
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
>at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
>at
> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
>at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
>at org.apache.hadoop.ipc.Client.call(Client.java:685)
>... 12 more
>
> 2009-02-03 13:00:37,739 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
> /
> SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
> /
>
>
> Also, the Pseudo distributed operation is working on both the machines. And
> i am able to ssh from 'master to master' and 'master to slave' via a
> password-less ssh login. I do not think there is any problem with the
> network because cross pinging is working fine.
>
> I am working on Linux (Fedora 8)
>
> The following is the configuration which i am using
>
> On master and slave, /conf/masters looks like this:

Regarding "Hadoop multi cluster" set-up

2009-02-04 Thread shefali pawar
Hi,

I am trying to set-up a two node cluster using Hadoop0.19.0, with 1 
master(which should also work as a slave) and 1 slave node. 

But while running bin/start-dfs.sh the datanode is not starting on the slave. I 
had read the previous mails on the list, but nothing seems to be working in 
this case. I am getting the following error in the hadoop-root-datanode-slave 
log file while running the command bin/start-dfs.sh =>

2009-02-03 13:00:27,516 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
STARTUP_MSG: 
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = slave/172.16.0.32
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.19.0
STARTUP_MSG:   build = 
https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 -r 713890; 
compiled by 'ndaley' on Fri Nov 14 03:12:29 UTC 2008
/
2009-02-03 13:00:28,725 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 0 time(s).
2009-02-03 13:00:29,726 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 1 time(s).
2009-02-03 13:00:30,727 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 2 time(s).
2009-02-03 13:00:31,728 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 3 time(s).
2009-02-03 13:00:32,729 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 4 time(s).
2009-02-03 13:00:33,730 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 5 time(s).
2009-02-03 13:00:34,731 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 6 time(s).
2009-02-03 13:00:35,732 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 7 time(s).
2009-02-03 13:00:36,733 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 8 time(s).
2009-02-03 13:00:37,734 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: master/172.16.0.46:54310. Already tried 9 time(s).
2009-02-03 13:00:37,738 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
java.io.IOException: Call to master/172.16.0.46:54310 failed on local 
exception: No route to host
at org.apache.hadoop.ipc.Client.call(Client.java:699)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
at $Proxy4.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:306)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:343)
at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:288)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:258)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:205)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1199)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1154)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1162)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1284)
Caused by: java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:685)
... 12 more

2009-02-03 13:00:37,739 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
SHUTDOWN_MSG: 
/
SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.32
/


Also, the Pseudo distributed operation is working on both the machines. And i 
am able to ssh from 'master to master' and 'master to slave' via a 
password-less ssh login. I do not think there is any problem with the network 
because cross pinging is working fine.

I am working on Linux (Fedora 8)

The following is the configuration which i am using

On master and slave, /conf/masters looks like this: 

 master

On master and slave, /conf/slaves looks like this: 

 master
 slave

On both the machines conf/hadoop-site.xml looks like this

 
   fs.default.name
   hdfs://master:54310
   The name of the default file system.  A URI whose
   scheme an

Control over max map/reduce tasks per job

2009-02-04 Thread Jonathan Gray
All,

 

I have a few relatively small clusters (5-20 nodes) and am having trouble
keeping them loaded with my MR jobs.

 

The primary issue is that I have different jobs that have drastically
different patterns.  I have jobs that read/write to/from HBase or Hadoop
with minimal logic (network throughput bound or io bound), others that
perform crawling (network latency bound), and one huge parsing streaming job
(very CPU bound, each task eats a core).

 

I'd like to launch very large numbers of tasks for network latency bound
jobs, however the large CPU bound job means I have to keep the max maps
allowed per node low enough as to not starve the Datanode and Regionserver.

 

I'm an HBase dev but not familiar enough with Hadoop MR code to even know
what would be involved with implementing this.  However, in talking with
other users, it seems like this would be a well-received option.

 

I wanted to ping the list before filing an issue because it seems like
someone may have thought about this in the past.

 

Thanks.

 

Jonathan Gray



Re: HDD benchmark/checking tool

2009-02-04 Thread Mikhail Yakshin
On Tue, Feb 3, 2009 at 8:53 PM, Dmitry Pushkarev wrote:

> Recently I have had a number of drive failures that slowed down processes a
> lot until they were discovered. It is there any easy way or tool, to check
> HDD performance and see if there any IO errors?
>
> Currently I wrote a simple script that looks at /var/log/messages and greps
> everything abnormal for /dev/sdaX. But if you have better solution I'd
> appreciate if you share it.

If you have any hardware RAIDs you'd like to monitor/manage, good
chances that you'd want to use Einarc to access them:
http://www.inquisitor.ru/doc/einarc/ - in fact, it won't hurt even if
you use just a bunch of HDDs or software RAIDs :)

-- 
WBR, Mikhail Yakshin


Re: Hadoop 0.19, Cascading 1.0 and MultipleOutputs problem

2009-02-04 Thread Mikhail Yakshin
On Wed, Feb 4, 2009 at 10:07 AM, Alejandro Abdelnur  wrote:
> Mikhail,
>
> You are right, please open a Jira on this.
>
> Alejandro

Done:
https://issues.apache.org/jira/browse/HADOOP-5167

-- 
WBR, Mikhail Yakshin


Re: decommissioned node showing up ad dead node in web based interface to namenode (dfshealth.jsp)

2009-02-04 Thread Bill Au
I have been looking into this some more by looking a the output of dfsadmin
-report during the decommissioning process.  After a node has been
decommissioned, dfsadmin -report shows that the node is in the
Decommissioned state.  The web interface dfshealth.jsp shows it as a dead
node.  After I removed the decommissioned node from the exclude file and run
the refreshNodes command, the web interface continues to show it as a dead
node but dfsadmin -report shows the node to be in service.  After I restart
HDFS dfsadmin -report shows the correct information again.

If I restart HDFS leaving the decommissioned node in the exlude, the web
interface shows it as a dead node and dfsadmin -report shows it to be in
service.  But after I remove it from the exclude file and run the
refreshNodes command, both the web interface and dfsadmin -report show the
correct information.

It looks to me I should only remove the decommissioned node from the exclude
file after restarting HDFS.

I would still like to see the web interface report any decommissioned node
as decommissioned rather than dead as with the case with dfsadmin -report.
I am willing to work on a patch for this.  Before I start, does anyone know
if this is already in the works?

Bill

On Mon, Feb 2, 2009 at 5:00 PM, Bill Au  wrote:

> It looks like the behavior is the same with 0.18.2 and 0.19.0.  Even though
> I removed the decommissioned node from the exclude file and run the
> refreshNode command, the decommissioned node still show up as a dead node.
> What I did noticed is that if I leave the decommissioned node in the exclude
> and restart HDFS, the node will show up as a dead node after restart.  But
> then if I remove it from the exclude file and run the refreshNode command,
> it will disappear from the status page (dfshealth.jsp).
>
> So it looks like I will have to stop and start the entire cluster in order
> to get what I want.
>
> Bill
>
>
> On Thu, Jan 29, 2009 at 5:40 PM, Bill Au  wrote:
>
>> Not sure why but this does not work for me.  I am running 0.18.2.  I ran
>> hadoop dfsadmin -refreshNodes after removing the decommissioned node from
>> the exclude file.  It still shows up as a dead node.  I also removed it from
>> the slaves file and ran the refresh nodes command again.  It still shows up
>> as a dead node after that.
>>
>> I am going to upgrade to 0.19.0 to see if it makes any difference.
>>
>> Bill
>>
>>
>> On Tue, Jan 27, 2009 at 7:01 PM, paul  wrote:
>>
>>> Once the nodes are listed as dead, if you still have the host names in
>>> your
>>> conf/exclude file, remove the entries and then run hadoop dfsadmin
>>> -refreshNodes.
>>>
>>>
>>> This works for us on our cluster.
>>>
>>>
>>>
>>> -paul
>>>
>>>
>>> On Tue, Jan 27, 2009 at 5:08 PM, Bill Au  wrote:
>>>
>>> > I was able to decommission a datanode successfully without having to
>>> stop
>>> > my
>>> > cluster.  But I noticed that after a node has been decommissioned, it
>>> shows
>>> > up as a dead node in the web base interface to the namenode (ie
>>> > dfshealth.jsp).  My cluster is relatively small and losing a datanode
>>> will
>>> > have performance impact.  So I have a need to monitor the health of my
>>> > cluster and take steps to revive any dead datanode in a timely fashion.
>>>  So
>>> > is there any way to altogether "get rid of" any decommissioned datanode
>>> > from
>>> > the web interace of the namenode?  Or is there a better way to monitor
>>> the
>>> > health of the cluster?
>>> >
>>> > Bill
>>> >
>>>
>>
>>
>


Re: Hadoop FS Shell - command overwrite capability

2009-02-04 Thread S D
Rasit,

Thanks for this comment. I do need console-based control and will consider
your suggestion of using a jar file.

Thanks,
John

On Wed, Feb 4, 2009 at 10:17 AM, Rasit OZDAS  wrote:

> John, I also couldn't find a way from console,
> Maybe you already know and don't prefer to use, but API solves this
> problem.
> FileSystem.copyFromLocalFile(boolean delSrc, boolean overwrite, Path
> src, Path dst)
>
> If you have to use console, long solution, but you can create a jar
> for this, and call it just like hadoop calls FileSystem class in
> "hadoop" file in bin directory.
>
> I think File System API also needs some improvement. I wonder if it's
> considered by head developers.
>
> Hope this helps,
> Rasit
>
> 2009/2/4 S D :
> > I'm using the Hadoop FS commands to move files from my local machine into
> > the Hadoop dfs. I'd like a way to force a write to the dfs even if a file
> of
> > the same name exists. Ideally I'd like to use a "-force" switch or some
> > such; e.g.,
> >hadoop dfs -copyFromLocal -force adirectory s3n://wholeinthebucket/
> >
> > Is there a way to do this or does anyone know if this is in the future
> > Hadoop plans?
> >
> > Thanks
> > John SD
> >
>
>
>
> --
> M. Raşit ÖZDAŞ
>


Re: Value-Only Reduce Output

2009-02-04 Thread jason hadoop
For your reduce, the parameter is stream.reduce.input.field.separator, if
you are supplying a reduce class and I believe the output format is
TextOutputFormat...

It looks like you have tried the map parameter for the separator, not the
reduce parameter.

>From 0.19.0 PipeReducer:
configure:
  reduceOutFieldSeparator =
job_.get("stream.reduce.output.field.separator", "\t").getBytes("UTF-8");
  reduceInputFieldSeparator =
job_.get("stream.reduce.input.field.separator", "\t").getBytes("UTF-8");
  this.numOfReduceOutputKeyFields =
job_.getInt("stream.num.reduce.output.key.fields", 1);

getInputSeparator:
  byte[] getInputSeparator() {
return reduceInputFieldSeparator;
  }

reduce:
  write(key);
*  clientOut_.write(getInputSeparator());*
  write(val);
  clientOut_.write('\n');
} else {
  // "identity reduce"
*  output.collect(key, val);*
}


On Wed, Feb 4, 2009 at 6:15 AM, Rasit OZDAS  wrote:

> I tried it myself, it doesn't work.
> I've also tried   stream.map.output.field.separator   and
> map.output.key.field.separator  parameters for this purpose, they
> don't work either. When hadoop sees empty string, it takes default tab
> character instead.
>
> Rasit
>
> 2009/2/4 jason hadoop 
> >
> > Ooops, you are using streaming., and I am not familar.
> > As a terrible hack, you could set mapred.textoutputformat.separator to
> the
> > empty string, in your configuration.
> >
> > On Tue, Feb 3, 2009 at 9:26 PM, jason hadoop 
> wrote:
> >
> > > If you are using the standard TextOutputFormat, and the output
> collector is
> > > passed a null for the value, there will not be a trailing tab character
> > > added to the output line.
> > >
> > > output.collect( key, null );
> > > Will give you the behavior you are looking for if your configuration is
> as
> > > I expect.
> > >
> > >
> > > On Tue, Feb 3, 2009 at 7:49 PM, Jack Stahl  wrote:
> > >
> > >> Hello,
> > >>
> > >> I'm interested in a map-reduce flow where I output only values (no
> keys)
> > >> in
> > >> my reduce step.  For example, imagine the canonical word-counting
> program
> > >> where I'd like my output to be an unlabeled histogram of counts
> instead of
> > >> (word, count) pairs.
> > >>
> > >> I'm using HadoopStreaming (specifically, I'm using the dumbo module to
> run
> > >> my python scripts).  When I simulate the map reduce using pipes and
> sort
> > >> in
> > >> bash, it works fine.   However, in Hadoop, if I output a value with no
> > >> tabs,
> > >> Hadoop appends a trailing "\t", apparently interpreting my output as a
> > >> (value, "") KV pair.  I'd like to avoid outputing this trailing tab if
> > >> possible.
> > >>
> > >> Is there a command line option that could be use to effect this?  More
> > >> generally, is there something wrong with outputing arbitrary strings,
> > >> instead of key-value pairs, in your reduce step?
> > >>
> > >
> > >
>
>
>
> --
> M. Raşit ÖZDAŞ
>


Re: HADOOP-2536 supports Oracle too?

2009-02-04 Thread Enis Soztutar
Hadoop-2536 connects to the db via JDBC, so in theory it should work 
with proper jdbc drivers.

It has been tested against MySQL, Hsqldb, and PostreSQL, but not Oracle.

To answer your earlier question, the actual SQL statements might not be 
recognized by Oracle, so I suggest the best way to test this is to 
insert print statements, and run the actual SQL statements against 
Oracle to see if the syntax is accepted.


We would appreciate if you publish your results.

Enis

Amandeep Khurana wrote:

Does the patch HADOOP-2536 support connecting to Oracle databases as well?
Or is it just limited to MySQL?

Amandeep


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

  


Re: Hadoop FS Shell - command overwrite capability

2009-02-04 Thread Rasit OZDAS
John, I also couldn't find a way from console,
Maybe you already know and don't prefer to use, but API solves this problem.
FileSystem.copyFromLocalFile(boolean delSrc, boolean overwrite, Path
src, Path dst)

If you have to use console, long solution, but you can create a jar
for this, and call it just like hadoop calls FileSystem class in
"hadoop" file in bin directory.

I think File System API also needs some improvement. I wonder if it's
considered by head developers.

Hope this helps,
Rasit

2009/2/4 S D :
> I'm using the Hadoop FS commands to move files from my local machine into
> the Hadoop dfs. I'd like a way to force a write to the dfs even if a file of
> the same name exists. Ideally I'd like to use a "-force" switch or some
> such; e.g.,
>hadoop dfs -copyFromLocal -force adirectory s3n://wholeinthebucket/
>
> Is there a way to do this or does anyone know if this is in the future
> Hadoop plans?
>
> Thanks
> John SD
>



-- 
M. Raşit ÖZDAŞ


Re: How to use DBInputFormat?

2009-02-04 Thread Rasit OZDAS
Amandeep,
"SQL command not properly ended"
I get this error whenever I forget the semicolon at the end.
I know, it doesn't make sense, but I recommend giving it a try

Rasit

2009/2/4 Amandeep Khurana :
> The same query is working if I write a simple JDBC client and query the
> database. So, I'm probably doing something wrong in the connection settings.
> But the error looks to be on the query side more than the connection side.
>
> Amandeep
>
>
> Amandeep Khurana
> Computer Science Graduate Student
> University of California, Santa Cruz
>
>
> On Tue, Feb 3, 2009 at 7:25 PM, Amandeep Khurana  wrote:
>
>> Thanks Kevin
>>
>> I couldnt get it work. Here's the error I get:
>>
>> bin/hadoop jar ~/dbload.jar LoadTable1
>> 09/02/03 19:21:17 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=
>> 09/02/03 19:21:20 INFO mapred.JobClient: Running job: job_local_0001
>> 09/02/03 19:21:21 INFO mapred.JobClient:  map 0% reduce 0%
>> 09/02/03 19:21:22 INFO mapred.MapTask: numReduceTasks: 0
>> 09/02/03 19:21:24 WARN mapred.LocalJobRunner: job_local_0001
>> java.io.IOException: ORA-00933: SQL command not properly ended
>>
>> at
>> org.apache.hadoop.mapred.lib.db.DBInputFormat.getRecordReader(DBInputFormat.java:289)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:321)
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>> java.io.IOException: Job failed!
>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
>> at LoadTable1.run(LoadTable1.java:130)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> at LoadTable1.main(LoadTable1.java:107)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>> at java.lang.reflect.Method.invoke(Unknown Source)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>>
>> Exception closing file
>> /user/amkhuran/contract_table/_temporary/_attempt_local_0001_m_00_0/part-0
>> java.io.IOException: Filesystem closed
>> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:198)
>> at org.apache.hadoop.hdfs.DFSClient.access$600(DFSClient.java:65)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3084)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
>> at
>> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:942)
>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:210)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:243)
>> at
>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1413)
>> at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:236)
>> at
>> org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:221)
>>
>>
>> Here's my code:
>>
>> public class LoadTable1 extends Configured implements Tool  {
>>
>>   // data destination on hdfs
>>   private static final String CONTRACT_OUTPUT_PATH = "contract_table";
>>
>>   // The JDBC connection URL and driver implementation class
>>
>> private static final String CONNECT_URL = "jdbc:oracle:thin:@dbhost
>> :1521:PSEDEV";
>>   private static final String DB_USER = "user";
>>   private static final String DB_PWD = "pass";
>>   private static final String DATABASE_DRIVER_CLASS =
>> "oracle.jdbc.driver.OracleDriver";
>>
>>   private static final String CONTRACT_INPUT_TABLE =
>> "OSE_EPR_CONTRACT";
>>
>>   private static final String [] CONTRACT_INPUT_TABLE_FIELDS = {
>> "PORTFOLIO_NUMBER", "CONTRACT_NUMBER"};
>>
>>   private static final String ORDER_CONTRACT_BY_COL =
>> "CONTRACT_NUMBER";
>>
>>
>> static class ose_epr_contract implements Writable, DBWritable {
>>
>>
>> String CONTRACT_NUMBER;
>>
>>
>> public void readFields(DataInput in) throws IOException {
>>
>> this.CONTRACT_NUMBER = Text.readString(in);
>>
>> }
>>
>> public void write(DataOutput out) throws IOException {
>>
>> Text.writeString(out, this.CONTRACT_NUMBER);
>>
>>
>> }
>>
>> public void readFields(ResultSet in_set) throws SQLException {
>>
>> this.CONTRACT_NUMBER = in_set.getString(1);
>>
>> }
>>
>> @Override
>> public void write(PreparedStatement prep_st) throws SQLException {
>>

Re: Value-Only Reduce Output

2009-02-04 Thread Rasit OZDAS
I tried it myself, it doesn't work.
I've also tried   stream.map.output.field.separator   and
map.output.key.field.separator  parameters for this purpose, they
don't work either. When hadoop sees empty string, it takes default tab
character instead.

Rasit

2009/2/4 jason hadoop 
>
> Ooops, you are using streaming., and I am not familar.
> As a terrible hack, you could set mapred.textoutputformat.separator to the
> empty string, in your configuration.
>
> On Tue, Feb 3, 2009 at 9:26 PM, jason hadoop  wrote:
>
> > If you are using the standard TextOutputFormat, and the output collector is
> > passed a null for the value, there will not be a trailing tab character
> > added to the output line.
> >
> > output.collect( key, null );
> > Will give you the behavior you are looking for if your configuration is as
> > I expect.
> >
> >
> > On Tue, Feb 3, 2009 at 7:49 PM, Jack Stahl  wrote:
> >
> >> Hello,
> >>
> >> I'm interested in a map-reduce flow where I output only values (no keys)
> >> in
> >> my reduce step.  For example, imagine the canonical word-counting program
> >> where I'd like my output to be an unlabeled histogram of counts instead of
> >> (word, count) pairs.
> >>
> >> I'm using HadoopStreaming (specifically, I'm using the dumbo module to run
> >> my python scripts).  When I simulate the map reduce using pipes and sort
> >> in
> >> bash, it works fine.   However, in Hadoop, if I output a value with no
> >> tabs,
> >> Hadoop appends a trailing "\t", apparently interpreting my output as a
> >> (value, "") KV pair.  I'd like to avoid outputing this trailing tab if
> >> possible.
> >>
> >> Is there a command line option that could be use to effect this?  More
> >> generally, is there something wrong with outputing arbitrary strings,
> >> instead of key-value pairs, in your reduce step?
> >>
> >
> >



--
M. Raşit ÖZDAŞ


Re: Task tracker archive contains too many files

2009-02-04 Thread Amareshwari Sriramadasu

Andrew wrote:
I've noticed that task tracker moves all unpacked jars into 
${hadoop.tmp.dir}/mapred/local/taskTracker.


We are using a lot of external libraries, that are deployed via "-libjars" 
option. The total number of files after unpacking is about 20 thousands.


After running a number of jobs, tasks start to be killed with timeout reason 
("Task attempt_200901281518_0011_m_000173_2 failed to report status for 601 
seconds. Killing!"). All killed tasks are in "initializing" state. I've 
watched the tasktracker logs and found such messages:



Thread 20926 (Thread-10368):
  State: BLOCKED
  Blocked count: 3611
  Waited count: 24
  Blocked on java.lang.ref.reference$l...@e48ed6
  Blocked by 20882 (Thread-10341)
  Stack:
java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
java.lang.StringCoding.encode(StringCoding.java:272)
java.lang.String.getBytes(String.java:947)
java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
java.io.File.isDirectory(File.java:754)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:427)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)


This is exactly as in HADOOP-4780. 
As I understand, patch brings the code, which stores map of directories along 
with their DU's, thus reducing the number of calls to DU. This must help but 
the process of deleting 2 files taks too long. I've manually deleted 
archive after 10 jobs had run and it took over 30 minutes on XFS. Three times 
more, that default timeout for tasks!


Is there is the way to prohibit unpacking of jars? Or at least not to hold the 
archive? Or any other better way to solve this problem?


Hadoop version: 0.19.0.


  
Now, there is no way to stop DistributedCache from stopping unpacking of 
jars. I think it should have an option (thru configuration) whether to 
unpack or not.

Can you raise a jira for the same?

Thanks
Amareshwari


Task tracker archive contains too many files

2009-02-04 Thread Andrew
I've noticed that task tracker moves all unpacked jars into 
${hadoop.tmp.dir}/mapred/local/taskTracker.

We are using a lot of external libraries, that are deployed via "-libjars" 
option. The total number of files after unpacking is about 20 thousands.

After running a number of jobs, tasks start to be killed with timeout reason 
("Task attempt_200901281518_0011_m_000173_2 failed to report status for 601 
seconds. Killing!"). All killed tasks are in "initializing" state. I've 
watched the tasktracker logs and found such messages:


Thread 20926 (Thread-10368):
  State: BLOCKED
  Blocked count: 3611
  Waited count: 24
  Blocked on java.lang.ref.reference$l...@e48ed6
  Blocked by 20882 (Thread-10341)
  Stack:
java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
java.lang.StringCoding.encode(StringCoding.java:272)
java.lang.String.getBytes(String.java:947)
java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
java.io.File.isDirectory(File.java:754)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:427)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)
org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:433)


This is exactly as in HADOOP-4780. 
As I understand, patch brings the code, which stores map of directories along 
with their DU's, thus reducing the number of calls to DU. This must help but 
the process of deleting 2 files taks too long. I've manually deleted 
archive after 10 jobs had run and it took over 30 minutes on XFS. Three times 
more, that default timeout for tasks!

Is there is the way to prohibit unpacking of jars? Or at least not to hold the 
archive? Or any other better way to solve this problem?

Hadoop version: 0.19.0.


-- 
Andrew Gudkov
PGP key id: CB9F07D8 (cryptonomicon.mit.edu)
Jabber: gu...@jabber.ru