Hadoop Training, May 15th: SF Bay Area with Online Participation Available

2009-04-27 Thread Christophe Bisciglia
OK, last announcement from me today :-)

We're hosting a training session in the SF bay area (at the Cloudera
office) on Friday, May 15th.

We're doing two things differently:
1) We've allocated a chunk of discounted "early bird" registrations -
first come first serve until May 1st, at which point, only regular
registration is available.
2) We're enabling people from outside the bay area to attend through
some pretty impressive web based video remote presence software we've
been piloting - all you need is a browser with flash. If you have a
webcam and mic, all the better. We're working with a startup on this,
and we're really impressed with the technology. Since this is new for
us, we've discounted web based participation significantly for this
session.

registration: http://cloudera.eventbrite.com/

Cheers,
Christophe

-- 
get hadoop: cloudera.com/hadoop
online training: cloudera.com/hadoop-training
blog: cloudera.com/blog
twitter: twitter.com/cloudera


Debian support for Cloudera's Distribution

2009-04-27 Thread Christophe Bisciglia
Hey Hadoop fans, just wanted to drop a quick note to let you know that
we now have debian packages for our distribution in addition to RPMs.
We will continue to support both platforms going forward.

Todd Lipcon put in many late nights for this, so next time you see
him, but him a beer :-)

http://www.cloudera.com/hadoop-deb

Cheers,
Christophe

-- 
get hadoop: cloudera.com/hadoop
online training: cloudera.com/hadoop-training
blog: cloudera.com/blog
twitter: twitter.com/cloudera


Re: How to set System property for my job

2009-04-27 Thread mlimotte

I think what you want is the section "Task Execution & Environment" in 
http://hadoop.apache.org/core/docs/current/mapred_tutorial.html
http://hadoop.apache.org/core/docs/current/mapred_tutorial.html .  Here is a
sample from that document:


  mapred.child.java.opts
  
 -Xmx512M -Djava.library.path=/home/mycompany/lib -verbose:gc
-Xloggc:/tmp/@tas...@.gc
 -Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
  
 


-Marc


Tarandeep wrote:
> 
> Hi,
> 
> While submitting a job to Hadoop, how can I set system properties that are
> required by my code ?
> Passing -Dmy.prop=myvalue to the hadoop job command is not going to work
> as
> hadoop command will pass this to my program as command line argument.
> 
> Is there any way to achieve this ?
> 
> Thanks,
> Taran
> 
> *
> 
> *
> 
> 

-- 
View this message in context: 
http://www.nabble.com/How-to-set-System-property-for-my-job-tp18896188p23264520.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: Blocks replication in downtime even

2009-04-27 Thread Stas Oskin
Thanks.

2009/4/27 Koji Noguchi 

> http://hadoop.apache.org/core/docs/current/hdfs_design.html#Data+Disk+Fa
> ilure%2C+Heartbeats+and+Re-Replication
>
> hope this helps.
>
> Koji
>
> -Original Message-
> From: Stas Oskin [mailto:stas.os...@gmail.com]
> Sent: Monday, April 27, 2009 4:11 AM
> To: core-user@hadoop.apache.org
> Subject: Blocks replication in downtime even
>
> Hi.
>
> I have a question:
>
> If I have N of DataNodes, and one or several of the nodes have become
> unavailable, would HDFS re-synchronize the blocks automatically,
> according
> to replication level set?
> And if yes, when? As soon as the offline node was detected, or only on
> file
> access?
>
> Regards.
>


Rescheduling of already completed map/reduce task

2009-04-27 Thread Sagar Naik

Hi,
The job froze after the filesystem hung on a machine which had 
successfully completed a map task.

Is there a flag to enable the re scheduling of such a task ?


Jstack of job tracker

"SocketListener0-2" prio=10 tid=0x08916000 nid=0x4a4f runnable 
[0x4d05c000..0x4d05ce30]

  java.lang.Thread.State: RUNNABLE
   at java.net.SocketInputStream.socketRead0(Native Method)
   at java.net.SocketInputStream.read(SocketInputStream.java:129)
   at org.mortbay.util.LineInput.fill(LineInput.java:469)
   at org.mortbay.util.LineInput.fillLine(LineInput.java:547)
   at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293)
   at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277)
   at org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238)
   at 
org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861)
   at 
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907)

   at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
   at 
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)

   at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
   at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

  Locked ownable synchronizers:
   - None


"SocketListener0-1" prio=10 tid=0x4da8c800 nid=0xeeb runnable 
[0x4d266000..0x4d2670b0]

  java.lang.Thread.State: RUNNABLE
   at java.net.SocketInputStream.socketRead0(Native Method)
   at java.net.SocketInputStream.read(SocketInputStream.java:129)
   at org.mortbay.util.LineInput.fill(LineInput.java:469)
   at org.mortbay.util.LineInput.fillLine(LineInput.java:547)
   at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:293)
   at org.mortbay.util.LineInput.readLineBuffer(LineInput.java:277)
   at org.mortbay.http.HttpRequest.readHeader(HttpRequest.java:238)
   at 
org.mortbay.http.HttpConnection.readRequest(HttpConnection.java:861)
   at 
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:907)

   at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
   at 
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)

   at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
   at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

"IPC Server listener on 54311" daemon prio=10 tid=0x4df70400 nid=0xe86 
runnable [0x4d9fe000..0x4d9feeb0]

  java.lang.Thread.State: RUNNABLE
   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
   at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
   - locked <0x54fb4320> (a sun.nio.ch.Util$1)
   - locked <0x54fb4310> (a java.util.Collections$UnmodifiableSet)
   - locked <0x54fb40b8> (a sun.nio.ch.EPollSelectorImpl)
   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
   at org.apache.hadoop.ipc.Server$Listener.run(Server.java:296)

  Locked ownable synchronizers:
   - None

"IPC Server Responder" daemon prio=10 tid=0x4da22800 nid=0xe85 runnable 
[0x4db75000..0x4db75e30]

  java.lang.Thread.State: RUNNABLE
   at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
   at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
   at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
   - locked <0x54f0> (a sun.nio.ch.Util$1)
   - locked <0x54fdce10> (a java.util.Collections$UnmodifiableSet)
   - locked <0x54fdcc18> (a sun.nio.ch.EPollSelectorImpl)
   at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
   at org.apache.hadoop.ipc.Server$Responder.run(Server.java:455)

  Locked ownable synchronizers:
   - None

"RMI TCP Accept-0" daemon prio=10 tid=0x4da13400 nid=0xe31 runnable 
[0x4de55000..0x4de56130]

  java.lang.Thread.State: RUNNABLE
   at java.net.PlainSocketImpl.socketAccept(Native Method)
   at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
   - locked <0x54f6dae0> (a java.net.SocksSocketImpl)
   at java.net.ServerSocket.implAccept(ServerSocket.java:453)
   at java.net.ServerSocket.accept(ServerSocket.java:421)
   at 
sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34)
   at 
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369)
   at 
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)

   at java.lang.Thread.run(Thread.java:619)

  Locked ownable synchronizers:
   - None

-Sagar


Re: .20.0, Partitioners?

2009-04-27 Thread Jothi Padmanabhan
I created 

https://issues.apache.org/jira/browse/HADOOP-5750

to follow this up. 

Thanks
Jothi


On 4/27/09 10:10 PM, "Jothi Padmanabhan"  wrote:

> Ryan,
> 
> I observed this behavior too -- Partitioner does not seems to work with the
> new API exactly for the reason you have mentioned. Till this gets fixed, you
> probably need to use the old API.
> 
> Jothi
> 
> 
> On 4/27/09 7:14 PM, "Ryan Farris"  wrote:
> 
>> Is there some magic to get a Partitioner working on .20.0?  Setting
>> the partitioner class on the Job object doesn't take, hadoop always
>> uses the HashPartitioner.  Looking through the source code, it looks
>> like the MapOutputBuffer in MapTask only ever fetches the
>> "mapred.partitioner.class", and doesn't check for new api's
>> "mapreduce.partitioner.class", but I'm not confident in my
>> understanding of how things work.
>> 
>> I was eventually able to get my test program working correctly by:
>>   1) Creating a partitioner that extends the deprecated
>> org.apache.hadoop.mapred.Partitioner class.
>>   2) Calling job.getConfiguration().set("mapred.partitioner.class",
>> DeprecatedTestPartitioner.class.getCanonicalName());
>>   3) Commenting out line 395 of org.apache.hadoop.mapreduce.Job.java,
>> where it asserts that "mapred.partitioner.class" is null
>> 
>> But I'm assuming editing the hadoop core sourcecode is not the
>> intended path.  Am I missing some simple switch or something?
>> 
>> rf
> 



Re: .20.0, Partitioners?

2009-04-27 Thread Jothi Padmanabhan
Ryan,

I observed this behavior too -- Partitioner does not seems to work with the
new API exactly for the reason you have mentioned. Till this gets fixed, you
probably need to use the old API.

Jothi


On 4/27/09 7:14 PM, "Ryan Farris"  wrote:

> Is there some magic to get a Partitioner working on .20.0?  Setting
> the partitioner class on the Job object doesn't take, hadoop always
> uses the HashPartitioner.  Looking through the source code, it looks
> like the MapOutputBuffer in MapTask only ever fetches the
> "mapred.partitioner.class", and doesn't check for new api's
> "mapreduce.partitioner.class", but I'm not confident in my
> understanding of how things work.
> 
> I was eventually able to get my test program working correctly by:
>   1) Creating a partitioner that extends the deprecated
> org.apache.hadoop.mapred.Partitioner class.
>   2) Calling job.getConfiguration().set("mapred.partitioner.class",
> DeprecatedTestPartitioner.class.getCanonicalName());
>   3) Commenting out line 395 of org.apache.hadoop.mapreduce.Job.java,
> where it asserts that "mapred.partitioner.class" is null
> 
> But I'm assuming editing the hadoop core sourcecode is not the
> intended path.  Am I missing some simple switch or something?
> 
> rf



Re: write a large file to HDFS?

2009-04-27 Thread jason hadoop
block by block.
open multiple connections and write multiple files if you are not saturating
your network connection.
Generally a single file writer writing large blocks rapidly will do a decent
job of saturating things.

On Mon, Apr 27, 2009 at 2:22 AM, Xie, Tao  wrote:

>
> hi,
> If I write a large file to HDFS, will it be split into blocks and
> multi-blocks are written to HDFS at the same time? Or HDFS can only write
> block by block?
> Thanks.
> --
> View this message in context:
> http://www.nabble.com/write-a-large-file-to-HDFS--tp23252754p23252754.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>


-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422


Re: IO Exception in Map Tasks

2009-04-27 Thread jason hadoop
You will need to figure out why your task crashed,
Check the task logs, there may be some messages there, that give you a hint
as to what is going on.

you can enable saving failed task logs and then run the task standalone in
the isolation runner.
chapter 7 of my book (alpha available) provides details on this, hoping the
failure repeats in the controlled environment.

You could unlimit the core dump size, via hadoop-env.sh *ulimit -c unlimited
*, but that will require that the failed task logs be available as the core
will be in the task working directory.


On Mon, Apr 27, 2009 at 1:30 AM, Rakhi Khatwani wrote:

> Thanks Jason,
>  is there any way we can avoid this exception??
>
> Thanks,
> Raakhi
>
> On Mon, Apr 27, 2009 at 1:20 PM, jason hadoop  >wrote:
>
> > The jvm had a hard failure and crashed
> >
> >
> > On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani
> > wrote:
> >
> > > Hi,
> > >
> > >  In one of the map tasks, i get the following exception:
> > >  java.io.IOException: Task process exit with nonzero status of 255.
> > > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
> > >
> > > java.io.IOException: Task process exit with nonzero status of 255.
> > > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
> > >
> > > what could be the reason?
> > >
> > > Thanks,
> > > Raakhi
> > >
> >
> >
> >
> > --
> > Alpha Chapters of my book on Hadoop are available
> > http://www.apress.com/book/view/9781430219422
> >
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422


Re: Datanode Setup

2009-04-27 Thread jpe30

bump*

Any suggestions?

-- 
View this message in context: 
http://www.nabble.com/Datanode-Setup-tp23064660p23259364.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



RE: Blocks replication in downtime even

2009-04-27 Thread Koji Noguchi
http://hadoop.apache.org/core/docs/current/hdfs_design.html#Data+Disk+Fa
ilure%2C+Heartbeats+and+Re-Replication

hope this helps.

Koji

-Original Message-
From: Stas Oskin [mailto:stas.os...@gmail.com] 
Sent: Monday, April 27, 2009 4:11 AM
To: core-user@hadoop.apache.org
Subject: Blocks replication in downtime even

Hi.

I have a question:

If I have N of DataNodes, and one or several of the nodes have become
unavailable, would HDFS re-synchronize the blocks automatically,
according
to replication level set?
And if yes, when? As soon as the offline node was detected, or only on
file
access?

Regards.


Re: Can't start fully-distributed operation of Hadoop in Sun Grid Engine

2009-04-27 Thread Jasmine (Xuanjing) Huang
I have contacted with the administor of our cluster and he gave me the 
access. Now my program can work under full distributed mode.


Thanks a lot.

Jasmine
- Original Message - 
From: "jason hadoop" 

To: 
Sent: Sunday, April 26, 2009 12:13 PM
Subject: Re: Can't start fully-distributed operation of Hadoop in Sun Grid 
Engine




It may be that the sun grid is similar to the EC2 and the machines have an
internal IPaddress/name that MUST be used for inter machine communication
and an external IPaddress/name that is only for internet access.

The above overly complex sentence basically states there may be some
firewall rules/tools in the sun grid that you need to be aware of and use.

On Sun, Apr 26, 2009 at 6:31 AM, Jasmine (Xuanjing) Huang <
xjhu...@cs.umass.edu> wrote:


Hi, Jason,

Thanks for your advice, after insert port into the file of
"hadoop-site.xml", I can start namenode and run job now.
But my system works only when I  set localhost to masters and add 
localhost

(as well as some other nodes) to slavers file. And all the tasks are
Data-local map tasks. I wonder if whether I enter fully distributed mode, 
or

still in pseudo mode.

As for the SGE, I am only a user and know little about it. This is the 
user

manual of our cluster:
http://www.cs.umass.edu/~swarm/index.php?n=Main.UserDoc

Best,
Jasmine

- Original Message - From: "jason hadoop" 


To: 
Sent: Sunday, April 26, 2009 12:06 AM
Subject: Re: Can't start fully-distributed operation of Hadoop in Sun 
Grid

Engine



 the parameter you specify for fs.default name should be of the form
hdfs://host:port and the parameter you specify for the 
mapred.job.tracker

MUST be host:port. I haven't looked at 18.3,  but it appears that the
:port
is mandatory.

In your case, the piece of code parsing the fs.default.name variable is
not
able to tokenize it into protocol host and port correctly

recap:
fs.default.name hdfs://namenodeHost:port
mapred.job.tracker jobtrackerHost:port
sepecify all the parts above and try again.

Can you please point me at information on using the sun grid, I want to
include a paragraph or two about it in my book.

On Sat, Apr 25, 2009 at 4:28 PM, Jasmine (Xuanjing) Huang <
xjhu...@cs.umass.edu> wrote:

 Hi, there,


My hadoop system (version: 0.18.3) works well under standalone and
pseudo-distributed operation. But if I try to run hadoop in
fully-distributed mode in Sun Grid Engine, Hadoop always failed -- in
fact,
the jobTracker and TaskzTracker can be started, but the namenode and
secondary namenode cannot be started. Could anyone help me with it?

My SGE scripts looks like:

#!/bin/bash
#$ -cwd
#$ -S /bin/bash
#$ -l long=TRUE
#$ -v JAVA_HOME=/usr/java/latest
#$ -v HADOOP_HOME=*
#$ -pe hadoop 6
PATH="$HADOOP_HOME/bin:$PATH"
hadoop fs -put 
hadoop jar *
hadoop fs -get *

Then the output looks like:
Exception in thread "main" java.lang.NumberFormatException: For input
string: ""
 at
java.lang.NumberFormatException.forInputString(NumberFormatException.
java:48)
 at java.lang.Integer.parseInt(Integer.java:468)
 at java.lang.Integer.parseInt(Integer.java:497)
 at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:144)
 at org.apache.hadoop.dfs.NameNode.getAddress(NameNode.java:116)
 at
org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFil
eSystem.java:66)
 at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339
)
 at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:118)
 at org.apache.hadoop.fs.FsShell.init(FsShell.java:88)
 at org.apache.hadoop.fs.FsShell.run(FsShell.java:1703)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.fs.FsShell.main(FsShell.java:1852)

And the log of NameNode looks like
2009-04-25 17:27:17,032 INFO org.apache.hadoop.dfs.NameNode: 
STARTUP_MSG:

/
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = 
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.18.3
/
2009-04-25 17:27:17,147 ERROR org.apache.hadoop.dfs.NameNode:
java.lang.NumberFormatException: For i
nput string: ""
 at

java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:468)
 at java.lang.Integer.parseInt(Integer.java:497)
 at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:144)
 at org.apache.hadoop.dfs.NameNode.getAddress(NameNode.java:116)
 at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:136)

ANN: R and Hadoop = RHIPE 0.1

2009-04-27 Thread Saptarshi Guha
Hello,
I'd like to announce the release of the 0.1 version of RHIPE -R and
Hadoop Integrated Processing Environment. Using RHIPE, it is possible
to write map-reduce algorithms using the R language and start them
from within R.
RHIPE is built on Hadoop and so benefits from Hadoop's fault
tolerance, distributed file system and job scheduling features.
For the R user, there is rhlapply which runs an lapply across the cluster.
For the Hadoop user, there is rhmr which runs a general map-reduce program.

The tired example of counting words:

m <- function(key,val){
  words <- substr(val," +")[[1]]
  wc <- table(words)
  cln <- names(wc)
  return(sapply(1:length(wc),function(r)
list(key=cln[r],value=wc[[r]]),simplify=F))
}
r <- function(key,value){
  value <- do.call("rbind",value)
  return(list(list(key=key,value=sum(value
}
rhmr(mapper=m,reduce=r,input.folder="X",output.folder="Y")

URL: http://ml.stat.purdue.edu/rhipe

There are some downsides to RHIPE which are described at
http://ml.stat.purdue.edu/rhipe/install.html#sec-5

Regards
Saptarshi Guha


.20.0, Partitioners?

2009-04-27 Thread Ryan Farris
Is there some magic to get a Partitioner working on .20.0?  Setting
the partitioner class on the Job object doesn't take, hadoop always
uses the HashPartitioner.  Looking through the source code, it looks
like the MapOutputBuffer in MapTask only ever fetches the
"mapred.partitioner.class", and doesn't check for new api's
"mapreduce.partitioner.class", but I'm not confident in my
understanding of how things work.

I was eventually able to get my test program working correctly by:
  1) Creating a partitioner that extends the deprecated
org.apache.hadoop.mapred.Partitioner class.
  2) Calling job.getConfiguration().set("mapred.partitioner.class",
DeprecatedTestPartitioner.class.getCanonicalName());
  3) Commenting out line 395 of org.apache.hadoop.mapreduce.Job.java,
where it asserts that "mapred.partitioner.class" is null

But I'm assuming editing the hadoop core sourcecode is not the
intended path.  Am I missing some simple switch or something?

rf


Database access in 0.18.3

2009-04-27 Thread rajeev gupta

I need to write output of reduce in database. There is support for this in 0.19 
but I am using 0.18.3. Any suggestion?

I tried to process output myself in the reduce() function by writing some 
System.out.println; but it is writing output in userlogs corresponding to the 
map function (intermediate output).

-rajeev



  


Re: Balancing datanodes - Running hadoop 0.18.3

2009-04-27 Thread Usman Waheed

Hi Tamir,

Thanks for the info, makes sense now :).

Cheers,
Usman

Hi,

The balancer works with the average utilization of all the nodes in the
cluster - in your case it's about 13%. Only nodes that are +/- 10% off the
average will be rebalanced. Node 4 isn't under-utilized because 13-10=3
which is less than 4%. You can use a different threshold than the default
10% (hadoop balancer -threshold 5). Read more here:
http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer

Tamir


On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed  wrote:

  

Hi,
I had sent out an email yesterday asking about how to balance the cluster
after setting the replication level to 2. I have 4 datanodes and one
namenode in my setup.
Using the -R switch with -setrep did the trick but one of my nodes became
under utilized. I then ran hadoop balancer and it did help but upto a
certain extent.

Datanode 4 noted below is now up to almost 5% but when i try to balance the
datanode again using the "hadoop balance" command it says that the cluster
is already balanced which isnt.
I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will
pick up more blocks?

Any clues?

Thanks,
Usman

Name: 1
State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 35858599(206.97 GB)
Used raw bytes: 48140136448 (44.83 GB)
% used: 16.39%
Last contact: Mon Apr 27 08:34:46 UTC 2009


Name: 2
State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 231235100994(215.35 GB)
Used raw bytes: 40704245760 (37.91 GB)
% used: 13.86%
Last contact: Mon Apr 27 08:34:45 UTC 2009


Name: 3
State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 211936026161(197.38 GB)
Used raw bytes: 59591700480 (55.5 GB)
% used: 20.28%
Last contact: Mon Apr 27 08:34:45 UTC 2009


*Name: 4
*State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 258876991693(241.1 GB)
Used raw bytes: 12142653440 (11.31 GB)
% used: 4.13%
Last contact: Mon Apr 27 08:34:46 UTC 2009





  




Re: Balancing datanodes - Running hadoop 0.18.3

2009-04-27 Thread Tamir Kamara
Hi,

The balancer works with the average utilization of all the nodes in the
cluster - in your case it's about 13%. Only nodes that are +/- 10% off the
average will be rebalanced. Node 4 isn't under-utilized because 13-10=3
which is less than 4%. You can use a different threshold than the default
10% (hadoop balancer -threshold 5). Read more here:
http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Rebalancer

Tamir


On Mon, Apr 27, 2009 at 11:36 AM, Usman Waheed  wrote:

> Hi,
> I had sent out an email yesterday asking about how to balance the cluster
> after setting the replication level to 2. I have 4 datanodes and one
> namenode in my setup.
> Using the -R switch with -setrep did the trick but one of my nodes became
> under utilized. I then ran hadoop balancer and it did help but upto a
> certain extent.
>
> Datanode 4 noted below is now up to almost 5% but when i try to balance the
> datanode again using the "hadoop balance" command it says that the cluster
> is already balanced which isnt.
> I wonder if there is an alternate way(s) or maybe overtime Datanode-4 will
> pick up more blocks?
>
> Any clues?
>
> Thanks,
> Usman
>
> Name: 1
> State  : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 35858599(206.97 GB)
> Used raw bytes: 48140136448 (44.83 GB)
> % used: 16.39%
> Last contact: Mon Apr 27 08:34:46 UTC 2009
>
>
> Name: 2
> State  : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 231235100994(215.35 GB)
> Used raw bytes: 40704245760 (37.91 GB)
> % used: 13.86%
> Last contact: Mon Apr 27 08:34:45 UTC 2009
>
>
> Name: 3
> State  : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 211936026161(197.38 GB)
> Used raw bytes: 59591700480 (55.5 GB)
> % used: 20.28%
> Last contact: Mon Apr 27 08:34:45 UTC 2009
>
>
> *Name: 4
> *State  : In Service
> Total raw bytes: 293778976768 (273.6 GB)
> Remaining raw bytes: 258876991693(241.1 GB)
> Used raw bytes: 12142653440 (11.31 GB)
> % used: 4.13%
> Last contact: Mon Apr 27 08:34:46 UTC 2009
>
>


Blocks replication in downtime even

2009-04-27 Thread Stas Oskin
Hi.

I have a question:

If I have N of DataNodes, and one or several of the nodes have become
unavailable, would HDFS re-synchronize the blocks automatically, according
to replication level set?
And if yes, when? As soon as the offline node was detected, or only on file
access?

Regards.


write a large file to HDFS?

2009-04-27 Thread Xie, Tao

hi, 
If I write a large file to HDFS, will it be split into blocks and
multi-blocks are written to HDFS at the same time? Or HDFS can only write
block by block? 
Thanks.
-- 
View this message in context: 
http://www.nabble.com/write-a-large-file-to-HDFS--tp23252754p23252754.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Balancing datanodes - Running hadoop 0.18.3

2009-04-27 Thread Usman Waheed

Hi,
I had sent out an email yesterday asking about how to balance the 
cluster after setting the replication level to 2. I have 4 datanodes and 
one namenode in my setup.
Using the -R switch with -setrep did the trick but one of my nodes 
became under utilized. I then ran hadoop balancer and it did help but 
upto a certain extent.


Datanode 4 noted below is now up to almost 5% but when i try to balance 
the datanode again using the "hadoop balance" command it says that the 
cluster is already balanced which isnt.
I wonder if there is an alternate way(s) or maybe overtime Datanode-4 
will pick up more blocks?


Any clues?

Thanks,
Usman

Name: 1
State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 35858599(206.97 GB)
Used raw bytes: 48140136448 (44.83 GB)
% used: 16.39%
Last contact: Mon Apr 27 08:34:46 UTC 2009


Name: 2
State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 231235100994(215.35 GB)
Used raw bytes: 40704245760 (37.91 GB)
% used: 13.86%
Last contact: Mon Apr 27 08:34:45 UTC 2009


Name: 3
State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 211936026161(197.38 GB)
Used raw bytes: 59591700480 (55.5 GB)
% used: 20.28%
Last contact: Mon Apr 27 08:34:45 UTC 2009


*Name: 4
*State  : In Service
Total raw bytes: 293778976768 (273.6 GB)
Remaining raw bytes: 258876991693(241.1 GB)
Used raw bytes: 12142653440 (11.31 GB)
% used: 4.13%
Last contact: Mon Apr 27 08:34:46 UTC 2009



Re: IO Exception in Map Tasks

2009-04-27 Thread Rakhi Khatwani
Thanks Jason,
  is there any way we can avoid this exception??

Thanks,
Raakhi

On Mon, Apr 27, 2009 at 1:20 PM, jason hadoop wrote:

> The jvm had a hard failure and crashed
>
>
> On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani
> wrote:
>
> > Hi,
> >
> >  In one of the map tasks, i get the following exception:
> >  java.io.IOException: Task process exit with nonzero status of 255.
> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
> >
> > java.io.IOException: Task process exit with nonzero status of 255.
> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
> >
> > what could be the reason?
> >
> > Thanks,
> > Raakhi
> >
>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
>


Re: IO Exception in Map Tasks

2009-04-27 Thread Rakhi Khatwani
Thanks Jason,
  is there any way we can avoid this exception??

Thanks,
Raakhi

On Mon, Apr 27, 2009 at 1:20 PM, jason hadoop wrote:

> The jvm had a hard failure and crashed
>
>
> On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani
> wrote:
>
> > Hi,
> >
> >  In one of the map tasks, i get the following exception:
> >  java.io.IOException: Task process exit with nonzero status of 255.
> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
> >
> > java.io.IOException: Task process exit with nonzero status of 255.
> > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
> >
> > what could be the reason?
> >
> > Thanks,
> > Raakhi
> >
>
>
>
> --
> Alpha Chapters of my book on Hadoop are available
> http://www.apress.com/book/view/9781430219422
>


Re: IO Exception in Map Tasks

2009-04-27 Thread jason hadoop
The jvm had a hard failure and crashed


On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani
wrote:

> Hi,
>
>  In one of the map tasks, i get the following exception:
>  java.io.IOException: Task process exit with nonzero status of 255.
> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
>
> java.io.IOException: Task process exit with nonzero status of 255.
> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424)
>
> what could be the reason?
>
> Thanks,
> Raakhi
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422


Re: Storing data-node content to other machine

2009-04-27 Thread jason hadoop
There is no requirement that your hdfs and mapred clusters share an
installation directory, it is just done that way because it is simple and
most people have a datanode and tasktracker on each slave node.

Simply have 2 configuration directories on your cluster machines, and us the
bin/start-dfs.sh script in one, and the bin/start-mapred.sh script in the
other, and maintain different slaves files in the two directories.

You will loose the benefit of data locality for your tasktrackers which do
not reside on the datanode machines.

On Sun, Apr 26, 2009 at 10:06 PM, Vishal Ghawate <
vishal_ghaw...@persistent.co.in> wrote:

> Hi,
> I want to store the contents of all the client machine(datanode)of hadoop
> cluster to centralized machine
>  with high storage capacity.so that tasktracker will be on the client
> machine but the contents are stored on the
> centralized machine.
>Can anybody help me on this please.
>
> DISCLAIMER
> ==
> This e-mail may contain privileged and confidential information which is
> the property of Persistent Systems Ltd. It is intended only for the use of
> the individual or entity to which it is addressed. If you are not the
> intended recipient, you are not authorized to read, retain, copy, print,
> distribute or use this message. If you have received this communication in
> error, please notify the sender and delete all copies of this message.
> Persistent Systems Ltd. does not accept any liability for virus infected
> mails.
>



-- 
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422