Re: No locks available

2011-01-17 Thread Adarsh Sharma

Edward Capriolo wrote:

On Mon, Jan 17, 2011 at 8:13 AM, Adarsh Sharma  wrote:
  

Harsh J wrote:


Could you re-check your permissions on the $(dfs.data.dir)s for your
failing DataNode versus the user that runs it?

On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma 
wrote:

  

Can i know why it occurs.


  

Thanx Harsh , I know this issue and I cross-check several times permissions
of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ).

It is 755 and is owned by hadoop user and group.

I found that in failed datanode dir , it is unable to create 5 files in
dfs.data.dir whereas on the other hand, it creates following files in
successsful datanode :

curent
tmp
storage
in_use.lock

Does it helps.

Thanx




No locks available can mean that you are trying to use hadoop on a
filesystem that does not support file level locking. Are you trying to
run your name node storage in NFS space?
  

I am sorry but my Namenode is in separate Machine outside CLoud.

The path is in /home/hadoop/project/hadoop-0.20.2/name

It's is running properly.

I find it difficult because I followed the same steps in the other 2 
VM's and they are running.


How could I debug this for 1 exceptional case where it is failing.


Thanks & Regards

Adarsh Sharma


BZip2Codec memory usage for map output compression?

2011-01-17 Thread Attila Csordas
Hi,

How can memory usage be calculated in case of BZip2Codec for map output?

Cheers,
Attila


Re: Question about Hadoop Default FCFS Job Scheduler

2011-01-17 Thread Nan Zhu
OK, I got your point,

you mean why don't we put the for loop into obtainNewLocalMapTask(),

yes, I think we can do that, but the result is the same with current codes,
and I don't think it will lead too many benefits on performance, and
personally, I like the current style, :-)

Best,

Nan

On Tue, Jan 18, 2011 at 12:24 AM, He Chen  wrote:

> Hi Nan,
>
> Thank you for the reply. I understand what you mean. What I concern is
> inside the "obtainNewLocalMapTask(...)" method, it only assigns one tasks a
> time.
>
> Now I understand why it only assigns one task at a time. It is because the
> outside loop:
>
> for (i = 0; i < MapperCapacity; ++i){
>
> (..)
>
> }
>
> I mean why this loop exists here. Why does the scheduler use this type of
> loop. It imposes overhead to the task assigning process if only assign one
> task at a time. It is obviously that a node can be assigned all available
> local tasks it can in one "afford obtainNewLocalMapTask(..)" method
> call.
>
> Bests
>
> Chen
>
> On Mon, Jan 17, 2011 at 8:28 AM, Nan Zhu  wrote:
>
> > Hi, Chen
> >
> > How is it going recently?
> >
> > Actually I think you misundertand the code in assignTasks() in
> > JobQueueTaskScheduler.java, see the following structure of the
> interesting
> > codes:
> >
> > //I'm sorry, I hacked the code so much, the name of the variables may be
> > different from the original version
> >
> > for (i = 0; i < MapperCapacity; ++i){
> >   ...
> >   for (JobInProgress job:jobQueue){
> >   //try to shedule a node-local or rack-local map tasks
> >   //here is the interesting place
> >   t = job.obtainNewLocalMapTask(...);
> >   if (t != null){
> >  ...
> >  break;//the break statement here will make the control flow back
> > to "for (job:jobQueue)" which means that it will restart map tasks
> > selection
> > procedure from the first job, so , it is actually schedule all of the
> first
> > job's local mappers first until the map slots are full
> >   }
> >   }
> > }
> >
> > BTW, we can only schedule a reduce task in a single heartbeat
> >
> >
> >
> > Best,
> > Nan
> > On Sat, Jan 15, 2011 at 1:45 PM, He Chen  wrote:
> >
> > > Hey all
> > >
> > > Why does the FCFS scheduler only let a node chooses one task at a time
> in
> > > one job? In order to increase the data locality,
> > > it is reasonable to let a node to choose all its local tasks (if it
> can)
> > > from a job at a time.
> > >
> > > Any reply will be appreciated.
> > >
> > > Thanks
> > >
> > > Chen
> > >
> >
>


Re: Question about Hadoop Default FCFS Job Scheduler

2011-01-17 Thread Nan Zhu
Hi, Chen

Actually not one task each time,

see this statement:

 assignedTasks.add(t);

assignedTasks is the return value of this method, and it's a collection of
selected tasks, it will contain multiple tasks if the candidates are there..

Best,

Nan

On Tue, Jan 18, 2011 at 12:24 AM, He Chen  wrote:

> Hi Nan,
>
> Thank you for the reply. I understand what you mean. What I concern is
> inside the "obtainNewLocalMapTask(...)" method, it only assigns one tasks a
> time.
>
> Now I understand why it only assigns one task at a time. It is because the
> outside loop:
>
> for (i = 0; i < MapperCapacity; ++i){
>
> (..)
>
> }
>
> I mean why this loop exists here. Why does the scheduler use this type of
> loop. It imposes overhead to the task assigning process if only assign one
> task at a time. It is obviously that a node can be assigned all available
> local tasks it can in one "afford obtainNewLocalMapTask(..)" method
> call.
>
> Bests
>
> Chen
>
> On Mon, Jan 17, 2011 at 8:28 AM, Nan Zhu  wrote:
>
> > Hi, Chen
> >
> > How is it going recently?
> >
> > Actually I think you misundertand the code in assignTasks() in
> > JobQueueTaskScheduler.java, see the following structure of the
> interesting
> > codes:
> >
> > //I'm sorry, I hacked the code so much, the name of the variables may be
> > different from the original version
> >
> > for (i = 0; i < MapperCapacity; ++i){
> >   ...
> >   for (JobInProgress job:jobQueue){
> >   //try to shedule a node-local or rack-local map tasks
> >   //here is the interesting place
> >   t = job.obtainNewLocalMapTask(...);
> >   if (t != null){
> >  ...
> >  break;//the break statement here will make the control flow back
> > to "for (job:jobQueue)" which means that it will restart map tasks
> > selection
> > procedure from the first job, so , it is actually schedule all of the
> first
> > job's local mappers first until the map slots are full
> >   }
> >   }
> > }
> >
> > BTW, we can only schedule a reduce task in a single heartbeat
> >
> >
> >
> > Best,
> > Nan
> > On Sat, Jan 15, 2011 at 1:45 PM, He Chen  wrote:
> >
> > > Hey all
> > >
> > > Why does the FCFS scheduler only let a node chooses one task at a time
> in
> > > one job? In order to increase the data locality,
> > > it is reasonable to let a node to choose all its local tasks (if it
> can)
> > > from a job at a time.
> > >
> > > Any reply will be appreciated.
> > >
> > > Thanks
> > >
> > > Chen
> > >
> >
>


Re: Question about Hadoop Default FCFS Job Scheduler

2011-01-17 Thread He Chen
Hi Nan,

Thank you for the reply. I understand what you mean. What I concern is
inside the "obtainNewLocalMapTask(...)" method, it only assigns one tasks a
time.

Now I understand why it only assigns one task at a time. It is because the
outside loop:

for (i = 0; i < MapperCapacity; ++i){

(..)

}

I mean why this loop exists here. Why does the scheduler use this type of
loop. It imposes overhead to the task assigning process if only assign one
task at a time. It is obviously that a node can be assigned all available
local tasks it can in one "afford obtainNewLocalMapTask(..)" method
call.

Bests

Chen

On Mon, Jan 17, 2011 at 8:28 AM, Nan Zhu  wrote:

> Hi, Chen
>
> How is it going recently?
>
> Actually I think you misundertand the code in assignTasks() in
> JobQueueTaskScheduler.java, see the following structure of the interesting
> codes:
>
> //I'm sorry, I hacked the code so much, the name of the variables may be
> different from the original version
>
> for (i = 0; i < MapperCapacity; ++i){
>   ...
>   for (JobInProgress job:jobQueue){
>   //try to shedule a node-local or rack-local map tasks
>   //here is the interesting place
>   t = job.obtainNewLocalMapTask(...);
>   if (t != null){
>  ...
>  break;//the break statement here will make the control flow back
> to "for (job:jobQueue)" which means that it will restart map tasks
> selection
> procedure from the first job, so , it is actually schedule all of the first
> job's local mappers first until the map slots are full
>   }
>   }
> }
>
> BTW, we can only schedule a reduce task in a single heartbeat
>
>
>
> Best,
> Nan
> On Sat, Jan 15, 2011 at 1:45 PM, He Chen  wrote:
>
> > Hey all
> >
> > Why does the FCFS scheduler only let a node chooses one task at a time in
> > one job? In order to increase the data locality,
> > it is reasonable to let a node to choose all its local tasks (if it can)
> > from a job at a time.
> >
> > Any reply will be appreciated.
> >
> > Thanks
> >
> > Chen
> >
>


Re: question about Hadoop job conf

2011-01-17 Thread Harsh J
Set them to final if you don't want the default values being applied.
A true addition should solve your problem (although it
may generate some warnings when your job tries to override them with
their defaults).

(Default value xml files are in the Hadoop jars and are usually picked
up when a JobConf/Configuration is created, unless your Cluster's
configuration is on the CLASSPATH)

Although am wondering what do you gain by changing fs.checkpoint.size
for a Task? It isn't read by the MapReduce code base at all and is
only read inside the HDFS's Checkpointer/SN Nodes at initialization.

On Mon, Jan 17, 2011 at 8:11 PM, xiufeng liu  wrote:
> Hi,
>
>
> The following is the setting of mapred-site where I have set the *
> mapred.child.java.opts* to *-Xmx512 -Xincgc*, and *fs.checkpoint.size* to *
> 268435456*.  But in the runtime setting job.xml, I found that it is still
> using the default value *mapred.child.java.opts*= *-Xmx200, and the
> *fs.checkpoint.size=67108864,
> instead of the values in mapred-site.xml ?  Could anybody advise? Thanks!
>
> -afancy
>
> [xiliu@xiliu-fedora conf]$ cat mapred-site.xml
> 
> 
>
> 
>
> 
> 
>    mapred.job.tracker
>    xiliu-fedora:9001
> 
> 
>        mapred.local.dir
>        /data1/hadoop-0.20.2/mapred/
> 
> 
>        mapred.tasktracker.map.tasks.maximum
>        4
> 
> 
>        mapred.tasktracker.reduce.tasks.maximum
>        4
> 
> 
>    fs.checkpoint.size
>    268435456
> 
> 
>        mapred.child.java.opts
>        -Xmx51 -Xincgc
> 
> 
>
>
>
> **
>



-- 
Harsh J
www.harshj.com


Re: Why Hadoop is slow in Cloud

2011-01-17 Thread Edward Capriolo
On Mon, Jan 17, 2011 at 6:08 AM, Steve Loughran  wrote:
> On 17/01/11 04:11, Adarsh Sharma wrote:
>>
>> Dear all,
>>
>> Yesterday I performed a kind of testing between *Hadoop in Standalone
>> Servers* & *Hadoop in Cloud.
>>
>> *I establish a Hadoop cluster of 4 nodes ( Standalone Machines ) in
>> which one node act as Master ( Namenode , Jobtracker ) and the remaining
>> nodes act as slaves ( Datanodes, Tasktracker ).
>> On the other hand, for testing Hadoop in *Cloud* ( Euclayptus ), I made
>> one Standalone Machine as *Hadoop Master* and the slaves are configured
>> on the VM's in Cloud.
>>
>> I am confused about the stats obtained after the testing. What I
>> concluded that the VM are giving half peformance as compared with
>> Standalone Servers.
>
> Interesting stats, nothing that massively surprises me, especially as your
> benchmarks are very much streaming through datasets. If you were doing
> something more CPU intensive (graph work, for example), things wouldn't look
> so bad
>
> I've done stuff in this area.
> http://www.slideshare.net/steve_l/farming-hadoop-inthecloud
>
>
>
>>
>> I am expected some slow down but at this level I never expect. Would
>> this is genuine or there may be some configuration problem.
>>
>> I am using 1 GB (10-1000mb/s) LAN in VM machines and 100mb/s in
>> Standalone Servers.
>>
>> Please have a look on the results and if interested comment on it.
>>
>
>
> The big killer here is File IO, with today's HDD controllers and virtual
> filesystems, disk IO is way underpowered compared to physical disk IO.
> Networking is reduced (but improving), and CPU can be pretty good, but disk
> is bad.
>
>
> Why?
>
> 1.  Every access to a block in the VM is turned into virtual disk controller
> operations which are then interpreted by the VDC and turned into
> reads/writes in the virtual disk drive
>
> 2. which is turned into seeks, reads and writes in the physical hardware.
>
> Some workarounds
>
> -allocate physical disks for the HDFS filesystem, for the duration of the
> VMs.
>
> -have the local hosts serve up a bit of their filesystem on a fast protocol
> (like NFS), and have every VM mount the local physical NFS filestore as
> their hadoop data dirs.
>
>

Q: "Why is my Nintendo emulator slow on a 800 MHZ computer made 10
years after Nintendo?"
A: Emulation

Everything you emulate you cut X% performance right off the top.

Emulation is great when you want to run mac on windows or freebsd on
linux or nintendo on linux. However most people would do better with
technologies that use kernel level isolation such as Linux containers,
Solaris Zones, Linux VServer (my favorite) http://linux-vserver.org/,
User Mode Linux or similar technologies that ISOLATE rather then
EMULATE.

Sorry list I feel I rant about this bi-annually. I have just always
been so shocked about how many people get lured into cloud and
virtualized solutions for "better management" and "near native
performance"


question about Hadoop job conf

2011-01-17 Thread xiufeng liu
Hi,


The following is the setting of mapred-site where I have set the *
mapred.child.java.opts* to *-Xmx512 -Xincgc*, and *fs.checkpoint.size* to *
268435456*.  But in the runtime setting job.xml, I found that it is still
using the default value *mapred.child.java.opts*= *-Xmx200, and the
*fs.checkpoint.size=67108864,
instead of the values in mapred-site.xml ?  Could anybody advise? Thanks!

-afancy

[xiliu@xiliu-fedora conf]$ cat mapred-site.xml







mapred.job.tracker
xiliu-fedora:9001


mapred.local.dir
/data1/hadoop-0.20.2/mapred/


mapred.tasktracker.map.tasks.maximum
4


mapred.tasktracker.reduce.tasks.maximum
4


fs.checkpoint.size
268435456


mapred.child.java.opts
-Xmx51 -Xincgc





**


Re: Question about Hadoop Default FCFS Job Scheduler

2011-01-17 Thread Nan Zhu
Hi, Chen

How is it going recently?

Actually I think you misundertand the code in assignTasks() in
JobQueueTaskScheduler.java, see the following structure of the interesting
codes:

//I'm sorry, I hacked the code so much, the name of the variables may be
different from the original version

for (i = 0; i < MapperCapacity; ++i){
   ...
   for (JobInProgress job:jobQueue){
   //try to shedule a node-local or rack-local map tasks
   //here is the interesting place
   t = job.obtainNewLocalMapTask(...);
   if (t != null){
  ...
  break;//the break statement here will make the control flow back
to "for (job:jobQueue)" which means that it will restart map tasks selection
procedure from the first job, so , it is actually schedule all of the first
job's local mappers first until the map slots are full
   }
   }
}

BTW, we can only schedule a reduce task in a single heartbeat



Best,
Nan
On Sat, Jan 15, 2011 at 1:45 PM, He Chen  wrote:

> Hey all
>
> Why does the FCFS scheduler only let a node chooses one task at a time in
> one job? In order to increase the data locality,
> it is reasonable to let a node to choose all its local tasks (if it can)
> from a job at a time.
>
> Any reply will be appreciated.
>
> Thanks
>
> Chen
>


Re: HDFS and Input Splits

2011-01-17 Thread Harsh J
On Mon, Jan 17, 2011 at 6:11 PM, Marco Didonna  wrote:
> Hello everyone,
> I am pretty new to hadoop and I have started learning it thanks to Tom
> White's book. There is something I still do not understand though: it's
> about the splitting of the input data in order to distribute the work
> load to a cluster of machines. I would like to discuss two possible
> scenarios and ask some question.
>
> For both scenario let's suppose the input data is a single 5GB text file.
>
> ::Scenario n.1::
>
> The 5GB input file is put on HDFS. According to the default settings it
> will be blindly split into 64MB chunks and sent to the various datanodes
> in a redundant flavor (according to the replica factor). Let's suppose I
> need to perform a line oriented analysis so my unit of analysis is a
> single line. Let's suppose I use a TextInputFormat which allows to use
> the entire line as value and the file offset (ignored) as key.
>
> On machine N I have this chunk of text[1]:
>
> "[...] MIDWAY upon the journey of our life
> I found myself within a forest dark,
> For the straightforward pathway had been lost.
> Ah me! how hard a"
>
> And on machine K I have this other chunk:
>
> "thing it is to say
> What was this forest savage, rough, and stern,
> Which in the very thought renews the fear. [...]"
>
> How can the mapper running on machine N reconstruct the correct prase
> "Ah me! how hard a thing it is to say". Maybe it will ask the namenode
> the address of the datanode holding the next file block?

The DFSInputStream can read across blocks (See the implementation for
the read methods inside DFSInputStream, which is what is provided on
an FS.open() call). LineReader, which is used to read lines off it, is
only interested in reading till it's designated length end (in bytes)
is reached (the first split may read beyond its last line in its block
and subsequent splits other than the first ignore the first line
read). Also read the wiki on Hadoop's MapReduce:
http://wiki.apache.org/hadoop/HadoopMapReduce which explains this
behavior.

> ::Scenario n.2::
>
> The 5GB is stored on the machine (local filesystem) submitting the job
> to the hadoop cluster. I suppose this machine will split the file evenly
> (using line endings as possible split points) and send the chunks to the
> node in the cluster.
> Is that correct? How is replication performed in this scenario?

Replication is not performed immediately. It is done for every
DataNode 'heartbeat', where-in the NameNode identifies if it needs to
perform any replications of a new block (or changed block) and assigns
the task to the DN waiting for its heartbeat response (a.k.a, waiting
for action).

Splits are also not done at line boundaries (which is good cause not
all files are text files), but at the specified block size byte
boundaries. The reader logic, as mentioned above, takes care of proper
line reading across mapper's splits.

(Please correct me if I'm wrong anywhere.)

-- 
Harsh J
www.harshj.com


Re: No locks available

2011-01-17 Thread Adarsh Sharma

Edward Capriolo wrote:

On Mon, Jan 17, 2011 at 8:13 AM, Adarsh Sharma  wrote:
  

Harsh J wrote:


Could you re-check your permissions on the $(dfs.data.dir)s for your
failing DataNode versus the user that runs it?

On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma 
wrote:

  

Can i know why it occurs.


  

Thanx Harsh , I know this issue and I cross-check several times permissions
of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ).

It is 755 and is owned by hadoop user and group.

I found that in failed datanode dir , it is unable to create 5 files in
dfs.data.dir whereas on the other hand, it creates following files in
successsful datanode :

curent
tmp
storage
in_use.lock

Does it helps.

Thanx




No locks available can mean that you are trying to use hadoop on a
filesystem that does not support file level locking. Are you trying to
run your name node storage in NFS space?
  


Yes Edward U'r absolutely right.

I mount hard disk path to the Datanode ( VM ) dfs.data.path.

But it causes no problem in other nodes.

Thanx



Re: When applying a patch, which attachment should I use?

2011-01-17 Thread Adarsh Sharma

Thanx a Lot Edward,

This information is very helpful to me.



With Best Regards

Adarsh Sharma




edward choi wrote:

Dear Adarsh,

I have a single machine running Namenode/JobTracker/Hbase Master.
There are 17 machines running Datanode/TaskTracker
Among those 17 machines, 14 are running Hbase Regionservers.
The other 3 machines are running Zookeeper.

And about the Zookeeper,
Hbase comes with its own Zookeeper so you don't need to install a new
Zookeeper. (except for the special occasion, which I'll explain later)
I assigned 14 machines as regionservers using
"$HBASE_HOME/conf/regionservers".
I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum" property
in "$HBASE_HOME/conf/hbase-site.xml".
Don't forget to set "export HBASE_MANAGES_ZK=true"
in "$HBASE_HOME/conf/hbase-env.sh". (This is where you announce that you
will be using Zookeeper that comes with HBase)
This way, when you execute "$HBASE_HOME/bin/start-hbase.sh", HBase will
automatically start Zookeeper first, then start HBase daemons.

Also, you can install your own Zookeeper and tell HBase to use it instead of
its own.
I read it on the internet that Zookeeper that comes with HBase does not work
properly on Windows 7 64bit. (
http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/)
So in that case you need to install your own Zookeeper, set it up properly,
and tell HBase to use it instead of its own.
All you need to do is configure zoo.cfg and add it to the HBase CLASSPATH.
And don't forget to set "export HBASE_MANAGES_ZK=false"
in "$HBASE_HOME/conf/hbase-env.sh".
This way, HBase will not start Zookeeper automatically.

About the separation of Zookeepers from regionservers,
Yes, it is recommended to separate Zookeepers from regionservers.
But that won't be necessary unless your clusters are very heavily loaded.
They also suggest that you give Zookeeper its own hard disk. But I haven't
done that myself yet. (Hard disks cost money you know)
So I'd say your cluster seems fine.
But when you want to expand your cluster, you'd need some changes. I suggest
you take a look at "Hadoop: The Definitive Guide".

Regards,
Edward

2011/1/13 Adarsh Sharma 

  

Thanks Edward,

Can you describe me the architecture used in your configuration.

Fore.g I have a cluster of 10 servers and

1 node act as ( Namenode, Jobtracker, Hmaster ).
Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker, Hregionservers
).
Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property.

I want to know that is it necessary to configure zookeeper separately with
the zookeeper-3.2.2 package or just have some IP's listed in

zookeeper.quorum.property and Hbase take care of it.

Can we specify IP's of Hregionservers used before as zookeeper servers (
HQuorumPeer ) or we must need separate servers for it.

My problem arises in running zookeeper. My Hbase is up and running  in
fully distributed mode too.




With Best Regards

Adarsh Sharma








edward choi wrote:



Dear Adarsh,

My situation is somewhat different from yours as I am only running Hadoop
and Hbase (as opposed to Hadoop/Hive/Hbase).

But I hope my experience could be of help to you somehow.

I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
(including master and slaves)
Then I followed exactly what they told me to do on

http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description
.

I didn't get a single error message and successfully started HBase in a
fully distributed mode.

I am not using Hive so I can't tell what caused the
MasterNotRunningException, but the patch above is meant to  allow
DFSClients
pass NameNode lists of known dead Datanodes.
I doubt that the patch has anything to do with MasterNotRunningException.

Hope this helps.

Regards,
Ed

2011/1/13 Adarsh Sharma 



  

I am also facing some issues  and i think applying

hdfs-630-0.20-append.patch<

https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
   would solve my problem.

I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

But I am facing master Not Running Exception mentioned in

http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

What you think Edward.


Thanks  Adarsh






edward choi wrote:





I am not familiar with this whole svn and patch stuff, so please
understand
my asking.

I was going to apply
hdfs-630-0.20-append.patch<

https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
 only
because I wanted to install HBase and the installation guide told me to.
The append branch you mentioned, does that include
hdfs-630-0.20-append.patch<

https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
 as
well?
Is it like the latest patch with all the good stuff packed in one?

Regards,
Ed

2011/1/12 Ted Dunning 





  

You may also be interested in the append branch:

http://

Re: No locks available

2011-01-17 Thread Edward Capriolo
On Mon, Jan 17, 2011 at 8:13 AM, Adarsh Sharma  wrote:
> Harsh J wrote:
>>
>> Could you re-check your permissions on the $(dfs.data.dir)s for your
>> failing DataNode versus the user that runs it?
>>
>> On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma 
>> wrote:
>>
>>>
>>> Can i know why it occurs.
>>>
>>
>>
>
> Thanx Harsh , I know this issue and I cross-check several times permissions
> of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ).
>
> It is 755 and is owned by hadoop user and group.
>
> I found that in failed datanode dir , it is unable to create 5 files in
> dfs.data.dir whereas on the other hand, it creates following files in
> successsful datanode :
>
> curent
> tmp
> storage
> in_use.lock
>
> Does it helps.
>
> Thanx
>

No locks available can mean that you are trying to use hadoop on a
filesystem that does not support file level locking. Are you trying to
run your name node storage in NFS space?


Re: No locks available

2011-01-17 Thread Adarsh Sharma

Harsh J wrote:

Could you re-check your permissions on the $(dfs.data.dir)s for your
failing DataNode versus the user that runs it?

On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma  wrote:
  

Can i know why it occurs.



  
Thanx Harsh , I know this issue and I cross-check several times 
permissions of of all dirs ( dfs.name.dir, dfs.data.dir, mapred.local.dir ).


It is 755 and is owned by hadoop user and group.

I found that in failed datanode dir , it is unable to create 5 files in 
dfs.data.dir whereas on the other hand, it creates following files in 
successsful datanode :


curent
tmp
storage
in_use.lock

Does it helps.

Thanx


Re: No locks available

2011-01-17 Thread Harsh J
Could you re-check your permissions on the $(dfs.data.dir)s for your
failing DataNode versus the user that runs it?

On Mon, Jan 17, 2011 at 6:33 PM, Adarsh Sharma  wrote:
> Can i know why it occurs.

-- 
Harsh J
www.harshj.com


Re: No locks available

2011-01-17 Thread Adarsh Sharma

xiufeng liu wrote:

did you format the namenode before you start? try to format it and start:
1) go to HADOOP_HOME/bin

2) ./hadoop namenode -format
  


I format the namenode and then issue the command :

bin/start-all.sh

this results 2 of my datanodes to run properly but causes the below 
exception for one datanode.



Can i know why it occurs.


Thanx




On Mon, Jan 17, 2011 at 1:43 PM, Adarsh Sharma wrote:

  

Dear all,


I know this a silly mistake but not able to find the reason of the
exception that causes one datanode to fail to start.

I mount  /hdd2-1 of a phsical machine into this VM and start
datanode,tasktracker.

Datanode fails after few seconds.

Can someone tell  me  the root  cause.

Below is the exception :

2011-01-17 18:01:08,199 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = hadoop7/172.16.1.8
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build =
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
/
2011-01-17 18:03:36,391 INFO org.apache.hadoop.hdfs.server.common.Storage:
java.io.IOException: No locks available
  at sun.nio.ch.FileChannelImpl.lock0(Native Method)
  at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
  at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
  at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
  at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
  at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
  at
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)

2011-01-17 18:03:36,393 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: No
locks available
  at sun.nio.ch.FileChannelImpl.lock0(Native Method)
  at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
  at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
  at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
  at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
  at
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
  at
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
  at
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
"~/project/hadoop-0.20.2/logs/hadoop-hadoop-datanode-hadoop7.log" 42L,
3141C   1,1   Top




Thanks

Adarsh




  




Re: No locks available

2011-01-17 Thread xiufeng liu
did you format the namenode before you start? try to format it and start:
1) go to HADOOP_HOME/bin

2) ./hadoop namenode -format

On Mon, Jan 17, 2011 at 1:43 PM, Adarsh Sharma wrote:

> Dear all,
>
>
> I know this a silly mistake but not able to find the reason of the
> exception that causes one datanode to fail to start.
>
> I mount  /hdd2-1 of a phsical machine into this VM and start
> datanode,tasktracker.
>
> Datanode fails after few seconds.
>
> Can someone tell  me  the root  cause.
>
> Below is the exception :
>
> 2011-01-17 18:01:08,199 INFO
> org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
> /
> STARTUP_MSG: Starting DataNode
> STARTUP_MSG:   host = hadoop7/172.16.1.8
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2
> STARTUP_MSG:   build =
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r
> 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
> /
> 2011-01-17 18:03:36,391 INFO org.apache.hadoop.hdfs.server.common.Storage:
> java.io.IOException: No locks available
>   at sun.nio.ch.FileChannelImpl.lock0(Native Method)
>   at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
>   at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
>   at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
>   at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
>   at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)
>
> 2011-01-17 18:03:36,393 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: No
> locks available
>   at sun.nio.ch.FileChannelImpl.lock0(Native Method)
>   at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
>   at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
>   at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
>   at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
>   at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
>   at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
> "~/project/hadoop-0.20.2/logs/hadoop-hadoop-datanode-hadoop7.log" 42L,
> 3141C   1,1   Top
>
>
>
>
> Thanks
>
> Adarsh
>


HDFS and Input Splits

2011-01-17 Thread Marco Didonna
Hello everyone,
I am pretty new to hadoop and I have started learning it thanks to Tom
White's book. There is something I still do not understand though: it's
about the splitting of the input data in order to distribute the work
load to a cluster of machines. I would like to discuss two possible
scenarios and ask some question.

For both scenario let's suppose the input data is a single 5GB text file.

::Scenario n.1::

The 5GB input file is put on HDFS. According to the default settings it
will be blindly split into 64MB chunks and sent to the various datanodes
in a redundant flavor (according to the replica factor). Let's suppose I
need to perform a line oriented analysis so my unit of analysis is a
single line. Let's suppose I use a TextInputFormat which allows to use
the entire line as value and the file offset (ignored) as key.

On machine N I have this chunk of text[1]:

"[...] MIDWAY upon the journey of our life
I found myself within a forest dark,
For the straightforward pathway had been lost.
Ah me! how hard a"

And on machine K I have this other chunk:

"thing it is to say
What was this forest savage, rough, and stern,
Which in the very thought renews the fear. [...]"

How can the mapper running on machine N reconstruct the correct prase
"Ah me! how hard a thing it is to say". Maybe it will ask the namenode
the address of the datanode holding the next file block?

::Scenario n.2::

The 5GB is stored on the machine (local filesystem) submitting the job
to the hadoop cluster. I suppose this machine will split the file evenly
(using line endings as possible split points) and send the chunks to the
node in the cluster.
Is that correct? How is replication performed in this scenario?

Thank you.

MD



No locks available

2011-01-17 Thread Adarsh Sharma

Dear all,


I know this a silly mistake but not able to find the reason of the 
exception that causes one datanode to fail to start.


I mount  /hdd2-1 of a phsical machine into this VM and start 
datanode,tasktracker.


Datanode fails after few seconds.

Can someone tell  me  the root  cause.

Below is the exception :

2011-01-17 18:01:08,199 INFO 
org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:

/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = hadoop7/172.16.1.8
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = 
https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 
911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010

/
2011-01-17 18:03:36,391 INFO 
org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException: No 
locks available

   at sun.nio.ch.FileChannelImpl.lock0(Native Method)
   at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
   at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
   at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
   at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
   at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
   at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)


2011-01-17 18:03:36,393 ERROR 
org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: No 
locks available

   at sun.nio.ch.FileChannelImpl.lock0(Native Method)
   at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
   at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
   at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
   at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
   at 
org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
   at 
org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:112)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:216)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
   at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
"~/project/hadoop-0.20.2/logs/hadoop-hadoop-datanode-hadoop7.log" 42L, 
3141C   1,1   Top





Thanks

Adarsh


Re: How to replace Jetty-6.1.14 with Jetty 7 in Hadoop?

2011-01-17 Thread Steve Loughran

On 16/01/11 09:41, xiufeng liu wrote:

Hi,

In my cluster, Hadoop somehow cannot work, and I found that it was due to
the Jetty-6.1.14 which is not able to start up. However, Jetty 7 can work in
my cluster.  Could any body know how to replace Jetty6.1.14 with Jetty7?

Thanks
afancy



The switch to jetty 7 will not be easy, and I wouldn't encourage you to 
do it unless you want to get into editing the Hadoop source, retesting 
everything,



Try moving up to v 6.1.25, which should be more straightforward. Replace 
the JAR, QA the cluster with some terasorting.


Re: Why Hadoop is slow in Cloud

2011-01-17 Thread Steve Loughran

On 17/01/11 04:11, Adarsh Sharma wrote:

Dear all,

Yesterday I performed a kind of testing between *Hadoop in Standalone
Servers* & *Hadoop in Cloud.

*I establish a Hadoop cluster of 4 nodes ( Standalone Machines ) in
which one node act as Master ( Namenode , Jobtracker ) and the remaining
nodes act as slaves ( Datanodes, Tasktracker ).
On the other hand, for testing Hadoop in *Cloud* ( Euclayptus ), I made
one Standalone Machine as *Hadoop Master* and the slaves are configured
on the VM's in Cloud.

I am confused about the stats obtained after the testing. What I
concluded that the VM are giving half peformance as compared with
Standalone Servers.


Interesting stats, nothing that massively surprises me, especially as 
your benchmarks are very much streaming through datasets. If you were 
doing something more CPU intensive (graph work, for example), things 
wouldn't look so bad


I've done stuff in this area.
http://www.slideshare.net/steve_l/farming-hadoop-inthecloud





I am expected some slow down but at this level I never expect. Would
this is genuine or there may be some configuration problem.

I am using 1 GB (10-1000mb/s) LAN in VM machines and 100mb/s in
Standalone Servers.

Please have a look on the results and if interested comment on it.




The big killer here is File IO, with today's HDD controllers and virtual 
filesystems, disk IO is way underpowered compared to physical disk IO. 
Networking is reduced (but improving), and CPU can be pretty good, but 
disk is bad.



Why?

1.  Every access to a block in the VM is turned into virtual disk 
controller operations which are then interpreted by the VDC and turned 
into reads/writes in the virtual disk drive


2. which is turned into seeks, reads and writes in the physical hardware.

Some workarounds

-allocate physical disks for the HDFS filesystem, for the duration of 
the VMs.


-have the local hosts serve up a bit of their filesystem on a fast 
protocol (like NFS), and have every VM mount the local physical NFS 
filestore as their hadoop data dirs.