Backupnode web UI showing upgrade status..

2011-03-22 Thread Gokulakannan M
Hi all,

A newbie question reg backupnode . I just started the namenode and
backupnode and in backupnode web UI it shows Upgrade for version -24 has
been completed. Upgrade is not finalized. I did not run any upgrade. Can
anyone please clarify as this message is confusing..

 

 Thanks,

  Gokul

 



RE: Backupnode web UI showing upgrade status..

2011-03-22 Thread Gokulakannan M
Since I am not running any upgrades should the message be like There are no
upgrades in progress as in case of NN  others???

 

-Original Message-
From: James Seigel [mailto:ja...@tynt.com] 
Sent: Tuesday, March 22, 2011 7:04 PM
To: common-user@hadoop.apache.org
Subject: Re: Backupnode web UI showing upgrade status..

 

Ther is a step which flips a bit. finalizeupgrade or something that

needs to be run.

 

Should be straight forward

 

Cheers

James

 

Sent from my mobile. Please excuse the typos.

 

On 2011-03-22, at 7:32 AM, Gokulakannan M gok...@huawei.com wrote:

 

 Hi all,

 

 A newbie question reg backupnode . I just started the namenode and

 backupnode and in backupnode web UI it shows Upgrade for version -24 has

 been completed. Upgrade is not finalized. I did not run any upgrade. Can

 anyone please clarify as this message is confusing..

 

 

 

 Thanks,

 

  Gokul

 

 

 



RE: hadoop 0.20 append - some clarifications

2011-02-14 Thread Gokulakannan M
 

 I think that in general, the behavior of any program reading data from an
HDFS file before hsync or close is called is pretty much undefined.

 

In Unix, users can parallelly read a file when another user is writing a
file. And I suppose the sync feature design is based on that.

So at any point of time during the file write, parallel users should be able
to read the file.

 

https://issues.apache.org/jira/browse/HDFS-142?focusedCommentId=12663958pag
e=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1
2663958

  _  

From: Ted Dunning [mailto:tdunn...@maprtech.com] 
Sent: Friday, February 11, 2011 2:14 PM
To: common-user@hadoop.apache.org; gok...@huawei.com
Cc: hdfs-u...@hadoop.apache.org; dhr...@gmail.com
Subject: Re: hadoop 0.20 append - some clarifications

 

I think that in general, the behavior of any program reading data from an
HDFS file before hsync or close is called is pretty much undefined.

 

If you don't wait until some point were part of the file is defined, you
can't expect any particular behavior.

On Fri, Feb 11, 2011 at 12:31 AM, Gokulakannan M gok...@huawei.com wrote:

I am not concerned about the sync behavior.

The thing is the reader reading non-flushed(non-synced) data from HDFS as
you have explained in previous post.(in hadoop 0.20 append branch)

I identified one specific scenario where the above statement is not holding
true.

Following is how you can reproduce the problem.

1. add debug point at createBlockOutputStream() method in DFSClient and run
your HDFS write client in debug mode

2. allow client to write 1 block to HDFS

3. for the 2nd block, the flow will come to the debug point mentioned in
1(do not execute the createBlockOutputStream() method). hold here.

4. parallely, try to read the file from another client

Now you will get an error saying that file cannot be read.



 _

From: Ted Dunning [mailto:tdunn...@maprtech.com]
Sent: Friday, February 11, 2011 11:04 AM
To: gok...@huawei.com
Cc: common-user@hadoop.apache.org; hdfs-u...@hadoop.apache.org;
c...@boudnik.org
Subject: Re: hadoop 0.20 append - some clarifications



It is a bit confusing.



SequenceFile.Writer#sync isn't really sync.



There is SequenceFile.Writer#syncFs which is more what you might expect to
be sync.



Then there is HADOOP-6313 which specifies hflush and hsync.  Generally, if
you want portable code, you have to reflect a bit to figure out what can be
done.

On Thu, Feb 10, 2011 at 8:38 PM, Gokulakannan M gok...@huawei.com wrote:

Thanks Ted for clarifying.

So the sync is to just flush the current buffers to datanode and persist the
block info in namenode once per block, isn't it?



Regarding reader able to see the unflushed data, I faced an issue in the
following scneario:

1. a writer is writing a 10MB file(block size 2 MB)

2. wrote the file upto 4MB (2 finalized blocks in current and nothing in
blocksBeingWritten directory in DN) . So 2 blocks are written

3. client calls addBlock for the 3rd block on namenode and not yet created
outputstream to DN(or written anything to DN). At this point of time, the
namenode knows about the 3rd block but the datanode doesn't.

4. at point 3, a reader is trying to read the file and he is getting
exception and not able to read the file as the datanode's getBlockInfo
returns null to the client(of course DN doesn't know about the 3rd block
yet)

In this situation the reader cannot see the file. But when the block writing
is in progress , the read is successful.

Is this a bug that needs to be handled in append branch?



 -Original Message-
 From: Konstantin Boudnik [mailto:c...@boudnik.org]
 Sent: Friday, February 11, 2011 4:09 AM
To: common-user@hadoop.apache.org
 Subject: Re: hadoop 0.20 append - some clarifications

 You might also want to check append design doc published at HDFS-265



I was asking about the hadoop 0.20 append branch. I suppose HDFS-265's
design doc won't apply to it.



 _

From: Ted Dunning [mailto:tdunn...@maprtech.com]
Sent: Thursday, February 10, 2011 9:29 PM
To: common-user@hadoop.apache.org; gok...@huawei.com
Cc: hdfs-u...@hadoop.apache.org
Subject: Re: hadoop 0.20 append - some clarifications



Correct is a strong word here.



There is actually an HDFS unit test that checks to see if partially written
and unflushed data is visible.  The basic rule of thumb is that you need to
synchronize readers and writers outside of HDFS.  There is no guarantee that
data is visible or invisible after writing, but there is a guarantee that it
will become visible after sync or close.

On Thu, Feb 10, 2011 at 7:11 AM, Gokulakannan M gok...@huawei.com wrote:

Is this the correct behavior or my understanding is wrong?






 



RE: hadoop 0.20 append - some clarifications

2011-02-14 Thread Gokulakannan M
I agree that HDFS doesn't strongly follow POSIX semantics. But it would have
been better if this issue is fixed.

 

  _  

From: Ted Dunning [mailto:tdunn...@maprtech.com] 
Sent: Monday, February 14, 2011 10:18 PM
To: gok...@huawei.com
Cc: common-user@hadoop.apache.org; hdfs-u...@hadoop.apache.org;
dhr...@gmail.com
Subject: Re: hadoop 0.20 append - some clarifications

 

HDFS definitely doesn't follow anything like POSIX file semantics.

 

They may be a vague inspiration for what HDFS does, but generally the
behavior of HDFS is not tightly specified.  Even the unit tests have some
real surprising behavior.

On Mon, Feb 14, 2011 at 7:21 AM, Gokulakannan M gok...@huawei.com wrote:

 

 I think that in general, the behavior of any program reading data from an
HDFS file before hsync or close is called is pretty much undefined.

 

In Unix, users can parallelly read a file when another user is writing a
file. And I suppose the sync feature design is based on that.

So at any point of time during the file write, parallel users should be able
to read the file.

 

https://issues.apache.org/jira/browse/HDFS-142?focusedCommentId=12663958
https://issues.apache.org/jira/browse/HDFS-142?focusedCommentId=12663958pa
ge=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-
12663958
page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comme
nt-12663958

  _  

From: Ted Dunning [mailto:tdunn...@maprtech.com] 
Sent: Friday, February 11, 2011 2:14 PM
To: common-user@hadoop.apache.org; gok...@huawei.com
Cc: hdfs-u...@hadoop.apache.org; dhr...@gmail.com
Subject: Re: hadoop 0.20 append - some clarifications

 

I think that in general, the behavior of any program reading data from an
HDFS file before hsync or close is called is pretty much undefined.

 

If you don't wait until some point were part of the file is defined, you
can't expect any particular behavior.

On Fri, Feb 11, 2011 at 12:31 AM, Gokulakannan M gok...@huawei.com wrote:

I am not concerned about the sync behavior.

The thing is the reader reading non-flushed(non-synced) data from HDFS as
you have explained in previous post.(in hadoop 0.20 append branch)

I identified one specific scenario where the above statement is not holding
true.

Following is how you can reproduce the problem.

1. add debug point at createBlockOutputStream() method in DFSClient and run
your HDFS write client in debug mode

2. allow client to write 1 block to HDFS

3. for the 2nd block, the flow will come to the debug point mentioned in
1(do not execute the createBlockOutputStream() method). hold here.

4. parallely, try to read the file from another client

Now you will get an error saying that file cannot be read.



 _

From: Ted Dunning [mailto:tdunn...@maprtech.com]
Sent: Friday, February 11, 2011 11:04 AM
To: gok...@huawei.com
Cc: common-user@hadoop.apache.org; hdfs-u...@hadoop.apache.org;
c...@boudnik.org
Subject: Re: hadoop 0.20 append - some clarifications



It is a bit confusing.



SequenceFile.Writer#sync isn't really sync.



There is SequenceFile.Writer#syncFs which is more what you might expect to
be sync.



Then there is HADOOP-6313 which specifies hflush and hsync.  Generally, if
you want portable code, you have to reflect a bit to figure out what can be
done.

On Thu, Feb 10, 2011 at 8:38 PM, Gokulakannan M gok...@huawei.com wrote:

Thanks Ted for clarifying.

So the sync is to just flush the current buffers to datanode and persist the
block info in namenode once per block, isn't it?



Regarding reader able to see the unflushed data, I faced an issue in the
following scneario:

1. a writer is writing a 10MB file(block size 2 MB)

2. wrote the file upto 4MB (2 finalized blocks in current and nothing in
blocksBeingWritten directory in DN) . So 2 blocks are written

3. client calls addBlock for the 3rd block on namenode and not yet created
outputstream to DN(or written anything to DN). At this point of time, the
namenode knows about the 3rd block but the datanode doesn't.

4. at point 3, a reader is trying to read the file and he is getting
exception and not able to read the file as the datanode's getBlockInfo
returns null to the client(of course DN doesn't know about the 3rd block
yet)

In this situation the reader cannot see the file. But when the block writing
is in progress , the read is successful.

Is this a bug that needs to be handled in append branch?



 -Original Message-
 From: Konstantin Boudnik [mailto:c...@boudnik.org]
 Sent: Friday, February 11, 2011 4:09 AM
To: common-user@hadoop.apache.org
 Subject: Re: hadoop 0.20 append - some clarifications

 You might also want to check append design doc published at HDFS-265



I was asking about the hadoop 0.20 append branch. I suppose HDFS-265's
design doc won't apply to it.



 _

From: Ted Dunning [mailto:tdunn...@maprtech.com]
Sent: Thursday, February 10, 2011 9:29 PM
To: common-user@hadoop.apache.org; gok...@huawei.com
Cc: hdfs-u

RE: hadoop 0.20 append - some clarifications

2011-02-11 Thread Gokulakannan M
I am not concerned about the sync behavior.

The thing is the reader reading non-flushed(non-synced) data from HDFS as
you have explained in previous post.(in hadoop 0.20 append branch)

I identified one specific scenario where the above statement is not holding
true.

Following is how you can reproduce the problem.

1. add debug point at createBlockOutputStream() method in DFSClient and run
your HDFS write client in debug mode

2. allow client to write 1 block to HDFS

3. for the 2nd block, the flow will come to the debug point mentioned in
1(do not execute the createBlockOutputStream() method). hold here.

4. parallely, try to read the file from another client

Now you will get an error saying that file cannot be read.

 

  _  

From: Ted Dunning [mailto:tdunn...@maprtech.com] 
Sent: Friday, February 11, 2011 11:04 AM
To: gok...@huawei.com
Cc: common-user@hadoop.apache.org; hdfs-u...@hadoop.apache.org;
c...@boudnik.org
Subject: Re: hadoop 0.20 append - some clarifications

 

It is a bit confusing.

 

SequenceFile.Writer#sync isn't really sync.

 

There is SequenceFile.Writer#syncFs which is more what you might expect to
be sync.  

 

Then there is HADOOP-6313 which specifies hflush and hsync.  Generally, if
you want portable code, you have to reflect a bit to figure out what can be
done. 

On Thu, Feb 10, 2011 at 8:38 PM, Gokulakannan M gok...@huawei.com wrote:

Thanks Ted for clarifying.

So the sync is to just flush the current buffers to datanode and persist the
block info in namenode once per block, isn't it?

 

Regarding reader able to see the unflushed data, I faced an issue in the
following scneario:

1. a writer is writing a 10MB file(block size 2 MB) 

2. wrote the file upto 4MB (2 finalized blocks in current and nothing in
blocksBeingWritten directory in DN) . So 2 blocks are written 

3. client calls addBlock for the 3rd block on namenode and not yet created
outputstream to DN(or written anything to DN). At this point of time, the
namenode knows about the 3rd block but the datanode doesn't.

4. at point 3, a reader is trying to read the file and he is getting
exception and not able to read the file as the datanode's getBlockInfo
returns null to the client(of course DN doesn't know about the 3rd block
yet)

In this situation the reader cannot see the file. But when the block writing
is in progress , the read is successful. 

Is this a bug that needs to be handled in append branch?

 

 -Original Message-
 From: Konstantin Boudnik [mailto:c...@boudnik.org] 
 Sent: Friday, February 11, 2011 4:09 AM
To: common-user@hadoop.apache.org
 Subject: Re: hadoop 0.20 append - some clarifications

 You might also want to check append design doc published at HDFS-265

 

I was asking about the hadoop 0.20 append branch. I suppose HDFS-265's
design doc won't apply to it.

 

  _  

From: Ted Dunning [mailto:tdunn...@maprtech.com] 
Sent: Thursday, February 10, 2011 9:29 PM
To: common-user@hadoop.apache.org; gok...@huawei.com
Cc: hdfs-u...@hadoop.apache.org
Subject: Re: hadoop 0.20 append - some clarifications

 

Correct is a strong word here.

 

There is actually an HDFS unit test that checks to see if partially written
and unflushed data is visible.  The basic rule of thumb is that you need to
synchronize readers and writers outside of HDFS.  There is no guarantee that
data is visible or invisible after writing, but there is a guarantee that it
will become visible after sync or close.

On Thu, Feb 10, 2011 at 7:11 AM, Gokulakannan M gok...@huawei.com wrote:

Is this the correct behavior or my understanding is wrong?

 

 



hadoop 0.20 append - some clarifications

2011-02-10 Thread Gokulakannan M
Hi All,

I have run the hadoop 0.20 append branch . Can someone please clarify the
following behavior?

A writer writing a file but he has not flushed the data and not closed the
file. Could a parallel reader read this partial file? 

For example,

1. a writer is writing a 10MB file(block size 2 MB) 

2. wrote the file upto 5MB (2 finalized blocks + 1 blockBeingWritten) . note
that writer is not calling FsDataOutputStream sync( ) at all

3. now a reader tries to read the above partially written file

I can be able to see that the reader can be able to see the partially
written 5MB data but I feel the reader should be able to see the data only
after the writer calls sync() api. 

Is this the correct behavior or my understanding is wrong?

 

 Thanks,

  Gokul



Has anyone tried rolling upgrade in hadoop?

2010-11-10 Thread Gokulakannan M
Hi all,

 

It will be a good sharing if some tips are given in performing
rolling upgrade in hadoop(upgrading to a newer version of hadoop without
stopping the 

whole cluster) i.e., upgrading phase by phase, without affecting the running
services. However I found in hadoop related materials and web that the
rolling upgrade

will be supported for 1+ versions of hadoop.

 

 Thanks,

  Gokul

 

 



RE: Tasktracker volume failure...

2010-10-26 Thread Gokulakannan M

Yes.. This is my scenario..

I have one tasktracker... I configured 10 dirs(volumes)in
mapred.local.direach of this is a separate volume mounted..even separate
disks physically..if one of the volume has bugsin my case one physical
harddisk is removed manually , tasktracker is not executing further tasks..


I remember in datanode, a similar scenario is handled.. when one of the
volume fails, it will mark that volume as bad and proceed further..Ref:
HDFS-457

Is the similar fault tolerant feature available for tasktracker??? because
only one of the n dirs has problem .. but it makes the TT to keep on
retrying for that failed and not executing any tasks...

-Original Message-
From: Steve Loughran [mailto:ste...@apache.org] 
Sent: Tuesday, October 26, 2010 3:52 PM
To: common-user@hadoop.apache.org
Subject: Re: Tasktracker volume failure...

On 26/10/10 04:10, Gokulakannan M wrote:


 Hi,

 I faced a problem when a volume configured in *mapred.local.dir* fails,
 the tasktracker continuously trying to create directory
 checkLocalDirs() and failseven the main method throws exception
 periodically due to getFreeSpace() call on the failed volume.
 Eventually all the running jobs are getting failed and new jobs cannot
 be executed.


I think you can provide a list of localdirs, in which case the TT would 
only fail if there is no free local volume with enough space



RE: Best practices - Large Hadoop Cluster

2010-08-10 Thread Gokulakannan M

Hi Raj,

As per my understanding the problem is with ssh password each time
you start/stop the cluster. You need password less startup shutdown right.?

Here is my way of overcoming the ssh problem 

Write a shell script as follows:

1. Generate a ssh key from the namenode machine (where you will
start/stop the cluster)

2. Read each entry from the conf/slaves file and do the following

2.1 add the key you generated in step 1 to the ssh
authorized_keys file of the datanode machine that you got in step 2
something like below script
cat $HOME/.ssh/public_key_file | ssh usern...@host '
cat  $HOME/.ssh/authorized_keys'


3. Repeat step 2 for conf/masters also

Note: Password must be specified for the specified usern...@host
first time since the ssh command given in point 2.1 requires it. 

Now you can start/stop your hadoop cluster without ssh password
overhead


 Thanks,
  Gokul
 
   
 

***

-Original Message-
From: Raj V [mailto:rajv...@yahoo.com] 
Sent: Tuesday, August 10, 2010 7:16 PM
To: common-user@hadoop.apache.org
Subject: Best practices - Large Hadoop Cluster

I need to start setting up a large - hadoop cluster of 512 nodes . My
biggest 
problem is the SSH keys. Is there a simpler way of generating and exchanging
ssh 
keys among the nodes? Any best practices? If there is none, I could
volunteer to 
do it,

Raj


logging difficulties in hadoop

2010-06-17 Thread Gokulakannan M
 Hi,

 

I wonder why the log level setting is not working when I set the
hadoop.root.logger to ERROR or WARN in hadoop/conf/log4j.properties file.
(probably it is set in  hadoop/bin/hadoop and hadoop-daemon.sh files)

 

But if I export the HADOOP_ROOT_LOGGER property in
hadoop-env.sh, the change is seen in the logs. However a log*.out[num] file
is created each time I restart the cluster. I exported with ERROR and DRFA. 

 

Any idea why the settings in log4j.properties is of no use
currently, inspite of the fact that it is mentioned everywhere in web for
hadoop logging

 

Appreciated help reg this.. 

 

PS: setting hadoop.root.logger in /log4j.properties is working
fine if I take out the HADOOP_ROOT_LOGGER entries in hadoop/bin/hadoop and
hadoop-daemon.sh files

 

 Thanks,

  Gokul

 



changing my hadoop log level is not getting reflected in logs

2010-06-14 Thread Gokulakannan M
Hi,

 

I changed the default log level of hadoop from INFO to ERROR by
setting the property hadoop.root.logger to error in
hadoop/conf/log4j.properties

 

But when I start namenode, the INFO logs are seen in the log
file. I did a workaround and found that  HADOOP_ROOT_LOGGER is hard coded to
INFO in 

hadoop-daemon.sh and hadoop script files in hadoop/bin. Is
there anything to do with that or they are provided for any purpose??

 

PS: I am using hadoop 0.20.1

 

 Thanks,

  Gokul

 

  

 


***
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

 



Namenode UI - Browse File System not working in psedo-dist cluster..

2010-05-11 Thread Gokulakannan M
 

Hi,

 

 

The Browse File System link  in NameNode UI
(http://namenode:50070) is not working when I run NameNode and 1 DataNode in
the same system (Pseudo-distributed mode).

 

I thought it might be jetty issue. But if I run 1 NameNode and 1
DataNode in one system and a DataNode in another system (total 1 NN and 2
DN), the Browse File System link is working fine and I can see the files in
HDFS. 

 

Any idea why the issue with pseudo-distributed mode???

 

 Thanks,

  Gokul

 

  

 

 



hadoop-append feature not in stable release?

2010-03-29 Thread Gokulakannan M
Hi,

I am new to hadoop. The following questions popped up in my mind
and I couldn't get answers from web.

 

I found that in hdfs-default.xml, the property
dfs.support.append has been set to false by default with the description 

 

Does HDFS allow appends to files? This is currently set to
false because there are bugs in the append code and is not supported in
any production cluster

 

So, is there a way to resolve this issue? any existing
patches(like HADOOP-1700
http://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.pl
ugin.system.issuetabpanels%3Aall-tabpanel ) will solve the problem of
hadoop-append to be stable? 

 

From HADOOP-1700
http://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.pl
ugin.system.issuetabpanels%3Aall-tabpanel  , I can see that this feature
has been enabled and updated in trunk. But why it is not enabled in the
stable Hadoop release?

 

 Thanks,

 Gokul

 



hadoop conf for dynamically changing ips

2010-03-26 Thread Gokulakannan M
 

 Hi,

 

I have a LAN in which the IPs of the machines will be changed
dynamically by the DHCP sever.

 

So for namenode, jobtracker, master and slave configurations we
could not give the IP. 

can the machine names be given for those configurations will
that work??

 

 Thanks,

  Gokul

 

 



RE: hadoop conf for dynamically changing ips

2010-03-26 Thread Gokulakannan M

It's what I do. You just have to make sure that if the IPAddrs change, 

everything gets restarted.


Thanks Steve for the reply. everything must be restarted means the
hadoop cluster or all the systems.

   Thanks,
   Gokul
 
 
-Original Message-
From: Steve Loughran [mailto:ste...@apache.org] 
Sent: Friday, March 26, 2010 8:14 PM
To: common-user@hadoop.apache.org
Subject: Re: hadoop conf for dynamically changing ips

On 26/03/2010 14:31, Gokulakannan M wrote:


   Hi,



  I have a LAN in which the IPs of the machines will be changed
 dynamically by the DHCP sever.



  So for namenode, jobtracker, master and slave configurations
we
 could not give the IP.

  can the machine names be given for those configurations
will
 that work??


It's what I do. You just have to make sure that if the IPAddrs change, 
everything gets restarted.