[jira] [Updated] (HDFS-2691) HA: Tests and fixes for pipeline targets and replica recovery

2012-01-27 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2691:
--

Attachment: hdfs-2691.txt

Updated patch on the tip of HDFS-1623 branch, rather than on top of 2742. Also 
added comments to the proto file as Eli suggested.

> HA: Tests and fixes for pipeline targets and replica recovery
> -
>
> Key: HDFS-2691
> URL: https://issues.apache.org/jira/browse/HDFS-2691
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-2691.txt, hdfs-2691.txt, hdfs-2691.txt, 
> hdfs-2691.txt
>
>
> Currently there are some TODOs around pipeline/recovery code in the HA 
> branch. For example, commitBlockSynchronization only gets sent to the active 
> NN which may have failed over by that point. So, we need to write some tests 
> here and figure out what the correct behavior is.
> Another related area is the treatment of targets in the pipeline. When a 
> pipeline is created, the active NN adds the "expected locations" to the 
> BlockInfoUnderConstruction, but the DN identifiers aren't logged with the 
> OP_ADD. So after a failover, the BlockInfoUnderConstruction will have no 
> targets and I imagine replica recovery would probably trigger some issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-27 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2742:
--

Attachment: hdfs-2742.txt

Fixed all the nits above except for the indentation - I didn't see any place 
with improper indentation.

{quote}
I think BM should distinguish between corrupt and out-of-dates replicas. The 
new case in processFirstBlockReport in thispatch, and where we mark reported 
RBW replicas for completed blocks as corrupt are using "corrupt" as a proxy for 
"please delete". I wasn't able to come up with additional bugs that with a 
similar cause but it would be easier to reason about if only truly corrupt 
replicas were marked as such. Can punt to a separate jira, if you agree.
{quote}
I don't entirely follow what you're getting at here... so let's open a new JIRA 
:)

bq. In FSNamesystem#isSafeModeTrackingBlocks, shouldn't we assert haEnabled is 
enabled if we're in SM and shouldIncrementallyTrackBlocks is true, instead of 
short-circuiting? We currently wouldn't know if we violate this condition 
because we'll return false if haEnabled.

I did the check for haEnabled in FSNamesystem rather than SafeModeInfo, since 
when HA is enabled it means we can avoid the volatile read of safeModeInfo. 
This is to avoid having any impact on the HA case. Is that what you're 
referring to? Not sure specifically what you're asking for in this change...

I changed {{setBlockTotal}} to only set {{shouldIncrementallyTrackBlocks}} to 
true when HA is enabled, and added {{assert haEnabled}} in 
{{adjustBlockTotals}}. Does that address your comment?


> HA: observed dataloss in replication stress test
> 
>
> Key: HDFS-2742
> URL: https://issues.apache.org/jira/browse/HDFS-2742
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-2742.txt, hdfs-2742.txt, hdfs-2742.txt, 
> hdfs-2742.txt, hdfs-2742.txt, hdfs-2742.txt, log-colorized.txt
>
>
> The replication stress test case failed over the weekend since one of the 
> replicas went missing. Still diagnosing the issue, but it seems like the 
> chain of events was something like:
> - a block report was generated on one of the nodes while the block was being 
> written - thus the block report listed the block as RBW
> - when the standby replayed this queued message, it was replayed after the 
> file was marked complete. Thus it marked this replica as corrupt
> - it asked the DN holding the corrupt replica to delete it. And, I think, 
> removed it from the block map at this time.
> - That DN then did another block report before receiving the deletion. This 
> caused it to be re-added to the block map, since it was "FINALIZED" now.
> - Replication was lowered on the file, and it counted the above replica as 
> non-corrupt, and asked for the other replicas to be deleted.
> - All replicas were lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2801) Provide a method in client side translators to check for a methods supported in underlying protocol.

2012-01-27 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195396#comment-13195396
 ] 

Hadoop QA commented on HDFS-2801:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12512214/HDFS-2801.trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.TestFSInputChecker

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1822//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1822//artifact/trunk/hadoop-hdfs-project/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1822//console

This message is automatically generated.

> Provide a method in client side translators to check for a methods supported 
> in underlying protocol.
> 
>
> Key: HDFS-2801
> URL: https://issues.apache.org/jira/browse/HDFS-2801
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-2801.trunk.patch, HDFS-2801.trunk.patch, 
> HDFS-2801.trunk.patch, HDFS-2801.trunk.patch
>
>
> This is jira corresponds to HADOOP-7965. The client side translators should 
> have a method boolean isMethodSupported(String methodName) which returns true 
> if the given method is supported and available at the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-224) I propose a tool for creating and manipulating a new abstraction, Hadoop Archives.

2012-01-27 Thread Owen O'Malley (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HDFS-224.


Resolution: Duplicate

We have a different version of harchives.

> I propose a tool for creating and manipulating a new abstraction, Hadoop 
> Archives.
> --
>
> Key: HDFS-224
> URL: https://issues.apache.org/jira/browse/HDFS-224
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Dick King
>
> -- Introduction
> In some hadoop map/reduce and dfs use cases, including a specific case that 
> arises in my own work, users would like to populate dfs with a family of 
> hundreds or thousands of directory trees, each of which consists of thousands 
> of files.  In our case, the trees each have perhaps 20 gigabytes; two or 
> three 3-10-gigabyte files, a thousand small ones, and a large number of files 
> of intermediate size.  I am writing this JIRA to encourage discussion of a 
> new facility I want to create and contribute to the dfs core.
> -- The problem
> You can't store such families of trees in dfs in the obvious manner.  The 
> problem is that the name nodes can't handle the millions or ten million files 
> that result from such a family, especially if there are a couple of families. 
>  I understand that dfs will not be able to accommodate tens of millions of 
> files in one instance for quite a while.
> -- Exposed API of my proposed solution
> I would therefore like to produce, and contribute to the dfs core, a new tool 
> that implements an abstraction called a Hadoop Archive [or harchive].  
> Conceptually, a harchive is a unit, but it manages a space that looks like a 
> directory tree.  The tool exposes an interface that allows a user to do the 
> following:
>  * directory-level operations
>** create a harchive [either empty, or initially populated form a 
> locally-stored directory tree] .  The namespace for harchives is the same as 
> the space of possible dfs directory locators, and a harchive would in fact be 
> implemented as a dfs directory with specialized contents.
>** Add a directory tree to an existing harchive in a specific place within 
> the harchive
>** retrieve a directory tree or subtree at or beneath the root of the 
> harchive directory structure, into a local directory tree
>  * file-level operations
>** add a local file to a specific place in the harchive
>** modify a file image in a specific place in the harchive to match a 
> local file
>** delete a file image in the harchive.
>** move a file image within the harchive
>** open a file image in the harchive for reading or writing.
>  * stream operations
>** open a harchive file image for reading or writing as a stream, in a 
> manner similar to dfs files, and read or write it [ie., hdfsRead(...) ].  
> This would include random access operators for reading.
>  * management operations
>** commit a group of changes [which would be made atomically -- there 
> would be no way half of a change could be made to a harchive if a client 
> crashes].
>** clean up a harchive, if it's gotten less performant because of 
> extensive editing
>** delete a harchive
> We would also implement a command line interface.
> -- Brief sketch of internals
> A harchive would be represented as a small collection of files, called 
> segments, in a dfs directory at the harchive's location.  Each segment would 
> contain some of the files of the harchive's file images in a format to be 
> determined, plus a harchive index.  We may group files by size, or some other 
> criteria.  It is likely that harchives would contain only one segment in 
> common cases.
> Changes would be made by adding the text of the new files, either by 
> rewriting an existing segment that contains not much more data than the size 
> of the changes or by creating a new segment, complete with a new index.  When 
> dfs comes to be enhanced to allow appends to dfs files, as requested by 
> HADOOP-1700 , we would be able to take advantage of that.
> Often, when a harchive is initially populated, it could be a single segment, 
> and a file it contains could be accessed with two random accesses into the 
> segment.  The first access retrieves the index, and the second access 
> retrieves the beginning of the file.  We could choose to put smaller files 
> closer to the index to allow lower average amortized costs per byte.
> We might instead choose to represent a harchive as one file or a few files 
> for the large represented files, and smaller files for the represented 
> smaller files.  That lets us make modifications by copying at lower cost.
> The segment containing the index is found by a naming convention.  Atomicity 
> is obtained by creating indices and renaming the files conta

[jira] [Commented] (HDFS-2854) SecurityUtil.buildTokenService returns java.net.UnknownHostException when using paths like viewfs://default/some/path

2012-01-27 Thread Arpit Gupta (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195353#comment-13195353
 ] 

Arpit Gupta commented on HDFS-2854:
---

here the stack trace
{code}
java.lang.IllegalArgumentException: java.net.UnknownHostException: default
at 
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:428)
at 
org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:311)
at 
org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:109)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:87)
at 
org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
at 
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
at 
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:405)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1218)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1239)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:84)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:200)
Caused by: java.net.UnknownHostException: default
... 29 more
{code}

I had viewfs configured to have a default mount table and using 
viewfs://default/paths

> SecurityUtil.buildTokenService returns java.net.UnknownHostException when 
> using paths like viewfs://default/some/path
> -
>
> Key: HDFS-2854
> URL: https://issues.apache.org/jira/browse/HDFS-2854
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0, 0.23.1
>Reporter: Arpit Gupta
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2854) SecurityUtil.buildTokenService returns java.net.UnknownHostException when using paths like viewfs://default/some/path

2012-01-27 Thread Arpit Gupta (Created) (JIRA)
SecurityUtil.buildTokenService returns java.net.UnknownHostException when using 
paths like viewfs://default/some/path
-

 Key: HDFS-2854
 URL: https://issues.apache.org/jira/browse/HDFS-2854
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.1
Reporter: Arpit Gupta




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2851) HA: After Balancer runs, usedSpace is not balancing correctly.

2012-01-27 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2851:
--

Summary: HA: After Balancer runs, usedSpace is not balancing correctly.  
(was: After Balancer runs, usedSpace is not balancing correctly.)

> HA: After Balancer runs, usedSpace is not balancing correctly.
> --
>
> Key: HDFS-2851
> URL: https://issues.apache.org/jira/browse/HDFS-2851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2851-HDFS-1623-Test.patch
>
>
> After Balancer runs, usedSpace is not balancing correctly.
> {code}
> java.util.concurrent.TimeoutException: Cluster failed to reached expected 
> values of totalSpace (current: 1500, expected: 1500), or usedSpace (current: 
> 390, expected: 300), in more than 2 msec.
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:233)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithHANameNodes(TestBalancerWithHANameNodes.java:99)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195340#comment-13195340
 ] 

Hudson commented on HDFS-2791:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #454 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/454/])
HDFS-2791. If block report races with closing of file, replica is 
incorrectly marked corrupt. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236944
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeAdapter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java


> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195339#comment-13195339
 ] 

Hudson commented on HDFS-2840:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #454 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/454/])
Merge -r 1236939:1236940 from trunk to branch. FIXES: HDFS-2840

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236942
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/lib/servlet/TestHostnameFilter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2853) HA: NN fails to start if the shared edits dir is marked required

2012-01-27 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195331#comment-13195331
 ] 

Aaron T. Myers commented on HDFS-2853:
--

I think we should probably just remove the check. It's better implemented at a 
higher level anyway, since it can't give a specific error message where the 
error is currently detected.

> HA: NN fails to start if the shared edits dir is marked required
> 
>
> Key: HDFS-2853
> URL: https://issues.apache.org/jira/browse/HDFS-2853
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Critical
>
> Currently it will fail because of a bug with checking the valid configuration 
> of required resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2853) HA: NN fails to start if the shared edits dir is marked required

2012-01-27 Thread Aaron T. Myers (Created) (JIRA)
HA: NN fails to start if the shared edits dir is marked required


 Key: HDFS-2853
 URL: https://issues.apache.org/jira/browse/HDFS-2853
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Critical


Currently it will fail because of a bug with checking the valid configuration 
of required resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2851) After Balancer runs, usedSpace is not balancing correctly.

2012-01-27 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2851:
--

Attachment: HDFS-2851-HDFS-1623-Test.patch

Attached the patch with test assertion to reproduce this issue.

> After Balancer runs, usedSpace is not balancing correctly.
> --
>
> Key: HDFS-2851
> URL: https://issues.apache.org/jira/browse/HDFS-2851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2851-HDFS-1623-Test.patch
>
>
> After Balancer runs, usedSpace is not balancing correctly.
> {code}
> java.util.concurrent.TimeoutException: Cluster failed to reached expected 
> values of totalSpace (current: 1500, expected: 1500), or usedSpace (current: 
> 390, expected: 300), in more than 2 msec.
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:233)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithHANameNodes(TestBalancerWithHANameNodes.java:99)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2851) After Balancer runs, usedSpace is not balancing correctly.

2012-01-27 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195323#comment-13195323
 ] 

Uma Maheswara Rao G commented on HDFS-2851:
---

Hi Eli, This particular issue is happening only in branch. This case works fine 
in trunk.

As for the initial look, There are 2 DNs(DN1,DN2) registered with NN initial 
block report also sent. After NN transitioned to active all blocks will be 
marked as stale until next block report comes from this DNs. One new DN (DN3) 
added , this particular DN registered with active NN sucessfully. When we run 
the balancer, it needs to move some blocks here and there to balance the 
cluster. Some blocks came to old DNs, and needs to process OverReplicated 
blocks as well. I think there is no immediate next block report after 
transitioned to active (this point need to confirm , whether we are triggering 
the block report immediately after transitioned to active or not), So the 
blocks was still in stale mode. Processing overReplicated blocks are getting 
postponed due to this reason. Since this nodes not processed OverReplicated 
blocks , used space is little high than expected. [ usedSpace (current: 390, 
expected: 300)]

I just reduced the block report interval to very less (10s), then this 
particular case is passing.

Thanks
Uma

> After Balancer runs, usedSpace is not balancing correctly.
> --
>
> Key: HDFS-2851
> URL: https://issues.apache.org/jira/browse/HDFS-2851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> After Balancer runs, usedSpace is not balancing correctly.
> {code}
> java.util.concurrent.TimeoutException: Cluster failed to reached expected 
> values of totalSpace (current: 1500, expected: 1500), or usedSpace (current: 
> 390, expected: 300), in more than 2 msec.
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:233)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithHANameNodes(TestBalancerWithHANameNodes.java:99)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195317#comment-13195317
 ] 

Hudson commented on HDFS-2791:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1625 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1625/])
HDFS-2791. If block report races with closing of file, replica is 
incorrectly marked corrupt. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236945
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeAdapter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java


> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195306#comment-13195306
 ] 

Hudson commented on HDFS-2791:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1681 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1681/])
HDFS-2791. If block report races with closing of file, replica is 
incorrectly marked corrupt. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236945
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeAdapter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java


> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195305#comment-13195305
 ] 

Hudson commented on HDFS-2791:
--

Integrated in Hadoop-Common-trunk-Commit #1609 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1609/])
HDFS-2791. If block report races with closing of file, replica is 
incorrectly marked corrupt. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236945
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeAdapter.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java


> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195301#comment-13195301
 ] 

Hudson commented on HDFS-2791:
--

Integrated in Hadoop-Hdfs-0.23-Commit #430 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/430/])
HDFS-2791. If block report races with closing of file, replica is 
incorrectly marked corrupt. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236944
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeAdapter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java


> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195300#comment-13195300
 ] 

Hudson commented on HDFS-2791:
--

Integrated in Hadoop-Common-0.23-Commit #439 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/439/])
HDFS-2791. If block report races with closing of file, replica is 
incorrectly marked corrupt. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236944
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeAdapter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java


> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2791:
--

   Resolution: Fixed
Fix Version/s: 0.23.1
   0.24.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to branch-23 and trunk.
The branch-23 patch required a few small changes since protobuf RPC isn't 
merged:
- instead of using a spy on the bpNamenode object, it uses a mock with 
DelegateAnswer as the default answer implementation
- changed types from the translator implementation to the straight 
DatanodeProtocol

Since the changes were simple and confined to test code, I made sure the new 
test passed and committed.

> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195298#comment-13195298
 ] 

Hudson commented on HDFS-2840:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1624 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1624/])
HDFS-2840. TestHostnameFilter should work with localhost or 
localhost.localdomain (tucu)

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236940
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/lib/servlet/TestHostnameFilter.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195295#comment-13195295
 ] 

Hudson commented on HDFS-2840:
--

Integrated in Hadoop-Hdfs-0.23-Commit #429 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/429/])
Merge -r 1236939:1236940 from trunk to branch. FIXES: HDFS-2840

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236942
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/lib/servlet/TestHostnameFilter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195292#comment-13195292
 ] 

Hudson commented on HDFS-2840:
--

Integrated in Hadoop-Common-trunk-Commit #1608 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1608/])
HDFS-2840. TestHostnameFilter should work with localhost or 
localhost.localdomain (tucu)

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236940
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/lib/servlet/TestHostnameFilter.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2824) HA: failover does not succeed if prior NN died just after creating an edit log segment

2012-01-27 Thread Aaron T. Myers (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers reassigned HDFS-2824:


Assignee: Aaron T. Myers  (was: Todd Lipcon)

> HA: failover does not succeed if prior NN died just after creating an edit 
> log segment
> --
>
> Key: HDFS-2824
> URL: https://issues.apache.org/jira/browse/HDFS-2824
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: HDFS-2824-HDFS-1623.patch
>
>
> In stress testing failover, I had the following failure:
> - NN1 rolls edit logs and starts writing edits_inprogress_1000
> - NN1 crashes before writing the START_LOG_SEGMENT transaction
> - NN2 tries to become active, and calls {{recoverUnfinalizedSegment}}. Since 
> the log file contains no valid transactions, it is marked as corrupt and 
> renamed with the {{.corrupt}} suffix
> - The sanity check in {{openLogsForWrite}} will refuse to open a new 
> in-progress log at the same txid. Failover does not proceed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2824) HA: failover does not succeed if prior NN died just after creating an edit log segment

2012-01-27 Thread Aaron T. Myers (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-2824:
-

Attachment: HDFS-2824-HDFS-1623.patch

Here's a patch which addresses the issue. I ran the following tests to verify 
there were no regressions, and all passed:

TestOfflineEditsViewer,TestHDFSConcat,TestEditLogRace,TestNameEditsConfigs,TestSaveNamespace,TestEditLogFileOutputStream,TestFileJournalManager,TestEditLog,TestFSEditLogLoader,TestFsLimits,TestSecurityTokenEditLog,TestStorageRestore,TestEditLogJournalFailures,TestEditLogTailer,TestEditLogsDuringFailover,TestHASafeMode,TestStandbyCheckpoints,TestDNFencing,TestDNFencingWithReplication,TestStandbyIsHot,TestGenericJournalConf,TestCheckPointForSecurityTokens,TestNNStorageRetentionManager,TestPBHelper,TestNNLeaseRecovery,TestFiRename,TestHAStateTransitions

> HA: failover does not succeed if prior NN died just after creating an edit 
> log segment
> --
>
> Key: HDFS-2824
> URL: https://issues.apache.org/jira/browse/HDFS-2824
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: HDFS-2824-HDFS-1623.patch
>
>
> In stress testing failover, I had the following failure:
> - NN1 rolls edit logs and starts writing edits_inprogress_1000
> - NN1 crashes before writing the START_LOG_SEGMENT transaction
> - NN2 tries to become active, and calls {{recoverUnfinalizedSegment}}. Since 
> the log file contains no valid transactions, it is marked as corrupt and 
> renamed with the {{.corrupt}} suffix
> - The sanity check in {{openLogsForWrite}} will refuse to open a new 
> in-progress log at the same txid. Failover does not proceed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195288#comment-13195288
 ] 

Hudson commented on HDFS-2840:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1680 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1680/])
HDFS-2840. TestHostnameFilter should work with localhost or 
localhost.localdomain (tucu)

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236940
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/lib/servlet/TestHostnameFilter.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195287#comment-13195287
 ] 

Hudson commented on HDFS-2840:
--

Integrated in Hadoop-Common-0.23-Commit #438 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/438/])
Merge -r 1236939:1236940 from trunk to branch. FIXES: HDFS-2840

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236942
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/lib/servlet/TestHostnameFilter.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Alejandro Abdelnur (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-2840:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk and branch-0.23

> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2840) TestHostnameFilter should work with localhost or localhost.localdomain

2012-01-27 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195247#comment-13195247
 ] 

Eli Collins commented on HDFS-2840:
---

+1 lgtm

> TestHostnameFilter should work with localhost or localhost.localdomain 
> ---
>
> Key: HDFS-2840
> URL: https://issues.apache.org/jira/browse/HDFS-2840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2840.patch
>
>
> TestHostnameFilter may currently fail with the following:
> {noformat}
> Error Message
> null expected: but was:
> Stacktrace
> junit.framework.ComparisonFailure: null expected: 
> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:81)
>   at junit.framework.Assert.assertEquals(Assert.java:87)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter$1.doFilter(TestHostnameFilter.java:50)
>   at 
> org.apache.hadoop.lib.servlet.HostnameFilter.doFilter(HostnameFilter.java:68)
>   at 
> org.apache.hadoop.lib.servlet.TestHostnameFilter.hostname(TestHostnameFilter.java:58)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2851) After Balancer runs, usedSpace is not balancing correctly.

2012-01-27 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195215#comment-13195215
 ] 

Eli Collins commented on HDFS-2851:
---

Wonder if this is fixed by HDFS-1105. Would be good to update that patch on 
trunk and get it in.

> After Balancer runs, usedSpace is not balancing correctly.
> --
>
> Key: HDFS-2851
> URL: https://issues.apache.org/jira/browse/HDFS-2851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> After Balancer runs, usedSpace is not balancing correctly.
> {code}
> java.util.concurrent.TimeoutException: Cluster failed to reached expected 
> values of totalSpace (current: 1500, expected: 1500), or usedSpace (current: 
> 390, expected: 300), in more than 2 msec.
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:233)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithHANameNodes(TestBalancerWithHANameNodes.java:99)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Ravi Prakash (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-2848:
---

Description: 
Courtesy Pat White [~patwhitey2007]
{quote}
Appears that there is a regression in corrupt block detection by both fsck and 
fs cmds like 'cat'. Testcases for
pre-block and block-overwrite corruption of all replicas is correctly reporting 
errors however post-block corruption is
not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
error. Looking at the DN blocks themselves,
they clearly contain the injected corruption pattern.
{quote}

  was:
Courtesy Pat White
{quote}
Appears that there is a regression in corrupt block detection by both fsck and 
fs cmds like 'cat'. Testcases for
pre-block and block-overwrite corruption of all replicas is correctly reporting 
errors however post-block corruption is
not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
error. Looking at the DN blocks themselves,
they clearly contain the injected corruption pattern.
{quote}


> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White [~patwhitey2007]
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2843) Rename protobuf message StorageInfoProto to NodeInfoProto

2012-01-27 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195110#comment-13195110
 ] 

Aaron T. Myers commented on HDFS-2843:
--

bq. Sorry I did not address it. My plan was to do this change as well for 
StorageInfo.

Cool. The description of the JIRA and the patch should be updated, then.

bq. I think, you have probably misunderstood my proposal.

I understood the proposal and looked at the patch. I was just unconvinced that 
the name change made sense. But, you've convinced me. The current class 
hierarchy would indeed suggest that this change is appropriate. Change away.

> Rename protobuf message StorageInfoProto to NodeInfoProto
> -
>
> Key: HDFS-2843
> URL: https://issues.apache.org/jira/browse/HDFS-2843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2843.patch
>
>
> StorageInfoProto has cTime, layoutVersion, namespaceID and clusterID. This is 
> really information of a node that is part of the cluster, such as Namenode, 
> Standby/Secondary/Backup/Checkpointer and datanodes. To reflect this, I want 
> to rename it as NodeInfoProto from StorageInfoProto.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2759) Pre-allocate HDFS edit log files after writing version number

2012-01-27 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195093#comment-13195093
 ] 

Aaron T. Myers commented on HDFS-2759:
--

bq. So I think we're OK here.

Awesome. I agree with your analysis.

bq. I remember you ran a benchmark at some point to check for edit log 
throughput - if you have that around still would you mind re-running to make 
sure this doesn't cause any unforeseen regression?

I bet I can dig that up and will be happy to run it again with this patch.

> Pre-allocate HDFS edit log files after writing version number
> -
>
> Key: HDFS-2759
> URL: https://issues.apache.org/jira/browse/HDFS-2759
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-2759.patch, HDFS-2759.patch
>
>
> In HDFS-2709 it was discovered that there's a potential race wherein edits 
> log files are pre-allocated before the version number is written into the 
> header of the file. This can cause the NameNode to read an invalid HDFS 
> layout version, and hence fail to read the edit log file. We should write the 
> header, then pre-allocate the rest of the file after this point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2852) Jenkins pre-commit build does not pick up the correct attachment.

2012-01-27 Thread Kihwal Lee (Created) (JIRA)
Jenkins pre-commit build does not pick up the correct attachment.
-

 Key: HDFS-2852
 URL: https://issues.apache.org/jira/browse/HDFS-2852
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0
Reporter: Kihwal Lee


When two files are attached to a jira, slaves build twice but only the latest 
attachement.

For example, the patch_tested.txt from PreCommit-Admin shows correct attachment 
numbers for In HDFS-2784.
>From 
>https://builds.apache.org/job/PreCommit-Admin/56284/artifact/patch_tested.txt
{noformat}
...
HBASE-5271,12511722
HDFS-2784,12511725
HDFS-2836,12511727
HDFS-2784,12511726
{noformat}

But the Jenkins build slaves had built #12511726 twice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2759) Pre-allocate HDFS edit log files after writing version number

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195092#comment-13195092
 ] 

Todd Lipcon commented on HDFS-2759:
---

bq. The explanation, per the comment, is that syncing metadata is unnecessary 
because of pre-allocation. I don't think that's reasonable, though, since 
EditLogFileOutputStream#preallocate doesn't call sync itself, which means that 
the file length might never get updated upon returning from 
EditLogFileOutputStream#flushAndSync.

This isn't quite true - the difference between {{force(true)}} and 
{{force(false)}} is that {{force(false)}} calls {{fdatasync()}}. The man page 
says:
{quote}
   fdatasync() is similar to fsync(), but does not flush modified metadata
   *unless that metadata is needed in order  to  allow  a  subsequent  data*
   *retrieval to be correctly handled*.  For example, changes to st_atime or
   st_mtime (respectively, time of last access and time of last  modifica‐
   tion;  see stat(2)) do not require flushing because they are not neces‐
   sary for a subsequent data read to be handled correctly.  *On the  other*
   *hand, a change to the file size* (st_size, as made by say ftruncate(2)),
   *would require a metadata flush*.
{quote}
(emphasis mine)

So I think we're OK here.

> Pre-allocate HDFS edit log files after writing version number
> -
>
> Key: HDFS-2759
> URL: https://issues.apache.org/jira/browse/HDFS-2759
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-2759.patch, HDFS-2759.patch
>
>
> In HDFS-2709 it was discovered that there's a potential race wherein edits 
> log files are pre-allocated before the version number is written into the 
> header of the file. This can cause the NameNode to read an invalid HDFS 
> layout version, and hence fail to read the edit log file. We should write the 
> header, then pre-allocate the rest of the file after this point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195086#comment-13195086
 ] 

Todd Lipcon commented on HDFS-2791:
---

bq.  I am coming to the conclusion that when a NN asks a DN to delete a 
replica, in addition to the bid and generation stamp, it should also include 
the state (RBW etc) known to the NN. The block is deleted only if the it is in 
that state.

Good idea - I like this safeguard. But given that there are +1s on this patch 
here, I dont think the above safeguard is mutually exclusive either. So let's 
do both for extra safety.

Assuming this patch still applies, I'll commit it momentarily.

> If block report races with closing of file, replica is incorrectly marked 
> corrupt
> -
>
> Key: HDFS-2791
> URL: https://issues.apache.org/jira/browse/HDFS-2791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node
>Affects Versions: 0.22.0, 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-2791-test.txt, hdfs-2791.txt, hdfs-2791.txt, 
> hdfs-2791.txt, hdfs-2791.txt
>
>
> The following sequence of events results in a replica mistakenly marked 
> corrupt:
> 1. Pipeline is open with 2 replicas
> 2. DN1 generates a block report but is slow in sending to the NN (eg some 
> flaky network). It gets "stuck" right before the block report RPC.
> 3. Client closes the file.
> 4. DN2 is fast and sends blockReceived to the NN. NN marks the block as 
> COMPLETE
> 5. DN1's block report proceeds, and includes the block in an RBW state.
> 6. (x) NN incorrectly marks the replica as corrupt, since it is an RBW 
> replica on a COMPLETE block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2759) Pre-allocate HDFS edit log files after writing version number

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195074#comment-13195074
 ] 

Todd Lipcon commented on HDFS-2759:
---

This seems reasonable. I remember you ran a benchmark at some point to check 
for edit log throughput - if you have that around still would you mind 
re-running to make sure this doesn't cause any unforeseen regression?

> Pre-allocate HDFS edit log files after writing version number
> -
>
> Key: HDFS-2759
> URL: https://issues.apache.org/jira/browse/HDFS-2759
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 0.24.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-2759.patch, HDFS-2759.patch
>
>
> In HDFS-2709 it was discovered that there's a potential race wherein edits 
> log files are pre-allocated before the version number is written into the 
> header of the file. This can cause the NameNode to read an invalid HDFS 
> layout version, and hence fail to read the edit log file. We should write the 
> header, then pre-allocate the rest of the file after this point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2801) Provide a method in client side translators to check for a methods supported in underlying protocol.

2012-01-27 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2801:
---

Status: Patch Available  (was: Open)

> Provide a method in client side translators to check for a methods supported 
> in underlying protocol.
> 
>
> Key: HDFS-2801
> URL: https://issues.apache.org/jira/browse/HDFS-2801
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-2801.trunk.patch, HDFS-2801.trunk.patch, 
> HDFS-2801.trunk.patch, HDFS-2801.trunk.patch
>
>
> This is jira corresponds to HADOOP-7965. The client side translators should 
> have a method boolean isMethodSupported(String methodName) which returns true 
> if the given method is supported and available at the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2801) Provide a method in client side translators to check for a methods supported in underlying protocol.

2012-01-27 Thread Jitendra Nath Pandey (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195034#comment-13195034
 ] 

Jitendra Nath Pandey commented on HDFS-2801:


Re-uploaded same patch to trigger hudson.

> Provide a method in client side translators to check for a methods supported 
> in underlying protocol.
> 
>
> Key: HDFS-2801
> URL: https://issues.apache.org/jira/browse/HDFS-2801
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-2801.trunk.patch, HDFS-2801.trunk.patch, 
> HDFS-2801.trunk.patch, HDFS-2801.trunk.patch
>
>
> This is jira corresponds to HADOOP-7965. The client side translators should 
> have a method boolean isMethodSupported(String methodName) which returns true 
> if the given method is supported and available at the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2801) Provide a method in client side translators to check for a methods supported in underlying protocol.

2012-01-27 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2801:
---

Attachment: HDFS-2801.trunk.patch

> Provide a method in client side translators to check for a methods supported 
> in underlying protocol.
> 
>
> Key: HDFS-2801
> URL: https://issues.apache.org/jira/browse/HDFS-2801
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-2801.trunk.patch, HDFS-2801.trunk.patch, 
> HDFS-2801.trunk.patch, HDFS-2801.trunk.patch
>
>
> This is jira corresponds to HADOOP-7965. The client side translators should 
> have a method boolean isMethodSupported(String methodName) which returns true 
> if the given method is supported and available at the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2843) Rename protobuf message StorageInfoProto to NodeInfoProto

2012-01-27 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195024#comment-13195024
 ] 

Suresh Srinivas commented on HDFS-2843:
---

bq. Also, what about my other concern from earlier? I can probably be convinced 
that a name change for StorageInfo is in order, but not StorageInfoProto in 
isolation:
Sorry I did not address it. My plan was to do this change as well for 
StorageInfo.

bq. I don't see how this fact makes it necessarily not storage. All those 
things have "storage" directories as well.
May be we are thinking about the abstractions differently. To me in HDFS 
cluster is made of nodes. Each node has the following information:
* ClusterID - for membership purposes
* namespaceID - to bind namenodes, secondary/backup, datanodes to be serving a 
namespace
* layoutVersion - for compatibility checks and to ensure 
upgrade/snapshot/rollback
* cTime - namespace related information to trigger upgrades

To me storage is more of what Datanode is exposing to the namenode. It has 
storageID, storage utilization (available, free space etc.) This is exposed 
only by datanode.

Currently what are the subclasses of StorageInfo:
CheckPointSignature, NamenodeRegistration, NamespaceInfo, Storage -> { 
BlockPoolSliceStorage, DataStorage, NNStorage}

Out of this Storage is the one that has Storage directories, Version file etc 
and not StorageInfo.

I think, you have probably misunderstood my proposal. I am not proposing making 
changes in directory structure. Hence your upgrade questions on what happens 
when partial upgrade happened, layoutVersion is different etc. is some thing 
that already exists, which this patch does not touch. I am changing the 
organization to (if you see the patch, it should be clear).

StorageInfo to NodeInfo, which captures all the information about a node 
exchanged over protocol.
CheckPointSignature, NamenodeRegistration, NamespaceInfo, Storage use NodeInfo, 
without the need for inheritance.


> Rename protobuf message StorageInfoProto to NodeInfoProto
> -
>
> Key: HDFS-2843
> URL: https://issues.apache.org/jira/browse/HDFS-2843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2843.patch
>
>
> StorageInfoProto has cTime, layoutVersion, namespaceID and clusterID. This is 
> really information of a node that is part of the cluster, such as Namenode, 
> Standby/Secondary/Backup/Checkpointer and datanodes. To reflect this, I want 
> to rename it as NodeInfoProto from StorageInfoProto.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2833) Add GETMERGE operation to httpfs

2012-01-27 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195002#comment-13195002
 ] 

Eli Collins commented on HDFS-2833:
---

getmerge is an FsShell API, not a FileSystem/Context API. I think we should 
keep HttpFs as a FileSystem/Context proxy. Otherwise, eg HttpFS won't be 
compatible with WebHDFS, we'll end up putting in the kitchen sink, etc.

Sounds like what the user really wants is a rest API for FsShell.

> Add GETMERGE operation to httpfs
> 
>
> Key: HDFS-2833
> URL: https://issues.apache.org/jira/browse/HDFS-2833
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.23.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Fix For: 0.23.1
>
> Attachments: HDFS-2833.patch
>
>
> Add to a convenience operation GETMERGE to httpfs. 
> This will simplify for external system accessing over HTTP to consume the 
> output of an MR job a single stream.
> It would have the same semantics as the 'hadoop fs -getmerge' command.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2801) Provide a method in client side translators to check for a methods supported in underlying protocol.

2012-01-27 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2801:
---

Status: Open  (was: Patch Available)

> Provide a method in client side translators to check for a methods supported 
> in underlying protocol.
> 
>
> Key: HDFS-2801
> URL: https://issues.apache.org/jira/browse/HDFS-2801
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: HDFS-2801.trunk.patch, HDFS-2801.trunk.patch, 
> HDFS-2801.trunk.patch
>
>
> This is jira corresponds to HADOOP-7965. The client side translators should 
> have a method boolean isMethodSupported(String methodName) which returns true 
> if the given method is supported and available at the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2843) Rename protobuf message StorageInfoProto to NodeInfoProto

2012-01-27 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194997#comment-13194997
 ] 

Aaron T. Myers commented on HDFS-2843:
--

bq. This information is used by secondary namenode, checkpointer, backup node 
as well. Does that still make it "storage"?

I don't see how this fact makes it necessarily *not* storage. All those things 
have "storage" directories as well.

bq. Distributed upgrade was used only once long back and is not in use any 
more. It is the regular upgrade that considers layout version and CTime etc.

Sure, but the point still remains. What about during an upgrade process? What 
if this information (e.g. layoutVersion) differs between storage directories on 
a single node at some moment in time?

bq. I am not proposing any change in the VERSION file. All I am saying is, call 
this part NodeInfo instead of StorageInfo. All this information is still 
present in all the data structures.

I realize, but at least the layoutVersion field does seem 
storage-directory-specific, and hence should not be described as being at the 
node level.

bq. I plan to cleanup the hierarchy of classes where StorageInfo is super class 
of bunch of other classes such DataStorage, BlockPoolSliceStorage, Storage etc.

Perhaps it would be clearer how this proposed name change makes sense if we 
could see how you envision the end state? At the moment, this incremental 
change doesn't make sense to me.

Also, what about my other concern from earlier? I can probably be convinced 
that a name change for StorageInfo is in order, but not StorageInfoProto in 
isolation:

{quote}
Regardless, if we do go ahead with a name change, it doesn't make sense to me 
that we would change StorageInfoProto but not StorageInfo itself, given that 
StorageInfoProto is really just a serialization class of StorageInfo. If the 
semantic of the name StorageInfoProto is wrong, then surely StorageInfo is as 
well, right?
{quote}

> Rename protobuf message StorageInfoProto to NodeInfoProto
> -
>
> Key: HDFS-2843
> URL: https://issues.apache.org/jira/browse/HDFS-2843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2843.patch
>
>
> StorageInfoProto has cTime, layoutVersion, namespaceID and clusterID. This is 
> really information of a node that is part of the cluster, such as Namenode, 
> Standby/Secondary/Backup/Checkpointer and datanodes. To reflect this, I want 
> to rename it as NodeInfoProto from StorageInfoProto.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2843) Rename protobuf message StorageInfoProto to NodeInfoProto

2012-01-27 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194979#comment-13194979
 ] 

Suresh Srinivas commented on HDFS-2843:
---

bq. it's still information that's specific to "storage" in the general sense, 
although it's not specific to a single storage directory
This information is used by secondary namenode, checkpointer, backup node as 
well. Does that still make it "storage"?

bq. That said, what about during a distributed upgrade process which will 
change the layoutVersion?
Distributed upgrade was used only once long back and is not in use any more. It 
is the regular upgrade that considers layout version and CTime etc.

bq. Imagine a scenario where a DN dies during an upgrade while writing the 
VERSION files to its various storage directories, and hence only some subset of 
the directories get the updated VERSION file. Would we then need to know what 
layoutVersion is present in each storage directory separately? I haven't 
checked on how this case is handled - just thinking out loud.

I am not proposing any change in the VERSION file. All I am saying is, call 
this part NodeInfo instead of StorageInfo. All this information is still 
present in all the data structures. I plan to cleanup the hierarchy of classes 
where StorageInfo is super class of bunch of other classes such DataStorage, 
BlockPoolSliceStorage, Storage etc.

> Rename protobuf message StorageInfoProto to NodeInfoProto
> -
>
> Key: HDFS-2843
> URL: https://issues.apache.org/jira/browse/HDFS-2843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, name-node
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: HDFS-2843.patch
>
>
> StorageInfoProto has cTime, layoutVersion, namespaceID and clusterID. This is 
> really information of a node that is part of the cluster, such as Namenode, 
> Standby/Secondary/Backup/Checkpointer and datanodes. To reflect this, I want 
> to rename it as NodeInfoProto from StorageInfoProto.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HDFS-2802) Support for RW/RO snapshots in HDFS

2012-01-27 Thread Hari Mankude (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Mankude reopened HDFS-2802:



I (We) are well aware of hdfs-233. This jira was opened to provide 
comprehensive snapshot solution (both RW/RO support) for HDFS. 

> Support for RW/RO snapshots in HDFS
> ---
>
> Key: HDFS-2802
> URL: https://issues.apache.org/jira/browse/HDFS-2802
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Fix For: 0.24.0
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2844) HA: TestSafeMode#testNoExtensionIfNoBlocks is failing

2012-01-27 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194955#comment-13194955
 ] 

Uma Maheswara Rao G commented on HDFS-2844:
---

No problem., Thanks Aaron for taking a look.
Make sense to me for resolving this issue, Thanks  

> HA: TestSafeMode#testNoExtensionIfNoBlocks is failing
> -
>
> Key: HDFS-2844
> URL: https://issues.apache.org/jira/browse/HDFS-2844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2844-HDFS-1623.patch
>
>
> The test is timing out after 45 seconds. It's also failed in the last two 
> nightly builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2827) Cannot save namespace after renaming a directory above a file with an open lease

2012-01-27 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194957#comment-13194957
 ] 

Uma Maheswara Rao G commented on HDFS-2827:
---

Aaron, could you please take a look?

> Cannot save namespace after renaming a directory above a file with an open 
> lease
> 
>
> Key: HDFS-2827
> URL: https://issues.apache.org/jira/browse/HDFS-2827
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.24.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2827-test.patch, HDFS-2827.patch
>
>
> When i execute the following operations and wait for checkpoint to complete.
> fs.mkdirs(new Path("/test1"));
> FSDataOutputStream create = fs.create(new Path("/test/abc.txt")); //dont close
> fs.rename(new Path("/test/"), new Path("/test1/"));
> Check-pointing is failing with the following exception.
> 2012-01-23 15:03:14,204 ERROR namenode.FSImage (FSImage.java:run(795)) - 
> Unable to save image for 
> E:\HDFS-1623\hadoop-hdfs-project\hadoop-hdfs\build\test\data\dfs\name3
> java.io.IOException: saveLeases found path /test1/est/abc.txt but no matching 
> entry in namespace.[/test1/est/abc.txt]
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveFilesUnderConstruction(FSNamesystem.java:4336)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$Saver.save(FSImageFormat.java:588)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.saveFSImage(FSImage.java:761)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage$FSImageSaver.run(FSImage.java:789)
>   at java.lang.Thread.run(Unknown Source)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2844) HA: TestSafeMode#testNoExtensionIfNoBlocks is failing

2012-01-27 Thread Aaron T. Myers (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers resolved HDFS-2844.
--

Resolution: Duplicate

Let's just resolve as a dupe. Fixing it here and in HDFS-2742 will just make 
the HDFS-2742 fix more difficult to merge.

Thanks, Uma, for looking into this. Sorry for the false alarm.

> HA: TestSafeMode#testNoExtensionIfNoBlocks is failing
> -
>
> Key: HDFS-2844
> URL: https://issues.apache.org/jira/browse/HDFS-2844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2844-HDFS-1623.patch
>
>
> The test is timing out after 45 seconds. It's also failed in the last two 
> nightly builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Ravi Prakash (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194930#comment-13194930
 ] 

Ravi Prakash commented on HDFS-2848:


Nopes After the restart, the filesize is still the correct, uncorrupted size :(

> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Daryn Sharp (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194923#comment-13194923
 ] 

Daryn Sharp commented on HDFS-2848:
---

After the restart, I suspect {{hadoop fs -ls}} on the file reports the new 
physical size of the block?

> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Ravi Prakash (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194919#comment-13194919
 ] 

Ravi Prakash commented on HDFS-2848:


I was experimenting more and I noticed that after restarting the cluster, if I 
try to cat the file it finally notices the corruption. So we definitely have an 
inconsistency that we should fix.

> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2844) HA: TestSafeMode#testNoExtensionIfNoBlocks is failing

2012-01-27 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194888#comment-13194888
 ] 

Uma Maheswara Rao G commented on HDFS-2844:
---

Thanks Todd, I missed that comment in HDFS-2742.
So, we can resolve this as duplicate with HDFS-2742? (or) since this issue is 
filed already, we can process the fix alone separately in this jira?

> HA: TestSafeMode#testNoExtensionIfNoBlocks is failing
> -
>
> Key: HDFS-2844
> URL: https://issues.apache.org/jira/browse/HDFS-2844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2844-HDFS-1623.patch
>
>
> The test is timing out after 45 seconds. It's also failed in the last two 
> nightly builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Daryn Sharp (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194881#comment-13194881
 ] 

Daryn Sharp commented on HDFS-2848:
---

Yes, OP meant physically appending to the block.  I was referring to hadoop 
appends because I think the design of append masks the OP's issue.

> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Daryn Sharp (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194877#comment-13194877
 ] 

Daryn Sharp commented on HDFS-2848:
---

I'd suggest that the corrective action for an otherwise valid block should be a 
truncate rather than invalidate.

> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194875#comment-13194875
 ] 

Harsh J commented on HDFS-2848:
---

Hey Daryn,

Sorry to have misguided with the 'append' word but I loosely meant something 
like:

$ cat >> blk_XYZ_FOO
thisisbadlyappendeddataonexistingblock
^D

This is probably also what the OP is talking about.

> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2844) HA: TestSafeMode#testNoExtensionIfNoBlocks is failing

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194871#comment-13194871
 ] 

Todd Lipcon commented on HDFS-2844:
---

Sorry, I mentioned this in HDFS-2742: 
https://issues.apache.org/jira/browse/HDFS-2742?focusedCommentId=13182907&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13182907

The 2742 patch will also fix this issue once it's committed.

> HA: TestSafeMode#testNoExtensionIfNoBlocks is failing
> -
>
> Key: HDFS-2844
> URL: https://issues.apache.org/jira/browse/HDFS-2844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2844-HDFS-1623.patch
>
>
> The test is timing out after 45 seconds. It's also failed in the last two 
> nightly builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2848) hdfs corruption appended to blocks is not detected by fs commands or fsck

2012-01-27 Thread Daryn Sharp (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194872#comment-13194872
 ] 

Daryn Sharp commented on HDFS-2848:
---

As best I can tell based on a cursory read of the code, append will add to the 
block but not immediately update the file size until the block is committed -- 
block fills or the stream is closed.  Client readers will only get the 
committed block size and data, which means the spurious bytes are "harmless" to 
a client.  I think an append will seek to the end of the committed data, and 
then overwrite the spurious bytes.  

I'm not a DN expert, but detecting the incorrectly sized blocks is probably 
something best left to fsck and/or the block scanner.  It also might be 
possible to have the NN issue a truncate in response to a block report that 
doesn't match the NN's view of the world.  Maybe hadoop already does something 
like this.  A DN expert should weigh in.

> hdfs corruption appended to blocks is not detected by fs commands or fsck
> -
>
> Key: HDFS-2848
> URL: https://issues.apache.org/jira/browse/HDFS-2848
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>
> Courtesy Pat White
> {quote}
> Appears that there is a regression in corrupt block detection by both fsck 
> and fs cmds like 'cat'. Testcases for
> pre-block and block-overwrite corruption of all replicas is correctly 
> reporting errors however post-block corruption is
> not, fsck on the filesystem reports it's Healthy and 'cat' returns without 
> error. Looking at the DN blocks themselves,
> they clearly contain the injected corruption pattern.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2830) HA: Improvements for SBN web UI

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194866#comment-13194866
 ] 

Todd Lipcon commented on HDFS-2830:
---

HDFS-2845 points out that we should remove the "browse filesystem" link and 
show a nicer error when hitting the browsedfs page.

> HA: Improvements for SBN web UI
> ---
>
> Key: HDFS-2830
> URL: https://issues.apache.org/jira/browse/HDFS-2830
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>
> A few improvements should be done to the NN web UI while it is in standby 
> mode:
> - Since the SBN doesn't compute replication queues, we shouldn't show 
> under-replicated/missing blocks or corrupt files
> - In an edits dir that is open for read only, we should probably mark it as 
> such
> - We should include the latest txid reflected in the namespace, as well as 
> expose some of the HA-related metrics (eg queued block reports)
> - We should include a link to the other NNs in the cluster, as well as the 
> NN's nameservice ID and namenode ID when available.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2847) NamenodeProtocol#getBlocks() should use DatanodeID as an argument instead of DatanodeInfo

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194863#comment-13194863
 ] 

Todd Lipcon commented on HDFS-2847:
---

+1 pending Jenkins results

> NamenodeProtocol#getBlocks() should use DatanodeID as an argument instead of 
> DatanodeInfo
> -
>
> Key: HDFS-2847
> URL: https://issues.apache.org/jira/browse/HDFS-2847
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.24.0
>
> Attachments: HDFS-2847.txt
>
>
> DatanodeID is sufficient for identifying a Datanode. DatanodeInfo has a lot 
> of information that is not required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2825) Add test hook to turn off the writer preferring its local DN

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194857#comment-13194857
 ] 

Todd Lipcon commented on HDFS-2825:
---

Yes, for create/append

> Add test hook to turn off the writer preferring its local DN
> 
>
> Key: HDFS-2825
> URL: https://issues.apache.org/jira/browse/HDFS-2825
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 0.24.0, 0.23.1
>
> Attachments: hdfs-2825.txt, hdfs-2825.txt
>
>
> Currently, the default block placement policy always places the first replica 
> in the pipeline on the local node if there is a valid DN running there. In 
> some network designs, within-rack bandwidth is never constrained so this 
> doesn't give much of an advantage. It would also be really useful to disable 
> this for MiniDFSCluster tests, since currently if you start a multi-DN 
> cluster and write with replication level 1, all of the replicas go to the 
> same DN.
> _[per discussion below, this was changed to not add a config, but only to add 
> a hook for testing]_

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2851) After Balancer runs, usedSpace is not balancing correctly.

2012-01-27 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2851:
--

  Component/s: name-node
   ha
   data-node
   balancer
Affects Version/s: HA branch (HDFS-1623)

> After Balancer runs, usedSpace is not balancing correctly.
> --
>
> Key: HDFS-2851
> URL: https://issues.apache.org/jira/browse/HDFS-2851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> After Balancer runs, usedSpace is not balancing correctly.
> {code}
> java.util.concurrent.TimeoutException: Cluster failed to reached expected 
> values of totalSpace (current: 1500, expected: 1500), or usedSpace (current: 
> 390, expected: 300), in more than 2 msec.
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:233)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithHANameNodes(TestBalancerWithHANameNodes.java:99)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2851) After Balancer runs, usedSpace is not balancing correctly.

2012-01-27 Thread Uma Maheswara Rao G (Created) (JIRA)
After Balancer runs, usedSpace is not balancing correctly.
--

 Key: HDFS-2851
 URL: https://issues.apache.org/jira/browse/HDFS-2851
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


After Balancer runs, usedSpace is not balancing correctly.

{code}
java.util.concurrent.TimeoutException: Cluster failed to reached expected 
values of totalSpace (current: 1500, expected: 1500), or usedSpace (current: 
390, expected: 300), in more than 2 msec.
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancer.waitForHeartBeat(TestBalancer.java:233)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes.testBalancerWithHANameNodes(TestBalancerWithHANameNodes.java:99)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2837) mvn javadoc:javadoc not seeing LimitedPrivate class

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194691#comment-13194691
 ] 

Hudson commented on HDFS-2837:
--

Integrated in Hadoop-Mapreduce-0.23-Build #173 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/173/])
Merge -r 1236337:1236338 from trunk to branch. FIXES: HDFS-2837

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236339
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/dev-support/test-patch.properties
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/pom.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> mvn javadoc:javadoc not seeing LimitedPrivate class 
> 
>
> Key: HDFS-2837
> URL: https://issues.apache.org/jira/browse/HDFS-2837
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2837.txt
>
>
> mvn javadoc:javadoc not seeing LimitedPrivate class 
> {noformat}
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate': class 
> file for org.apache.hadoop.classification.InterfaceAudience not found
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/Groups.class(org/apache/hadoop/security:Groups.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataInputStream.class(org/apache/hadoop/fs:FSDataInputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataOutputStream.class(org/apache/hadoop/fs:FSDataOutputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] org/apache/hadoop/fs/Path.class(org/apache/hadoop/fs:Path.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/UnresolvedLinkException.class(org/apache/hadoop/fs:UnresolvedLinkException.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.class(org/apache/hadoop/fs:MD5MD5CRC32FileChecksum.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/LocalDirAllocator.class(org/apache/hadoop/fs:LocalDirAllocator.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSOutputSu

[jira] [Commented] (HDFS-2836) HttpFSServer still has 2 javadoc warnings in trunk

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194694#comment-13194694
 ] 

Hudson commented on HDFS-2836:
--

Integrated in Hadoop-Mapreduce-0.23-Build #173 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/173/])
Merge -r 1236327:1236328 from trunk to branch. FIXES: HDFS-2836

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236331
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HttpFSServer still has 2 javadoc warnings in trunk
> --
>
> Key: HDFS-2836
> URL: https://issues.apache.org/jira/browse/HDFS-2836
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2836.txt
>
>
> {noformat}
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:241:
>  warning - @param argument "override," is not a parameter name.
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:450:
>  warning - @param argument "override," is not a parameter name.
> {noformat}
> These are causing other patches to get a -1 in automated testing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2837) mvn javadoc:javadoc not seeing LimitedPrivate class

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194671#comment-13194671
 ] 

Hudson commented on HDFS-2837:
--

Integrated in Hadoop-Mapreduce-trunk #971 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/971/])
HDFS-2837. mvn javadoc:javadoc not seeing LimitedPrivate class (revans2 via 
tucu)

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236338
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/dev-support/test-patch.properties
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> mvn javadoc:javadoc not seeing LimitedPrivate class 
> 
>
> Key: HDFS-2837
> URL: https://issues.apache.org/jira/browse/HDFS-2837
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2837.txt
>
>
> mvn javadoc:javadoc not seeing LimitedPrivate class 
> {noformat}
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate': class 
> file for org.apache.hadoop.classification.InterfaceAudience not found
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/Groups.class(org/apache/hadoop/security:Groups.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataInputStream.class(org/apache/hadoop/fs:FSDataInputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataOutputStream.class(org/apache/hadoop/fs:FSDataOutputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] org/apache/hadoop/fs/Path.class(org/apache/hadoop/fs:Path.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/UnresolvedLinkException.class(org/apache/hadoop/fs:UnresolvedLinkException.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.class(org/apache/hadoop/fs:MD5MD5CRC32FileChecksum.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/LocalDirAllocator.class(org/apache/hadoop/fs:LocalDirAllocator.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSOutputSummer.class(org/apache/hadoop/fs:FSOutpu

[jira] [Commented] (HDFS-2836) HttpFSServer still has 2 javadoc warnings in trunk

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194672#comment-13194672
 ] 

Hudson commented on HDFS-2836:
--

Integrated in Hadoop-Mapreduce-trunk #971 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/971/])
HDFS-2836. HttpFSServer still has 2 javadoc warnings in trunk (revans2 via 
tucu)

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236328
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HttpFSServer still has 2 javadoc warnings in trunk
> --
>
> Key: HDFS-2836
> URL: https://issues.apache.org/jira/browse/HDFS-2836
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2836.txt
>
>
> {noformat}
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:241:
>  warning - @param argument "override," is not a parameter name.
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:450:
>  warning - @param argument "override," is not a parameter name.
> {noformat}
> These are causing other patches to get a -1 in automated testing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2838) NPE in FSNamesystem when in safe mode

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194665#comment-13194665
 ] 

Hudson commented on HDFS-2838:
--

Integrated in Hadoop-Hdfs-HAbranch-build #60 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/60/])
HDFS-2838. NPE in FSNamesystem when in safe mode. Contributed by Gregory 
Chanan

eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236450
Files : 
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java


> NPE in FSNamesystem when in safe mode
> -
>
> Key: HDFS-2838
> URL: https://issues.apache.org/jira/browse/HDFS-2838
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Gregory Chanan
>Assignee: Gregory Chanan
> Attachments: HDFS-2838-v2.patch, HDFS-2838.patch
>
>
> I'm seeing an NPE when running HBase 0.92 unit tests against the HA branch.  
> The test failure is: 
> org.apache.hadoop.hbase.regionserver.wal.TestHLog.testAppendClose.
> Here is the backtrace:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:179)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getActiveBlockCount(BlockManager.java:2465)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.doConsistencyCheck(FSNamesystem.java:3591)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.isOn(FSNamesystem.java:3285)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$900(FSNamesystem.java:3196)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isInSafeMode(FSNamesystem.java:3670)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.isInSafeMode(NameNode.java:609)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:1476)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:1487)
> Here is the relevant section of the test:
> {code}
>try {
>   DistributedFileSystem dfs = (DistributedFileSystem) 
> cluster.getFileSystem();
>   dfs.setSafeMode(FSConstants.SafeModeAction.SAFEMODE_ENTER);
>   cluster.shutdown();
>   try {
> // wal.writer.close() will throw an exception,
> // but still call this since it closes the LogSyncer thread first
> wal.close();
>   } catch (IOException e) {
> LOG.info(e);
>   }
>   fs.close(); // closing FS last so DFSOutputStream can't call close
>   LOG.info("STOPPED first instance of the cluster");
> } finally {
>   // Restart the cluster
>   while (cluster.isClusterUp()){
> LOG.error("Waiting for cluster to go down");
> Thread.sleep(1000);
>   }
> {code}
> Fix looks trivial, will include patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2805) HA: Add a test for a federated cluster with HA NNs

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194666#comment-13194666
 ] 

Hudson commented on HDFS-2805:
--

Integrated in Hadoop-Hdfs-HAbranch-build #60 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/60/])
HDFS-2805. Add a test for a federated cluster with HA NNs. Contributed by 
Brandon Li.

jitendra : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236471
Files : 
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSNNTopology.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAStateTransitions.java


> HA: Add a test for a federated cluster with HA NNs
> --
>
> Key: HDFS-2805
> URL: https://issues.apache.org/jira/browse/HDFS-2805
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Brandon Li
> Attachments: HDFS-2805-HDFS-1623.patch, HDFS-2805.3.txt, 
> HDFS-2805.4.txt, HDFS-2805.second.txt, HDFS-2805.txt
>
>
> Add a test for configuring/interacting with a federated cluster wherein each 
> name service is itself HA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2836) HttpFSServer still has 2 javadoc warnings in trunk

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194659#comment-13194659
 ] 

Hudson commented on HDFS-2836:
--

Integrated in Hadoop-Hdfs-0.23-Build #151 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/151/])
Merge -r 1236327:1236328 from trunk to branch. FIXES: HDFS-2836

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236331
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HttpFSServer still has 2 javadoc warnings in trunk
> --
>
> Key: HDFS-2836
> URL: https://issues.apache.org/jira/browse/HDFS-2836
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2836.txt
>
>
> {noformat}
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:241:
>  warning - @param argument "override," is not a parameter name.
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:450:
>  warning - @param argument "override," is not a parameter name.
> {noformat}
> These are causing other patches to get a -1 in automated testing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2837) mvn javadoc:javadoc not seeing LimitedPrivate class

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194656#comment-13194656
 ] 

Hudson commented on HDFS-2837:
--

Integrated in Hadoop-Hdfs-0.23-Build #151 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/151/])
Merge -r 1236337:1236338 from trunk to branch. FIXES: HDFS-2837

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236339
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/dev-support/test-patch.properties
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/pom.xml
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> mvn javadoc:javadoc not seeing LimitedPrivate class 
> 
>
> Key: HDFS-2837
> URL: https://issues.apache.org/jira/browse/HDFS-2837
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2837.txt
>
>
> mvn javadoc:javadoc not seeing LimitedPrivate class 
> {noformat}
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate': class 
> file for org.apache.hadoop.classification.InterfaceAudience not found
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/Groups.class(org/apache/hadoop/security:Groups.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataInputStream.class(org/apache/hadoop/fs:FSDataInputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataOutputStream.class(org/apache/hadoop/fs:FSDataOutputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] org/apache/hadoop/fs/Path.class(org/apache/hadoop/fs:Path.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/UnresolvedLinkException.class(org/apache/hadoop/fs:UnresolvedLinkException.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.class(org/apache/hadoop/fs:MD5MD5CRC32FileChecksum.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/LocalDirAllocator.class(org/apache/hadoop/fs:LocalDirAllocator.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSOutputSummer.class

[jira] [Commented] (HDFS-2837) mvn javadoc:javadoc not seeing LimitedPrivate class

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194637#comment-13194637
 ] 

Hudson commented on HDFS-2837:
--

Integrated in Hadoop-Hdfs-trunk #938 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/938/])
HDFS-2837. mvn javadoc:javadoc not seeing LimitedPrivate class (revans2 via 
tucu)

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236338
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/dev-support/test-patch.properties
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/pom.xml
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> mvn javadoc:javadoc not seeing LimitedPrivate class 
> 
>
> Key: HDFS-2837
> URL: https://issues.apache.org/jira/browse/HDFS-2837
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2837.txt
>
>
> mvn javadoc:javadoc not seeing LimitedPrivate class 
> {noformat}
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate': class 
> file for org.apache.hadoop.classification.InterfaceAudience not found
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileSystem.class(org/apache/hadoop/fs:FileSystem.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/Groups.class(org/apache/hadoop/security:Groups.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/security/UserGroupInformation.class(org/apache/hadoop/security:UserGroupInformation.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataInputStream.class(org/apache/hadoop/fs:FSDataInputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSDataOutputStream.class(org/apache/hadoop/fs:FSDataOutputStream.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] org/apache/hadoop/fs/Path.class(org/apache/hadoop/fs:Path.class): 
> warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/UnresolvedLinkException.class(org/apache/hadoop/fs:UnresolvedLinkException.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/MD5MD5CRC32FileChecksum.class(org/apache/hadoop/fs:MD5MD5CRC32FileChecksum.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/LocalDirAllocator.class(org/apache/hadoop/fs:LocalDirAllocator.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FileContext.class(org/apache/hadoop/fs:FileContext.class):
>  warning: Cannot find annotation method 'value()' in type 
> 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate'
> [WARNING] 
> org/apache/hadoop/fs/FSOutputSummer.class(org/apache/hadoop/fs:FSOutputSummer.cl

[jira] [Commented] (HDFS-2836) HttpFSServer still has 2 javadoc warnings in trunk

2012-01-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194638#comment-13194638
 ] 

Hudson commented on HDFS-2836:
--

Integrated in Hadoop-Hdfs-trunk #938 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/938/])
HDFS-2836. HttpFSServer still has 2 javadoc warnings in trunk (revans2 via 
tucu)

tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1236328
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> HttpFSServer still has 2 javadoc warnings in trunk
> --
>
> Key: HDFS-2836
> URL: https://issues.apache.org/jira/browse/HDFS-2836
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.1
>
> Attachments: HDFS-2836.txt
>
>
> {noformat}
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:241:
>  warning - @param argument "override," is not a parameter name.
> [WARNING] 
> hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java:450:
>  warning - @param argument "override," is not a parameter name.
> {noformat}
> These are causing other patches to get a -1 in automated testing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2844) HA: TestSafeMode#testNoExtensionIfNoBlocks is failing

2012-01-27 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2844:
--

Attachment: HDFS-2844-HDFS-1623.patch

Attached the patch to understand the analysis and also test passed with this 
change.

> HA: TestSafeMode#testNoExtensionIfNoBlocks is failing
> -
>
> Key: HDFS-2844
> URL: https://issues.apache.org/jira/browse/HDFS-2844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2844-HDFS-1623.patch
>
>
> The test is timing out after 45 seconds. It's also failed in the last two 
> nightly builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2844) HA: TestSafeMode#testNoExtensionIfNoBlocks is failing

2012-01-27 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194546#comment-13194546
 ] 

Uma Maheswara Rao G commented on HDFS-2844:
---

I just debugged this issue. Looks like , when we restart the namenode, it is 
checking for the safemode.

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.checkMode(FSNamesystem.java:3440)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.setBlockTotal(FSNamesystem.java:3483)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$8(FSNamesystem.java:3478)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setBlockTotal(FSNamesystem.java:3759)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:94)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:685)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:245)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:441)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:380)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:351)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:385)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:540)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:526)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:836)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:1291)

Here before actually initializing the nnResourceChecker, it is using 
hasResouceAvailable flag. Since that is not initialized, that flag will be 
flase and needEnter will return true in FSNamesystem$SafeModeInfo#checkMode. 
So, it is entering into safemode initially and further it will check for 
safemode extention to leave.

Thanks
Uma

> HA: TestSafeMode#testNoExtensionIfNoBlocks is failing
> -
>
> Key: HDFS-2844
> URL: https://issues.apache.org/jira/browse/HDFS-2844
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
>
> The test is timing out after 45 seconds. It's also failed in the last two 
> nightly builds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2850) Clients can hang in close if processDatanodeError throws Exception ( ex: OOME).

2012-01-27 Thread Uma Maheswara Rao G (Created) (JIRA)
Clients can hang in close if processDatanodeError throws Exception ( ex: OOME).
---

 Key: HDFS-2850
 URL: https://issues.apache.org/jira/browse/HDFS-2850
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 1.0.1
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


I met a situation, where DataStreamer#processDatanodeError throws OOME when 
creating ResponseProcessor thread. 
Due to this Datastreamer thread died. When clinet closing the stream, it keeps 
waiting.

Looks this is because, when clinet closes, it will enque one packet by marking 
that a lastpacket and wait for the ack. Here Datastreamer thread died and no 
one is there for processsing the packet from dataqueue. Obviously will not get 
any ack and it will keep wait in close.

This i have seen in 20.2 version. when i verified, this problem will not be 
there in trunk as processDatanodeError already guarded with try/catch. This 
problem can be there in branch-1 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira