[jira] [Commented] (HDFS-2450) Only complete hostname is supported to access data via hdfs://

2012-05-10 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272690#comment-13272690
 ] 

Daryn Sharp commented on HDFS-2450:
---

I'll investigate.

> Only complete hostname is supported to access data via hdfs://
> --
>
> Key: HDFS-2450
> URL: https://issues.apache.org/jira/browse/HDFS-2450
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Daryn Sharp
> Fix For: 1.0.0
>
> Attachments: HDFS-2450-1.patch, HDFS-2450-2.patch, HDFS-2450-3.patch, 
> HDFS-2450-4.patch, HDFS-2450-5.patch, HDFS-2450.patch, IP vs. Hostname.pdf
>
>
> If my complete hostname is  host1.abc.xyz.com, only complete hostname must be 
> used to access data via hdfs://
> I am running following in .20.205 Client to get data from .20.205 NN (host1)
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1/tmp
> copyFromLocal: Wrong FS: hdfs://host1/tmp, expected: hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hftp://host1.abc.xyz/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> Only following is supported 
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc.xyz.com/tmp/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3369:
-

Target Version/s: 2.0.0, 3.0.0  (was: 3.0.0, 2.0.0)
Hadoop Flags: Reviewed

+1 patch looks good.  All changes are simple renaming or comment updates.

> change variable names referring to inode in blockmanagement to more 
> appropriate
> ---
>
> Key: HDFS-3369
> URL: https://issues.apache.org/jira/browse/HDFS-3369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: John George
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-3369.patch
>
>
> We should rename BlocksMap.getINode(..) and, in addition, the local variable 
> names such as fileInode to match 'block collection'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3372) offlineEditsViewer should be able to read a binary edits file with recovery mode

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272676#comment-13272676
 ] 

Hadoop QA commented on HDFS-3372:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526388/HDFS-3372.002.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2407//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2407//console

This message is automatically generated.

> offlineEditsViewer should be able to read a binary edits file with recovery 
> mode
> 
>
> Key: HDFS-3372
> URL: https://issues.apache.org/jira/browse/HDFS-3372
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3372.001.patch, HDFS-3372.002.patch
>
>
> It would be nice if oev (the offline edits viewer) had a switch that allowed 
> us to read a binary edits file using recovery mode.  oev can be very useful 
> when working with corrupt or messed up edit log files, and this would make it 
> even more so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272675#comment-13272675
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3369:
--

The patch conflicted with HDFS-3157 so that the compilation failed.  HDFS-3157 
is now reverted and the patch can be compiled.  Let me start a new Jenkins 
build.

> change variable names referring to inode in blockmanagement to more 
> appropriate
> ---
>
> Key: HDFS-3369
> URL: https://issues.apache.org/jira/browse/HDFS-3369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: John George
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-3369.patch
>
>
> We should rename BlocksMap.getINode(..) and, in addition, the local variable 
> names such as fileInode to match 'block collection'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3335:
---

Attachment: HDFS-3335.007.patch

* EditLogFileInputStream: check for txid >= highest txid, not just equal

* FSEditLogOp: handle RuntimeExceptions thrown from readOp.

> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335.001.patch, HDFS-3335.002.patch, 
> HDFS-3335.003.patch, HDFS-3335.004.patch, HDFS-3335.005.patch, 
> HDFS-3335.006.patch, HDFS-3335.007.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) Failing tests in branch-2

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272641#comment-13272641
 ] 

Eli Collins commented on HDFS-3391:
---

Yea, TestPipelinesFailover fails for me when I loop it, eg on the 3rd or 4th 
iteration.

> Failing tests in branch-2
> -
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3230) Cleanup DatanodeID creation in the tests

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272630#comment-13272630
 ] 

Hudson commented on HDFS-3230:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2239 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2239/])
HDFS-3230. Cleanup DatanodeID creation in the tests. Contributed by Eli 
Collins (Revision 1336815)

 Result = ABORTED
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336815
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplaceDatanodeOnFailure.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestPendingDataNodeMessages.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


> Cleanup DatanodeID creation in the tests
> 
>
> Key: HDFS-3230
> URL: https://issues.apache.org/jira/browse/HDFS-3230
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Minor
> Attachments: hdfs-3230.txt, hdfs-3230.txt
>
>
> A lot of tests create dummy DatanodeIDs for testing, often use bogus values 
> when creating the objects (eg hostname in the IP field), which they can get 
> away with because the IDs aren't actually used. Let's add a test utility 
> method for creating a DatanodeID for testing and use it throughout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272624#comment-13272624
 ] 

Hadoop QA commented on HDFS-3387:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12526384/HDFS-3387_updated.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2408//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2408//console

This message is automatically generated.

> [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
> --
>
> Key: HDFS-3387
> URL: https://issues.apache.org/jira/browse/HDFS-3387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 2.0.0
>Reporter: Brahma Reddy Battula
>Priority: Trivial
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3387.patch, HDFS-3387_updated.patch
>
>
> Scenario:
> --
> Execute any fsshell command with invalid options
> Like ./hdfs haadmin -transitionToActive...
> Here it is logging as following..
> bin/hadoop command [genericOptions] [commandOptions]...
> Expected: Here help message is misleading to user saying that bin/hadoop that 
> is not actually user ran
> it's better to log bin/hdfs..Anyway hadoop is deprecated..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3398) Client will not retry when primaryDN is down once it's just got pipeline

2012-05-10 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272598#comment-13272598
 ] 

Uma Maheswara Rao G commented on HDFS-3398:
---

Seems to be a good catch Brahma.

@Todd, It looks to be problem to me Todd. When writing on to socket if other 
peer goes down, it may treat that as client error and client will exit.
How about catching socket operations and setting errorIndex to 1 (treating 
first node as bad)?

I did not see the below check  in 205 code.
 {code}
 if (errorIndex == -1) { // not a datanode error
streamerClosed = true;
  }
  {code}

205 code on throwable:
{code}
  } catch (Throwable e) {
  LOG.warn("DataStreamer Exception: " + 
   StringUtils.stringifyException(e));
  if (e instanceof IOException) {
setLastException((IOException)e);
  }
  hasError = true;
}
  }
 {code}


 In trunk:
 {code}
  } catch (Throwable e) {
  DFSClient.LOG.warn("DataStreamer Exception", e);
  if (e instanceof IOException) {
setLastException((IOException)e);
  }
  hasError = true;
  if (errorIndex == -1) { // not a datanode error
streamerClosed = true;
  }
}
{code}


> Client will not retry when primaryDN is down once it's just got pipeline
> 
>
> Key: HDFS-3398
> URL: https://issues.apache.org/jira/browse/HDFS-3398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 2.0.0
>Reporter: Brahma Reddy Battula
>Priority: Minor
>
> Scenario:
> =
> Start NN and three DN"S
> Get the datanode to which blocks has to be replicated.
> from 
> {code}
> nodes = nextBlockOutputStream(src);
> {code}
> Before start writing to the DN ,kill the primary DN.
> {code}
> // write out data to remote datanode
>   blockStream.write(buf.array(), buf.position(), buf.remaining());
>   blockStream.flush();
> {code}
> Now write will fail with the exception 
> {noformat}
> 2012-05-10 14:21:47,993 WARN  hdfs.DFSClient (DFSOutputStream.java:run(552)) 
> - DataStreamer Exception
> java.io.IOException: An established connection was aborted by the software in 
> your host machine
>   at sun.nio.ch.SocketDispatcher.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>   at sun.nio.ch.IOUtil.write(Unknown Source)
>   at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:60)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:151)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:112)
>   at java.io.BufferedOutputStream.write(Unknown Source)
>   at java.io.DataOutputStream.write(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:513)
> {noformat}
> .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3230) Cleanup DatanodeID creation in the tests

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272594#comment-13272594
 ] 

Hudson commented on HDFS-3230:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2297 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2297/])
HDFS-3230. Cleanup DatanodeID creation in the tests. Contributed by Eli 
Collins (Revision 1336815)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336815
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplaceDatanodeOnFailure.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestPendingDataNodeMessages.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


> Cleanup DatanodeID creation in the tests
> 
>
> Key: HDFS-3230
> URL: https://issues.apache.org/jira/browse/HDFS-3230
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Minor
> Attachments: hdfs-3230.txt, hdfs-3230.txt
>
>
> A lot of tests create dummy DatanodeIDs for testing, often use bogus values 
> when creating the objects (eg hostname in the IP field), which they can get 
> away with because the IDs aren't actually used. Let's add a test utility 
> method for creating a DatanodeID for testing and use it throughout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3230) Cleanup DatanodeID creation in the tests

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272589#comment-13272589
 ] 

Hudson commented on HDFS-3230:
--

Integrated in Hadoop-Common-trunk-Commit # (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit//])
HDFS-3230. Cleanup DatanodeID creation in the tests. Contributed by Eli 
Collins (Revision 1336815)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336815
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/JsonUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReplaceDatanodeOnFailure.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestPendingDataNodeMessages.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


> Cleanup DatanodeID creation in the tests
> 
>
> Key: HDFS-3230
> URL: https://issues.apache.org/jira/browse/HDFS-3230
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Minor
> Attachments: hdfs-3230.txt, hdfs-3230.txt
>
>
> A lot of tests create dummy DatanodeIDs for testing, often use bogus values 
> when creating the objects (eg hostname in the IP field), which they can get 
> away with because the IDs aren't actually used. Let's add a test utility 
> method for creating a DatanodeID for testing and use it throughout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Aaron T. Myers (JIRA)
Aaron T. Myers created HDFS-3400:


 Summary: DNs should be able start with jsvc even if security is 
disabled
 Key: HDFS-3400
 URL: https://issues.apache.org/jira/browse/HDFS-3400
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, scripts
Affects Versions: 2.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers


Currently if one tries to start a DN with security disabled (via 
hadoop.security.authentication = "simple" in the configs), but JSVC is 
correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3372) offlineEditsViewer should be able to read a binary edits file with recovery mode

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3372:
---

Attachment: HDFS-3372.002.patch

* rebase

* strip out a small cleanup patch that got mixed in.  This patch should only 
have oev stuff now.

> offlineEditsViewer should be able to read a binary edits file with recovery 
> mode
> 
>
> Key: HDFS-3372
> URL: https://issues.apache.org/jira/browse/HDFS-3372
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3372.001.patch, HDFS-3372.002.patch
>
>
> It would be nice if oev (the offline edits viewer) had a switch that allowed 
> us to read a binary edits file using recovery mode.  oev can be very useful 
> when working with corrupt or messed up edit log files, and this would make it 
> even more so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3372) offlineEditsViewer should be able to read a binary edits file with recovery mode

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3372:
---

Attachment: (was: HDFS-3372.002.patch)

> offlineEditsViewer should be able to read a binary edits file with recovery 
> mode
> 
>
> Key: HDFS-3372
> URL: https://issues.apache.org/jira/browse/HDFS-3372
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3372.001.patch
>
>
> It would be nice if oev (the offline edits viewer) had a switch that allowed 
> us to read a binary edits file using recovery mode.  oev can be very useful 
> when working with corrupt or messed up edit log files, and this would make it 
> even more so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3372) offlineEditsViewer should be able to read a binary edits file with recovery mode

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3372:
---

Attachment: HDFS-3372.002.patch

rebase on trunk

> offlineEditsViewer should be able to read a binary edits file with recovery 
> mode
> 
>
> Key: HDFS-3372
> URL: https://issues.apache.org/jira/browse/HDFS-3372
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3372.001.patch, HDFS-3372.002.patch
>
>
> It would be nice if oev (the offline edits viewer) had a switch that allowed 
> us to read a binary edits file using recovery mode.  oev can be very useful 
> when working with corrupt or messed up edit log files, and this would make it 
> even more so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser

2012-05-10 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272563#comment-13272563
 ] 

Brahma Reddy Battula commented on HDFS-3387:


@Daryn Sharp..Thanks for look...

hadoop fs is not deprecated..Whatever you told is correct..It's better to 
separate out hdfs and hadoop(general filesystem) when running fsshell 
commands...I'll Fix same ..:)

> [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
> --
>
> Key: HDFS-3387
> URL: https://issues.apache.org/jira/browse/HDFS-3387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 2.0.0
>Reporter: Brahma Reddy Battula
>Priority: Trivial
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3387.patch, HDFS-3387_updated.patch
>
>
> Scenario:
> --
> Execute any fsshell command with invalid options
> Like ./hdfs haadmin -transitionToActive...
> Here it is logging as following..
> bin/hadoop command [genericOptions] [commandOptions]...
> Expected: Here help message is misleading to user saying that bin/hadoop that 
> is not actually user ran
> it's better to log bin/hdfs..Anyway hadoop is deprecated..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) Failing tests in branch-2

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272562#comment-13272562
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3391:
--

HDFS-3157 is reverted.  So TestRBWBlockInvalidation is no longer a problem.

Not sure if TestPipelinesFailover is related to HDFS-3157.  Could any of you 
try to reproduce the failure?  I actually never see it failing in my machine.

> Failing tests in branch-2
> -
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3265) PowerPc Build error.

2012-05-10 Thread Kumar Ravi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272559#comment-13272559
 ] 

Kumar Ravi commented on HDFS-3265:
--

Matt, Is there anything I need to from my side to make sure this patch gets 
included in branch-2.0?

> PowerPc Build error.
> 
>
> Key: HDFS-3265
> URL: https://issues.apache.org/jira/browse/HDFS-3265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.0.2, 1.0.3, 2.0.0
> Environment: Linux RHEL 6.1 PowerPC + IBM JVM 6.0 SR10
>Reporter: Kumar Ravi
>Assignee: Kumar Ravi
>  Labels: patch
> Fix For: 0.24.0, 1.0.3
>
> Attachments: HADOOP-8271.patch, HADOOP-8271.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> When attempting to build branch-1, the following error is seen and ant exits.
> [exec] configure: error: Unsupported CPU architecture "powerpc64"
> The following command was used to build hadoop-common
> ant -Dlibhdfs=true -Dcompile.native=true -Dfusedfs=true -Dcompile.c++=true 
> -Dforrest.home=$FORREST_HOME compile-core-native compile-c++ 
> compile-c++-examples task-controller tar record-parser compile-hdfs-classes 
> package -Djava5.home=/opt/ibm/ibm-java2-ppc64-50/ 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3387) [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser

2012-05-10 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-3387:
---

Attachment: HDFS-3387_updated.patch

@Aaron T Mayes..I have updated patch..

> [Fsshell]It's better to provide hdfs instead of hadoop in GenericOptionsParser
> --
>
> Key: HDFS-3387
> URL: https://issues.apache.org/jira/browse/HDFS-3387
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 2.0.0
>Reporter: Brahma Reddy Battula
>Priority: Trivial
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3387.patch, HDFS-3387_updated.patch
>
>
> Scenario:
> --
> Execute any fsshell command with invalid options
> Like ./hdfs haadmin -transitionToActive...
> Here it is logging as following..
> bin/hadoop command [genericOptions] [commandOptions]...
> Expected: Here help message is misleading to user saying that bin/hadoop that 
> is not actually user ran
> it's better to log bin/hdfs..Anyway hadoop is deprecated..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272538#comment-13272538
 ] 

Eli Collins commented on HDFS-3026:
---

Looks good. Two suggestions:

doImmediateShutdown is HA-specific (refers state transition and logs a specific 
return code), how about making it take a return code and a message and 
converting the 5 or so other places in NameNode where we do
{code}
LOG.error("something is wrong");  // or System.err etc
System.exit();
{code}
with
{code}
doImmediateShutdown(, "something is wrong")
{code}
and can use another wrapper for the HA state transition case?
  
Noticed we don't check FS_TRASH_INTERVAL for bogus value, perhaps fail to start 
the emptier if it's bogus and use a negative value in the test instead of 
introducing failToStartTrashEmptierForTests?

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3389) Document the BKJM usage in Namenode HA.

2012-05-10 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3389:
--

Issue Type: Sub-task  (was: Bug)
Parent: HDFS-3399

> Document the BKJM usage in Namenode HA.
> ---
>
> Key: HDFS-3389
> URL: https://issues.apache.org/jira/browse/HDFS-3389
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>
> As per the discussion in HDFS-234, We need clear documentation for BKJM usage 
> in Namenode HA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3230) Cleanup DatanodeID creation in the tests

2012-05-10 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272524#comment-13272524
 ] 

Aaron T. Myers commented on HDFS-3230:
--

+1, the patch looks good to me. Thanks a lot, Eli.

> Cleanup DatanodeID creation in the tests
> 
>
> Key: HDFS-3230
> URL: https://issues.apache.org/jira/browse/HDFS-3230
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Minor
> Attachments: hdfs-3230.txt, hdfs-3230.txt
>
>
> A lot of tests create dummy DatanodeIDs for testing, often use bogus values 
> when creating the objects (eg hostname in the IP field), which they can get 
> away with because the IDs aren't actually used. Let's add a test utility 
> method for creating a DatanodeID for testing and use it throughout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException

2012-05-10 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3388:
-

Attachment: HDFS-3388.HDFS-3092.patch

> GetJournalEditServlet should catch more exceptions, not just IOException
> 
>
> Key: HDFS-3388
> URL: https://issues.apache.org/jira/browse/HDFS-3388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Reporter: Brandon Li
>Assignee: Brandon Li
> Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch
>
>
> GetJournalEditServlet has the same problem as that of GetImageServlet 
> (HDFS-3330). It should be fixed in the same way. Also need to make 
> CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3392) BookKeeper Journal Manager is not retrying to connect to BK when BookKeeper is not available for write.

2012-05-10 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3392:
--

Issue Type: Sub-task  (was: Bug)
Parent: HDFS-3399

> BookKeeper Journal Manager is not retrying to connect to BK when BookKeeper 
> is not available for write.
> ---
>
> Key: HDFS-3392
> URL: https://issues.apache.org/jira/browse/HDFS-3392
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: surendra singh lilhore
>
> Scenario:
> 1. Start 3 bookKeeper and 3 zookeeper.
> 2. Start one NN as active & second NN as standby.
> 3. Write some file.
> 4. Stop all BookKeepers.
> Issue:
> Bookkeeper Journal Manager is not retrying to connect to BK when Bookkeeper 
> is not available for write and Active namenode is shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException

2012-05-10 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272521#comment-13272521
 ] 

Brandon Li commented on HDFS-3388:
--

Todd, it seems to be doable if we make the fault injector class as nested 
static class.
In term of the manageability of fault injector classes, people could argue that 
it might be better to keep all the fault injector classes in a different 
package. Let me upload the new path and see what other folks think. 

> GetJournalEditServlet should catch more exceptions, not just IOException
> 
>
> Key: HDFS-3388
> URL: https://issues.apache.org/jira/browse/HDFS-3388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Reporter: Brandon Li
>Assignee: Brandon Li
> Attachments: HDFS-3388.HDFS-3092.patch
>
>
> GetJournalEditServlet has the same problem as that of GetImageServlet 
> (HDFS-3330). It should be fixed in the same way. Also need to make 
> CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3386) BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-0000X', when fails to acquire the lock

2012-05-10 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3386:
--

Issue Type: Sub-task  (was: Bug)
Parent: HDFS-3399

> BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-X', 
> when fails to acquire the lock
> --
>
> Key: HDFS-3386
> URL: https://issues.apache.org/jira/browse/HDFS-3386
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Reporter: surendra singh lilhore
>Assignee: Ivan Kelly
>Priority: Minor
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3386.diff
>
>
> When a Standby NN becomes Active, it will first create his sequential lock 
> entry create lock-000X  in ZK and then tries to acquire the lock as shown 
> below:
> {quote}
> myznode = zkc.create(lockpath + "/lock-", new byte[] {'0'},
>  Ids.OPEN_ACL_UNSAFE,
>  CreateMode.EPHEMERAL_SEQUENTIAL);
> if ((lockpath + "/" + nodes.get(0)).equals(myznode)) {
> if (LOG.isTraceEnabled()) {
> LOG.trace("Lock acquired - " + myznode);
> }
> lockCount.set(1);
> zkc.exists(myznode, this);
> return;
> } else {
> LOG.error("Failed to acquire lock with " + myznode
> + ", " + nodes.get(0) + " already has it");
> throw new IOException("Could not acquire lock");
> }  
> {quote}
> Say the transition to standby fails to acquire the lock it will throw the 
> exception and NN is getting shutdown. Here the problem is, the lock entry 
> lock-000X will exists in the ZK till session expiry and the further start-up 
> will not be able to acquire lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3399) BookKeeper option support for NN HA

2012-05-10 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-3399:
-

 Summary: BookKeeper option support for NN HA
 Key: HDFS-3399
 URL: https://issues.apache.org/jira/browse/HDFS-3399
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: ha
Affects Versions: 2.0.0, 3.0.0
Reporter: Uma Maheswara Rao G


Here is the JIRA to BookKeeper support issues with NN HA. We can file all the 
BookKeeperJournalManager issues under this JIRA for more easy tracking.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272480#comment-13272480
 ] 

Hadoop QA commented on HDFS-3026:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526356/HDFS-3026.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2406//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2406//artifact/trunk/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2406//console

This message is automatically generated.

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3398) Client will not retry when primaryDN is down once it's just got pipeline

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3398:
--

 Component/s: hdfs client
Target Version/s: 2.0.0
   Fix Version/s: (was: 3.0.0)

Does this affect branch-1 as well?

P.S please set "Target version" instead of "Fix version" for unfixed bugs.

> Client will not retry when primaryDN is down once it's just got pipeline
> 
>
> Key: HDFS-3398
> URL: https://issues.apache.org/jira/browse/HDFS-3398
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 2.0.0
>Reporter: Brahma Reddy Battula
>Priority: Minor
>
> Scenario:
> =
> Start NN and three DN"S
> Get the datanode to which blocks has to be replicated.
> from 
> {code}
> nodes = nextBlockOutputStream(src);
> {code}
> Before start writing to the DN ,kill the primary DN.
> {code}
> // write out data to remote datanode
>   blockStream.write(buf.array(), buf.position(), buf.remaining());
>   blockStream.flush();
> {code}
> Now write will fail with the exception 
> {noformat}
> 2012-05-10 14:21:47,993 WARN  hdfs.DFSClient (DFSOutputStream.java:run(552)) 
> - DataStreamer Exception
> java.io.IOException: An established connection was aborted by the software in 
> your host machine
>   at sun.nio.ch.SocketDispatcher.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>   at sun.nio.ch.IOUtil.write(Unknown Source)
>   at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:60)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:151)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:112)
>   at java.io.BufferedOutputStream.write(Unknown Source)
>   at java.io.DataOutputStream.write(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:513)
> {noformat}
> .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2450) Only complete hostname is supported to access data via hdfs://

2012-05-10 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272441#comment-13272441
 ] 

Harsh J commented on HDFS-2450:
---

Hey Daryn,

I've not verified this by self but does trunk remain unaffected by this issue?

> Only complete hostname is supported to access data via hdfs://
> --
>
> Key: HDFS-2450
> URL: https://issues.apache.org/jira/browse/HDFS-2450
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Daryn Sharp
> Fix For: 1.0.0
>
> Attachments: HDFS-2450-1.patch, HDFS-2450-2.patch, HDFS-2450-3.patch, 
> HDFS-2450-4.patch, HDFS-2450-5.patch, HDFS-2450.patch, IP vs. Hostname.pdf
>
>
> If my complete hostname is  host1.abc.xyz.com, only complete hostname must be 
> used to access data via hdfs://
> I am running following in .20.205 Client to get data from .20.205 NN (host1)
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1/tmp
> copyFromLocal: Wrong FS: hdfs://host1/tmp, expected: hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> $hadoop dfs -copyFromLocal /etc/passwd  hftp://host1.abc.xyz/tmp/
> copyFromLocal: Wrong FS: hdfs://host1.blue/tmp/1, expected: 
> hdfs://host1.abc.xyz.com
> Usage: java FsShell [-copyFromLocal  ... ]
> Only following is supported 
> $hadoop dfs -copyFromLocal /etc/passwd  hdfs://host1.abc.xyz.com/tmp/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272439#comment-13272439
 ] 

Harsh J commented on HDFS-2391:
---

bq.   -1 tests included.  The patch doesn't appear to include any new or 
modified tests.

Logging additions, does not require tests. Did however run HDFS manually to 
verify the INFO log printed upon setting a new balancer bandwidth value.

> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Harsh J
>  Labels: newbie
> Attachments: HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3026:
-

Attachment: HDFS-3026.patch

I looked into what it would take to make the RPC server support a semi-shutdown 
state, wherein it could return one final response to the client who initiated a 
shutdown from an RPC, but cancel all other RPCs and not accept any further 
incoming connections. To do so requires a fair bit of surgery to the 
o.a.h.ipc.Server shutdown code. Since clients initiating HA state transitions 
must already handle the case where an RPC to the NN times out, it doesn't seem 
worth it to do so.

In the patch attached, I've removed the delayed shutdown code and instead just 
shutdown the NN immediately upon failure to fully perform an HA state 
transition.

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3026:
-

Status: Patch Available  (was: Open)

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3386) BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-0000X', when fails to acquire the lock

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272417#comment-13272417
 ] 

Hadoop QA commented on HDFS-3386:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526353/HDFS-3386.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The patch appears to cause tar ant target to fail.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2405//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2405//console

This message is automatically generated.

> BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-X', 
> when fails to acquire the lock
> --
>
> Key: HDFS-3386
> URL: https://issues.apache.org/jira/browse/HDFS-3386
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: surendra singh lilhore
>Assignee: Ivan Kelly
>Priority: Minor
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3386.diff
>
>
> When a Standby NN becomes Active, it will first create his sequential lock 
> entry create lock-000X  in ZK and then tries to acquire the lock as shown 
> below:
> {quote}
> myznode = zkc.create(lockpath + "/lock-", new byte[] {'0'},
>  Ids.OPEN_ACL_UNSAFE,
>  CreateMode.EPHEMERAL_SEQUENTIAL);
> if ((lockpath + "/" + nodes.get(0)).equals(myznode)) {
> if (LOG.isTraceEnabled()) {
> LOG.trace("Lock acquired - " + myznode);
> }
> lockCount.set(1);
> zkc.exists(myznode, this);
> return;
> } else {
> LOG.error("Failed to acquire lock with " + myznode
> + ", " + nodes.get(0) + " already has it");
> throw new IOException("Could not acquire lock");
> }  
> {quote}
> Say the transition to standby fails to acquire the lock it will throw the 
> exception and NN is getting shutdown. Here the problem is, the lock entry 
> lock-000X will exists in the ZK till session expiry and the further start-up 
> will not be able to acquire lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3392) BookKeeper Journal Manager is not retrying to connect to BK when BookKeeper is not available for write.

2012-05-10 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272411#comment-13272411
 ] 

Ivan Kelly commented on HDFS-3392:
--

I don't understand the description here. If the BK cluster is down, then 
there's no way to connect to it. What do you think the behaviour should be?

> BookKeeper Journal Manager is not retrying to connect to BK when BookKeeper 
> is not available for write.
> ---
>
> Key: HDFS-3392
> URL: https://issues.apache.org/jira/browse/HDFS-3392
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: surendra singh lilhore
>
> Scenario:
> 1. Start 3 bookKeeper and 3 zookeeper.
> 2. Start one NN as active & second NN as standby.
> 3. Write some file.
> 4. Stop all BookKeepers.
> Issue:
> Bookkeeper Journal Manager is not retrying to connect to BK when Bookkeeper 
> is not available for write and Active namenode is shutdown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3386) BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-0000X', when fails to acquire the lock

2012-05-10 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3386:
-

Fix Version/s: (was: 0.23.0)
   3.0.0
   2.0.0
 Assignee: Ivan Kelly
   Status: Patch Available  (was: Open)

> BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-X', 
> when fails to acquire the lock
> --
>
> Key: HDFS-3386
> URL: https://issues.apache.org/jira/browse/HDFS-3386
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: surendra singh lilhore
>Assignee: Ivan Kelly
>Priority: Minor
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3386.diff
>
>
> When a Standby NN becomes Active, it will first create his sequential lock 
> entry create lock-000X  in ZK and then tries to acquire the lock as shown 
> below:
> {quote}
> myznode = zkc.create(lockpath + "/lock-", new byte[] {'0'},
>  Ids.OPEN_ACL_UNSAFE,
>  CreateMode.EPHEMERAL_SEQUENTIAL);
> if ((lockpath + "/" + nodes.get(0)).equals(myznode)) {
> if (LOG.isTraceEnabled()) {
> LOG.trace("Lock acquired - " + myznode);
> }
> lockCount.set(1);
> zkc.exists(myznode, this);
> return;
> } else {
> LOG.error("Failed to acquire lock with " + myznode
> + ", " + nodes.get(0) + " already has it");
> throw new IOException("Could not acquire lock");
> }  
> {quote}
> Say the transition to standby fails to acquire the lock it will throw the 
> exception and NN is getting shutdown. Here the problem is, the lock entry 
> lock-000X will exists in the ZK till session expiry and the further start-up 
> will not be able to acquire lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3386) BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-0000X', when fails to acquire the lock

2012-05-10 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3386:
-

Attachment: HDFS-3386.diff

Patch applies on top of HDFS-3058

> BK JM : Namenode is not deleting his lock entry '/ledgers/lock/lock-X', 
> when fails to acquire the lock
> --
>
> Key: HDFS-3386
> URL: https://issues.apache.org/jira/browse/HDFS-3386
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: surendra singh lilhore
>Priority: Minor
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3386.diff
>
>
> When a Standby NN becomes Active, it will first create his sequential lock 
> entry create lock-000X  in ZK and then tries to acquire the lock as shown 
> below:
> {quote}
> myznode = zkc.create(lockpath + "/lock-", new byte[] {'0'},
>  Ids.OPEN_ACL_UNSAFE,
>  CreateMode.EPHEMERAL_SEQUENTIAL);
> if ((lockpath + "/" + nodes.get(0)).equals(myznode)) {
> if (LOG.isTraceEnabled()) {
> LOG.trace("Lock acquired - " + myznode);
> }
> lockCount.set(1);
> zkc.exists(myznode, this);
> return;
> } else {
> LOG.error("Failed to acquire lock with " + myznode
> + ", " + nodes.get(0) + " already has it");
> throw new IOException("Could not acquire lock");
> }  
> {quote}
> Say the transition to standby fails to acquire the lock it will throw the 
> exception and NN is getting shutdown. Here the problem is, the lock entry 
> lock-000X will exists in the ZK till session expiry and the further start-up 
> will not be able to acquire lock.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3398) Client will not retry when primaryDN is down once it's just got pipeline

2012-05-10 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-3398:
---

Issue Type: Bug  (was: Task)

> Client will not retry when primaryDN is down once it's just got pipeline
> 
>
> Key: HDFS-3398
> URL: https://issues.apache.org/jira/browse/HDFS-3398
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Brahma Reddy Battula
>Priority: Minor
> Fix For: 3.0.0
>
>
> Scenario:
> =
> Start NN and three DN"S
> Get the datanode to which blocks has to be replicated.
> from 
> {code}
> nodes = nextBlockOutputStream(src);
> {code}
> Before start writing to the DN ,kill the primary DN.
> {code}
> // write out data to remote datanode
>   blockStream.write(buf.array(), buf.position(), buf.remaining());
>   blockStream.flush();
> {code}
> Now write will fail with the exception 
> {noformat}
> 2012-05-10 14:21:47,993 WARN  hdfs.DFSClient (DFSOutputStream.java:run(552)) 
> - DataStreamer Exception
> java.io.IOException: An established connection was aborted by the software in 
> your host machine
>   at sun.nio.ch.SocketDispatcher.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>   at sun.nio.ch.IOUtil.write(Unknown Source)
>   at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:60)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:151)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:112)
>   at java.io.BufferedOutputStream.write(Unknown Source)
>   at java.io.DataOutputStream.write(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:513)
> {noformat}
> .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3398) Client will not retry when primaryDN is down once it's just got pipeline

2012-05-10 Thread Brahma Reddy Battula (JIRA)
Brahma Reddy Battula created HDFS-3398:
--

 Summary: Client will not retry when primaryDN is down once it's 
just got pipeline
 Key: HDFS-3398
 URL: https://issues.apache.org/jira/browse/HDFS-3398
 Project: Hadoop HDFS
  Issue Type: Task
Affects Versions: 2.0.0
Reporter: Brahma Reddy Battula
Priority: Minor
 Fix For: 3.0.0


Scenario:
=
Start NN and three DN"S


Get the datanode to which blocks has to be replicated.
from 
{code}
nodes = nextBlockOutputStream(src);

{code}
Before start writing to the DN ,kill the primary DN.
{code}
// write out data to remote datanode
  blockStream.write(buf.array(), buf.position(), buf.remaining());
  blockStream.flush();
{code}

Now write will fail with the exception 

{noformat}
2012-05-10 14:21:47,993 WARN  hdfs.DFSClient (DFSOutputStream.java:run(552)) - 
DataStreamer Exception
java.io.IOException: An established connection was aborted by the software in 
your host machine
at sun.nio.ch.SocketDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(Unknown Source)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.write(Unknown Source)
at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:60)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:151)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:112)
at java.io.BufferedOutputStream.write(Unknown Source)
at java.io.DataOutputStream.write(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:513)

{noformat}


.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3058) HA: Bring BookKeeperJournalManager up to date with HA changes

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272392#comment-13272392
 ] 

Hadoop QA commented on HDFS-3058:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526350/HDFS-3058.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2404//console

This message is automatically generated.

> HA: Bring BookKeeperJournalManager up to date with HA changes
> -
>
> Key: HDFS-3058
> URL: https://issues.apache.org/jira/browse/HDFS-3058
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.24.0
>
> Attachments: HDFS-3058.diff, HDFS-3058.diff
>
>
> There's a couple of TODO(HA) comments in the BookKeeperJournalManager code. 
> This JIRA is to address those.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3058) HA: Bring BookKeeperJournalManager up to date with HA changes

2012-05-10 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272383#comment-13272383
 ] 

jirapos...@reviews.apache.org commented on HDFS-3058:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4230/
---

(Updated 2012-05-10 14:44:08.706764)


Review request for hadoop-hdfs.


Summary
---

There's a couple of TODO(HA) comments in the BookKeeperJournalManager code. 
This JIRA is to address those.


This addresses bug HDFS-3058.
http://issues.apache.org/jira/browse/HDFS-3058


Diffs (updated)
-

  hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/pom.xml 380ef62 
  
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperEditLogInputStream.java
 9d070d9 
  
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
 7fa9026 
  
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/BKJMUtil.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperAsHASharedDir.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperHACheckpoints.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperJournalManager.java
 b949bc2 
  
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java
 41f0292 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
 c144906 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
 3810614 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java
 5440c38 

Diff: https://reviews.apache.org/r/4230/diff


Testing
---


Thanks,

Ivan



> HA: Bring BookKeeperJournalManager up to date with HA changes
> -
>
> Key: HDFS-3058
> URL: https://issues.apache.org/jira/browse/HDFS-3058
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.24.0
>
> Attachments: HDFS-3058.diff, HDFS-3058.diff
>
>
> There's a couple of TODO(HA) comments in the BookKeeperJournalManager code. 
> This JIRA is to address those.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3058) HA: Bring BookKeeperJournalManager up to date with HA changes

2012-05-10 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3058:
-

Attachment: HDFS-3058.diff

Rebased onto trunk

> HA: Bring BookKeeperJournalManager up to date with HA changes
> -
>
> Key: HDFS-3058
> URL: https://issues.apache.org/jira/browse/HDFS-3058
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 0.24.0
>
> Attachments: HDFS-3058.diff, HDFS-3058.diff
>
>
> There's a couple of TODO(HA) comments in the BookKeeperJournalManager code. 
> This JIRA is to address those.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3395) NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272379#comment-13272379
 ] 

Hudson commented on HDFS-3395:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2238 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2238/])
HDFS-3395. NN doesn't start with HA+security enabled and HTTP address set 
to 0.0.0.0. Contributed by Aaron T. Myers. (Revision 1336690)

 Result = ABORTED
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336690
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetUtils.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java


> NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0
> -
>
> Key: HDFS-3395
> URL: https://issues.apache.org/jira/browse/HDFS-3395
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3395.patch
>
>
> DFSUtil#substituteForWildcardAddress subs in a default hostname if the given 
> hostname is 0.0.0.0. However, this function throws an exception if the given 
> hostname is set to 0.0.0.0 and security is enabled, regardless of whether the 
> default hostname is also 0.0.0.0. This function shouldn't throw an exception 
> unless both addresses are set to 0.0.0.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272357#comment-13272357
 ] 

Hadoop QA commented on HDFS-2391:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526327/HDFS-2391.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2403//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2403//console

This message is automatically generated.

> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Harsh J
>  Labels: newbie
> Attachments: HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3328) NPE in DataNode.getIpcPort

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272354#comment-13272354
 ] 

Hudson commented on HDFS-3328:
--

Integrated in Hadoop-Mapreduce-trunk #1075 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1075/])
HDFS-3328. NPE in DataNode.getIpcPort. Contributed by Eli Collins (Revision 
1336480)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336480
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


> NPE in DataNode.getIpcPort
> --
>
> Key: HDFS-3328
> URL: https://issues.apache.org/jira/browse/HDFS-3328
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node
>Affects Versions: 2.0.0
>Reporter: Uma Maheswara Rao G
>Assignee: Eli Collins
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: hdfs-3328.txt
>
>
> While running the tests, I have seen this exceptions.Tests passed. 
> Not sure this is a problem.
> {quote}
> 2012-04-26 23:15:51,763 WARN  hdfs.DFSClient (DFSOutputStream.java:run(710)) 
> - DFSOutputStream ResponseProcessor exception  for block 
> BP-1372255573-49.249.124.17-1335462329685:blk_-843504080180201_1005
> java.io.EOFException: Premature EOF: no length prefix available
>   at 
> org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:162)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:95)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:657)
> Exception in thread "DataXceiver for client /127.0.0.1:52323 [Cleaning up]" 
> java.lang.NullPointerException
>   at org.apache.hadoop.ipc.Server$Listener.getAddress(Server.java:669)
>   at org.apache.hadoop.ipc.Server.getListenerAddress(Server.java:1988)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.getIpcPort(DataNode.java:882)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.getDisplayName(DataNode.java:863)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:171)
>   at java.lang.Thread.run(Unknown Source){quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3396) FUSE build fails on Ubuntu 12.04

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272347#comment-13272347
 ] 

Hudson commented on HDFS-3396:
--

Integrated in Hadoop-Mapreduce-trunk #1075 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1075/])
HDFS-3396. FUSE build fails on Ubuntu 12.04. Contributed by Colin Patrick 
McCabe (Revision 1336495)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336495
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/fuse-dfs/src/Makefile.am


> FUSE build fails on Ubuntu 12.04
> 
>
> Key: HDFS-3396
> URL: https://issues.apache.org/jira/browse/HDFS-3396
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HDFS-3396.001.patch
>
>
> The HDFS FUSE-dfs build fails on Ubuntu 12.04 (and probably other OSes) with 
> a message like this:
> {code}
> /home/petru/work/ubeeko/hadoo.apache.org/0.23/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/src/contrib/fuse-dfs/src/fuse_dfs.c:27:
> undefined reference to `fuse_get_context'
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3341) Change minimum RPC versions to 2.0.0-SNAPSHOT instead of 2.0.0

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272346#comment-13272346
 ] 

Hudson commented on HDFS-3341:
--

Integrated in Hadoop-Mapreduce-trunk #1075 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1075/])
HDFS-3341, HADOOP-8340. SNAPSHOT build versions should compare as less than 
their eventual release. Contributed by Todd Lipcon. (Revision 1336459)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336459
Files : 
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/VersionUtil.java
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestVersionUtil.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> Change minimum RPC versions to 2.0.0-SNAPSHOT instead of 2.0.0
> --
>
> Key: HDFS-3341
> URL: https://issues.apache.org/jira/browse/HDFS-3341
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 3.0.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: hdfs-3341.txt
>
>
> After we commit HADOOP-8340, tests will fail because the minimum version is 
> configured to be 2.0.0/3.0.0 instead of the corresponding SNAPSHOT builds. 
> When we commit HADOOP-8340, we should update the minimums at the same time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272345#comment-13272345
 ] 

Hudson commented on HDFS-3157:
--

Integrated in Hadoop-Mapreduce-trunk #1075 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1075/])
Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) 
HDFS-3157. Error in deleting block is keep on coming from DN even after the 
block report and directory scanning has happened. (Revision 1336572)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java


> Error in deleting block is keep on coming from DN even after the block report 
> and directory scanning has happened
> -
>
> Key: HDFS-3157
> URL: https://issues.apache.org/jira/browse/HDFS-3157
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: J.Andreina
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch
>
>
> Cluster setup:
> 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" 
> 300,"dfs.datanode.directoryscan.interval" 1
> step 1: write one file "a.txt" with sync(not closed)
> step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
> replication happened.
> step 3: close the file.
> Since the replication factor is 2 the blocks are replicated to the other 
> datanode.
> Then at the NN side the following cmd is issued to DN from which the block is 
> deleted
> -
> {noformat}
> 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: duplicate requested for 
> blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
> because reported RBW replica with genstamp 1002 does not match COMPLETE 
> block's genstamp in block map 1003
> 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> Removing block blk_2903555284838653156_1003 from neededReplications as it has 
> enough replicas.
> {noformat}
> From the datanode side in which the block is deleted the following exception 
> occured
> {noformat}
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Unexpected error trying to delete block blk_2903555284838653156_1003. 
> BlockInfo not found in volumeMap.
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Error processing datanode Command
> java.io.IOException: Error in deleting blocks.
>   at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3390) DFSAdmin should print full stack traces of errors when DEBUG logging is enabled

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272349#comment-13272349
 ] 

Hudson commented on HDFS-3390:
--

Integrated in Hadoop-Mapreduce-trunk #1075 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1075/])
HDFS-3390. DFSAdmin should print full stack traces of errors when DEBUG 
logging is enabled. Contributed by Aaron T. Myers. (Revision 1336324)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336324
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java


> DFSAdmin should print full stack traces of errors when DEBUG logging is 
> enabled
> ---
>
> Key: HDFS-3390
> URL: https://issues.apache.org/jira/browse/HDFS-3390
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HDFS-3390.patch, HDFS-3390.patch
>
>
> If an error is encountered when running an `hdfs dfsadmin ...' command, only 
> the exception's message is output. It would be handy for debugging if the 
> full stack trace of the exception were output when DEBUG logging is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3395) NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3395:
-

   Resolution: Fixed
Fix Version/s: 2.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks a lot for the review, Eli. I've just committed this to trunk, branch-2, 
and branch-2.0.0-alpha.

> NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0
> -
>
> Key: HDFS-3395
> URL: https://issues.apache.org/jira/browse/HDFS-3395
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3395.patch
>
>
> DFSUtil#substituteForWildcardAddress subs in a default hostname if the given 
> hostname is 0.0.0.0. However, this function throws an exception if the given 
> hostname is set to 0.0.0.0 and security is enabled, regardless of whether the 
> default hostname is also 0.0.0.0. This function shouldn't throw an exception 
> unless both addresses are set to 0.0.0.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3395) NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272340#comment-13272340
 ] 

Hudson commented on HDFS-3395:
--

Integrated in Hadoop-Common-trunk-Commit #2221 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2221/])
HDFS-3395. NN doesn't start with HA+security enabled and HTTP address set 
to 0.0.0.0. Contributed by Aaron T. Myers. (Revision 1336690)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336690
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetUtils.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java


> NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0
> -
>
> Key: HDFS-3395
> URL: https://issues.apache.org/jira/browse/HDFS-3395
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3395.patch
>
>
> DFSUtil#substituteForWildcardAddress subs in a default hostname if the given 
> hostname is 0.0.0.0. However, this function throws an exception if the given 
> hostname is set to 0.0.0.0 and security is enabled, regardless of whether the 
> default hostname is also 0.0.0.0. This function shouldn't throw an exception 
> unless both addresses are set to 0.0.0.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3395) NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272339#comment-13272339
 ] 

Hudson commented on HDFS-3395:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2296 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2296/])
HDFS-3395. NN doesn't start with HA+security enabled and HTTP address set 
to 0.0.0.0. Contributed by Aaron T. Myers. (Revision 1336690)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336690
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetUtils.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java


> NN doesn't start with HA+security enabled and HTTP address set to 0.0.0.0
> -
>
> Key: HDFS-3395
> URL: https://issues.apache.org/jira/browse/HDFS-3395
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3395.patch
>
>
> DFSUtil#substituteForWildcardAddress subs in a default hostname if the given 
> hostname is 0.0.0.0. However, this function throws an exception if the given 
> hostname is set to 0.0.0.0 and security is enabled, regardless of whether the 
> default hostname is also 0.0.0.0. This function shouldn't throw an exception 
> unless both addresses are set to 0.0.0.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272326#comment-13272326
 ] 

Hudson commented on HDFS-3157:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2237 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2237/])
Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) 
HDFS-3157. Error in deleting block is keep on coming from DN even after the 
block report and directory scanning has happened. (Revision 1336572)

 Result = ABORTED
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java


> Error in deleting block is keep on coming from DN even after the block report 
> and directory scanning has happened
> -
>
> Key: HDFS-3157
> URL: https://issues.apache.org/jira/browse/HDFS-3157
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: J.Andreina
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch
>
>
> Cluster setup:
> 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" 
> 300,"dfs.datanode.directoryscan.interval" 1
> step 1: write one file "a.txt" with sync(not closed)
> step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
> replication happened.
> step 3: close the file.
> Since the replication factor is 2 the blocks are replicated to the other 
> datanode.
> Then at the NN side the following cmd is issued to DN from which the block is 
> deleted
> -
> {noformat}
> 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: duplicate requested for 
> blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
> because reported RBW replica with genstamp 1002 does not match COMPLETE 
> block's genstamp in block map 1003
> 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> Removing block blk_2903555284838653156_1003 from neededReplications as it has 
> enough replicas.
> {noformat}
> From the datanode side in which the block is deleted the following exception 
> occured
> {noformat}
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Unexpected error trying to delete block blk_2903555284838653156_1003. 
> BlockInfo not found in volumeMap.
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Error processing datanode Command
> java.io.IOException: Error in deleting blocks.
>   at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272297#comment-13272297
 ] 

Hudson commented on HDFS-3157:
--

Integrated in Hadoop-Common-trunk-Commit #2220 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2220/])
Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) 
HDFS-3157. Error in deleting block is keep on coming from DN even after the 
block report and directory scanning has happened. (Revision 1336572)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java


> Error in deleting block is keep on coming from DN even after the block report 
> and directory scanning has happened
> -
>
> Key: HDFS-3157
> URL: https://issues.apache.org/jira/browse/HDFS-3157
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: J.Andreina
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch
>
>
> Cluster setup:
> 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" 
> 300,"dfs.datanode.directoryscan.interval" 1
> step 1: write one file "a.txt" with sync(not closed)
> step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
> replication happened.
> step 3: close the file.
> Since the replication factor is 2 the blocks are replicated to the other 
> datanode.
> Then at the NN side the following cmd is issued to DN from which the block is 
> deleted
> -
> {noformat}
> 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: duplicate requested for 
> blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
> because reported RBW replica with genstamp 1002 does not match COMPLETE 
> block's genstamp in block map 1003
> 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> Removing block blk_2903555284838653156_1003 from neededReplications as it has 
> enough replicas.
> {noformat}
> From the datanode side in which the block is deleted the following exception 
> occured
> {noformat}
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Unexpected error trying to delete block blk_2903555284838653156_1003. 
> BlockInfo not found in volumeMap.
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Error processing datanode Command
> java.io.IOException: Error in deleting blocks.
>   at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272294#comment-13272294
 ] 

Hudson commented on HDFS-3157:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2295 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2295/])
Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) 
HDFS-3157. Error in deleting block is keep on coming from DN even after the 
block report and directory scanning has happened. (Revision 1336572)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java


> Error in deleting block is keep on coming from DN even after the block report 
> and directory scanning has happened
> -
>
> Key: HDFS-3157
> URL: https://issues.apache.org/jira/browse/HDFS-3157
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: J.Andreina
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch
>
>
> Cluster setup:
> 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" 
> 300,"dfs.datanode.directoryscan.interval" 1
> step 1: write one file "a.txt" with sync(not closed)
> step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
> replication happened.
> step 3: close the file.
> Since the replication factor is 2 the blocks are replicated to the other 
> datanode.
> Then at the NN side the following cmd is issued to DN from which the block is 
> deleted
> -
> {noformat}
> 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: duplicate requested for 
> blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
> because reported RBW replica with genstamp 1002 does not match COMPLETE 
> block's genstamp in block map 1003
> 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> Removing block blk_2903555284838653156_1003 from neededReplications as it has 
> enough replicas.
> {noformat}
> From the datanode side in which the block is deleted the following exception 
> occured
> {noformat}
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Unexpected error trying to delete block blk_2903555284838653156_1003. 
> BlockInfo not found in volumeMap.
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Error processing datanode Command
> java.io.IOException: Error in deleting blocks.
>   at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened

2012-05-10 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272240#comment-13272240
 ] 

Uma Maheswara Rao G commented on HDFS-3157:
---

Yes, Nicholas, Thanks a lot for checking this. It will actually will not mark 
as block corrupt due to that inode check. We may have to rebuild the blockInfo 
just with reported block genstamp and other state should be same as 
storedBlock. Let's fix this in next patch. I just reverted the changes.
Ashish is working on it.

> Error in deleting block is keep on coming from DN even after the block report 
> and directory scanning has happened
> -
>
> Key: HDFS-3157
> URL: https://issues.apache.org/jira/browse/HDFS-3157
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: J.Andreina
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch
>
>
> Cluster setup:
> 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" 
> 300,"dfs.datanode.directoryscan.interval" 1
> step 1: write one file "a.txt" with sync(not closed)
> step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
> replication happened.
> step 3: close the file.
> Since the replication factor is 2 the blocks are replicated to the other 
> datanode.
> Then at the NN side the following cmd is issued to DN from which the block is 
> deleted
> -
> {noformat}
> 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: duplicate requested for 
> blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
> because reported RBW replica with genstamp 1002 does not match COMPLETE 
> block's genstamp in block map 1003
> 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> Removing block blk_2903555284838653156_1003 from neededReplications as it has 
> enough replicas.
> {noformat}
> From the datanode side in which the block is deleted the following exception 
> occured
> {noformat}
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Unexpected error trying to delete block blk_2903555284838653156_1003. 
> BlockInfo not found in volumeMap.
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Error processing datanode Command
> java.io.IOException: Error in deleting blocks.
>   at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-2391:
--

Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Harsh J
>  Labels: newbie
> Attachments: HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned HDFS-2391:
-

Assignee: Harsh J

> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Harsh J
>  Labels: newbie
> Attachments: HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-2391:
--

Attachment: HDFS-2391.patch

> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>  Labels: newbie
> Attachments: HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3031:
--

Target Version/s: 2.0.0  (was: 0.24.0)
  Status: Patch Available  (was: Open)

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: hdfs-3031.txt, styx01_killNNfailover, 
> styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272188#comment-13272188
 ] 

Todd Lipcon commented on HDFS-3031:
---

(aso need to make close() idempotent, I think, before we mark this JIRA 
resolved -- just wanted to post the partial patch to see how it fares against 
the whole suite)

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: hdfs-3031.txt, styx01_killNNfailover, 
> styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3031:
--

Attachment: hdfs-3031.txt

This patch makes getAdditionalBlock() idempotent. I think it might still have 
some issues with append, perhaps - will see what tests it fails up on Jenkins

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: hdfs-3031.txt, styx01_killNNfailover, 
> styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3396) FUSE build fails on Ubuntu 12.04

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272175#comment-13272175
 ] 

Hudson commented on HDFS-3396:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2236 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2236/])
HDFS-3396. FUSE build fails on Ubuntu 12.04. Contributed by Colin Patrick 
McCabe (Revision 1336495)

 Result = ABORTED
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336495
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/fuse-dfs/src/Makefile.am


> FUSE build fails on Ubuntu 12.04
> 
>
> Key: HDFS-3396
> URL: https://issues.apache.org/jira/browse/HDFS-3396
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HDFS-3396.001.patch
>
>
> The HDFS FUSE-dfs build fails on Ubuntu 12.04 (and probably other OSes) with 
> a message like this:
> {code}
> /home/petru/work/ubeeko/hadoo.apache.org/0.23/hadoop-common/hadoop-hdfs-project/hadoop-hdfs/src/contrib/fuse-dfs/src/fuse_dfs.c:27:
> undefined reference to `fuse_get_context'
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1121) Allow HDFS client to measure distribution of blocks across devices for a specific DataNode

2012-05-10 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272149#comment-13272149
 ] 

Harsh J commented on HDFS-1121:
---

Isn't monitoring 
http://DNHOST:50075/jmx?qry=hadoop:service=DataNode,name=DataNodeInfo 
individually sufficient for getting volume-info results, to measure disk-level 
balance for your clusters?

I don't think we should increase the DN heartbeat payload when this is already 
exposed at the DN level?

> Allow HDFS client to measure distribution of blocks across devices for a 
> specific DataNode
> --
>
> Key: HDFS-1121
> URL: https://issues.apache.org/jira/browse/HDFS-1121
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Jeff Hammerbacher
> Attachments: HDFS-1121.0.patch
>
>
> As discussed on the mailing list, it would be useful if the DfsClient could 
> measure the distribution of blocks across devices for an individual DataNode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-3031:
-

Assignee: Todd Lipcon

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: styx01_killNNfailover, styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




<    1   2