[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active

2012-08-13 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432946#comment-13432946
 ] 

Vinay commented on HDFS-3561:
-

Hi [~atm] any more comments you have on this..? 

 ZKFC retries for 45 times to connect to other NN during fencing when network 
 between NNs broken and standby Nn will not take over as active 
 

 Key: HDFS-3561
 URL: https://issues.apache.org/jira/browse/HDFS-3561
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover, ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: suja s
Assignee: Vinay
 Attachments: HDFS-3561-2.patch, HDFS-3561.patch


 Scenario:
 Active NN on machine1
 Standby NN on machine2
 Machine1 is isolated from the network (machine1 network cable unplugged)
 After zk session timeout ZKFC at machine2 side gets notification that NN1 is 
 not there.
 ZKFC tries to failover NN2 as active.
 As part of this during fencing it tries to connect to machine1 and kill NN1. 
 (sshfence technique configured)
 This connection retry happens for 45 times( as it takes  
 ipc.client.connect.max.socket.retries)
 Also after that standby NN is not able to take over as active (because of 
 fencing failure).
 Suggestion: If ZKFC is not able to reach other NN for specified time/no of 
 retries it can consider that NN as dead and instruct the other NN to take 
 over as active as there is no chance of the other NN (NN1) retaining its 
 state as active after zk session timeout when its isolated from network
 From ZKFC log:
 {noformat}
 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s).
 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s).
 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s).
 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s).
 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s).
 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s).
 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s).
 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s).
 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s).
 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s).
 {noformat}
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-13 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432952#comment-13432952
 ] 

Eli Collins commented on HDFS-3788:
---

Correct, this is a different issue from HDFS-3671.

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Ivan Kelly (JIRA)
Ivan Kelly created HDFS-3789:


 Summary: JournalManager#format() should be able to throw 
IOException
 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly


Currently JournalManager#format cannot throw any exception. As format can fail, 
we should be able to propogate this failure upwards. Otherwise, format will 
fail silently, and the admin will start using the cluster with a 
failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3789:
-

Status: Patch Available  (was: Open)

 JournalManager#format() should be able to throw IOException
 ---

 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Attachments: HDFS-3789.diff


 Currently JournalManager#format cannot throw any exception. As format can 
 fail, we should be able to propogate this failure upwards. Otherwise, format 
 will fail silently, and the admin will start using the cluster with a 
 failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly updated HDFS-3789:
-

Attachment: HDFS-3789.diff

 JournalManager#format() should be able to throw IOException
 ---

 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Attachments: HDFS-3789.diff


 Currently JournalManager#format cannot throw any exception. As format can 
 fail, we should be able to propogate this failure upwards. Otherwise, format 
 will fail silently, and the admin will start using the cluster with a 
 failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-08-13 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433124#comment-13433124
 ] 

Arun C Murthy commented on HDFS-3672:
-

I'd really encourage you to put this into the DataNode and throw an 
UnsupportedOperationException rather than merely do this via a client-side 
config.


 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, 
 hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, 
 hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433126#comment-13433126
 ] 

Hadoop QA commented on HDFS-3789:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540641/HDFS-3789.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal:

  org.apache.hadoop.hdfs.TestDFSClientRetries
  org.apache.hadoop.hdfs.TestFileAppend4

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2989//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2989//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2989//console

This message is automatically generated.

 JournalManager#format() should be able to throw IOException
 ---

 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Attachments: HDFS-3789.diff


 Currently JournalManager#format cannot throw any exception. As format can 
 fail, we should be able to propogate this failure upwards. Otherwise, format 
 will fail silently, and the admin will start using the cluster with a 
 failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-13 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HDFS-3788:
-

Affects Version/s: 0.23.3

This affects 0.23 as well.

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname

2012-08-13 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433209#comment-13433209
 ] 

Daryn Sharp commented on HDFS-3150:
---

Question: should we consider tying this and the use_ip config together?  I 
think that if you need hosts names for multihoming you probably need host names 
for everything.  Does this even work if use_ip is true (default value)?

 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode 
 identifiers.
 New client and Datanode configuration options are introduced:
 - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
 connections should use the datanode hostname (as clients outside cluster may 
 not be able to route the IP)
 - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
 use hostnames when connecting to other Datanodes for data transfer
 If the configuration options are not used, there is no change in the current 
 behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-13 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433211#comment-13433211
 ] 

Jason Lowe commented on HDFS-3788:
--

A -get of a large file also fails, but it works on smaller files:

{noformat}
$ hadoop fs -ls bigfile
Found 1 items
-rw-r--r--   3 someuser hdfs 3246391296 2012-08-13 15:04 bigfile
$ hadoop fs -get webhdfs://clusternn:50070/user/someuser/bigfile
get: Content-Length header is missing
{noformat}


 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-13 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433254#comment-13433254
 ] 

Daryn Sharp commented on HDFS-3788:
---

The problem is complex to support multiple grid versions:
* you need either the content-length or chunking to reliably know when the file 
has been fully read
* if the response isn't chunked, and there's no content-length, the client 
needs to obtain the content-length by other means such as a file stat

Based on a quick glance, it looks like the current streaming servlet is 
explicitly setting the content-length to 0.  (That seems wrong, because it's 
not an empty file)  The puzzling part is I don't know how it works at all for 
either 2GB or 2GB!  Java must be implicitly setting the content-length when 
the stream is 2GB.

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-3790:
--

 Summary: test_fuse_dfs.c doesn't compile on centos 5
 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3790:
---

Attachment: HDFS-3790.001.patch

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3790:
---

Status: Patch Available  (was: Open)

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433361#comment-13433361
 ] 

Colin Patrick McCabe commented on HDFS-3790:


I tested this on Centos5.8.  It works and the test passes.

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433374#comment-13433374
 ] 

Aaron T. Myers commented on HDFS-3790:
--

+1 pending Jenkins.

Colin, could you please set the affects/targets versions appropriately? Thanks.

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433376#comment-13433376
 ] 

Hadoop QA commented on HDFS-3790:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540713/HDFS-3790.001.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 javac.  The patch appears to cause the build to fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2990//console

This message is automatically generated.

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433388#comment-13433388
 ] 

Todd Lipcon commented on HDFS-3789:
---

+1. I had this same patch pending but couldn't post it due to the JIRA outage 
the past few days. I will commit this later today.

 JournalManager#format() should be able to throw IOException
 ---

 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Attachments: HDFS-3789.diff


 Currently JournalManager#format cannot throw any exception. As format can 
 fail, we should be able to propogate this failure upwards. Otherwise, format 
 will fail silently, and the admin will start using the cluster with a 
 failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3791) Backport HDFS-173 Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes

2012-08-13 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-3791:
-

 Summary: Backport HDFS-173 Recursively deleting a directory with 
millions of files makes NameNode unresponsive for other commands until the 
deletion completes
 Key: HDFS-3791
 URL: https://issues.apache.org/jira/browse/HDFS-3791
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 1.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


Backport HDFS-173. 
see the 
[comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13422007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422007]
 for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3719) Re-enable append-related tests in TestFileConcurrentReader

2012-08-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433400#comment-13433400
 ] 

Aaron T. Myers commented on HDFS-3719:
--

OK, since it appears there's more to these failing tests than a simple fix, I'm 
going to go ahead and revert this change to re-enable the tests.

 Re-enable append-related tests in TestFileConcurrentReader
 --

 Key: HDFS-3719
 URL: https://issues.apache.org/jira/browse/HDFS-3719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.2.0-alpha

 Attachments: hdfs-3719-1.patch


 Both of these tests are disabled. We should figure out what append 
 functionality we need to make the tests work again, and reenable them.
 {code}
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorTransferToAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
   }
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
 DEFAULT_WRITE_SIZE);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HDFS-3719) Re-enable append-related tests in TestFileConcurrentReader

2012-08-13 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers reopened HDFS-3719:
--


 Re-enable append-related tests in TestFileConcurrentReader
 --

 Key: HDFS-3719
 URL: https://issues.apache.org/jira/browse/HDFS-3719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Fix For: 2.2.0-alpha

 Attachments: hdfs-3719-1.patch


 Both of these tests are disabled. We should figure out what append 
 functionality we need to make the tests work again, and reenable them.
 {code}
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorTransferToAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
   }
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
 DEFAULT_WRITE_SIZE);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3719) Re-enable append-related tests in TestFileConcurrentReader

2012-08-13 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3719:
-

Fix Version/s: (was: 2.2.0-alpha)

I've just reverted this.

 Re-enable append-related tests in TestFileConcurrentReader
 --

 Key: HDFS-3719
 URL: https://issues.apache.org/jira/browse/HDFS-3719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-3719-1.patch


 Both of these tests are disabled. We should figure out what append 
 functionality we need to make the tests work again, and reenable them.
 {code}
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorTransferToAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
   }
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
 DEFAULT_WRITE_SIZE);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3719) Re-enable append-related tests in TestFileConcurrentReader

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433415#comment-13433415
 ] 

Hudson commented on HDFS-3719:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2637 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2637/])
Revert HDFS-3719. See discussion there and HDFS-3770 for more info. 
(Revision 1372544)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372544
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java


 Re-enable append-related tests in TestFileConcurrentReader
 --

 Key: HDFS-3719
 URL: https://issues.apache.org/jira/browse/HDFS-3719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-3719-1.patch


 Both of these tests are disabled. We should figure out what append 
 functionality we need to make the tests work again, and reenable them.
 {code}
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorTransferToAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
   }
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
 DEFAULT_WRITE_SIZE);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3770) TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433416#comment-13433416
 ] 

Hudson commented on HDFS-3770:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2637 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2637/])
Revert HDFS-3719. See discussion there and HDFS-3770 for more info. 
(Revision 1372544)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372544
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java


 TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed
 ---

 Key: HDFS-3770
 URL: https://issues.apache.org/jira/browse/HDFS-3770
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Eli Collins

 TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed 
 on [a recent job|https://builds.apache.org/job/PreCommit-HDFS-Build/2959]. 
 Looks like a race in the test. The failure is due to a ChecksumException but 
 that's likely due to the DFSOutputstream getting interrupted on close. 
 Looking at the relevant code, waitForAckedSeqno is getting an 
 InterruptedException waiting on dataQueue, looks like there are uses of 
 interrupt where we're not first notifying dataQueue, or waiting for the 
 notifications to be delivered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3719) Re-enable append-related tests in TestFileConcurrentReader

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433419#comment-13433419
 ] 

Hudson commented on HDFS-3719:
--

Integrated in Hadoop-Common-trunk-Commit #2572 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2572/])
Revert HDFS-3719. See discussion there and HDFS-3770 for more info. 
(Revision 1372544)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372544
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java


 Re-enable append-related tests in TestFileConcurrentReader
 --

 Key: HDFS-3719
 URL: https://issues.apache.org/jira/browse/HDFS-3719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-3719-1.patch


 Both of these tests are disabled. We should figure out what append 
 functionality we need to make the tests work again, and reenable them.
 {code}
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorTransferToAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
   }
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
 DEFAULT_WRITE_SIZE);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3770) TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433420#comment-13433420
 ] 

Hudson commented on HDFS-3770:
--

Integrated in Hadoop-Common-trunk-Commit #2572 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2572/])
Revert HDFS-3719. See discussion there and HDFS-3770 for more info. 
(Revision 1372544)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372544
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java


 TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed
 ---

 Key: HDFS-3770
 URL: https://issues.apache.org/jira/browse/HDFS-3770
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Eli Collins

 TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed 
 on [a recent job|https://builds.apache.org/job/PreCommit-HDFS-Build/2959]. 
 Looks like a race in the test. The failure is due to a ChecksumException but 
 that's likely due to the DFSOutputstream getting interrupted on close. 
 Looking at the relevant code, waitForAckedSeqno is getting an 
 InterruptedException waiting on dataQueue, looks like there are uses of 
 interrupt where we're not first notifying dataQueue, or waiting for the 
 notifications to be delivered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-13 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433437#comment-13433437
 ] 

Daryn Sharp commented on HDFS-3788:
---

I'll add that if you just remove the content-length check, and the response is 
not chunked, the http timeouts will abort the download.

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3790:
---

 Target Version/s: 2.2.0-alpha
Affects Version/s: 2.2.0-alpha

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Affects Versions: 2.2.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3789:
--

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Ivan.

 JournalManager#format() should be able to throw IOException
 ---

 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HDFS-3789.diff


 Currently JournalManager#format cannot throw any exception. As format can 
 fail, we should be able to propogate this failure upwards. Otherwise, format 
 will fail silently, and the admin will start using the cluster with a 
 failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3770) TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433458#comment-13433458
 ] 

Hudson commented on HDFS-3770:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2593 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2593/])
Revert HDFS-3719. See discussion there and HDFS-3770 for more info. 
(Revision 1372544)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372544
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java


 TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed
 ---

 Key: HDFS-3770
 URL: https://issues.apache.org/jira/browse/HDFS-3770
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 3.0.0
Reporter: Eli Collins

 TestFileConcurrentReader#testUnfinishedBlockCRCErrorTransferToAppend failed 
 on [a recent job|https://builds.apache.org/job/PreCommit-HDFS-Build/2959]. 
 Looks like a race in the test. The failure is due to a ChecksumException but 
 that's likely due to the DFSOutputstream getting interrupted on close. 
 Looking at the relevant code, waitForAckedSeqno is getting an 
 InterruptedException waiting on dataQueue, looks like there are uses of 
 interrupt where we're not first notifying dataQueue, or waiting for the 
 notifications to be delivered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3719) Re-enable append-related tests in TestFileConcurrentReader

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433457#comment-13433457
 ] 

Hudson commented on HDFS-3719:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2593 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2593/])
Revert HDFS-3719. See discussion there and HDFS-3770 for more info. 
(Revision 1372544)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372544
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileConcurrentReader.java


 Re-enable append-related tests in TestFileConcurrentReader
 --

 Key: HDFS-3719
 URL: https://issues.apache.org/jira/browse/HDFS-3719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: hdfs-3719-1.patch


 Both of these tests are disabled. We should figure out what append 
 functionality we need to make the tests work again, and reenable them.
 {code}
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorTransferToAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(true, SyncType.APPEND, DEFAULT_WRITE_SIZE);
   }
   // fails due to issue w/append, disable 
   @Ignore
   @Test
   public void _testUnfinishedBlockCRCErrorNormalTransferAppend()
 throws IOException {
 runTestUnfinishedBlockCRCError(false, SyncType.APPEND, 
 DEFAULT_WRITE_SIZE);
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433462#comment-13433462
 ] 

Hudson commented on HDFS-3789:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2638 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2638/])
HDFS-3789. JournalManager#format() should be able to throw IOException. 
Contributed by Ivan Kelly. (Revision 1372566)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372566
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java


 JournalManager#format() should be able to throw IOException
 ---

 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HDFS-3789.diff


 Currently JournalManager#format cannot throw any exception. As format can 
 fail, we should be able to propogate this failure upwards. Otherwise, format 
 will fail silently, and the admin will start using the cluster with a 
 failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3771) Namenode can't restart due to corrupt edit logs, timing issue with shutdown and edit log rolling

2012-08-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433466#comment-13433466
 ] 

Todd Lipcon commented on HDFS-3771:
---

Hey Patrick. I think this behavior might have been fixed in 2.0.0 already -- 
the empty file should get properly ignored and the NN should start up.

Perhaps you can instigate this failure again by adding System.exit(0) right 
before where {{START_LOG_SEGMENT}} is logged in 
{{startLogSegmentAndWriteHeaderTxn}}. That would allow you to see what the 
right recovery steps are.

The issue seems to be described in HDFS-2093... I think the following comment 
may be relevant:
{quote}
Thus in the situation above, where the only log we have is this corrupted one, 
it will refuse to let the NN start, with a nice message explaining that the 
logs starting at this txid are corrupt with no txns. The operator can then 
double-check whether a different storage drive which possibly went missing 
might have better logs, etc, before starting NN.
{quote}

Looking at your logs, it seems like you have only one edits directory. So the 
above probably applies, and you could successfully start by removing that last 
(empty) log segment.

bq. The larger concern should be for data loss. Based on what happened in this 
case it appears that any pending txids would be lost, unless the edit logs 
could be manually repaired. The filesystem would be intact, only minus the 
changes from the outstanding edit events, does that sound correct?

Only in-flight transactions could be lost -- ie those that were never ACKed 
to a client. Anything that has been ACKed would have been fsynced to the log, 
and thus not lost. So, after inspecting the segment to make sure there are 
truly no transactions, you should be able to remove it and start with no data 
loss or corruption whatsoever.


 Namenode can't restart due to corrupt edit logs, timing issue with shutdown 
 and edit log rolling
 

 Key: HDFS-3771
 URL: https://issues.apache.org/jira/browse/HDFS-3771
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.3, 2.0.0-alpha
 Environment: QE, 20 node Federated cluster with 3 NNs and 15 DNs, 
 using Kerberos based security
Reporter: patrick white
Priority: Critical

 Our 0.23.3 nightly HDFS regression suite encountered a particularly nasty 
 issue recently, which resulted in the cluster's default Namenode being unable 
 to restart, this was on a 20 node Federated cluster with security. The cause 
 appears to be that the NN was just starting to roll its edit log when a 
 shutdown occurred, the shutdown was intentional to restart the cluster as 
 part of an automated test.
 The tests that were running do not appear to be the issue in themselves, the 
 cluster was just wrapping up an adminReport subset and this failure case has 
 not reproduce so far, nor was it failing previously. It looks like a chance 
 occurrence of sending the shutdown just as the edit log roll was begun.
 From the NN log, the following sequence is noted:
 1. an InvalidateBlocks operation had completed
 2. FSNamesystem: Roll Edit Log from [Secondary Namenode IPaddr]
 3. FSEditLog: Ending log segment 23963
 4. FSEditLog: Starting log segment at 23967
 4. NameNode: SHUTDOWN_MSG
 = the NN shuts down and then is restarted...
 5. FSImageTransactionalStorageInspector: Logs beginning at txid 23967 were 
 are all in-progress
 6. FSImageTransactionalStorageInspector: Marking log at 
 /grid/[PATH]/edits_inprogress_0023967 as corrupt since it has no 
 transactions in it.
 7. NameNode: Exception in namenode join 
 [main]java.lang.IllegalStateException: No non-corrupt logs for txid 23967
 = NN start attempts continue to cycle trying to restart but can't, failing 
 on the same exception due to lack of non-corrupt edit logs
 If observations are correct and issue is from shutdown happening as edit logs 
 are rolling, does the NN have an equivalent to the conventional fs 'sync' 
 blocking action that should be called, or perhaps has a timing hole?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3789) JournalManager#format() should be able to throw IOException

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433497#comment-13433497
 ] 

Hudson commented on HDFS-3789:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2594 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2594/])
HDFS-3789. JournalManager#format() should be able to throw IOException. 
Contributed by Ivan Kelly. (Revision 1372566)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372566
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/main/java/org/apache/hadoop/contrib/bkjournal/BookKeeperJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestGenericJournalConf.java


 JournalManager#format() should be able to throw IOException
 ---

 Key: HDFS-3789
 URL: https://issues.apache.org/jira/browse/HDFS-3789
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: 3.0.0
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 3.0.0

 Attachments: HDFS-3789.diff


 Currently JournalManager#format cannot throw any exception. As format can 
 fail, we should be able to propogate this failure upwards. Otherwise, format 
 will fail silently, and the admin will start using the cluster with a 
 failed/unusable journal manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-08-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433506#comment-13433506
 ] 

Aaron T. Myers commented on HDFS-3672:
--

bq. I'd really encourage you to put this into the DataNode and throw an 
UnsupportedOperationException rather than merely do this via a client-side 
config.

That's fine by me. I don't feel super strongly about this, so if this is your 
preference Arun, let's go with that.

 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, 
 hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, 
 hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3765:
--

Attachment: hdfs-3765.txt

Trying patch upload again... this applies clean on trunk for me.

 Namenode INITIALIZESHAREDEDITS should be able to initialize all shared 
 storages
 ---

 Key: HDFS-3765
 URL: https://issues.apache.org/jira/browse/HDFS-3765
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, 
 hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt


 Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits 
 files to file schema based shared storages when moving cluster from Non-HA 
 environment to HA enabled environment.
 This Jira focuses on the following
 * Generalizing the logic of copying the edits to new shared storage so that 
 any schema based shared storage can initialized for HA cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2330) In NNStorage.java, IOExceptions of stream closures can mask root exceptions.

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2330:
--

Fix Version/s: 2.2.0-alpha

Backported this small fix to branch-2 to avoid some merge conflicts in further 
backports.

 In NNStorage.java, IOExceptions of stream closures  can mask root exceptions.
 -

 Key: HDFS-2330
 URL: https://issues.apache.org/jira/browse/HDFS-2330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: HDFS-2330.patch, HDFS-2330.patch


 In NNStorage.java:
   There are many stream closures in finally block. 
   There is a chance that they can mask the root exceptions.
 So, better to follow the pattern like below:
 {code}
   try{
  
  
  stream.close();
  stream =null;
}
finally{
  IOUtils.cleanup(LOG, stream);
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3190) Simple refactors in existing NN code to assist QuorumJournalManager extension

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3190:
--

Fix Version/s: 2.2.0-alpha

Backported this to branch-2, since it was causing some conflicts in other 
backports, and it's a straight refactor.

 Simple refactors in existing NN code to assist QuorumJournalManager extension
 -

 Key: HDFS-3190
 URL: https://issues.apache.org/jira/browse/HDFS-3190
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3190.txt, hdfs-3190.txt, hdfs-3190.txt, 
 hdfs-3190.txt, hdfs-3190.txt


 This JIRA is for some simple refactors in the NN:
 - refactor the code which writes the seen_txid file in NNStorage into a new 
 LongContainingFile utility class. This is useful for the JournalNode to 
 atomically/durably record its last promised epoch
 - refactor the interface from FileJournalManager back to StorageDirectory to 
 use a StorageErrorReport interface. This allows FileJournalManager to be used 
 in isolation of a full StorageDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3276) initializeSharedEdits should have a -nonInteractive flag

2012-08-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433527#comment-13433527
 ] 

Todd Lipcon commented on HDFS-3276:
---

Hudson built this here: https://builds.apache.org/job/PreCommit-HDFS-Build/2983/

but the comment was swallowed during JIRA downtime:

{quote}
-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540156/hdfs-3276.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.
{quote}

Looking into the new findbugs warnings.

 initializeSharedEdits should have a -nonInteractive flag
 

 Key: HDFS-3276
 URL: https://issues.apache.org/jira/browse/HDFS-3276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 2.0.0-alpha
Reporter: Vinithra Varadharajan
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3276.txt


 Similar to format and bootstrapStandby, would be nice to have -nonInteractive 
 as an option on initializeSharedEdits

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3787) BlockManager#close races with ReplicationMonitor#run

2012-08-13 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433530#comment-13433530
 ] 

Eli Collins commented on HDFS-3787:
---

I kicked the pre-commit build manually.

 BlockManager#close races with ReplicationMonitor#run
 

 Key: HDFS-3787
 URL: https://issues.apache.org/jira/browse/HDFS-3787
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
Priority: Minor
 Attachments: hdfs-3787-2.txt, hdfs-3787-2.txt, hdfs-3787.txt


 We saw {{TestDirectoryScanner}} fail during shutdown:
 {code}
 2012-08-09 12:17:19,844 WARN  datanode.DataNode 
 (BPServiceActor.java:run(683)) - Ending block pool service for: Block pool 
 BP-610123021-172.29.121.238-1344539835759 (storage id 
 DS-1581877160-172.29.121.238-43609-1344539837880) service to 
 localhost/127.0.0.1:40012
 ...
 2012-08-09 12:17:19,876 FATAL blockmanagement.BlockManager 
 (BlockManager.java:run(3039)) - ReplicationMonitor thread received Runtime 
 exception. 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:101)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1141)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1116)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3070)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3032)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 Inspecting the code, it appears that {{BlockManager#close - 
 BlocksMap#close}} can set {{blocks}} to {{null}} while 
 {{computeDatanodeWork}} is running.
 The fix seems simple -- have {{close}} just set an exit flag, and have 
 {{ReplicationMonitor#run}} call {{BlocksMap#close}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3792) Fix two findbugs introduced by HDFS-3695

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3792:
--

Attachment: hdfs-3792.txt

Trivial fix: forgot to add synchronized to these two methods and missed it in 
the QA report on HDFS-3695.

 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3792) Fix two findbugs introduced by HDFS-3695

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3792:
--

Status: Patch Available  (was: Open)

 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-08-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433538#comment-13433538
 ] 

Suresh Srinivas commented on HDFS-3672:
---

bq. Perhaps Storage(BlockLocation|Id)? Volume(BlockLocation|Id)? I'm not 
entirely sure of the end-user terminology here.
DiskBlockLocation could be BlockStorageLocation or just StorageLocation.
DiskId - StorageId seems appropriate here. However it is used for other things 
in HDFS. Als you suggested, perhaps VolumeId may be okay.

bq.  Should I just bump the default (say, to 10)? I haven't done any 
performance testing, so I don't know if it's a problem.
Only with this feature there will be more RPC calls to datanodes and hence may 
need more handlers. Handler is just a thread, so increasing it to 10 should be 
fine.

@aaron - need server side config as well. That is the only way an admin could 
control the accessibility to the feature. One could use exception/support for 
required method to figure out if server supports the functionality on client 
side instead of config.

Please address my previous comment:
bq. Is there a timeline where someone will work on HBase or MapReduce 
enhancements to use this capability?


 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, 
 hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, 
 hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3276) initializeSharedEdits should have a -nonInteractive flag

2012-08-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433539#comment-13433539
 ] 

Todd Lipcon commented on HDFS-3276:
---

The two new findbugs warnings are HDFS-3792 - not caused by this patch.

bq. -1 tests included. The patch doesn't appear to include any new or modified 
tests.

There are no new tests since this is just hooking up existing code (which is 
tested in TestInitializeSharedEdits) to command line flags. I manually tested 
the command line flags and verified they perform as expected.

 initializeSharedEdits should have a -nonInteractive flag
 

 Key: HDFS-3276
 URL: https://issues.apache.org/jira/browse/HDFS-3276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 2.0.0-alpha
Reporter: Vinithra Varadharajan
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3276.txt


 Similar to format and bootstrapStandby, would be nice to have -nonInteractive 
 as an option on initializeSharedEdits

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695

2012-08-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433544#comment-13433544
 ] 

Aaron T. Myers commented on HDFS-3792:
--

+1 pending Jenkins.

 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3276) initializeSharedEdits should have a -nonInteractive flag

2012-08-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433550#comment-13433550
 ] 

Aaron T. Myers commented on HDFS-3276:
--

+1, the patch looks good to me.

 initializeSharedEdits should have a -nonInteractive flag
 

 Key: HDFS-3276
 URL: https://issues.apache.org/jira/browse/HDFS-3276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 2.0.0-alpha
Reporter: Vinithra Varadharajan
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3276.txt


 Similar to format and bootstrapStandby, would be nice to have -nonInteractive 
 as an option on initializeSharedEdits

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3276) initializeSharedEdits should have a -nonInteractive flag

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3276:
--

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to branch-2 and trunk. Thanks.

 initializeSharedEdits should have a -nonInteractive flag
 

 Key: HDFS-3276
 URL: https://issues.apache.org/jira/browse/HDFS-3276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 2.0.0-alpha
Reporter: Vinithra Varadharajan
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3276.txt


 Similar to format and bootstrapStandby, would be nice to have -nonInteractive 
 as an option on initializeSharedEdits

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3793) Implement genericized format() in QJM

2012-08-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3793:
-

 Summary: Implement genericized format() in QJM
 Key: HDFS-3793
 URL: https://issues.apache.org/jira/browse/HDFS-3793
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon


HDFS-3695 added the ability for non-File journal managers to tie into calls 
like NameNode -format. This JIRA is to implement format() for 
QuorumJournalManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-08-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433582#comment-13433582
 ] 

Andrew Purtell commented on HDFS-3672:
--

bq. Is there a timeline where someone will work on HBase or MapReduce 
enhancements to use this capability?

I put up some ramblings on HBASE-6572. The scope is much larger and there's no 
timeline, it's a brainstorming issue. However, if you'd like this issue can be 
linked to it.

 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, 
 hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, 
 hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-13 Thread Ravi Prakash (JIRA)
Ravi Prakash created HDFS-3794:
--

 Summary: WebHDFS Open used with Offset returns the original (and 
incorrect) Content Length in the HTTP Header.
 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.0.0-alpha, 0.23.3, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash


When an offset is specified, the HTTP header Content Length still contains the 
original file size. e.g. if the original file is 100 bytes, and the offset 
specified it 10, then HTTP Content Length ought to be 90. Currently it is still 
returned as 100.
This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3793) Implement genericized format() in QJM

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3793:
--

Attachment: hdfs-3793.txt

Attached patch implements the formatting behavior.

In addition to changing the tests to use this new API to format at startup, I 
also tested this manually on a cluster using both namenode -format and 
namenode -initializeSharedEdits. Both the confirmation behavior and the 
formatting behavior reacted correctly.

 Implement genericized format() in QJM
 -

 Key: HDFS-3793
 URL: https://issues.apache.org/jira/browse/HDFS-3793
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-3793.txt


 HDFS-3695 added the ability for non-File journal managers to tie into calls 
 like NameNode -format. This JIRA is to implement format() for 
 QuorumJournalManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3723) All commands should support meaningful --help

2012-08-13 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3723:


Attachment: HDFS-3723.001.patch

Suresh, thanks for the comments. I have addressed the comments and added a help 
function in DFSUtil. I used the function to parse and check the help argument 
for commands DataNode, NameNode, ZKFC, FSCK, Balancer, GetConf, and GetGroups. 
Other commands such as JmxGet have their own mechanisms to handle help 
argument, so I did not change them.

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.001.patch, HDFS-3723.patch, HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433603#comment-13433603
 ] 

Ravi Prakash commented on HDFS-3794:


{noformat}
e.g. $ curl -L 
http://HOST:PORT/webhdfs/v1/somePath/someFile?op=OPENoffset=10; 
curl: (18) transfer closed with 10 bytes remaining to read
{noformat}


 WebHDFS Open used with Offset returns the original (and incorrect) Content 
 Length in the HTTP Header.
 -

 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash

 When an offset is specified, the HTTP header Content Length still contains 
 the original file size. e.g. if the original file is 100 bytes, and the 
 offset specified it 10, then HTTP Content Length ought to be 90. Currently it 
 is still returned as 100.
 This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3795) QJM: validate journal dir at startup

2012-08-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3795:
-

 Summary: QJM: validate journal dir at startup
 Key: HDFS-3795
 URL: https://issues.apache.org/jira/browse/HDFS-3795
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3795.txt

Currently, the JN does not validate the configured journal directory until it 
tries to write into it. This is counter-intuitive for users, since they would 
expect to find out about a misconfiguration at startup time, rather than on 
first access. Additionally, two testers accidentally configured the journal dir 
to be a URI, which the code accidentally understood as a relative path 
({{CWD/file:/foo/bar}}.

We should validate the config at startup to be an accessible absolute path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3795) QJM: validate journal dir at startup

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3795:
--

Attachment: hdfs-3795.txt

Simple patch attached.

 QJM: validate journal dir at startup
 

 Key: HDFS-3795
 URL: https://issues.apache.org/jira/browse/HDFS-3795
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3795.txt


 Currently, the JN does not validate the configured journal directory until it 
 tries to write into it. This is counter-intuitive for users, since they would 
 expect to find out about a misconfiguration at startup time, rather than on 
 first access. Additionally, two testers accidentally configured the journal 
 dir to be a URI, which the code accidentally understood as a relative path 
 ({{CWD/file:/foo/bar}}.
 We should validate the config at startup to be an accessible absolute path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-13 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-3794:
---

Attachment: HDFS-3794.patch

Attaching a patch that fixes the issue. Its too trivial to write a unit test 
(which will have to be pretty complicated :'( ... I tried briefly)
Here's the testing I did
1. Small file with offset. Worked
2. Big file (multiple blocks) with offset. Worked
3. Big file with offset greater than file size. Correctly threw a 
RemoteException

 WebHDFS Open used with Offset returns the original (and incorrect) Content 
 Length in the HTTP Header.
 -

 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-3794.patch


 When an offset is specified, the HTTP header Content Length still contains 
 the original file size. e.g. if the original file is 100 bytes, and the 
 offset specified it 10, then HTTP Content Length ought to be 90. Currently it 
 is still returned as 100.
 This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433609#comment-13433609
 ] 

Ravi Prakash commented on HDFS-3794:


The same patch applies to branch 0.23, 2, and trunk.


 WebHDFS Open used with Offset returns the original (and incorrect) Content 
 Length in the HTTP Header.
 -

 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-3794.patch


 When an offset is specified, the HTTP header Content Length still contains 
 the original file size. e.g. if the original file is 100 bytes, and the 
 offset specified it 10, then HTTP Content Length ought to be 90. Currently it 
 is still returned as 100.
 This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-13 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-3794:
---

Status: Patch Available  (was: Open)

 WebHDFS Open used with Offset returns the original (and incorrect) Content 
 Length in the HTTP Header.
 -

 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.0.0-alpha, 0.23.3, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-3794.patch


 When an offset is specified, the HTTP header Content Length still contains 
 the original file size. e.g. if the original file is 100 bytes, and the 
 offset specified it 10, then HTTP Content Length ought to be 90. Currently it 
 is still returned as 100.
 This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2330) In NNStorage.java, IOExceptions of stream closures can mask root exceptions.

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433613#comment-13433613
 ] 

Hudson commented on HDFS-2330:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2639 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2639/])
Move HDFS-2330 and HDFS-3190 to branch-2 section, since they have been 
backported from trunk. (Revision 1372605)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372605
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 In NNStorage.java, IOExceptions of stream closures  can mask root exceptions.
 -

 Key: HDFS-2330
 URL: https://issues.apache.org/jira/browse/HDFS-2330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: HDFS-2330.patch, HDFS-2330.patch


 In NNStorage.java:
   There are many stream closures in finally block. 
   There is a chance that they can mask the root exceptions.
 So, better to follow the pattern like below:
 {code}
   try{
  
  
  stream.close();
  stream =null;
}
finally{
  IOUtils.cleanup(LOG, stream);
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3276) initializeSharedEdits should have a -nonInteractive flag

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433614#comment-13433614
 ] 

Hudson commented on HDFS-3276:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2639 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2639/])
HDFS-3276. initializeSharedEdits should have a -nonInteractive flag. 
Contributed by Todd Lipcon. (Revision 1372628)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372628
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 initializeSharedEdits should have a -nonInteractive flag
 

 Key: HDFS-3276
 URL: https://issues.apache.org/jira/browse/HDFS-3276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 2.0.0-alpha
Reporter: Vinithra Varadharajan
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3276.txt


 Similar to format and bootstrapStandby, would be nice to have -nonInteractive 
 as an option on initializeSharedEdits

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3190) Simple refactors in existing NN code to assist QuorumJournalManager extension

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433612#comment-13433612
 ] 

Hudson commented on HDFS-3190:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2639 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2639/])
Move HDFS-2330 and HDFS-3190 to branch-2 section, since they have been 
backported from trunk. (Revision 1372605)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372605
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Simple refactors in existing NN code to assist QuorumJournalManager extension
 -

 Key: HDFS-3190
 URL: https://issues.apache.org/jira/browse/HDFS-3190
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3190.txt, hdfs-3190.txt, hdfs-3190.txt, 
 hdfs-3190.txt, hdfs-3190.txt


 This JIRA is for some simple refactors in the NN:
 - refactor the code which writes the seen_txid file in NNStorage into a new 
 LongContainingFile utility class. This is useful for the JournalNode to 
 atomically/durably record its last promised epoch
 - refactor the interface from FileJournalManager back to StorageDirectory to 
 use a StorageErrorReport interface. This allows FileJournalManager to be used 
 in isolation of a full StorageDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2330) In NNStorage.java, IOExceptions of stream closures can mask root exceptions.

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433624#comment-13433624
 ] 

Hudson commented on HDFS-2330:
--

Integrated in Hadoop-Common-trunk-Commit #2574 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2574/])
Move HDFS-2330 and HDFS-3190 to branch-2 section, since they have been 
backported from trunk. (Revision 1372605)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372605
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 In NNStorage.java, IOExceptions of stream closures  can mask root exceptions.
 -

 Key: HDFS-2330
 URL: https://issues.apache.org/jira/browse/HDFS-2330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: HDFS-2330.patch, HDFS-2330.patch


 In NNStorage.java:
   There are many stream closures in finally block. 
   There is a chance that they can mask the root exceptions.
 So, better to follow the pattern like below:
 {code}
   try{
  
  
  stream.close();
  stream =null;
}
finally{
  IOUtils.cleanup(LOG, stream);
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3190) Simple refactors in existing NN code to assist QuorumJournalManager extension

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433623#comment-13433623
 ] 

Hudson commented on HDFS-3190:
--

Integrated in Hadoop-Common-trunk-Commit #2574 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2574/])
Move HDFS-2330 and HDFS-3190 to branch-2 section, since they have been 
backported from trunk. (Revision 1372605)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372605
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Simple refactors in existing NN code to assist QuorumJournalManager extension
 -

 Key: HDFS-3190
 URL: https://issues.apache.org/jira/browse/HDFS-3190
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3190.txt, hdfs-3190.txt, hdfs-3190.txt, 
 hdfs-3190.txt, hdfs-3190.txt


 This JIRA is for some simple refactors in the NN:
 - refactor the code which writes the seen_txid file in NNStorage into a new 
 LongContainingFile utility class. This is useful for the JournalNode to 
 atomically/durably record its last promised epoch
 - refactor the interface from FileJournalManager back to StorageDirectory to 
 use a StorageErrorReport interface. This allows FileJournalManager to be used 
 in isolation of a full StorageDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3276) initializeSharedEdits should have a -nonInteractive flag

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433625#comment-13433625
 ] 

Hudson commented on HDFS-3276:
--

Integrated in Hadoop-Common-trunk-Commit #2574 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2574/])
HDFS-3276. initializeSharedEdits should have a -nonInteractive flag. 
Contributed by Todd Lipcon. (Revision 1372628)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372628
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 initializeSharedEdits should have a -nonInteractive flag
 

 Key: HDFS-3276
 URL: https://issues.apache.org/jira/browse/HDFS-3276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 2.0.0-alpha
Reporter: Vinithra Varadharajan
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3276.txt


 Similar to format and bootstrapStandby, would be nice to have -nonInteractive 
 as an option on initializeSharedEdits

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3190) Simple refactors in existing NN code to assist QuorumJournalManager extension

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433628#comment-13433628
 ] 

Hudson commented on HDFS-3190:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2596 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2596/])
Move HDFS-2330 and HDFS-3190 to branch-2 section, since they have been 
backported from trunk. (Revision 1372605)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372605
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Simple refactors in existing NN code to assist QuorumJournalManager extension
 -

 Key: HDFS-3190
 URL: https://issues.apache.org/jira/browse/HDFS-3190
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3190.txt, hdfs-3190.txt, hdfs-3190.txt, 
 hdfs-3190.txt, hdfs-3190.txt


 This JIRA is for some simple refactors in the NN:
 - refactor the code which writes the seen_txid file in NNStorage into a new 
 LongContainingFile utility class. This is useful for the JournalNode to 
 atomically/durably record its last promised epoch
 - refactor the interface from FileJournalManager back to StorageDirectory to 
 use a StorageErrorReport interface. This allows FileJournalManager to be used 
 in isolation of a full StorageDirectory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2330) In NNStorage.java, IOExceptions of stream closures can mask root exceptions.

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433629#comment-13433629
 ] 

Hudson commented on HDFS-2330:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2596 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2596/])
Move HDFS-2330 and HDFS-3190 to branch-2 section, since they have been 
backported from trunk. (Revision 1372605)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372605
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 In NNStorage.java, IOExceptions of stream closures  can mask root exceptions.
 -

 Key: HDFS-2330
 URL: https://issues.apache.org/jira/browse/HDFS-2330
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: HDFS-2330.patch, HDFS-2330.patch


 In NNStorage.java:
   There are many stream closures in finally block. 
   There is a chance that they can mask the root exceptions.
 So, better to follow the pattern like below:
 {code}
   try{
  
  
  stream.close();
  stream =null;
}
finally{
  IOUtils.cleanup(LOG, stream);
}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3723) All commands should support meaningful --help

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433630#comment-13433630
 ] 

Hadoop QA commented on HDFS-3723:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540767/HDFS-3723.001.patch
  against trunk revision .

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2993//console

This message is automatically generated.

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.001.patch, HDFS-3723.patch, HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3150) Add option for clients to contact DNs via hostname

2012-08-13 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3150:
--

Attachment: hdfs-3150.txt

Patch attached again since it looks like jira lost the old version.

 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode 
 identifiers.
 New client and Datanode configuration options are introduced:
 - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
 connections should use the datanode hostname (as clients outside cluster may 
 not be able to route the IP)
 - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
 use hostnames when connecting to other Datanodes for data transfer
 If the configuration options are not used, there is no change in the current 
 behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3796) Speed up edit log tests by avoiding fsync()

2012-08-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3796:
-

 Summary: Speed up edit log tests by avoiding fsync()
 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor


Our edit log tests are very slow because they incur a lot of fsyncs as they 
write out transactions. Since fsync() has no effect except in the case of power 
outages or system crashes, and we don't care about power outages in the context 
of tests, we can safely skip the fsync without any loss in coverage.

In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
case improved from ~83 seconds with fsync to about 5 seconds without. These 
results are from my SSD laptop - they are probably even more drastic on 
spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3796) Speed up edit log tests by avoiding fsync()

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3796:
--

Status: Patch Available  (was: Open)

 Speed up edit log tests by avoiding fsync()
 ---

 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3796.txt


 Our edit log tests are very slow because they incur a lot of fsyncs as they 
 write out transactions. Since fsync() has no effect except in the case of 
 power outages or system crashes, and we don't care about power outages in the 
 context of tests, we can safely skip the fsync without any loss in coverage.
 In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
 case improved from ~83 seconds with fsync to about 5 seconds without. These 
 results are from my SSD laptop - they are probably even more drastic on 
 spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3796) Speed up edit log tests by avoiding fsync()

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3796:
--

Attachment: hdfs-3796.txt

 Speed up edit log tests by avoiding fsync()
 ---

 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3796.txt


 Our edit log tests are very slow because they incur a lot of fsyncs as they 
 write out transactions. Since fsync() has no effect except in the case of 
 power outages or system crashes, and we don't care about power outages in the 
 context of tests, we can safely skip the fsync without any loss in coverage.
 In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
 case improved from ~83 seconds with fsync to about 5 seconds without. These 
 results are from my SSD laptop - they are probably even more drastic on 
 spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3797) QJM: add segment txid as a parameter to journal() RPC

2012-08-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3797:
-

 Summary: QJM: add segment txid as a parameter to journal() RPC
 Key: HDFS-3797
 URL: https://issues.apache.org/jira/browse/HDFS-3797
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor


During fault testing of QJM, I saw the following issue:

1) NN sends txn 5 to JN
2) NN gets partitioned from JN while JN remains up. The next two RPCs are 
missed while the partition has happened:
2a) finalizeSegment(1-5)
2b) startSegment(6)
3) NN sends txn 6 to JN

This caused one of the JNs to end up with a segment 1-10 while the others had 
two segments; 1-5 and 6-10. This broke some invariants of the QJM protocol and 
prevented the recovery protocol from running properly.

This can be addressed on the client side by HDFS-3726, which would cause the NN 
to not send the RPC in #3. But it makes sense to also add an extra safety check 
here on the server side: with every journal() call, we can send the segment's 
txid. Then if the JN and the client get out of sync, the JN can reject the 
RPCs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3796) Speed up edit log tests by avoiding fsync()

2012-08-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433692#comment-13433692
 ] 

Suresh Srinivas commented on HDFS-3796:
---

Todd, do multiple junit tests reuse JVM? If so, you are better off adding this 
to @BeforeClass and @AfterClass?

 Speed up edit log tests by avoiding fsync()
 ---

 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3796.txt


 Our edit log tests are very slow because they incur a lot of fsyncs as they 
 write out transactions. Since fsync() has no effect except in the case of 
 power outages or system crashes, and we don't care about power outages in the 
 context of tests, we can safely skip the fsync without any loss in coverage.
 In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
 case improved from ~83 seconds with fsync to about 5 seconds without. These 
 results are from my SSD laptop - they are probably even more drastic on 
 spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3798) Avoid throwing NPE when finalizeSegment() is called on invalid segment

2012-08-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3798:
-

 Summary: Avoid throwing NPE when finalizeSegment() is called on 
invalid segment
 Key: HDFS-3798
 URL: https://issues.apache.org/jira/browse/HDFS-3798
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial


Currently, if the client calls finalizeLogSegment() on a segment which doesn't 
exist on the JournalNode side, it throws an NPE. Instead it should throw a more 
intelligible exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3798) Avoid throwing NPE when finalizeSegment() is called on invalid segment

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3798:
--

Attachment: hdfs-3798.txt

 Avoid throwing NPE when finalizeSegment() is called on invalid segment
 --

 Key: HDFS-3798
 URL: https://issues.apache.org/jira/browse/HDFS-3798
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Attachments: hdfs-3798.txt


 Currently, if the client calls finalizeLogSegment() on a segment which 
 doesn't exist on the JournalNode side, it throws an NPE. Instead it should 
 throw a more intelligible exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3276) initializeSharedEdits should have a -nonInteractive flag

2012-08-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433696#comment-13433696
 ] 

Hudson commented on HDFS-3276:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2597 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2597/])
HDFS-3276. initializeSharedEdits should have a -nonInteractive flag. 
Contributed by Todd Lipcon. (Revision 1372628)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372628
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 initializeSharedEdits should have a -nonInteractive flag
 

 Key: HDFS-3276
 URL: https://issues.apache.org/jira/browse/HDFS-3276
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 2.0.0-alpha
Reporter: Vinithra Varadharajan
Assignee: Todd Lipcon
Priority: Minor
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3276.txt


 Similar to format and bootstrapStandby, would be nice to have -nonInteractive 
 as an option on initializeSharedEdits

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3796) Speed up edit log tests by avoiding fsync()

2012-08-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433699#comment-13433699
 ] 

Todd Lipcon commented on HDFS-3796:
---

Hey Suresh. Nope, each junit class file runs in its own JVM. We make use of the 
static {} pattern for setting log levels as well, so I think this should be 
considered equivalent.

 Speed up edit log tests by avoiding fsync()
 ---

 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3796.txt


 Our edit log tests are very slow because they incur a lot of fsyncs as they 
 write out transactions. Since fsync() has no effect except in the case of 
 power outages or system crashes, and we don't care about power outages in the 
 context of tests, we can safely skip the fsync without any loss in coverage.
 In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
 case improved from ~83 seconds with fsync to about 5 seconds without. These 
 results are from my SSD laptop - they are probably even more drastic on 
 spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3796) Speed up edit log tests by avoiding fsync()

2012-08-13 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433704#comment-13433704
 ] 

Suresh Srinivas commented on HDFS-3796:
---

Well I thought we use a specific LOG to do that.

+1 for the patch.

 Speed up edit log tests by avoiding fsync()
 ---

 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3796.txt


 Our edit log tests are very slow because they incur a lot of fsyncs as they 
 write out transactions. Since fsync() has no effect except in the case of 
 power outages or system crashes, and we don't care about power outages in the 
 context of tests, we can safely skip the fsync without any loss in coverage.
 In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
 case improved from ~83 seconds with fsync to about 5 seconds without. These 
 results are from my SSD laptop - they are probably even more drastic on 
 spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433710#comment-13433710
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3794:
--

+1 on the patch.  Good catch!

 WebHDFS Open used with Offset returns the original (and incorrect) Content 
 Length in the HTTP Header.
 -

 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-3794.patch


 When an offset is specified, the HTTP header Content Length still contains 
 the original file size. e.g. if the original file is 100 bytes, and the 
 offset specified it 10, then HTTP Content Length ought to be 90. Currently it 
 is still returned as 100.
 This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433716#comment-13433716
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3788:
--

How about first check the transfer-encoding, if it is chunked, then no 
content-length check?

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3766) Release stream and storage directory for removed streams, and fix TestStorageRestore on Windows

2012-08-13 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3766:
-

Attachment: HDFS-3766.patch

 Release stream and storage directory for removed streams, and fix 
 TestStorageRestore on Windows
 ---

 Key: HDFS-3766
 URL: https://issues.apache.org/jira/browse/HDFS-3766
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 1-win
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-3766.patch


 When a storage directory is removed, namenode doesn't close the stream and 
 storage directory is remained locked. This could fail later on the restoring 
 storage directory function because namenode will not be able to format 
 original directory. Unlike Linux, Windows doesn't allow deleting a file or 
 directory which is opened with no share/delete permission by a different 
 process.
 Similar problem also caused TestStorageRestore to fail because it can't 
 delete the directories/files being used by the test itself. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3766) Release stream and storage directory for removed streams, and fix TestStorageRestore on Windows

2012-08-13 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433723#comment-13433723
 ] 

Brandon Li commented on HDFS-3766:
--

Patch uploaded for branch-1-win

 Release stream and storage directory for removed streams, and fix 
 TestStorageRestore on Windows
 ---

 Key: HDFS-3766
 URL: https://issues.apache.org/jira/browse/HDFS-3766
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 1-win
Reporter: Brandon Li
Assignee: Brandon Li
 Attachments: HDFS-3766.patch


 When a storage directory is removed, namenode doesn't close the stream and 
 storage directory is remained locked. This could fail later on the restoring 
 storage directory function because namenode will not be able to format 
 original directory. Unlike Linux, Windows doesn't allow deleting a file or 
 directory which is opened with no share/delete permission by a different 
 process.
 Similar problem also caused TestStorageRestore to fail because it can't 
 delete the directories/files being used by the test itself. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname

2012-08-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433724#comment-13433724
 ] 

Aaron T. Myers commented on HDFS-3150:
--

The trunk patch looks pretty good to me. One little comment:

bq. @param useHostname if name should use a hostname or IP

This comment reads a little funny. Maybe true to use the hostname of the DN, 
false to use the IP address.

+1 once this is addressed.

 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode 
 identifiers.
 New client and Datanode configuration options are introduced:
 - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
 connections should use the datanode hostname (as clients outside cluster may 
 not be able to route the IP)
 - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
 use hostnames when connecting to other Datanodes for data transfer
 If the configuration options are not used, there is no change in the current 
 behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433736#comment-13433736
 ] 

Hadoop QA commented on HDFS-3790:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540713/HDFS-3790.001.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2991//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2991//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2991//console

This message is automatically generated.

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Affects Versions: 2.2.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5

2012-08-13 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3790:
-

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've just committed this to trunk and branch-2.

Thanks a lot for fixing this, Colin.

 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Affects Versions: 2.2.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3795) QJM: validate journal dir at startup

2012-08-13 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433773#comment-13433773
 ] 

Aaron T. Myers commented on HDFS-3795:
--

# Instead of !dir.getPath().startsWith(/) how about !dis.isAbsolute() ?
# If the path is not a directory, this will fail with a misleading error 
message:
{code}
+if (!dir.isDirectory()  !dir.mkdirs()) {
+  throw new IOException(Could not create journal dir ' +
+  dir + ');
+}
{code}

Patch looks good otherwise.

 QJM: validate journal dir at startup
 

 Key: HDFS-3795
 URL: https://issues.apache.org/jira/browse/HDFS-3795
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3795.txt


 Currently, the JN does not validate the configured journal directory until it 
 tries to write into it. This is counter-intuitive for users, since they would 
 expect to find out about a misconfiguration at startup time, rather than on 
 first access. Additionally, two testers accidentally configured the journal 
 dir to be a URI, which the code accidentally understood as a relative path 
 ({{CWD/file:/foo/bar}}.
 We should validate the config at startup to be an accessible absolute path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-08-13 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-3672:
--

Attachment: hdfs-3672-9.patch

 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, 
 hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, 
 hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch, hdfs-3672-9.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433787#comment-13433787
 ] 

Hadoop QA commented on HDFS-3792:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540757/hdfs-3792.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2994//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2994//console

This message is automatically generated.

 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-08-13 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433788#comment-13433788
 ] 

Andrew Wang commented on HDFS-3672:
---

Thanks everyone for all your input! Here's another spin of the patch. Big 
things:

* I renamed the Disk* classes to BlockStorageLocation and VolumeId, and tried 
to update all the javadoc/comments.
* I split out most of the DFSClient code into a new BlockStorageLocationUtil 
class, which is ~300 lines of static methods. I pulled apart one of the long 
methods. Doing this for the other long method would arguably be messier, so I 
left it.
* Added the DN-side config option. If any of the DNs throws an 
UnsupportedOperationException, it's bubbled up to the client (thus failing the 
entire call). The client-side code also checks for the same DN config option, 
so you need to enable it in both the client and DN for this to do anything.
* Bumped the DN handler count to 10.

I think Suresh's other more minor comments are also addressed.

 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, 
 hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, 
 hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch, hdfs-3672-9.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3795) QJM: validate journal dir at startup

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3795:
--

Attachment: hdfs-3795.txt

Updated patch addresses ATM's feedback.

 QJM: validate journal dir at startup
 

 Key: HDFS-3795
 URL: https://issues.apache.org/jira/browse/HDFS-3795
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3795.txt, hdfs-3795.txt


 Currently, the JN does not validate the configured journal directory until it 
 tries to write into it. This is counter-intuitive for users, since they would 
 expect to find out about a misconfiguration at startup time, rather than on 
 first access. Additionally, two testers accidentally configured the journal 
 dir to be a URI, which the code accidentally understood as a relative path 
 ({{CWD/file:/foo/bar}}.
 We should validate the config at startup to be an accessible absolute path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3792) Fix two findbugs introduced by HDFS-3695

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3792:
--

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, thx for review, sorry for missing this.

 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Fix For: 3.0.0

 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages

2012-08-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433797#comment-13433797
 ] 

Hadoop QA commented on HDFS-3765:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12540746/hdfs-3765.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2995//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2995//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2995//console

This message is automatically generated.

 Namenode INITIALIZESHAREDEDITS should be able to initialize all shared 
 storages
 ---

 Key: HDFS-3765
 URL: https://issues.apache.org/jira/browse/HDFS-3765
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, 
 hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt


 Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits 
 files to file schema based shared storages when moving cluster from Non-HA 
 environment to HA enabled environment.
 This Jira focuses on the following
 * Generalizing the logic of copying the edits to new shared storage so that 
 any schema based shared storage can initialized for HA cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages

2012-08-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433800#comment-13433800
 ] 

Todd Lipcon commented on HDFS-3765:
---

The findbugs warnings are from HDFS-3695. They have already been fixed in 
HDFS-3792 (committed just before this QA report)

 Namenode INITIALIZESHAREDEDITS should be able to initialize all shared 
 storages
 ---

 Key: HDFS-3765
 URL: https://issues.apache.org/jira/browse/HDFS-3765
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, 
 hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt


 Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits 
 files to file schema based shared storages when moving cluster from Non-HA 
 environment to HA enabled environment.
 This Jira focuses on the following
 * Generalizing the logic of copying the edits to new shared storage so that 
 any schema based shared storage can initialized for HA cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3799) QJM: handle empty log segments during recovery

2012-08-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3799:
-

 Summary: QJM: handle empty log segments during recovery
 Key: HDFS-3799
 URL: https://issues.apache.org/jira/browse/HDFS-3799
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon


One of the cases not yet handled in the QJM branch is the one where either the 
writer or the journal node crashes after startLogSegment() but before it has 
written its first transaction to the log. We currently have TODO assertions in 
the code which fire in these cases.

This JIRA is to deal with these cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3799) QJM: handle empty log segments during recovery

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3799:
--

Attachment: hdfs-3799.txt

The solution is as follows:
- during recovery, when we validate a log, if the log has no transactions, then 
we remove the file (same as if the log segment was never started)
- when coordinating recovery, if none of the loggers have any non-empty logs, 
then we don't have to take any action. We can simply treat the recovery as a 
no-op.

 QJM: handle empty log segments during recovery
 --

 Key: HDFS-3799
 URL: https://issues.apache.org/jira/browse/HDFS-3799
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-3799.txt


 One of the cases not yet handled in the QJM branch is the one where either 
 the writer or the journal node crashes after startLogSegment() but before it 
 has written its first transaction to the log. We currently have TODO 
 assertions in the code which fire in these cases.
 This JIRA is to deal with these cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3723) All commands should support meaningful --help

2012-08-13 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3723:


Attachment: HDFS-3723.001.patch

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.patch, HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3800) QJM: improvements to QJM fault testing

2012-08-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3800:
-

 Summary: QJM: improvements to QJM fault testing
 Key: HDFS-3800
 URL: https://issues.apache.org/jira/browse/HDFS-3800
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon


This JIRA improves TestQJMWithFaults as follows:
- the current implementation didn't properly unwrap exceptions thrown by the 
reflection-based injection method. This caused some issues in the code where 
the injecting proxy didn't act quite like the original object.
- the current implementation incorrectly assumed that the recovery process 
would recover to _exactly_ the last acked sequence number. In fact, it may 
recover to that transaction _or any greater transaction_.

It also adds a new randomized test which uncovered a number of other bugs. I 
will defer to the included javadoc for a description of this test.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3800) QJM: improvements to QJM fault testing

2012-08-13 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3800:
--

Attachment: hdfs-3800.txt

This patch applies on top of the following other staged patches:

pick fac7f15 HDFS-3796. Allow EditLogFileOutputStream to skip fsync() in tests
pick 5a18397 HDFS-3765. initializeSharedEdits 
pick 85978e7 HDFS-3793. Implement format() for QJM
pick 4fc442e HDFS-3795. Validate journal dir at startup
pick f2da880 HDFS-3798. Avoid throwing NPE if finalizeLogSegment() is called on 
an invalid segment
pick b0d1a3d HDFS-3799. deal with empty files in recovery path
pick d792847 HDFS-3797. Make journal() call take segmentTxId as parameter

The new randomized test requires a few more patches on top before it passes 
reliably. However, I'd like to check it in and then get it passing reliably in 
follow-up JIRAs.

 QJM: improvements to QJM fault testing
 --

 Key: HDFS-3800
 URL: https://issues.apache.org/jira/browse/HDFS-3800
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-3800.txt


 This JIRA improves TestQJMWithFaults as follows:
 - the current implementation didn't properly unwrap exceptions thrown by the 
 reflection-based injection method. This caused some issues in the code where 
 the injecting proxy didn't act quite like the original object.
 - the current implementation incorrectly assumed that the recovery process 
 would recover to _exactly_ the last acked sequence number. In fact, it may 
 recover to that transaction _or any greater transaction_.
 It also adds a new randomized test which uncovered a number of other bugs. I 
 will defer to the included javadoc for a description of this test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3586) Blocks are not getting replicate even DN's are availble.

2012-08-13 Thread Han Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433835#comment-13433835
 ] 

Han Xiao commented on HDFS-3586:


Uma, you are right. In my comments, there also be a condition (no in 
expression) as as long as good one exists(). 
Your description is exactly coninsiding with my ideas. Hope it will be done.

 Blocks are not getting replicate even DN's are availble.
 

 Key: HDFS-3586
 URL: https://issues.apache.org/jira/browse/HDFS-3586
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, name-node
Affects Versions: 2.0.0-alpha, 2.1.0-alpha, 3.0.0
Reporter: Brahma Reddy Battula
Assignee: amith
 Attachments: HDFS-3586-analysis.txt


 Scenario:
 =
 Started four DN's(Say DN1,DN2,DN3 and DN4)
 writing files with RF=3..
 formed pipeline with DN1-DN2-DN3.
 Since DN3 network is very slow.it's not able to send acks.
 Again pipeline is fromed with DN1-DN2-DN4.
 Here DN4 network is also slow.
 So finally commitblocksync happend tp DN1 and DN2 successfully.
 block present in all the four DN's(finalized state in two DN's and rbw state 
 in another DN's)..
 Here NN is asking replicate to DN3 and DN4,but it's failing since replcia's 
 are already present in RBW dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-13 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3788:
-

Attachment: h3788_20120813.patch

h3788_20120813.patch: check content-length only for non-chunked transfer 
encoding.

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >