date:20120814

[
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433967#comment-13433967
]

Hadoop QA commented on HDFS-3672:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540815/hdfs-3672-9.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.TestFsck

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/3001//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3001//console

This message is automatically generated.

Expose disk-location information for blocks to enable better scheduling
---

Key: HDFS-3672
URL: https://issues.apache.org/jira/browse/HDFS-3672
Project: Hadoop HDFS
Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch,
hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch,
hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch, hdfs-3672-9.patch

Currently, HDFS exposes on which datanodes a block resides, which allows
clients to make scheduling decisions for locality and load balancing.
Extending this to also expose on which disk on a datanode a block resides
would enable even better scheduling, on a per-disk rather than coarse
per-datanode basis.
This API would likely look similar to Filesystem#getFileBlockLocations, but
also involve a series of RPCs to the responsible datanodes to determine disk
ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3723) All commands should support meaningful --help

[
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433977#comment-13433977
]

Hadoop QA commented on HDFS-3723:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540826/HDFS-3723.001.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.ha.TestHAAdmin
org.apache.hadoop.ha.TestZKFailoverController
org.apache.hadoop.hdfs.tools.TestDFSHAAdmin

org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/3000//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3000//console

This message is automatically generated.

All commands should support meaningful --help
-

Key: HDFS-3723
URL: https://issues.apache.org/jira/browse/HDFS-3723
Project: Hadoop HDFS
Issue Type: Improvement
Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch,
HDFS-3723.patch, HDFS-3723.patch

Some (sub)commands support -help or -h options for detailed help while others
do not. Ideally, all commands should support meaningful help that works
regardless of current state or configuration.
For example, hdfs zkfc --help (or -h or -help) is not very useful. Option
checking should occur before state / configuration checking.
{code}
[esammer@hadoop-fed01 ~]# hdfs zkfc --help
Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException:
HA is not enabled for this namenode.
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
{code}
This would go a long way toward better usability for ops staff.

[jira] [Updated] (HDFS-3723) All commands should support meaningful --help


 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3723:


Attachment: HDFS-3723.002.patch

Updated TestDFSHAAdmin and TestHAAdmin. The expected help information now may 
also be contained in normal output (originally only contained in erroutput)

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.002.patch, HDFS-3723.patch, HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname

[
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434046#comment-13434046
]

Hadoop QA commented on HDFS-3150:
-

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540836/hdfs-3150.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 10 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/3002//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3002//console

This message is automatically generated.

Add option for clients to contact DNs via hostname
--

Key: HDFS-3150
URL: https://issues.apache.org/jira/browse/HDFS-3150
Project: Hadoop HDFS
Issue Type: New Feature
Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
Fix For: 1.1.0

Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt,
hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt

The DN listens on multiple IP addresses (the default {{dfs.datanode.address}}
is the wildcard) however per HADOOP-6867 only the source address (IP) of the
registration is given to clients. HADOOP-985 made clients access datanodes by
IP primarily to avoid the latency of a DNS lookup, this had the side effect
of breaking DN multihoming (the client can not route the IP exposed by the NN
if the DN registers with an interface that has a cluster-private IP). To fix
this let's add back the option for Datanodes to be accessed by hostname.
This can be done by:
# Modifying the primary field of the Datanode descriptor to be the hostname,
or
# Modifying Client/Datanode - Datanode access use the hostname field
instead of the IP
Approach #2 does not require an incompatible client protocol change, and is
much less invasive. It minimizes the scope of modification to just places
where clients and Datanodes connect, vs changing all uses of Datanode
identifiers.
New client and Datanode configuration options are introduced:
- {{dfs.client.use.datanode.hostname}} indicates all client to datanode
connections should use the datanode hostname (as clients outside cluster may
not be able to route the IP)
- {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should
use hostnames when connecting to other Datanodes for data transfer
If the configuration options are not used, there is no change in the current
behavior.

[jira] [Commented] (HDFS-3801) Provide a way to disable browsing of files from the web UI

2012-08-14 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434064#comment-13434064
 ] 

Harsh J commented on HDFS-3801:
---

Suresh - The general need seems to be to prevent users external to the group 
that uses HDFS to read/browse files.

Now this can be done by enabling the kerberos hadoop.http.authentication.type 
but not many users need the web file browsing facility itself, and hence it 
would be beneficial if this can be toggled off to prevent anyone (in or out of 
the group). This would also help as a toggle on non-secure installations.

 Provide a way to disable browsing of files from the web UI
 --

 Key: HDFS-3801
 URL: https://issues.apache.org/jira/browse/HDFS-3801
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Priority: Minor

 A few times we've had requests from users who wish to disable browsing of the 
 filesystem in the web UI completely, while keeping other servlet 
 functionality enabled (such as fsck, etc.). Right now, the cheap way to do 
 this is by blocking out the DN web port (50075) from access by clients, but 
 that also hampers HFTP transfers.
 We should instead provide a toggle config for the JSPs to use and disallow 
 browsing if the toggle's enabled. The config can be true by default, to not 
 change the behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

[
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434098#comment-13434098
]

Daryn Sharp commented on HDFS-3788:
---

bq. Nicholas: How about first check the transfer-encoding, if it is chunked,
then no content-length check?
Exactly. However, you need to update the patch to check both
Transfer-Encoding and TE headers, and the headers may contain multiple
comma separated values. I haven't tested, but I would expect Java's input
stream for chunked responses to throw an EOF exception if the connection is
broken so you might want to add a test for that.

bq. Eli: Note that a get of a 3gb file works but not distcp, what path is
different?
The code paths should be identical since it's the creation of the input stream
that does the content-length check. I can't see how distcp could possibly work
unless distcp is not using the filesystem class...

distcp can't copy large files using webhdfs due to missing Content-Length
header

Key: HDFS-3788
URL: https://issues.apache.org/jira/browse/HDFS-3788
Project: Hadoop HDFS
Issue Type: Bug
Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
Priority: Critical
Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch

The following command fails when data1 contains a 3gb file. It passes when
using hftp or when the directory just contains smaller (2gb) files, so looks
like a webhdfs issue with large files.
{{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1
hdfs://localhost:8020/user/eli/data2}}

[jira] [Commented] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.


[ 
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434102#comment-13434102
 ] 

Daryn Sharp commented on HDFS-3794:
---

Shouldn't it check if length - offset goes negative?  Or is that checked 
elsewhere?

 WebHDFS Open used with Offset returns the original (and incorrect) Content 
 Length in the HTTP Header.
 -

 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-3794.patch


 When an offset is specified, the HTTP header Content Length still contains 
 the original file size. e.g. if the original file is 100 bytes, and the 
 offset specified it 10, then HTTP Content Length ought to be 90. Currently it 
 is still returned as 100.
 This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname


[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434112#comment-13434112
 ] 

Daryn Sharp commented on HDFS-3150:
---

Are you intending to address the token issues too?  Or did I overlook that in 
my skim of the new patch?  I'm still wondering if we should deprecate use_ip 
and unify that and the two new keys?  Thoughts?

 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode 
 identifiers.
 New client and Datanode configuration options are introduced:
 - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
 connections should use the datanode hostname (as clients outside cluster may 
 not be able to route the IP)
 - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
 use hostnames when connecting to other Datanodes for data transfer
 If the configuration options are not used, there is no change in the current 
 behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade


 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3597:
--

Fix Version/s: 0.23.3

I've committed to 23.

 SNN can fail to start on upgrade
 

 Key: HDFS-3597
 URL: https://issues.apache.org/jira/browse/HDFS-3597
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Andy Isaacson
Assignee: Andy Isaacson
Priority: Minor
 Fix For: 0.23.3, 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
 hdfs-3597.txt


 When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
 {code}
 2012-06-16 09:52:33,812 ERROR 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
 doCheckpoint
 java.io.IOException: Inconsistent checkpoint fields.
 LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
 CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
 BP-1792677198-172.29.121.67-1339813967723.
 Expecting respectively: -19; 64415959; 0; ; .
 at 
 org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
 at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
 at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
 at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
 at 
 org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
 at 
 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The error check we're hitting came from HDFS-1073, and it's intended to 
 verify that we're connecting to the correct NN.  But the check is too strict 
 and considers different metadata version to be the same as different 
 clusterID.
 I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
 and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

2012-08-14 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434190#comment-13434190
 ] 

Robert Joseph Evans commented on HDFS-3731:
---

I am not an HDFS expert but the patch looks good to me. +1 non-binding. 

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-14 Thread Ravi Prakash (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ravi Prakash updated HDFS-3794:
---

Attachment: HDFS-3794.patch

Thanks a lot Nicholas! I'm afraid I don't know enough about the code. I'll
defer to you on this! I'm attaching the modified patch with the change you
suggested.

Thanks Daryn. It discovers an out of range offset and throws an exception
before reaching this method.
{noformat}
$ curl -L
http://HOST:PORT/webhdfs/v1/somePath/someFile?op=OPENoffset=457236547;
{RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=457236547
out of the range [0, 457236477); OPEN, path=/somePath/someFile}}
{noformat}

WebHDFS Open used with Offset returns the original (and incorrect) Content
Length in the HTTP Header.
-

Key: HDFS-3794
URL: https://issues.apache.org/jira/browse/HDFS-3794
Project: Hadoop HDFS
Issue Type: Bug
Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Attachments: HDFS-3794.patch, HDFS-3794.patch

When an offset is specified, the HTTP header Content Length still contains
the original file size. e.g. if the original file is 100 bytes, and the
offset specified it 10, then HTTP Content Length ought to be 90. Currently it
is still returned as 100.
This causes curl to give error 18, and JAVA to throw ConnectionClosedException

[jira] [Commented] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.

2012-08-14 Thread Ravi Prakash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434193#comment-13434193
 ] 

Ravi Prakash commented on HDFS-3794:


I tested the modified patch and it too worked in all three previous cases.

 WebHDFS Open used with Offset returns the original (and incorrect) Content 
 Length in the HTTP Header.
 -

 Key: HDFS-3794
 URL: https://issues.apache.org/jira/browse/HDFS-3794
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-3794.patch, HDFS-3794.patch


 When an offset is specified, the HTTP header Content Length still contains 
 the original file size. e.g. if the original file is 100 bytes, and the 
 offset specified it 10, then HTTP Content Length ought to be 90. Currently it 
 is still returned as 100.
 This causes curl to give error 18, and JAVA to throw ConnectionClosedException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3796) Speed up edit log tests by avoiding fsync()


[ 
https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434204#comment-13434204
 ] 

Colin Patrick McCabe commented on HDFS-3796:


Great idea.  I think we should also do this in {{TestNameNodeRecovery}}, 
{{TestFileJournalManager}}, {{TestSecurityTokenEditLog}}, 
{{TestEditLogsDuringFailover}}, and {{TestEditLogFileOutputStream}}.

 Speed up edit log tests by avoiding fsync()
 ---

 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3796.txt


 Our edit log tests are very slow because they incur a lot of fsyncs as they 
 write out transactions. Since fsync() has no effect except in the case of 
 power outages or system crashes, and we don't care about power outages in the 
 context of tests, we can safely skip the fsync without any loss in coverage.
 In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
 case improved from ~83 seconds with fsync to about 5 seconds without. These 
 results are from my SSD laptop - they are probably even more drastic on 
 spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3791) Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes

2012-08-14 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3791:
--

Summary: Backport HDFS-173 to Branch-1 :  Recursively deleting a directory 
with millions of files makes NameNode unresponsive for other commands until the 
deletion completes  (was: Backport HDFS-173 Recursively deleting a directory 
with millions of files makes NameNode unresponsive for other commands until the 
deletion completes)

 Backport HDFS-173 to Branch-1 :  Recursively deleting a directory with 
 millions of files makes NameNode unresponsive for other commands until the 
 deletion completes
 

 Key: HDFS-3791
 URL: https://issues.apache.org/jira/browse/HDFS-3791
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 1.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G

 Backport HDFS-173. 
 see the 
 [comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13422007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422007]
  for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3791) Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes

2012-08-14 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-3791:
--

Attachment: HDFS-3791.patch

 Backport HDFS-173 to Branch-1 :  Recursively deleting a directory with 
 millions of files makes NameNode unresponsive for other commands until the 
 deletion completes
 

 Key: HDFS-3791
 URL: https://issues.apache.org/jira/browse/HDFS-3791
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 1.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-3791.patch


 Backport HDFS-173. 
 see the 
 [comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13422007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422007]
  for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3791) Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes

2012-08-14 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434209#comment-13434209
 ] 

Uma Maheswara Rao G commented on HDFS-3791:
---

Hi Suresh, I have just attached a back-port patch here. Could you please take a 
look.

 Backport HDFS-173 to Branch-1 :  Recursively deleting a directory with 
 millions of files makes NameNode unresponsive for other commands until the 
 deletion completes
 

 Key: HDFS-3791
 URL: https://issues.apache.org/jira/browse/HDFS-3791
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 1.0.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
 Attachments: HDFS-3791.patch


 Backport HDFS-173. 
 see the 
 [comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13422007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422007]
  for more details

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3048) Small race in BlockManager#close

2012-08-14 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434212#comment-13434212
 ] 

Karthik Kambatla commented on HDFS-3048:


+1

 Small race in BlockManager#close
 

 Key: HDFS-3048
 URL: https://issues.apache.org/jira/browse/HDFS-3048
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Andy Isaacson
 Attachments: hdfs-3048.txt, hdfs-3787-2.txt


 There's a small race in BlockManager#close, we close the BlocksMap before the 
 replication monitor, which means the replication monitor can NPE if it tries 
 to access the blocks map. We need to swap the order (close the blocks map 
 after shutting down the repl monitor).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed


[ 
https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434213#comment-13434213
 ] 

Aaron T. Myers commented on HDFS-3658:
--

Hi Nicholas, did you commit this to branch-2.1.0-alpha as well as branch-2? If 
not, I believe the fix version should be set to 2.2.0-alpha. Do you agree?

 TestDFSClientRetries#testNamenodeRestart failed
 ---

 Key: HDFS-3658
 URL: https://issues.apache.org/jira/browse/HDFS-3658
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 1.2.0, 2.1.0-alpha

 Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, 
 test-log.txt


 Saw the following fail on a jenkins run:
 {noformat}
 Error Message
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
 Stacktrace
 junit.framework.AssertionFailedError: 
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.failNotEquals(Assert.java:283)
   at junit.framework.Assert.assertEquals(Assert.java:64)
   at junit.framework.Assert.assertEquals(Assert.java:71)
   at 
 org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3258) Test for HADOOP-8144 (pseudoSortByDistance in NetworkTopology for first rack local node)


 [ 
https://issues.apache.org/jira/browse/HDFS-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3258:
--

Fix Version/s: 0.23.3

Merged to branch 23.

 Test for HADOOP-8144 (pseudoSortByDistance in NetworkTopology for first rack 
 local node)
 

 Key: HDFS-3258
 URL: https://issues.apache.org/jira/browse/HDFS-3258
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.23.0, 1.0.0
Reporter: Eli Collins
Assignee: Junping Du
 Fix For: 0.23.3, 2.0.0-alpha

 Attachments: HDFS-3258.patch, hdfs-3258.txt


 For updating TestNetworkTopology to cover HADOOP-8144.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-14 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3788:
-

Attachment: h3788_20120814.patch

Thanks Daryn for taking a look.  Here is a new patch: h3788_20120814.patch

I have run 3GB file test included in the previous patch.  The test will not be 
committed since it takes 10 minutes.

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, 
 h3788_20120814.patch


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed

2012-08-14 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3658:
-

  Environment: Aaron, you are right.  It should be 2.2.0-alpha.
Fix Version/s: (was: 2.1.0-alpha)
   2.2.0-alpha

 TestDFSClientRetries#testNamenodeRestart failed
 ---

 Key: HDFS-3658
 URL: https://issues.apache.org/jira/browse/HDFS-3658
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
 Environment: Aaron, you are right.  It should be 2.2.0-alpha.
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, 
 test-log.txt


 Saw the following fail on a jenkins run:
 {noformat}
 Error Message
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
 Stacktrace
 junit.framework.AssertionFailedError: 
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.failNotEquals(Assert.java:283)
   at junit.framework.Assert.assertEquals(Assert.java:64)
   at junit.framework.Assert.assertEquals(Assert.java:71)
   at 
 org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3801) Provide a way to disable browsing of files from the web UI

2012-08-14 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434268#comment-13434268
 ] 

Harsh J commented on HDFS-3801:
---

Thanks Steve. That should be possible to do, given that we mostly seem to call 
the JspHelper class methods.

 Provide a way to disable browsing of files from the web UI
 --

 Key: HDFS-3801
 URL: https://issues.apache.org/jira/browse/HDFS-3801
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Priority: Minor

 A few times we've had requests from users who wish to disable browsing of the 
 filesystem in the web UI completely, while keeping other servlet 
 functionality enabled (such as fsck, etc.). Right now, the cheap way to do 
 this is by blocking out the DN web port (50075) from access by clients, but 
 that also hampers HFTP transfers.
 We should instead provide a toggle config for the JSPs to use and disallow 
 browsing if the toggle's enabled. The config can be true by default, to not 
 change the behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

[
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434276#comment-13434276
]

Eli Collins commented on HDFS-3788:
---

bq. Do you mean fs -get? No way, they should have the same code path. Are you
sure that both server and client were running trunk?

Yes, hadoop fs -get of a 3gb file works but distcp of the directory containing
that file fails. And yes, using a trunk build for everything, just running this
via a pseudo distributed tarball install on my laptop.

Can you explain what the bug is and the relevant fix? I don't see why we were
not setting the content length header as we do that unconditionally on the
server side.

distcp can't copy large files using webhdfs due to missing Content-Length
header

[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname


[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434277#comment-13434277
 ] 

Eli Collins commented on HDFS-3150:
---

These new options are distinct from hadoop.security.token.service.use_ip, it 
would be reasonable to have them set to true and use_ip to false and vice 
versa.  Note that these only affect client - DN and DN - DN where use_ip is 
Client - NN. There's really not much overlap.


 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode 
 identifiers.
 New client and Datanode configuration options are introduced:
 - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
 connections should use the datanode hostname (as clients outside cluster may 
 not be able to route the IP)
 - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
 use hostnames when connecting to other Datanodes for data transfer
 If the configuration options are not used, there is no change in the current 
 behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434284#comment-13434284
 ] 

Suresh Srinivas commented on HDFS-3731:
---

Colin, can you please add description about the final approach you are taking 
to solve this problem.

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3772) HDFS NN will hang in safe mode and never come out if we change the dfs.namenode.replication.min bigger.

2012-08-14 Thread Konstantin Shvachko (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434293#comment-13434293
]

Konstantin Shvachko commented on HDFS-3772:
---

I was writing this when Jira went down. Will try to reproduce that comment.

files created with the old replication count will expected to bump up to the
new minimum upon restart automatically

This is not an expected behavior.
{{dfs.namenode.replication.min}} has two purposes:
# Counting the blocks satisfying the new minimum replication during startup.
# Controls the minimal number of replicas that must be created during pipeline
in order to call the data transfer successful.

Setting {{replication.min}} to higher value does not mean NN replicates blocks
to that min.
It means NN will wait for that many replicas to be reported during startup
before exiting SafeMode.
If you set it too high, this is one of the ways to never let NN go out of
SafeMode automatically.

SafeMode prohibits replication or deletion of blocks or modification of the
namespace,
so block replication will not happen until NN leaves SafeMode.
If you are trying to increase block replication for all files in your file
system you should use
{{setReplication()}} on the root recursively. But replication will start only
after SafeMode is OFF.

I think we can change the semantics of this parameter to the percentage of
blocks that satisfy the real replication of each file.

Not a good idea. In general, changing semantics of existing parameters is
confusing.
And in particular, because this will make NN stay in SafeMode forever if some
DataNodes don't come up.

I think the question here is what you are trying to achieve with this?

HDFS NN will hang in safe mode and never come out if we change the
dfs.namenode.replication.min bigger.
---

Key: HDFS-3772
URL: https://issues.apache.org/jira/browse/HDFS-3772
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Yanbo Liang

If the NN restarts with a new minimum replication
(dfs.namenode.replication.min), any files created with the old replication
count will expected to bump up to the new minimum upon restart automatically.
However, the real case is that if the NN restarts will a new minimum
replication which is bigger than the old one, the NN will hang in safemode
and never come out.
The corresponding test case can pass is because we have missing some test
coverage. It had been discussed in HDFS-3734.
If the NN received enough number of reported block which is satisfying the
new minimum replication, it will exit safe mode. However, if we change a
bigger minimum replication, there will be no enough amount blocks which are
satisfying the limitation.
Look at the code segment in FSNamesystem.java:
private synchronized void incrementSafeBlockCount(short replication) {
if (replication == safeReplication) {
this.blockSafe++;
checkMode();
}
}
The DNs report blocks to NN and if the replication is equal to
safeReplication(It is assigned by the new minimum replication.), we will
increment blockSafe. But if we change a bigger minimum replication, all the
blocks whose replications are lower than it can not satisfy this equal
relationship. But actually the NN had received complete block information. It
cause blockSafe will not increment as usual and not reach the enough amount
to exit safe mode and then NN hangs.

[jira] [Resolved] (HDFS-3649) Port HDFS-385 to branch-1-win

2012-08-14 Thread Sumadhur Reddy Bolli (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumadhur Reddy Bolli resolved HDFS-3649.


  Resolution: Fixed
Release Note: Nicholas submitted the patches posted on HDF-385 to branch-1 
and branch-1-win

 Port HDFS-385 to branch-1-win
 -

 Key: HDFS-3649
 URL: https://issues.apache.org/jira/browse/HDFS-3649
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1-win
Reporter: Sumadhur Reddy Bolli
Assignee: Sumadhur Reddy Bolli

 Added patch to HDF-385 to port the existing pluggable placement policy to 
 branch-1-win

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages


[ 
https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434321#comment-13434321
 ] 

Aaron T. Myers commented on HDFS-3765:
--

+1, the latest patch looks good to me.

 Namenode INITIALIZESHAREDEDITS should be able to initialize all shared 
 storages
 ---

 Key: HDFS-3765
 URL: https://issues.apache.org/jira/browse/HDFS-3765
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, 
 hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt


 Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits 
 files to file schema based shared storages when moving cluster from Non-HA 
 environment to HA enabled environment.
 This Jira focuses on the following
 * Generalizing the logic of copying the edits to new shared storage so that 
 any schema based shared storage can initialized for HA cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3723) All commands should support meaningful --help


 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3723:


Attachment: HDFS-3723.003.patch

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3795) QJM: validate journal dir at startup


[ 
https://issues.apache.org/jira/browse/HDFS-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434326#comment-13434326
 ] 

Aaron T. Myers commented on HDFS-3795:
--

+1, the updated patch looks good to me.

 QJM: validate journal dir at startup
 

 Key: HDFS-3795
 URL: https://issues.apache.org/jira/browse/HDFS-3795
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3795.txt, hdfs-3795.txt


 Currently, the JN does not validate the configured journal directory until it 
 tries to write into it. This is counter-intuitive for users, since they would 
 expect to find out about a misconfiguration at startup time, rather than on 
 first access. Additionally, two testers accidentally configured the journal 
 dir to be a URI, which the code accidentally understood as a relative path 
 ({{CWD/file:/foo/bar}}.
 We should validate the config at startup to be an accessible absolute path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3757) libhdfs: improve native stack traces


 [ 
https://issues.apache.org/jira/browse/HDFS-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3757:
---

Status: Open  (was: Patch Available)

 libhdfs: improve native stack traces
 

 Key: HDFS-3757
 URL: https://issues.apache.org/jira/browse/HDFS-3757
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: libhdfs
Affects Versions: 2.2.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3757.001.patch


 When libhdfs crashes, we often don't get very good stack traces.  It would be 
 nice to get a better stack trace for the thread that crashed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-2882) DN continues to start up, even if block pool fails to initialize


 [ 
https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins reassigned HDFS-2882:
-

Assignee: (was: Todd Lipcon)

 DN continues to start up, even if block pool fails to initialize
 

 Key: HDFS-2882
 URL: https://issues.apache.org/jira/browse/HDFS-2882
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Todd Lipcon
 Attachments: hdfs-2882.txt


 I started a DN on a machine that was completely out of space on one of its 
 drives. I saw the following:
 2012-02-02 09:56:50,499 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
 block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id 
 DS-507718931-172.29.5.194-11072-12978
 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021
 java.io.IOException: Mkdirs failed to create 
 /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp
 at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335)
 but the DN continued to run, spewing NPEs when it tried to do block reports, 
 etc. This was on the HDFS-1623 branch but may affect trunk as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3771) Namenode can't restart due to corrupt edit logs, timing issue with shutdown and edit log rolling

2012-08-14 Thread patrick white (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434392#comment-13434392
]

patrick white commented on HDFS-3771:
-

Thanks very much Todd, appreciate the feedback and references, and the
suggestion on using exit to try to reproduce this.

Namenode can't restart due to corrupt edit logs, timing issue with shutdown
and edit log rolling

Key: HDFS-3771
URL: https://issues.apache.org/jira/browse/HDFS-3771
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.23.3, 2.0.0-alpha
Environment: QE, 20 node Federated cluster with 3 NNs and 15 DNs,
using Kerberos based security
Reporter: patrick white
Priority: Critical

Our 0.23.3 nightly HDFS regression suite encountered a particularly nasty
issue recently, which resulted in the cluster's default Namenode being unable
to restart, this was on a 20 node Federated cluster with security. The cause
appears to be that the NN was just starting to roll its edit log when a
shutdown occurred, the shutdown was intentional to restart the cluster as
part of an automated test.
The tests that were running do not appear to be the issue in themselves, the
cluster was just wrapping up an adminReport subset and this failure case has
not reproduce so far, nor was it failing previously. It looks like a chance
occurrence of sending the shutdown just as the edit log roll was begun.
From the NN log, the following sequence is noted:
1. an InvalidateBlocks operation had completed
2. FSNamesystem: Roll Edit Log from [Secondary Namenode IPaddr]
3. FSEditLog: Ending log segment 23963
4. FSEditLog: Starting log segment at 23967
4. NameNode: SHUTDOWN_MSG
= the NN shuts down and then is restarted...
5. FSImageTransactionalStorageInspector: Logs beginning at txid 23967 were
are all in-progress
6. FSImageTransactionalStorageInspector: Marking log at
/grid/[PATH]/edits_inprogress_0023967 as corrupt since it has no
transactions in it.
7. NameNode: Exception in namenode join
[main]java.lang.IllegalStateException: No non-corrupt logs for txid 23967
= NN start attempts continue to cycle trying to restart but can't, failing
on the same exception due to lack of non-corrupt edit logs
If observations are correct and issue is from shutdown happening as edit logs
are rolling, does the NN have an equivalent to the conventional fs 'sync'
blocking action that should be called, or perhaps has a timing hole?

[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695


[ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434412#comment-13434412
 ] 

Hudson commented on HDFS-3792:
--

Integrated in Hadoop-Common-trunk-Commit #2576 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/])
HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd 
Lipcon. (Revision 1372690)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java


 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Fix For: 3.0.0

 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5


[ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434411#comment-13434411
 ] 

Hudson commented on HDFS-3790:
--

Integrated in Hadoop-Common-trunk-Commit #2576 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/])
HDFS-3790. test_fuse_dfs.c doesn't compile on centos 5. Contributed by 
Colin Patrick McCabe. (Revision 1372676)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372676
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/test/test_fuse_dfs.c


 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Affects Versions: 2.2.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed


[ 
https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434414#comment-13434414
 ] 

Hudson commented on HDFS-3658:
--

Integrated in Hadoop-Common-trunk-Commit #2576 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/])
HDFS-3658. Fix bugs in TestDFSClientRetries and add more tests. (Revision 
1372707)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372707
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java


 TestDFSClientRetries#testNamenodeRestart failed
 ---

 Key: HDFS-3658
 URL: https://issues.apache.org/jira/browse/HDFS-3658
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, 
 test-log.txt


 Saw the following fail on a jenkins run:
 {noformat}
 Error Message
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
 Stacktrace
 junit.framework.AssertionFailedError: 
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.failNotEquals(Assert.java:283)
   at junit.framework.Assert.assertEquals(Assert.java:64)
   at junit.framework.Assert.assertEquals(Assert.java:71)
   at 
 org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3695) Genericize format() to non-file JournalManagers


[ 
https://issues.apache.org/jira/browse/HDFS-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434413#comment-13434413
 ] 

Hudson commented on HDFS-3695:
--

Integrated in Hadoop-Common-trunk-Commit #2576 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/])
HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd 
Lipcon. (Revision 1372690)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java


 Genericize format() to non-file JournalManagers
 ---

 Key: HDFS-3695
 URL: https://issues.apache.org/jira/browse/HDFS-3695
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 3.0.0

 Attachments: hdfs-3695.txt, hdfs-3695.txt, hdfs-3695.txt


 Currently, the namenode -format and namenode -initializeSharedEdits 
 commands do not understand how to do anything with non-file-based shared 
 storage. This affects both BookKeeperJournalManager and QuorumJournalManager.
 This JIRA is to plumb through the formatting of edits directories using 
 pluggable journal manager implementations so that no separate step needs to 
 be taken to format them -- the same commands will work for NFS-based storage 
 or one of the alternate implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5


[ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434416#comment-13434416
 ] 

Hudson commented on HDFS-3790:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/])
HDFS-3790. test_fuse_dfs.c doesn't compile on centos 5. Contributed by 
Colin Patrick McCabe. (Revision 1372676)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372676
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/test/test_fuse_dfs.c


 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Affects Versions: 2.2.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3695) Genericize format() to non-file JournalManagers


[ 
https://issues.apache.org/jira/browse/HDFS-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434418#comment-13434418
 ] 

Hudson commented on HDFS-3695:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/])
HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd 
Lipcon. (Revision 1372690)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java


 Genericize format() to non-file JournalManagers
 ---

 Key: HDFS-3695
 URL: https://issues.apache.org/jira/browse/HDFS-3695
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 3.0.0

 Attachments: hdfs-3695.txt, hdfs-3695.txt, hdfs-3695.txt


 Currently, the namenode -format and namenode -initializeSharedEdits 
 commands do not understand how to do anything with non-file-based shared 
 storage. This affects both BookKeeperJournalManager and QuorumJournalManager.
 This JIRA is to plumb through the formatting of edits directories using 
 pluggable journal manager implementations so that no separate step needs to 
 be taken to format them -- the same commands will work for NFS-based storage 
 or one of the alternate implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695


[ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434417#comment-13434417
 ] 

Hudson commented on HDFS-3792:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/])
HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd 
Lipcon. (Revision 1372690)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java


 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Fix For: 3.0.0

 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed


[ 
https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434419#comment-13434419
 ] 

Hudson commented on HDFS-3658:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/])
HDFS-3658. Fix bugs in TestDFSClientRetries and add more tests. (Revision 
1372707)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372707
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java


 TestDFSClientRetries#testNamenodeRestart failed
 ---

 Key: HDFS-3658
 URL: https://issues.apache.org/jira/browse/HDFS-3658
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, 
 test-log.txt


 Saw the following fail on a jenkins run:
 {noformat}
 Error Message
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
 Stacktrace
 junit.framework.AssertionFailedError: 
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.failNotEquals(Assert.java:283)
   at junit.framework.Assert.assertEquals(Assert.java:64)
   at junit.framework.Assert.assertEquals(Assert.java:71)
   at 
 org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3649) Port HDFS-385 to branch-1-win

2012-08-14 Thread Sumadhur Reddy Bolli (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sumadhur Reddy Bolli updated HDFS-3649:
---

Release Note: blockplacement policy is now ported to branch-1 and 
branch-1-win  (was: Nicholas submitted the patches posted on HDF-385 to 
branch-1 and branch-1-win)

Nicholas committed the patches posted on HDF-385 to branch-1 and branch-1-win

 Port HDFS-385 to branch-1-win
 -

 Key: HDFS-3649
 URL: https://issues.apache.org/jira/browse/HDFS-3649
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1-win
Reporter: Sumadhur Reddy Bolli
Assignee: Sumadhur Reddy Bolli

 Added patch to HDF-385 to port the existing pluggable placement policy to 
 branch-1-win

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0


 [ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3731:
---

Description: 
Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
release. Problem reported by Brahma Reddy.

The {{DataNode}} will only have one block pool after upgrading from a 1.x 
release.  (This is because in the 1.x releases, there were no block pools-- or 
equivalently, everything was in the same block pool).  During the upgrade, we 
should hardlink the block files from the {{blocksBeingWritten}} directory into 
the {{rbw}} directory of this block pool.  Similarly, on {{-finalize}}, we 
should delete the {{blocksBeingWritten}} directory.

  was:
Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
release. Problem reported by Brahma Reddy.



 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.
 The {{DataNode}} will only have one block pool after upgrading from a 1.x 
 release.  (This is because in the 1.x releases, there were no block pools-- 
 or equivalently, everything was in the same block pool).  During the upgrade, 
 we should hardlink the block files from the {{blocksBeingWritten}} directory 
 into the {{rbw}} directory of this block pool.  Similarly, on {{-finalize}}, 
 we should delete the {{blocksBeingWritten}} directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434432#comment-13434432
 ] 

Colin Patrick McCabe commented on HDFS-3731:


I added a description of the approach to the Description field.

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.
 The {{DataNode}} will only have one block pool after upgrading from a 1.x 
 release.  (This is because in the 1.x releases, there were no block pools-- 
 or equivalently, everything was in the same block pool).  During the upgrade, 
 we should hardlink the block files from the {{blocksBeingWritten}} directory 
 into the {{rbw}} directory of this block pool.  Similarly, on {{-finalize}}, 
 we should delete the {{blocksBeingWritten}} directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5


[ 
https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434450#comment-13434450
 ] 

Hudson commented on HDFS-3790:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2599 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2599/])
HDFS-3790. test_fuse_dfs.c doesn't compile on centos 5. Contributed by 
Colin Patrick McCabe. (Revision 1372676)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372676
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/test/test_fuse_dfs.c


 test_fuse_dfs.c doesn't compile on centos 5
 ---

 Key: HDFS-3790
 URL: https://issues.apache.org/jira/browse/HDFS-3790
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: fuse-dfs
Affects Versions: 2.2.0-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-3790.001.patch


 test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc 
 shipped on CentOS 5.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695


[ 
https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434451#comment-13434451
 ] 

Hudson commented on HDFS-3792:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2599 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2599/])
HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd 
Lipcon. (Revision 1372690)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java


 Fix two findbugs introduced by HDFS-3695
 

 Key: HDFS-3792
 URL: https://issues.apache.org/jira/browse/HDFS-3792
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, name-node
Affects Versions: 3.0.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Fix For: 3.0.0

 Attachments: hdfs-3792.txt


 Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA 
 is to fix them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed


[ 
https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434453#comment-13434453
 ] 

Hudson commented on HDFS-3658:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2599 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2599/])
HDFS-3658. Fix bugs in TestDFSClientRetries and add more tests. (Revision 
1372707)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372707
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java


 TestDFSClientRetries#testNamenodeRestart failed
 ---

 Key: HDFS-3658
 URL: https://issues.apache.org/jira/browse/HDFS-3658
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 1.2.0, 2.2.0-alpha

 Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, 
 test-log.txt


 Saw the following fail on a jenkins run:
 {noformat}
 Error Message
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
 Stacktrace
 junit.framework.AssertionFailedError: 
 expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but 
 was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.failNotEquals(Assert.java:283)
   at junit.framework.Assert.assertEquals(Assert.java:64)
   at junit.framework.Assert.assertEquals(Assert.java:71)
   at 
 org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages


 [ 
https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3765:
--

Attachment: hdfs-3765-branch-2.txt

Attaching branch-2 patch. It's the same except for some resolved conflicts on 
the imports.

Will commit both patches momentarily. Thanks, Vinay.

 Namenode INITIALIZESHAREDEDITS should be able to initialize all shared 
 storages
 ---

 Key: HDFS-3765
 URL: https://issues.apache.org/jira/browse/HDFS-3765
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: Vinay
Assignee: Vinay
 Attachments: hdfs-3765-branch-2.txt, HDFS-3765.patch, 
 HDFS-3765.patch, HDFS-3765.patch, hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt


 Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits 
 files to file schema based shared storages when moving cluster from Non-HA 
 environment to HA enabled environment.
 This Jira focuses on the following
 * Generalizing the logic of copying the edits to new shared storage so that 
 any schema based shared storage can initialized for HA cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-14 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434463#comment-13434463
 ] 

Jason Lowe commented on HDFS-3788:
--

I tested out the patch on trunk and am unable to reproduce Eli's issue.  
Without the patch both -get and distcp via webhdfs fail, but after the patch I 
can successfully -get and distcp large files.  This is on a pseudo-distributed 
tarball without security, distcp is {{hadoop distcp 
webhdfs://localhost:50070/user/someuser/distcpsrc 
hdfs://localhost:8020/user/someuser/distcpdest}} where distcpsrc/ contains a 
3GB file.

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, 
 h3788_20120814.patch


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3796) Speed up edit log tests by avoiding fsync()


 [ 
https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3796:
--

Attachment: hdfs-3796.txt

Attached patch adds the same hook to the other test cases that Colin suggested.

 Speed up edit log tests by avoiding fsync()
 ---

 Key: HDFS-3796
 URL: https://issues.apache.org/jira/browse/HDFS-3796
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0, 2.2.0-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3796.txt, hdfs-3796.txt


 Our edit log tests are very slow because they incur a lot of fsyncs as they 
 write out transactions. Since fsync() has no effect except in the case of 
 power outages or system crashes, and we don't care about power outages in the 
 context of tests, we can safely skip the fsync without any loss in coverage.
 In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test 
 case improved from ~83 seconds with fsync to about 5 seconds without. These 
 results are from my SSD laptop - they are probably even more drastic on 
 spinning media.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0


[ 
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434467#comment-13434467
 ] 

Suresh Srinivas commented on HDFS-3731:
---

Colin, is the recovery mechanism for bbw blocks in 1.x and rbw in 2.x 
compatible?

 2.0 release upgrade must handle blocks being written from 1.0
 -

 Key: HDFS-3731
 URL: https://issues.apache.org/jira/browse/HDFS-3731
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
 Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch


 Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 
 release. Problem reported by Brahma Reddy.
 The {{DataNode}} will only have one block pool after upgrading from a 1.x 
 release.  (This is because in the 1.x releases, there were no block pools-- 
 or equivalently, everything was in the same block pool).  During the upgrade, 
 we should hardlink the block files from the {{blocksBeingWritten}} directory 
 into the {{rbw}} directory of this block pool.  Similarly, on {{-finalize}}, 
 we should delete the {{blocksBeingWritten}} directory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3798) Avoid throwing NPE when finalizeSegment() is called on invalid segment


 [ 
https://issues.apache.org/jira/browse/HDFS-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3798:
--

Attachment: hdfs-3798.txt

addresses the test problem above. I'll commit this to the branch later today.

 Avoid throwing NPE when finalizeSegment() is called on invalid segment
 --

 Key: HDFS-3798
 URL: https://issues.apache.org/jira/browse/HDFS-3798
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Attachments: hdfs-3798.txt, hdfs-3798.txt


 Currently, if the client calls finalizeLogSegment() on a segment which 
 doesn't exist on the JournalNode side, it throws an NPE. Instead it should 
 throw a more intelligible exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3797) QJM: add segment txid as a parameter to journal() RPC


 [ 
https://issues.apache.org/jira/browse/HDFS-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3797:
--

Attachment: hdfs-3797.txt

Attached patch applies on top of HDFS-3798, HDFS-3799

 QJM: add segment txid as a parameter to journal() RPC
 -

 Key: HDFS-3797
 URL: https://issues.apache.org/jira/browse/HDFS-3797
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3797.txt


 During fault testing of QJM, I saw the following issue:
 1) NN sends txn 5 to JN
 2) NN gets partitioned from JN while JN remains up. The next two RPCs are 
 missed while the partition has happened:
 2a) finalizeSegment(1-5)
 2b) startSegment(6)
 3) NN sends txn 6 to JN
 This caused one of the JNs to end up with a segment 1-10 while the others had 
 two segments; 1-5 and 6-10. This broke some invariants of the QJM protocol 
 and prevented the recovery protocol from running properly.
 This can be addressed on the client side by HDFS-3726, which would cause the 
 NN to not send the RPC in #3. But it makes sense to also add an extra safety 
 check here on the server side: with every journal() call, we can send the 
 segment's txid. Then if the JN and the client get out of sync, the JN can 
 reject the RPCs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header


[ 
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434478#comment-13434478
 ] 

Daryn Sharp commented on HDFS-3788:
---

I believe it's legitimate to send a content-length (if known) with a chunked 
response.  You may want to check for chunking only if there's not a 
content-length.  It's up to you.

I think a test case would be invaluable since the file size issue has reared 
itself a few times.  Could you add a test that uses a mock?

 distcp can't copy large files using webhdfs due to missing Content-Length 
 header
 

 Key: HDFS-3788
 URL: https://issues.apache.org/jira/browse/HDFS-3788
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Tsz Wo (Nicholas), SZE
Priority: Critical
 Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, 
 h3788_20120814.patch


 The following command fails when data1 contains a 3gb file. It passes when 
 using hftp or when the directory just contains smaller (2gb) files, so looks 
 like a webhdfs issue with large files.
 {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 
 hdfs://localhost:8020/user/eli/data2}}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final

Jing Zhao created HDFS-3802:
---

 Summary: StartupOption.name in HdfsServerConstants should be final
 Key: HDFS-3802
 URL: https://issues.apache.org/jira/browse/HDFS-3802
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Priority: Trivial


In HdfsServerConstants, it may be better to define StartupOption.name as final 
since it will not and should not be modified after initialization. For example, 
in NameNode.java, the printUsage function prints out multiple startup options' 
names. The modification/change of the StartupOption.name may cause invalid 
usage message. Although right now there is no methods to change/set the value 
of StartupOption.name, it is better to add the final keyword to make sure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final


 [ 
https://issues.apache.org/jira/browse/HDFS-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3802:


Attachment: HDFS-3802.patch

 StartupOption.name in HdfsServerConstants should be final
 -

 Key: HDFS-3802
 URL: https://issues.apache.org/jira/browse/HDFS-3802
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Priority: Trivial
 Attachments: HDFS-3802.patch


 In HdfsServerConstants, it may be better to define StartupOption.name as 
 final since it will not and should not be modified after initialization. For 
 example, in NameNode.java, the printUsage function prints out multiple 
 startup options' names. The modification/change of the StartupOption.name may 
 cause invalid usage message. Although right now there is no methods to 
 change/set the value of StartupOption.name, it is better to add the final 
 keyword to make sure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final


 [ 
https://issues.apache.org/jira/browse/HDFS-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3802:


Affects Version/s: 3.0.0
Fix Version/s: 3.0.0

 StartupOption.name in HdfsServerConstants should be final
 -

 Key: HDFS-3802
 URL: https://issues.apache.org/jira/browse/HDFS-3802
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Jing Zhao
Priority: Trivial
 Fix For: 3.0.0

 Attachments: HDFS-3802.patch


 In HdfsServerConstants, it may be better to define StartupOption.name as 
 final since it will not and should not be modified after initialization. For 
 example, in NameNode.java, the printUsage function prints out multiple 
 startup options' names. The modification/change of the StartupOption.name may 
 cause invalid usage message. Although right now there is no methods to 
 change/set the value of StartupOption.name, it is better to add the final 
 keyword to make sure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread


 [ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3718:
--

Fix Version/s: 2.1.0-alpha
   0.23.3

I've committed to trunk, branches for 2.x, and 23.  Thanks Kihwal!

 Datanode won't shutdown because of runaway DataBlockScanner thread
 --

 Key: HDFS-3718
 URL: https://issues.apache.org/jira/browse/HDFS-3718
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.1-alpha
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3718.patch.txt


 Datanode sometimes does not shutdown because the block pool scanner thread 
 keeps running. It prints out Starting a new period every five seconds, even 
 after {{shutdown()}} is called.  Somehow the interrupt is missed.
 {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
 but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
 before it is being set to false.
 Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread


 [ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3718:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Datanode won't shutdown because of runaway DataBlockScanner thread
 --

 Key: HDFS-3718
 URL: https://issues.apache.org/jira/browse/HDFS-3718
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.1-alpha
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3718.patch.txt


 Datanode sometimes does not shutdown because the block pool scanner thread 
 keeps running. It prints out Starting a new period every five seconds, even 
 after {{shutdown()}} is called.  Somehow the interrupt is missed.
 {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
 but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
 before it is being set to false.
 Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3793) Implement genericized format() in QJM

2012-08-14 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434546#comment-13434546
 ] 

Andrew Purtell commented on HDFS-3793:
--

+1 Applied this patch after the generic support and confirmed with manual 
testing. 

 Implement genericized format() in QJM
 -

 Key: HDFS-3793
 URL: https://issues.apache.org/jira/browse/HDFS-3793
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-3793.txt


 HDFS-3695 added the ability for non-File journal managers to tie into calls 
 like NameNode -format. This JIRA is to implement format() for 
 QuorumJournalManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3723) All commands should support meaningful --help

[
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434549#comment-13434549
]

Hadoop QA commented on HDFS-3723:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540903/HDFS-3723.003.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 2 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.fs.TestS3_LocalFileContextURI

org.apache.hadoop.fs.s3native.TestInMemoryNativeS3FileSystemContract
org.apache.hadoop.fs.TestLocal_S3FileContextURI
org.apache.hadoop.fs.s3.TestInMemoryS3FileSystemContract

org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/3003//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3003//console

This message is automatically generated.

All commands should support meaningful --help
-

[jira] [Commented] (HDFS-3793) Implement genericized format() in QJM

[
https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434551#comment-13434551
]

Aaron T. Myers commented on HDFS-3793:
--

Patch looks pretty good to me. Two small comments:

# Seems like you should make the QuorumJournalManager#format and
QuroumJournalManager#hasSomeData timeouts configurable, or at least use
constants and add a comment or two justifying how you chose those values.
# I think I see the reasoning behind the need for the call to unlockAll in
JNStorage#format, but you might want to add a comment explaining why it's
necessary. Also, if this happens, when will the storage be locked again? Might
want to add a comment explaining that as well.

+1 once these are addressed.

Implement genericized format() in QJM
-

Key: HDFS-3793
URL: https://issues.apache.org/jira/browse/HDFS-3793
Project: Hadoop HDFS
Issue Type: Sub-task
Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Attachments: hdfs-3793.txt

HDFS-3695 added the ability for non-File journal managers to tie into calls
like NameNode -format. This JIRA is to implement format() for
QuorumJournalManager.

[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname


[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434557#comment-13434557
 ] 

Hudson commented on HDFS-3150:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2601 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2601/])
HDFS-3150. Add option for clients to contact DNs via hostname. Contributed 
by Eli Collins (Revision 1373094)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373094
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java


 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode

[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0

[
https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434576#comment-13434576
]

Colin Patrick McCabe commented on HDFS-3731:

The design doc for HDFS-265 says:

bq. RWR (Replica Waiting to be Recovered): If a DataNode dies and restarts, all
its rbw replicas change to be in the rwr state. Rwr replicas will not be in
any pipeline and therefore will not receive any new bytes. They will either
become out of date or will participate in a lease recovery if the client also
dies.

It seems to me that by putting the blocks into the rbw directory, what will
happen when the 2.x DataNode is started is that the blocks will participate in
lease recovery after a few minutes have gone past.

There is a unit test in this patch which short-cuts this process by manually
invoking lease recovery on the files and then verifying that they can be read.

Is there any more documentation about the lease recovery process? As far as I
can tell, it seems to work fine on the files in this patch. It might be useful
to test waiting for automatic lease recovery to be triggered rather than
invoking it manually.

2.0 release upgrade must handle blocks being written from 1.0
-

Key: HDFS-3731
URL: https://issues.apache.org/jira/browse/HDFS-3731
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 2.0.0-alpha
Reporter: Suresh Srinivas
Assignee: Colin Patrick McCabe
Priority: Blocker
Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch

Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0
release. Problem reported by Brahma Reddy.
The {{DataNode}} will only have one block pool after upgrading from a 1.x
release. (This is because in the 1.x releases, there were no block pools--
or equivalently, everything was in the same block pool). During the upgrade,
we should hardlink the block files from the {{blocksBeingWritten}} directory
into the {{rbw}} directory of this block pool. Similarly, on {{-finalize}},
we should delete the {{blocksBeingWritten}} directory.

[jira] [Updated] (HDFS-3150) Add option for clients to contact DNs via hostname


 [ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3150:
--

  Resolution: Fixed
   Fix Version/s: 2.2.0-alpha
Target Version/s:   (was: 2.2.0-alpha)
  Status: Resolved  (was: Patch Available)

I've committed this and merged to branch-2.

 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0, 2.2.0-alpha

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode 
 identifiers.
 New client and Datanode configuration options are introduced:
 - {{dfs.client.use.datanode.hostname}} indicates all client to datanode 
 connections should use the datanode hostname (as clients outside cluster may 
 not be able to route the IP)
 - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should 
 use hostnames when connecting to other Datanodes for data transfer
 If the configuration options are not used, there is no change in the current 
 behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level

2012-08-14 Thread Andrew Purtell (JIRA)

Andrew Purtell created HDFS-3803:


 Summary: BlockPoolSliceScanner new work period notice is very 
chatty at INFO level
 Key: HDFS-3803
 URL: https://issues.apache.org/jira/browse/HDFS-3803
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.1.0-alpha, 2.0.1-alpha
 Environment: Hadoop 2.0.1-alpha-SNAPSHOT
Reporter: Andrew Purtell
Priority: Minor


One line of ~140 chars logged every 5 seconds. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

[
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434593#comment-13434593
]

Hadoop QA commented on HDFS-3788:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540898/h3788_20120814.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.web.TestWebHDFS
org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes
org.apache.hadoop.hdfs.TestHftpFileSystem

org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks
org.apache.hadoop.hdfs.TestByteRangeInputStream
org.apache.hadoop.hdfs.TestDFSShell
org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs
org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/3005//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3005//console

This message is automatically generated.

distcp can't copy large files using webhdfs due to missing Content-Length
header

[jira] [Updated] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level

2012-08-14 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HDFS-3803:
-

Attachment: HDFS-3803.patch

Trivial patch applies to both trunk and branch-2.

 BlockPoolSliceScanner new work period notice is very chatty at INFO level
 -

 Key: HDFS-3803
 URL: https://issues.apache.org/jira/browse/HDFS-3803
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.1.0-alpha, 2.0.1-alpha
 Environment: Hadoop 2.0.1-alpha-SNAPSHOT
Reporter: Andrew Purtell
Priority: Minor
 Attachments: HDFS-3803.patch


 One line of ~140 chars logged every 5 seconds. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3799) QJM: handle empty log segments during recovery


[ 
https://issues.apache.org/jira/browse/HDFS-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434606#comment-13434606
 ] 

Aaron T. Myers commented on HDFS-3799:
--

Patch looks really good. The tests in particular are very solid. Two nits:

# sp: Synchronziing
# Recommend replacing the three testOutOfSyncAtBeginningOfSegmentX methods 
with a loop from 0-2. Feel free to punt if you think this is clearer.

+1 once these are addressed.

 QJM: handle empty log segments during recovery
 --

 Key: HDFS-3799
 URL: https://issues.apache.org/jira/browse/HDFS-3799
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Attachments: hdfs-3799.txt


 One of the cases not yet handled in the QJM branch is the one where either 
 the writer or the journal node crashes after startLogSegment() but before it 
 has written its first transaction to the log. We currently have TODO 
 assertions in the code which fire in these cases.
 This JIRA is to deal with these cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3797) QJM: add segment txid as a parameter to journal() RPC


[ 
https://issues.apache.org/jira/browse/HDFS-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434626#comment-13434626
 ] 

Aaron T. Myers commented on HDFS-3797:
--

Patch looks pretty good to me. One question: have you considered adding a test 
case that ensures that a JN which experiences this scenario will return to 
participating in the quorum after the next finalize/new segment?

Nit: looks like the method comment for testMissFinalizeAndNextStart got messed 
up a little bit: +   **/

 QJM: add segment txid as a parameter to journal() RPC
 -

 Key: HDFS-3797
 URL: https://issues.apache.org/jira/browse/HDFS-3797
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Minor
 Attachments: hdfs-3797.txt


 During fault testing of QJM, I saw the following issue:
 1) NN sends txn 5 to JN
 2) NN gets partitioned from JN while JN remains up. The next two RPCs are 
 missed while the partition has happened:
 2a) finalizeSegment(1-5)
 2b) startSegment(6)
 3) NN sends txn 6 to JN
 This caused one of the JNs to end up with a segment 1-10 while the others had 
 two segments; 1-5 and 6-10. This broke some invariants of the QJM protocol 
 and prevented the recovery protocol from running properly.
 This can be addressed on the client side by HDFS-3726, which would cause the 
 NN to not send the RPC in #3. But it makes sense to also add an extra safety 
 check here on the server side: with every journal() call, we can send the 
 segment's txid. Then if the JN and the client get out of sync, the JN can 
 reject the RPCs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname

[
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434632#comment-13434632
]

Hadoop QA commented on HDFS-3150:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540836/hdfs-3150.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 10 new or modified test
files.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery

org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints

org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
org.apache.hadoop.hdfs.TestPersistBlocks

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/3004//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3004//console

This message is automatically generated.

Add option for clients to contact DNs via hostname
--

Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt,
hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt

[jira] [Commented] (HDFS-3800) QJM: improvements to QJM fault testing

[
https://issues.apache.org/jira/browse/HDFS-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434633#comment-13434633
]

Aaron T. Myers commented on HDFS-3800:
--

Test looks great, and I agree we should go ahead and check it in to the branch
as-is.

One tiny nit: looks like you left in a System.out.println when we should
probably have used a LOG.info.

+1 otherwise.

QJM: improvements to QJM fault testing
--

Key: HDFS-3800
URL: https://issues.apache.org/jira/browse/HDFS-3800
Project: Hadoop HDFS
Issue Type: Sub-task
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Attachments: hdfs-3800.txt

This JIRA improves TestQJMWithFaults as follows:
- the current implementation didn't properly unwrap exceptions thrown by the
reflection-based injection method. This caused some issues in the code where
the injecting proxy didn't act quite like the original object.
- the current implementation incorrectly assumed that the recovery process
would recover to _exactly_ the last acked sequence number. In fact, it may
recover to that transaction _or any greater transaction_.
It also adds a new randomized test which uncovered a number of other bugs. I
will defer to the included javadoc for a description of this test.

[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header

2012-08-14 Thread Tsz Wo (Nicholas), SZE (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434634#comment-13434634
]

Tsz Wo (Nicholas), SZE commented on HDFS-3788:
--

I believe it's legitimate to send a content-length (if known) with a chunked
response. ...

I believe it is not. See below from
http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4

bq. The Content-Length header field MUST NOT be sent if these two lengths are
different (i.e., if a Transfer-Encoding header field is present)

I think a test case would be invaluable since the file size issue has reared
itself a few times. Could you add a test that uses a mock?

I did have a mock test but it requires changing DatanodeWebHdfsMethods. I
don't see an easy way to have mock tests without changing the main code. Do
you have any idea? If yes, could you add the tests?

distcp can't copy large files using webhdfs due to missing Content-Length
header

[jira] [Created] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7

Trevor Robinson created HDFS-3804:
-

 Summary: TestHftpFileSystem fails intermittently with JDK7
 Key: HDFS-3804
 URL: https://issues.apache.org/jira/browse/HDFS-3804
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
 Environment: Apache Maven 3.0.4
Maven home: /usr/share/maven
Java version: 1.7.0_04, vendor: Oracle Corporation
Java home: /usr/lib/jvm/jdk1.7.0_04/jre
Default locale: en_US, platform encoding: ISO-8859-1
OS name: linux, version: 3.2.0-25-generic, arch: amd64, family: unix
Reporter: Trevor Robinson


For example:

  testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
closed
  testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
closed

This test case sets up a filesystem that is used by the first half of the test 
methods (in declaration order), but the second half of the tests start by 
calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
arbitrary order, so if any first half methods run after any second half 
methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3048) Small race in BlockManager#close

[
https://issues.apache.org/jira/browse/HDFS-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434636#comment-13434636
]

Hadoop QA commented on HDFS-3048:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12540834/hdfs-3787-2.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests in
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/3006//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3006//console

This message is automatically generated.

Small race in BlockManager#close

Key: HDFS-3048
URL: https://issues.apache.org/jira/browse/HDFS-3048
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Assignee: Andy Isaacson
Attachments: hdfs-3048.txt, hdfs-3787-2.txt

There's a small race in BlockManager#close, we close the BlocksMap before the
replication monitor, which means the replication monitor can NPE if it tries
to access the blocks map. We need to swap the order (close the blocks map
after shutting down the repl monitor).

[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7


 [ 
https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated HDFS-3804:
--

Attachment: HDFS-3804.patch

The attached patch splits the test case in two: the tests requiring setup and 
teardown remain in TestHftpFileSystem.java and the tests that reset the 
filesystem cache are in TestHftpFileSystemReset.java.

 TestHftpFileSystem fails intermittently with JDK7
 -

 Key: HDFS-3804
 URL: https://issues.apache.org/jira/browse/HDFS-3804
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
 Environment: Apache Maven 3.0.4
 Maven home: /usr/share/maven
 Java version: 1.7.0_04, vendor: Oracle Corporation
 Java home: /usr/lib/jvm/jdk1.7.0_04/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.2.0-25-generic, arch: amd64, family: unix
Reporter: Trevor Robinson
 Attachments: HDFS-3804.patch


 For example:
   testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
 closed
   testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
 closed
 This test case sets up a filesystem that is used by the first half of the 
 test methods (in declaration order), but the second half of the tests start 
 by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
 arbitrary order, so if any first half methods run after any second half 
 methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7


 [ 
https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson updated HDFS-3804:
--

Assignee: Trevor Robinson
  Status: Patch Available  (was: Open)

 TestHftpFileSystem fails intermittently with JDK7
 -

 Key: HDFS-3804
 URL: https://issues.apache.org/jira/browse/HDFS-3804
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
 Environment: Apache Maven 3.0.4
 Maven home: /usr/share/maven
 Java version: 1.7.0_04, vendor: Oracle Corporation
 Java home: /usr/lib/jvm/jdk1.7.0_04/jre
 Default locale: en_US, platform encoding: ISO-8859-1
 OS name: linux, version: 3.2.0-25-generic, arch: amd64, family: unix
Reporter: Trevor Robinson
Assignee: Trevor Robinson
 Attachments: HDFS-3804.patch


 For example:
   testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
 closed
   testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem 
 closed
 This test case sets up a filesystem that is used by the first half of the 
 test methods (in declaration order), but the second half of the tests start 
 by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an 
 arbitrary order, so if any first half methods run after any second half 
 methods, they fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3723) All commands should support meaningful --help


[ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434656#comment-13434656
 ] 

Suresh Srinivas commented on HDFS-3723:
---

Test failure are unrelated to the patch.

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load


[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434660#comment-13434660
 ] 

Trevor Robinson commented on HDFS-2966:
---

I hit this on my last 2 builds of trunk. I don't see an open issue on it, so 
should I create a new issue or reopen this one (or HDFS-540)?

{noformat}
testCorruptBlock(org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics)
  Time elapsed: 7.082 sec   FAILURE!
java.lang.AssertionError: Bad value for metric PendingReplicationBlocks 
expected:0 but was:1
at org.junit.Assert.fail(Assert.java:91)
at org.junit.Assert.failNotEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:126)
at org.junit.Assert.assertEquals(Assert.java:470)
at 
org.apache.hadoop.test.MetricsAsserts.assertGauge(MetricsAsserts.java:191)
at 
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.testCorruptBlock(TestNameNodeMetrics.java:186)
{noformat}


 TestNameNodeMetrics tests can fail under load
 -

 Key: HDFS-2966
 URL: https://issues.apache.org/jira/browse/HDFS-2966
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0-alpha
 Environment: OS/X running intellij IDEA, firefox, winxp in a 
 virtualbox.
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Fix For: 2.2.0-alpha

 Attachments: HDFS-2966.patch, HDFS-2966.patch, HDFS-2966.patch, 
 HDFS-2966.patch


 I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
 running the HDFS tests on a desktop with out enough memory for all the 
 programs trying to run. Things got swapped out and the tests failed as the DN 
 heartbeats didn't come in on time.
 the tests both rely on {{waitForDeletion()}} to block the tests until the 
 delete operation has completed, but all it does is sleep for the same number 
 of seconds as there are datanodes. This is too brittle -it may work on a 
 lightly-loaded system, but not on a system under heavy load where it is 
 taking longer to replicate than expect.
 Immediate fix: double, triple, the sleep time?
 Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3723) All commands should support meaningful --help

[
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suresh Srinivas updated HDFS-3723:
--

Resolution: Fixed
Fix Version/s: 3.0.0
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)

I committed the patch. I had to merge some conflict related to import in
NameNode.java. Will posted updated file.

Thank you Jing for contributing this.

All commands should support meaningful --help
-

Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch,
HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch,
HDFS-3723.patch

[jira] [Updated] (HDFS-3723) All commands should support meaningful --help


 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3723:
--

Attachment: HDFS-3723.patch

Updated patch post merging with the trunk.

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Fix For: 3.0.0

 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, 
 HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname


[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434673#comment-13434673
 ] 

Hudson commented on HDFS-3150:
--

Integrated in Hadoop-Common-trunk-Commit #2577 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2577/])
HDFS-3150. Add option for clients to contact DNs via hostname. Contributed 
by Eli Collins (Revision 1373094)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373094
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java


 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0, 2.2.0-alpha

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of

[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages


[ 
https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434674#comment-13434674
 ] 

Hudson commented on HDFS-3765:
--

Integrated in Hadoop-Common-trunk-Commit #2577 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2577/])
HDFS-3765. namenode -initializeSharedEdits should be able to initialize all 
shared storages. Contributed by Vinay and Todd Lipcon. (Revision 1373061)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373061
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperAsHASharedDir.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestInitializeSharedEdits.java


 Namenode INITIALIZESHAREDEDITS should be able to initialize all shared 
 storages
 ---

 Key: HDFS-3765
 URL: https://issues.apache.org/jira/browse/HDFS-3765
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3765-branch-2.txt, HDFS-3765.patch, 
 HDFS-3765.patch, HDFS-3765.patch, hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt


 Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits 
 files to file schema based shared storages when moving cluster from Non-HA 
 environment to HA enabled environment.
 This Jira focuses on the following
 * Generalizing the logic of copying the edits to new shared storage so that 
 any schema based shared storage can initialized for HA cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread


[ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434675#comment-13434675
 ] 

Hudson commented on HDFS-3718:
--

Integrated in Hadoop-Common-trunk-Commit #2577 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2577/])
HDFS-3718. Datanode won't shutdown because of runaway DataBlockScanner 
thread (Kihwal Lee via daryn) (Revision 1373090)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373090
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


 Datanode won't shutdown because of runaway DataBlockScanner thread
 --

 Key: HDFS-3718
 URL: https://issues.apache.org/jira/browse/HDFS-3718
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.1-alpha
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3718.patch.txt


 Datanode sometimes does not shutdown because the block pool scanner thread 
 keeps running. It prints out Starting a new period every five seconds, even 
 after {{shutdown()}} is called.  Somehow the interrupt is missed.
 {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
 but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
 before it is being set to false.
 Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname


[ 
https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434677#comment-13434677
 ] 

Hudson commented on HDFS-3150:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/])
HDFS-3150. Add option for clients to contact DNs via hostname. Contributed 
by Eli Collins (Revision 1373094)

 Result = FAILURE
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373094
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java


 Add option for clients to contact DNs via hostname
 --

 Key: HDFS-3150
 URL: https://issues.apache.org/jira/browse/HDFS-3150
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: data-node, hdfs client
Affects Versions: 1.0.0, 2.0.0-alpha
Reporter: Eli Collins
Assignee: Eli Collins
 Fix For: 1.1.0, 2.2.0-alpha

 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, 
 hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt


 The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} 
 is the wildcard) however per HADOOP-6867 only the source address (IP) of the 
 registration is given to clients. HADOOP-985 made clients access datanodes by 
 IP primarily to avoid the latency of a DNS lookup, this had the side effect 
 of breaking DN multihoming (the client can not route the IP exposed by the NN 
 if the DN registers with an interface that has a cluster-private IP). To fix 
 this let's add back the option for Datanodes to be accessed by hostname.
 This can be done by:
 # Modifying the primary field of the Datanode descriptor to be the hostname, 
 or 
 # Modifying Client/Datanode - Datanode access use the hostname field 
 instead of the IP
 Approach #2 does not require an incompatible client protocol change, and is 
 much less invasive. It minimizes the scope of modification to just places 
 where clients and Datanodes connect, vs changing all uses of Datanode

[jira] [Commented] (HDFS-3723) All commands should support meaningful --help


[ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434678#comment-13434678
 ] 

Hudson commented on HDFS-3723:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/])
HDFS-3723. Add support -h, -help to all the commands. Contributed by Jing 
Zhao. (Revision 1373170)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373170
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSHAAdmin.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetConf.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSHAAdmin.java


 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Fix For: 3.0.0

 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, 
 HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages


[ 
https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434679#comment-13434679
 ] 

Hudson commented on HDFS-3765:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/])
HDFS-3765. namenode -initializeSharedEdits should be able to initialize all 
shared storages. Contributed by Vinay and Todd Lipcon. (Revision 1373061)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373061
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperAsHASharedDir.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestInitializeSharedEdits.java


 Namenode INITIALIZESHAREDEDITS should be able to initialize all shared 
 storages
 ---

 Key: HDFS-3765
 URL: https://issues.apache.org/jira/browse/HDFS-3765
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.1.0-alpha, 3.0.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3765-branch-2.txt, HDFS-3765.patch, 
 HDFS-3765.patch, HDFS-3765.patch, hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt


 Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits 
 files to file schema based shared storages when moving cluster from Non-HA 
 environment to HA enabled environment.
 This Jira focuses on the following
 * Generalizing the logic of copying the edits to new shared storage so that 
 any schema based shared storage can initialized for HA cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread


[ 
https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434680#comment-13434680
 ] 

Hudson commented on HDFS-3718:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/])
HDFS-3718. Datanode won't shutdown because of runaway DataBlockScanner 
thread (Kihwal Lee via daryn) (Revision 1373090)

 Result = FAILURE
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373090
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java


 Datanode won't shutdown because of runaway DataBlockScanner thread
 --

 Key: HDFS-3718
 URL: https://issues.apache.org/jira/browse/HDFS-3718
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.0.1-alpha
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
 Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha

 Attachments: hdfs-3718.patch.txt


 Datanode sometimes does not shutdown because the block pool scanner thread 
 keeps running. It prints out Starting a new period every five seconds, even 
 after {{shutdown()}} is called.  Somehow the interrupt is missed.
 {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, 
 but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked 
 before it is being set to false.
 Is there any reason why {{datanode.shouldRun}} is set to false later? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level


[ 
https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434690#comment-13434690
 ] 

Suresh Srinivas commented on HDFS-3803:
---

+1 for the patch. I will commit it soon.

 BlockPoolSliceScanner new work period notice is very chatty at INFO level
 -

 Key: HDFS-3803
 URL: https://issues.apache.org/jira/browse/HDFS-3803
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.1.0-alpha, 2.0.1-alpha
 Environment: Hadoop 2.0.1-alpha-SNAPSHOT
Reporter: Andrew Purtell
Priority: Minor
 Attachments: HDFS-3803.patch


 One line of ~140 chars logged every 5 seconds. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final


 [ 
https://issues.apache.org/jira/browse/HDFS-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3802:


Assignee: Jing Zhao
  Status: Patch Available  (was: Open)

 StartupOption.name in HdfsServerConstants should be final
 -

 Key: HDFS-3802
 URL: https://issues.apache.org/jira/browse/HDFS-3802
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Trivial
 Fix For: 3.0.0

 Attachments: HDFS-3802.patch


 In HdfsServerConstants, it may be better to define StartupOption.name as 
 final since it will not and should not be modified after initialization. For 
 example, in NameNode.java, the printUsage function prints out multiple 
 startup options' names. The modification/change of the StartupOption.name may 
 cause invalid usage message. Although right now there is no methods to 
 change/set the value of StartupOption.name, it is better to add the final 
 keyword to make sure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3723) All commands should support meaningful --help


 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3723:
--

Attachment: HDFS-3723.patch

Attaching the complete patch.

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Fix For: 3.0.0

 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, 
 HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3723) All commands should support meaningful --help


 [ 
https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3723:
--

Attachment: (was: HDFS-3723.patch)

 All commands should support meaningful --help
 -

 Key: HDFS-3723
 URL: https://issues.apache.org/jira/browse/HDFS-3723
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts, tools
Affects Versions: 2.0.0-alpha
Reporter: E. Sammer
Assignee: Jing Zhao
 Fix For: 3.0.0

 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, 
 HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, 
 HDFS-3723.patch


 Some (sub)commands support -help or -h options for detailed help while others 
 do not. Ideally, all commands should support meaningful help that works 
 regardless of current state or configuration.
 For example, hdfs zkfc --help (or -h or -help) is not very useful. Option 
 checking should occur before state / configuration checking.
 {code}
 [esammer@hadoop-fed01 ~]# hdfs zkfc --help
 Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: 
 HA is not enabled for this namenode.
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at 
 org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168)
 {code}
 This would go a long way toward better usability for ops staff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level


 [ 
https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-3803.
---

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed

I committed the patch. Thank you Andrew.

 BlockPoolSliceScanner new work period notice is very chatty at INFO level
 -

 Key: HDFS-3803
 URL: https://issues.apache.org/jira/browse/HDFS-3803
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.1.0-alpha, 2.0.1-alpha
 Environment: Hadoop 2.0.1-alpha-SNAPSHOT
Reporter: Andrew Purtell
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-3803.patch


 One line of ~140 chars logged every 5 seconds. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling

2012-08-14 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434699#comment-13434699
 ] 

Andrew Wang commented on HDFS-3672:
---

Ran TestFsck locally and it passed, test failures I think are unrelated.

 Expose disk-location information for blocks to enable better scheduling
 ---

 Key: HDFS-3672
 URL: https://issues.apache.org/jira/browse/HDFS-3672
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Andrew Wang
Assignee: Andrew Wang
 Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, 
 hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, 
 hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch, hdfs-3672-9.patch


 Currently, HDFS exposes on which datanodes a block resides, which allows 
 clients to make scheduling decisions for locality and load balancing. 
 Extending this to also expose on which disk on a datanode a block resides 
 would enable even better scheduling, on a per-disk rather than coarse 
 per-datanode basis.
 This API would likely look similar to Filesystem#getFileBlockLocations, but 
 also involve a series of RPCs to the responsible datanodes to determine disk 
 ids.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3793) Implement genericized format() in QJM

[
https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434702#comment-13434702
]

Todd Lipcon commented on HDFS-3793:
---

Fixed the two nits and committed to branch. Thanks.

bq. Seems like you should make the QuorumJournalManager#format and
QuroumJournalManager#hasSomeData timeouts configurable, or at least use
constants and add a comment or two justifying how you chose those values.
I added constants and set them both to 60sec. Also added a comment explaining
that, since they are only used in format and not normal operation, we can use a
fairly long timeout and don't really need to configure them (if a user sees a
timeout they can manually investigate why it's taking 60+sec and do something
about it)

bq. I think I see the reasoning behind the need for the call to unlockAll in
JNStorage#format, but you might want to add a comment explaining why it's
necessary. Also, if this happens, when will the storage be locked again? Might
want to add a comment explaining that as well.

Added a comment:
{code}
// Unlock the directory before formatting, because we will
// re-analyze it after format(). The analyzeStorage() call
// below is reponsible for re-locking it. This is a no-op
// if the storage is not currently locked.
unlockAll();
{code}

Implement genericized format() in QJM
-

HDFS-3695 added the ability for non-File journal managers to tie into calls
like NameNode -format. This JIRA is to implement format() for
QuorumJournalManager.

[jira] [Commented] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level

2012-08-14 Thread Andy Isaacson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434703#comment-13434703
 ] 

Andy Isaacson commented on HDFS-3803:
-

The BlockPoolScanner is supposed to be starting a new period every three weeks, 
not every 5 seconds.  See HDFS-3194.

I think this -LOG.info +LOG.debug change should be reverted and 
https://issues.apache.org/jira/browse/HDFS-3194?focusedCommentId=13399085page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399085
 should be merged instead.

 BlockPoolSliceScanner new work period notice is very chatty at INFO level
 -

 Key: HDFS-3803
 URL: https://issues.apache.org/jira/browse/HDFS-3803
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 2.1.0-alpha, 2.0.1-alpha
 Environment: Hadoop 2.0.1-alpha-SNAPSHOT
Reporter: Andrew Purtell
Priority: Minor
 Fix For: 3.0.0

 Attachments: HDFS-3803.patch


 One line of ~140 chars logged every 5 seconds. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3793) Implement genericized format() in QJM


 [ 
https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-3793.
---

   Resolution: Fixed
Fix Version/s: QuorumJournalManager (HDFS-3077)
 Hadoop Flags: Reviewed

 Implement genericized format() in QJM
 -

 Key: HDFS-3793
 URL: https://issues.apache.org/jira/browse/HDFS-3793
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: QuorumJournalManager (HDFS-3077)

 Attachments: hdfs-3793.txt


 HDFS-3695 added the ability for non-File journal managers to tie into calls 
 like NameNode -format. This JIRA is to implement format() for 
 QuorumJournalManager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3798) Avoid throwing NPE when finalizeSegment() is called on invalid segment


 [ 
https://issues.apache.org/jira/browse/HDFS-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-3798.
---

   Resolution: Fixed
Fix Version/s: QuorumJournalManager (HDFS-3077)
 Hadoop Flags: Reviewed

 Avoid throwing NPE when finalizeSegment() is called on invalid segment
 --

 Key: HDFS-3798
 URL: https://issues.apache.org/jira/browse/HDFS-3798
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Trivial
 Fix For: QuorumJournalManager (HDFS-3077)

 Attachments: hdfs-3798.txt, hdfs-3798.txt


 Currently, if the client calls finalizeLogSegment() on a segment which 
 doesn't exist on the JournalNode side, it throws an NPE. Instead it should 
 throw a more intelligible exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3799) QJM: handle empty log segments during recovery


[ 
https://issues.apache.org/jira/browse/HDFS-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434709#comment-13434709
 ] 

Todd Lipcon commented on HDFS-3799:
---

Fixed the spelling typo. Going to punt on the other thing - the different loop 
iterations fail separately enough that it's easier to diagnose them as separate 
test cases.

Will commit momentarily with the nit addressed.

 QJM: handle empty log segments during recovery
 --

 Key: HDFS-3799
 URL: https://issues.apache.org/jira/browse/HDFS-3799
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: QuorumJournalManager (HDFS-3077)

 Attachments: hdfs-3799.txt


 One of the cases not yet handled in the QJM branch is the one where either 
 the writer or the journal node crashes after startLogSegment() but before it 
 has written its first transaction to the log. We currently have TODO 
 assertions in the code which fire in these cases.
 This JIRA is to deal with these cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level