[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2013-12-02 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837414#comment-13837414
 ] 

Binglin Chang commented on HDFS-5574:
-

The failed test is unrelated. 
TestBalancerWithNodeGroup is traced in HDFS-5580
TestFsDatasetCache: I looked into the failed log, the timeout is caused by a 
race condition, luckily after HDFS-5556 committed, the race condition no longer 
apply, so there should be no problem now.


> Remove buffer copy in BlockReader.skip
> --
>
> Key: HDFS-5574
> URL: https://issues.apache.org/jira/browse/HDFS-5574
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Binglin Chang
>Assignee: Binglin Chang
>Priority: Trivial
> Attachments: HDFS-5574.v1.patch, HDFS-5574.v2.patch
>
>
> BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
> data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837413#comment-13837413
 ] 

Jing Zhao commented on HDFS-5587:
-

The patch looks good overall. Some comments:
# When we have duplicated user/group names, instead of terminating the whole 
process, I think it will be better to rethrow the IOException and let 
upper-level application handle the exception.
{code}
+  LOG.fatal("Can't update maps:" + e);
+  terminate(1, e);
{code}
# We can check the detailed exception message after catching the exception in 
the following code:
{code}
+try {
+  IdUserGroup.updateMapInternal(uMap, "user", GET_ALL_USERS_CMD, ":");
+  fail("didn't detect the duplicate name");
+} catch (IOException e) {
+}
{code}

> add debug information when NFS fails to start with duplicate user or group 
> names
> 
>
> Key: HDFS-5587
> URL: https://issues.apache.org/jira/browse/HDFS-5587
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Attachments: HDFS-5587.001.patch
>
>
> When the host provides duplicate user or group names, NFS will not start and 
> print errors like the following:
> {noformat}
> ... ... 
> 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
> [TERM, HUP, INT]
> Exception in thread "main" java.lang.IllegalArgumentException: value already 
> present: s-iss
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
> at 
> com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
> at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
> at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
> at 
> org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
> 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
> ... ...
> {noformat}
> The reason NFS should not start is that, HDFS (non-kerberos cluster) uses 
> name as the only way to identify a user. On some linux box, it could have two 
> users with the same name but different user IDs. Linux might be able to work 
> fine with that most of the time. However, when NFS gateway talks to HDFS, 
> HDFS accepts only user name. That is, from HDFS' point of view, these two 
> different users are the same user even though they are different on the Linux 
> box.
> The duplicate names on Linux systems sometimes is because of some legacy 
> system configurations, or combined name services.
> Regardless, NFS gateway should print some help information so the user can 
> understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5536) Implement HTTP policy for Namenode and DataNode

2013-12-02 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5536:
-

Attachment: HDFS-5536.008.patch

> Implement HTTP policy for Namenode and DataNode
> ---
>
> Key: HDFS-5536
> URL: https://issues.apache.org/jira/browse/HDFS-5536
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5536.000.patch, HDFS-5536.001.patch, 
> HDFS-5536.002.patch, HDFS-5536.003.patch, HDFS-5536.004.patch, 
> HDFS-5536.005.patch, HDFS-5536.006.patch, HDFS-5536.007.patch, 
> HDFS-5536.008.patch
>
>
> this jira implements the http and https policy in the namenode and the 
> datanode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5369) Support negative caching of user-group mapping

2013-12-02 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837385#comment-13837385
 ] 

Vinay commented on HDFS-5369:
-

Hi Andrew, Thanks for posting this jira. Surely we need to find a way to reduce 
this log. Following log message is for one request on UI.  The amount of log is 
really huge.
{noformat}2013-12-03 11:34:56,589 WARN 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to 
get groups for user dr.who
org.apache.hadoop.util.Shell$ExitCodeException: id: dr.who: No such user

at org.apache.hadoop.util.Shell.runCommand(Shell.java:504)
at org.apache.hadoop.util.Shell.run(Shell.java:417)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:636)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:725)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:708)
at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:83)
at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
at 
org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:50)
at org.apache.hadoop.security.Groups.getGroups(Groups.java:95)
at 
org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1376)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.(FSPermissionChecker.java:63)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getPermissionChecker(FSNamesystem.java:3228)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListingInt(FSNamesystem.java:4063)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4052)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:748)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.getDirectoryListing(NamenodeWebHdfsMethods.java:715)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.getListingStream(NamenodeWebHdfsMethods.java:727)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.get(NamenodeWebHdfsMethods.java:675)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.access$400(NamenodeWebHdfsMethods.java:114)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$3.run(NamenodeWebHdfsMethods.java:623)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods$3.run(NamenodeWebHdfsMethods.java:618)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1515)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.get(NamenodeWebHdfsMethods.java:618)
at 
org.apache.hadoop.hdfs.server.namenode.web.resources.NamenodeWebHdfsMethods.getRoot(NamenodeWebHdfsMethods.java:586)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
at 
com.sun.jersey.spi.container.servlet.WebComponent.service

[jira] [Commented] (HDFS-5536) Implement HTTP policy for Namenode and DataNode

2013-12-02 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837374#comment-13837374
 ] 

Haohui Mai commented on HDFS-5536:
--

A new patch to address Jing's comments.

For (9), the suggested changes will accidentally setFindPort to false when the 
user specifies concrete ports for either http or https, thus I leave it the 
same as the old patch.

> Implement HTTP policy for Namenode and DataNode
> ---
>
> Key: HDFS-5536
> URL: https://issues.apache.org/jira/browse/HDFS-5536
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5536.000.patch, HDFS-5536.001.patch, 
> HDFS-5536.002.patch, HDFS-5536.003.patch, HDFS-5536.004.patch, 
> HDFS-5536.005.patch, HDFS-5536.006.patch, HDFS-5536.007.patch
>
>
> this jira implements the http and https policy in the namenode and the 
> datanode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5536) Implement HTTP policy for Namenode and DataNode

2013-12-02 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5536:
-

Attachment: HDFS-5536.007.patch

> Implement HTTP policy for Namenode and DataNode
> ---
>
> Key: HDFS-5536
> URL: https://issues.apache.org/jira/browse/HDFS-5536
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5536.000.patch, HDFS-5536.001.patch, 
> HDFS-5536.002.patch, HDFS-5536.003.patch, HDFS-5536.004.patch, 
> HDFS-5536.005.patch, HDFS-5536.006.patch, HDFS-5536.007.patch
>
>
> this jira implements the http and https policy in the namenode and the 
> datanode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837315#comment-13837315
 ] 

Alejandro Abdelnur commented on HDFS-5569:
--

Adam,

Regarding Kerberos, I meant you could restrict to which subnets the KDC is 
avail, then clients could get TGT from those subnets. Granted, not much fine 
grain control.

On the reverse DNS lookup, with WebHDFS this will have to be done in ALL node 
of the cluster as well (all DNs), and for every HTTP request. Yes caching may 
help, but config wise is not a couple of machines but all nodes.

A twist to Haohui's suggestion of a proxy. Instead using HDFS WebHdfs you could 
use HttpFS. HttpFS implements the WebHfds HTTP API, but it run fully contained 
in a Tomcat server. You don't need to expose the whole cluster to the clients, 
just the box were Tomcat runs. And, AFAIK, Tomcat support via configuration 
allow/deny. HttpFS is part of Hadoop 2, so it is there already.


> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5560) Trash configuration log statements prints incorrect units

2013-12-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837298#comment-13837298
 ] 

Hudson commented on HDFS-5560:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4818 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4818/])
HDFS-5560. Trash configuration log statements prints incorrect units. 
Contributed by Josh Elser. (wang: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1547266)
* /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java


> Trash configuration log statements prints incorrect units
> -
>
> Key: HDFS-5560
> URL: https://issues.apache.org/jira/browse/HDFS-5560
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.3.0
>
> Attachments: HDFS-5560.patch
>
>
> I ran `hdfs dfs -expunge` on a 2.2.0 system, and noticed the following the 
> message printed out on the console:
> {noformat}
> $ hdfs dfs -expunge
> 13/11/23 22:12:17 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 180 minutes, Emptier interval = 0 minutes.
> {noformat}
> The configuration for both the deletion interval and emptier interval are 
> given in minutes, converted to milliseconds and then logged as milliseconds 
> but with a label of minutes. It looks like this was introduced in HDFS-4903.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed

2013-12-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5546:
--

Assignee: Kousuke Saruta

> race condition crashes "hadoop ls -R" when directories are moved/removed
> 
>
> Key: HDFS-5546
> URL: https://issues.apache.org/jira/browse/HDFS-5546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Patrick McCabe
>Assignee: Kousuke Saruta
>Priority: Minor
> Attachments: HDFS-5546.1.patch
>
>
> This seems to be a rare race condition where we have a sequence of events 
> like this:
> 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D.
> 2. someone deletes or moves directory D
> 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which 
> calls DFS#listStatus(D). This throws FileNotFoundException.
> 4. ls command terminates with FNF



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5560) Trash configuration log statements prints incorrect units

2013-12-02 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837291#comment-13837291
 ] 

Josh Elser commented on HDFS-5560:
--

Great! Thanks, Andrew and Vinay.

> Trash configuration log statements prints incorrect units
> -
>
> Key: HDFS-5560
> URL: https://issues.apache.org/jira/browse/HDFS-5560
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.3.0
>
> Attachments: HDFS-5560.patch
>
>
> I ran `hdfs dfs -expunge` on a 2.2.0 system, and noticed the following the 
> message printed out on the console:
> {noformat}
> $ hdfs dfs -expunge
> 13/11/23 22:12:17 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 180 minutes, Emptier interval = 0 minutes.
> {noformat}
> The configuration for both the deletion interval and emptier interval are 
> given in minutes, converted to milliseconds and then logged as milliseconds 
> but with a label of minutes. It looks like this was introduced in HDFS-4903.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5560) Trash configuration log statements prints incorrect units

2013-12-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5560:
--

   Resolution: Fixed
Fix Version/s: 2.3.0
   Status: Resolved  (was: Patch Available)

Thanks Josh, +1. Committed to trunk, branch-2, branch-2.3.

I also added you as a contributor, so you'll be able to assign JIRAs to 
yourself and mark things as patch available.

> Trash configuration log statements prints incorrect units
> -
>
> Key: HDFS-5560
> URL: https://issues.apache.org/jira/browse/HDFS-5560
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.3.0
>
> Attachments: HDFS-5560.patch
>
>
> I ran `hdfs dfs -expunge` on a 2.2.0 system, and noticed the following the 
> message printed out on the console:
> {noformat}
> $ hdfs dfs -expunge
> 13/11/23 22:12:17 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 180 minutes, Emptier interval = 0 minutes.
> {noformat}
> The configuration for both the deletion interval and emptier interval are 
> given in minutes, converted to milliseconds and then logged as milliseconds 
> but with a label of minutes. It looks like this was introduced in HDFS-4903.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5560) Trash configuration log statements prints incorrect units

2013-12-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5560:
--

Assignee: Josh Elser

> Trash configuration log statements prints incorrect units
> -
>
> Key: HDFS-5560
> URL: https://issues.apache.org/jira/browse/HDFS-5560
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Josh Elser
>Assignee: Josh Elser
> Attachments: HDFS-5560.patch
>
>
> I ran `hdfs dfs -expunge` on a 2.2.0 system, and noticed the following the 
> message printed out on the console:
> {noformat}
> $ hdfs dfs -expunge
> 13/11/23 22:12:17 INFO fs.TrashPolicyDefault: Namenode trash configuration: 
> Deletion interval = 180 minutes, Emptier interval = 0 minutes.
> {noformat}
> The configuration for both the deletion interval and emptier interval are 
> given in minutes, converted to milliseconds and then logged as milliseconds 
> but with a label of minutes. It looks like this was introduced in HDFS-4903.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837274#comment-13837274
 ] 

Haohui Mai commented on HDFS-5569:
--

bq. Currently WebHDFS only supports Kerberos authentication and does not 
support authorization.

I'm not following what exactly you meant by authorization. Do you mean 
controlling who can access the file using WebHDFS? Applications are supposed to 
specify their policy of authorization (i.e., access controls) by setting HDFS 
users / groups / permissions properly. The access control happens in the 
namenode. WebHDFS is only a gateway to the NN.

> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5536) Implement HTTP policy for Namenode and DataNode

2013-12-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837267#comment-13837267
 ] 

Jing Zhao commented on HDFS-5536:
-

The patch looks great to me. Some comments (most of them are minor):
# Shall we deprecate the configuration "hadoop.ssl.enabled"?
# Could you explain why here we need to create an HdfsConfiguration instead of
  the original Configuration (in both DN and NN)?
{code}
+  Configuration sslConf = new HdfsConfiguration(false);
   sslConf.addResource(conf.get(
   DFSConfigKeys.DFS_SERVER_HTTPS_KEYSTORE_RESOURCE_KEY,
   DFSConfigKeys.DFS_SERVER_HTTPS_KEYSTORE_RESOURCE_DEFAULT));
{code}
# It will be better to state the same meaning in the javadoc of 
SecureDataNodeStarter:
{code}
+// Obtain a listener for web server. The code intends to bind HTTP server 
to
+// privileged port only, as the client can authenticate the server using
+// certificates if they are communicating through SSL.
{code}
# It will be better to have some unit tests to test the mapping between
  different policies and http server connectors in NN/DN.
# Both NameNode.java and DataNode.java have a lot of changes in the import
  section. Let's still use * in the import to avoid these changes.
# HttpServer.java only has a change with an empty line.
# Is the following document correct?
{code}
+*-+-++
+| <<>> |  or  Implement HTTP policy for Namenode and DataNode
> ---
>
> Key: HDFS-5536
> URL: https://issues.apache.org/jira/browse/HDFS-5536
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5536.000.patch, HDFS-5536.001.patch, 
> HDFS-5536.002.patch, HDFS-5536.003.patch, HDFS-5536.004.patch, 
> HDFS-5536.005.patch, HDFS-5536.006.patch
>
>
> this jira implements the http and https policy in the namenode and the 
> datanode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Travis Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837260#comment-13837260
 ] 

Travis Thompson commented on HDFS-5569:
---

[~cmccabe] I'm not sure how adding a new IP address would by-pass this?  As the 
client I can add whatever IP address I want but if it's not routable it won't 
work... and the server would be doing filtering on the server side so it would 
be based on Source IP...  so having a ip/host based filter would be useful.

Also on the Kerberos note, there is NO authorization in Kerberos, only 
authentication.  Kerberos only tells you who you are, not what you can do, it's 
up to the application layer to decide what to do with that information.

> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837242#comment-13837242
 ] 

Colin Patrick McCabe commented on HDFS-5569:


It's not difficult to masquerade as a different IP address.  You can do it by 
typing this command from your command-line:

{code}
sudo /sbin/ipconfig eth0 
{code}

Doing so does not bypass existing controls because Kerberos can't be tricked 
just by changing your IP.

bq. Currently WebHDFS only supports Kerberos authentication and does not 
support authorization.

That sounds like something we should fix within webhdfs.  Maybe someone more 
familiar with webhdfs can comment?

> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837216#comment-13837216
 ] 

Haohui Mai commented on HDFS-5569:
--

[~farisa], what about putting an http proxy (e.g., nginx) over the namenode / 
datanode http server? You can deploy path-based filtering pretty easily.

It seems to me that the Knox project is trying to solve the same problem, so it 
might be worthwhile to check it out.

> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5581) NameNodeFsck should use only one instance of BlockPlacementPolicy

2013-12-02 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837205#comment-13837205
 ] 

Vinay commented on HDFS-5581:
-

Thanks Colin.

> NameNodeFsck should use only one instance of BlockPlacementPolicy
> -
>
> Key: HDFS-5581
> URL: https://issues.apache.org/jira/browse/HDFS-5581
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinay
>Assignee: Vinay
> Fix For: 2.4.0
>
> Attachments: HDFS-5581.patch, HDFS-5581.patch
>
>
> While going through NameNodeFsck I found that following code creates the new 
> instance of BlockPlacementPolicy for every block.
> {code}  // verify block placement policy
>   BlockPlacementStatus blockPlacementStatus = 
>   BlockPlacementPolicy.getInstance(conf, null, networktopology).
>   verifyBlockPlacement(path, lBlk, targetFileReplication);{code}
> It would be better to use the namenode's BPP itself instead of creating a new 
> one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837193#comment-13837193
 ] 

Hadoop QA commented on HDFS-5590:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616626/HDFS-5590.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5616//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5616//console

This message is automatically generated.

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch, HDFS-5590.001.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5555) CacheAdmin commands fail when first listed NameNode is in Standby

2013-12-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837177#comment-13837177
 ] 

Jing Zhao commented on HDFS-:
-

bq. The other fix is to make sure the iterator supports failover as well.

Agree. Currently the cache pool iterator is defined within 
ClientNamenodeProtocolTranslatorPB and it is always associated with the 
corresponding rpcProxy. Thus it cannot support failover. We may want to define 
the iterator inside DFSClient instead, where DFSClient#namenode supports 
failover in HA setup.

> CacheAdmin commands fail when first listed NameNode is in Standby
> -
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching
>Affects Versions: 3.0.0
>Reporter: Stephen Chu
>Assignee: Jimmy Xiang
>
> I am on a HA-enabled cluster. The NameNodes are on host-1 and host-2.
> In the configurations, we specify the host-1 NN first and the host-2 NN 
> afterwards in the _dfs.ha.namenodes.ns1_ property (where _ns1_ is the name of 
> the nameservice).
> If the host-1 NN is Standby and the host-2 NN is Active, some CacheAdmins 
> will fail complaining about operation not supported in standby state.
> e.g.
> {code}
> bash-4.1$ hdfs cacheadmin -removeDirectives -path /user/hdfs2
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1501)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1082)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.listCacheDirectives(FSNamesystem.java:6892)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1249)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.listCacheDirectives(ClientNamenodeProtocolServerSideTranslatorPB.java:1087)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1348)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1301)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.listCacheDirectives(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1079)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1064)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$32.hasNext(DistributedFileSystem.java:1704)
>   at 
> org.apache.hadoop.hdfs.tools.CacheAdmin$RemoveCacheDirectiveInfosCommand.run(CacheAdmin.java:372)
>   at org.apache.hadoop.hdfs.tools.CacheAdmin.run(CacheAdmin.java:84)
>   at org.apache.hadoop.hdfs.tools.CacheAdmin.main(CacheAdmin.java:89)
> {code}
> After manual

[jira] [Commented] (HDFS-5554) Add Snapshot Feature to INodeFile

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837175#comment-13837175
 ] 

Hadoop QA commented on HDFS-5554:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616624/HDFS-5554.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5615//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5615//console

This message is automatically generated.

> Add Snapshot Feature to INodeFile
> -
>
> Key: HDFS-5554
> URL: https://issues.apache.org/jira/browse/HDFS-5554
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5554.001.patch, HDFS-5554.002.patch
>
>
> Similar with HDFS-5285, we can add a FileWithSnapshot feature to INodeFile 
> and use it to replace the current INodeFileWithSnapshot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837170#comment-13837170
 ] 

Arpit Agarwal commented on HDFS-5590:
-

+1 for the new patch. Both Jenkins warnings look bogus. Thanks for reporting 
and fixing this!

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch, HDFS-5590.001.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837164#comment-13837164
 ] 

Hadoop QA commented on HDFS-2832:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12616642/20131202-HeterogeneousStorage-TestPlan.pdf
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5618//console

This message is automatically generated.

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, 
> 20131202-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
> editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
> h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
> h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
> h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
> h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, 
> h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
> h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
> h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
> h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Adam Faris (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837161#comment-13837161
 ] 

Adam Faris commented on HDFS-5569:
--

The two alternatives both have problems, and I'll start with iptables.  As 
already stated in the description, iptables will either block or expose the 
entire jetty container.  In short iptables is not smart enough to allow access 
to "/browseDirectory.jsp" and block "/webhdfs/v1".

Regarding Kerberos, Kerberos is only authentication and does not handle 
authorization.  When supported by an application Kerberos will verify who the 
user accessing the application is, but it's up to the application to decide if 
access should be allowed.   Currently WebHDFS only supports Kerberos 
authentication and does not support authorization.

For the concerns about faked IPs, the allow/deny list in httpd works against 
both IP ranges and hostname matches.  If someone were able to masquerade as a 
IP within the allowed IP range then they would also be able to bypass iptables. 
 As it's difficult to masquerade as a different IP and doing so would bypass 
existing controls, faked IPs should not be considered a blocker for this 
request.  

Performance concerns of reverse hostname lookups shouldn't be a concern.  A 
locally running caching name server would help with lookup response times and 
one could always configure the JVM security manager to cache hostname/ip 
mappings indefinitely.  Both the namenode and jobtracker do reverse lookups and 
this isn't a problem for either.

Thanks for taking time to investigate this request and not closing 'wontfix'.  

> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837160#comment-13837160
 ] 

Hadoop QA commented on HDFS-4983:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616621/HDFS-4983.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5613//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5613//console

This message is automatically generated.

> Numeric usernames do not work with WebHDFS FS
> -
>
> Key: HDFS-4983
> URL: https://issues.apache.org/jira/browse/HDFS-4983
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Yongjun Zhang
>  Labels: patch
> Attachments: HDFS-4983.001.patch
>
>
> Per the file 
> hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
>  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
> Given this, using a username such as "123" seems to fail for some reason 
> (tried on insecure setup):
> {code}
> [123@host-1 ~]$ whoami
> 123
> [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
> -ls: Invalid value: "123" does not belong to the domain 
> ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
> Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [ ...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837154#comment-13837154
 ] 

Hadoop QA commented on HDFS-5590:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616620/HDFS-5590.000.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 
release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5614//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5614//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5614//console

This message is automatically generated.

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch, HDFS-5590.001.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-2832:


Attachment: 20131202-HeterogeneousStorage-TestPlan.pdf

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, 
> 20131202-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
> editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
> h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
> h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
> h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
> h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, 
> h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
> h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
> h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
> h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5555) CacheAdmin commands fail when first listed NameNode is in Standby

2013-12-02 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837092#comment-13837092
 ] 

Jimmy Xiang commented on HDFS-:
---

DFSClient uses RetryInvocationHandler in this case while the iterator uses 
ProtobufRpcEngine#Invoker which doesn't handle failover. One fix is to throw 
StandbyException at the beginning instead of returning an iterator. The other 
fix is to make sure the iterator supports failover as well.  If we throw  
StandbyException at the beginning, the issue will comes up again when iterating 
the iterator while NN fails over in the middle.

> CacheAdmin commands fail when first listed NameNode is in Standby
> -
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching
>Affects Versions: 3.0.0
>Reporter: Stephen Chu
>Assignee: Jimmy Xiang
>
> I am on a HA-enabled cluster. The NameNodes are on host-1 and host-2.
> In the configurations, we specify the host-1 NN first and the host-2 NN 
> afterwards in the _dfs.ha.namenodes.ns1_ property (where _ns1_ is the name of 
> the nameservice).
> If the host-1 NN is Standby and the host-2 NN is Active, some CacheAdmins 
> will fail complaining about operation not supported in standby state.
> e.g.
> {code}
> bash-4.1$ hdfs cacheadmin -removeDirectives -path /user/hdfs2
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1501)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1082)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.listCacheDirectives(FSNamesystem.java:6892)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1249)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.listCacheDirectives(ClientNamenodeProtocolServerSideTranslatorPB.java:1087)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1348)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1301)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.listCacheDirectives(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1079)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1064)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$32.hasNext(DistributedFileSystem.java:1704)
>   at 
> org.apache.hadoop.hdfs.tools.CacheAdmin$RemoveCacheDirectiveInfosCommand.run(CacheAdmin.java:372)
>   at org.apache.hadoop.hdfs.tools.CacheAdmin.run(CacheAdmin.java:84)
>   at org.apache.hadoop.hdfs.tools.Ca

[jira] [Updated] (HDFS-5570) Deprecate hftp / hsftp and replace them with webhdfs / swebhdfs

2013-12-02 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5570:
-

Status: Patch Available  (was: Open)

> Deprecate hftp / hsftp and replace them with webhdfs / swebhdfs
> ---
>
> Key: HDFS-5570
> URL: https://issues.apache.org/jira/browse/HDFS-5570
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5570.000.patch
>
>
> Currently hftp / hsftp only provide a strict subset of functionality that 
> webhdfs / swebhdfs offer. Notably, hftp / hsftp do not support writes and HA 
> namenodes. Maintaining two piece of code with similar functionality introduce 
> unnecessary work.
> Webhdfs has been around since Hadoop 1.0 therefore moving forward with 
> webhdfs does not seem to cause any significant migration issues.
> This jira proposes to deprecate hftp / hsftp in branch-2 and remove them in 
> trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5570) Deprecate hftp / hsftp and replace them with webhdfs / swebhdfs

2013-12-02 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-5570:
-

Attachment: HDFS-5570.000.patch

> Deprecate hftp / hsftp and replace them with webhdfs / swebhdfs
> ---
>
> Key: HDFS-5570
> URL: https://issues.apache.org/jira/browse/HDFS-5570
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Attachments: HDFS-5570.000.patch
>
>
> Currently hftp / hsftp only provide a strict subset of functionality that 
> webhdfs / swebhdfs offer. Notably, hftp / hsftp do not support writes and HA 
> namenodes. Maintaining two piece of code with similar functionality introduce 
> unnecessary work.
> Webhdfs has been around since Hadoop 1.0 therefore moving forward with 
> webhdfs does not seem to cause any significant migration issues.
> This jira proposes to deprecate hftp / hsftp in branch-2 and remove them in 
> trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837056#comment-13837056
 ] 

Brandon Li commented on HDFS-5587:
--

With duplicated name or id, NFS gateway will print the following error message 
to explain the cause and steps to find the duplicates:

{noformat}
NFS gateway can't start with duplicate name or id on the host system.
This is because HDFS (non-kerberos cluster) uses name as the only way to 
identify a user or group.
The host system with duplicated user/group name or id might work fine most of 
the time by itself.
However when NFS gateway talks to HDFS, HDFS accepts only user and group name.
Therefore, same name means the same user or same group. To find the duplicated 
names/ids, one can do:
 and  on Linux 
systms,
 and  on 
MacOS.
{noformat}

> add debug information when NFS fails to start with duplicate user or group 
> names
> 
>
> Key: HDFS-5587
> URL: https://issues.apache.org/jira/browse/HDFS-5587
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Attachments: HDFS-5587.001.patch
>
>
> When the host provides duplicate user or group names, NFS will not start and 
> print errors like the following:
> {noformat}
> ... ... 
> 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
> [TERM, HUP, INT]
> Exception in thread "main" java.lang.IllegalArgumentException: value already 
> present: s-iss
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
> at 
> com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
> at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
> at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
> at 
> org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
> 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
> ... ...
> {noformat}
> The reason NFS should not start is that, HDFS (non-kerberos cluster) uses 
> name as the only way to identify a user. On some linux box, it could have two 
> users with the same name but different user IDs. Linux might be able to work 
> fine with that most of the time. However, when NFS gateway talks to HDFS, 
> HDFS accepts only user name. That is, from HDFS' point of view, these two 
> different users are the same user even though they are different on the Linux 
> box.
> The duplicate names on Linux systems sometimes is because of some legacy 
> system configurations, or combined name services.
> Regardless, NFS gateway should print some help information so the user can 
> understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5555) CacheAdmin commands fail when first listed NameNode is in Standby

2013-12-02 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837055#comment-13837055
 ] 

Jimmy Xiang commented on HDFS-:
---

It looks to me at first a RemoteIterator is returned from the standby nn. When 
trying to iterate the remote iterator, we get the StandbyException.

> CacheAdmin commands fail when first listed NameNode is in Standby
> -
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching
>Affects Versions: 3.0.0
>Reporter: Stephen Chu
>Assignee: Jimmy Xiang
>
> I am on a HA-enabled cluster. The NameNodes are on host-1 and host-2.
> In the configurations, we specify the host-1 NN first and the host-2 NN 
> afterwards in the _dfs.ha.namenodes.ns1_ property (where _ns1_ is the name of 
> the nameservice).
> If the host-1 NN is Standby and the host-2 NN is Active, some CacheAdmins 
> will fail complaining about operation not supported in standby state.
> e.g.
> {code}
> bash-4.1$ hdfs cacheadmin -removeDirectives -path /user/hdfs2
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1501)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1082)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.listCacheDirectives(FSNamesystem.java:6892)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1249)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.listCacheDirectives(ClientNamenodeProtocolServerSideTranslatorPB.java:1087)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1348)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1301)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.listCacheDirectives(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1079)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1064)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$32.hasNext(DistributedFileSystem.java:1704)
>   at 
> org.apache.hadoop.hdfs.tools.CacheAdmin$RemoveCacheDirectiveInfosCommand.run(CacheAdmin.java:372)
>   at org.apache.hadoop.hdfs.tools.CacheAdmin.run(CacheAdmin.java:84)
>   at org.apache.hadoop.hdfs.tools.CacheAdmin.main(CacheAdmin.java:89)
> {code}
> After manually failing over from host-2 to host-1, the CacheAdmin commands 
> succeed.
> The affected commands are:
> -listPools
> -listDirectives
> -removeDirectives



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5591) Checkpointing should use monotonic time when calculating period

2013-12-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5591:
--

Labels: newbie  (was: )

> Checkpointing should use monotonic time when calculating period
> ---
>
> Key: HDFS-5591
> URL: https://issues.apache.org/jira/browse/HDFS-5591
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.2.0
>Reporter: Andrew Wang
>Priority: Minor
>  Labels: newbie
>
> Both StandbyCheckpointer and SecondaryNameNode use {{Time.now}} rather than 
> {{Time.monotonicNow}} to calculate how long it's been since the last 
> checkpoint. This can lead to issues when the system time is changed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HDFS-5555) CacheAdmin commands fail when first listed NameNode is in Standby

2013-12-02 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HDFS-:
-

Assignee: Jimmy Xiang

> CacheAdmin commands fail when first listed NameNode is in Standby
> -
>
> Key: HDFS-
> URL: https://issues.apache.org/jira/browse/HDFS-
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching
>Affects Versions: 3.0.0
>Reporter: Stephen Chu
>Assignee: Jimmy Xiang
>
> I am on a HA-enabled cluster. The NameNodes are on host-1 and host-2.
> In the configurations, we specify the host-1 NN first and the host-2 NN 
> afterwards in the _dfs.ha.namenodes.ns1_ property (where _ns1_ is the name of 
> the nameservice).
> If the host-1 NN is Standby and the host-2 NN is Active, some CacheAdmins 
> will fail complaining about operation not supported in standby state.
> e.g.
> {code}
> bash-4.1$ hdfs cacheadmin -removeDirectives -path /user/hdfs2
> Exception in thread "main" 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1501)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1082)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.listCacheDirectives(FSNamesystem.java:6892)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer$ServerSideCacheEntriesIterator.makeRequest(NameNodeRpcServer.java:1249)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.listCacheDirectives(ClientNamenodeProtocolServerSideTranslatorPB.java:1087)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1348)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1301)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>   at com.sun.proxy.$Proxy9.listCacheDirectives(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1079)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB$CacheEntriesIterator.makeRequest(ClientNamenodeProtocolTranslatorPB.java:1064)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
>   at 
> org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$32.hasNext(DistributedFileSystem.java:1704)
>   at 
> org.apache.hadoop.hdfs.tools.CacheAdmin$RemoveCacheDirectiveInfosCommand.run(CacheAdmin.java:372)
>   at org.apache.hadoop.hdfs.tools.CacheAdmin.run(CacheAdmin.java:84)
>   at org.apache.hadoop.hdfs.tools.CacheAdmin.main(CacheAdmin.java:89)
> {code}
> After manually failing over from host-2 to host-1, the CacheAdmin commands 
> succeed.
> The affected commands are:
> -listPools
> -listDirectives
> -removeDirectives



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5591) Checkpointing should use monotonic time when calculating period

2013-12-02 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-5591:
-

 Summary: Checkpointing should use monotonic time when calculating 
period
 Key: HDFS-5591
 URL: https://issues.apache.org/jira/browse/HDFS-5591
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.2.0
Reporter: Andrew Wang
Priority: Minor


Both StandbyCheckpointer and SecondaryNameNode use {{Time.now}} rather than 
{{Time.monotonicNow}} to calculate how long it's been since the last 
checkpoint. This can lead to issues when the system time is changed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5590:


Attachment: HDFS-5590.001.patch

Upload a patch that always persists blocks and deletes the dfs.persist.blocks 
property.

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch, HDFS-5590.001.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837033#comment-13837033
 ] 

Jing Zhao commented on HDFS-5590:
-

Yeah, I think deprecating the property is also fine. Let me upload a patch for 
deprecating the property also.

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5554) Add Snapshot Feature to INodeFile

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5554:


Attachment: HDFS-5554.002.patch

Fix the failed test.

> Add Snapshot Feature to INodeFile
> -
>
> Key: HDFS-5554
> URL: https://issues.apache.org/jira/browse/HDFS-5554
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5554.001.patch, HDFS-5554.002.patch
>
>
> Similar with HDFS-5285, we can add a FileWithSnapshot feature to INodeFile 
> and use it to replace the current INodeFileWithSnapshot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-02 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837024#comment-13837024
 ] 

Arpit Agarwal commented on HDFS-2832:
-

The {{TestOfflineEditsViewer}} failure is expected and can be fixed with the 
attached editsStored binary file.

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, 
> h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, 
> h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, 
> h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, 
> h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, 
> h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, 
> h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, 
> h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, 
> h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, 
> h2832_20131124.patch, h2832_20131202.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5558) LeaseManager monitor thread can crash if the last block is complete but another block is not.

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837022#comment-13837022
 ] 

Hadoop QA commented on HDFS-5558:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616609/HDFS-5558.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5612//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5612//console

This message is automatically generated.

> LeaseManager monitor thread can crash if the last block is complete but 
> another block is not.
> -
>
> Key: HDFS-5558
> URL: https://issues.apache.org/jira/browse/HDFS-5558
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-5558.branch-023.patch, HDFS-5558.branch-023.patch, 
> HDFS-5558.patch, HDFS-5558.patch
>
>
> As mentioned in HDFS-5557, if a file has its last and penultimate block not 
> completed and the file is being closed, the last block may be completed but 
> the penultimate one might not. If this condition lasts long and the file is 
> abandoned, LeaseManager will try to recover the lease and the block. But 
> {{internalReleaseLease()}} will fail with invalid cast exception with this 
> kind of file.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4685) Implementation of ACLs in HDFS

2013-12-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837009#comment-13837009
 ] 

Chris Nauroth commented on HDFS-4685:
-

I've created a new HDFS-4685 branch in Subversion:

http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-4685/

Later today, I'll start entering sub-tasks for the development activity 
intended for this branch.


> Implementation of ACLs in HDFS
> --
>
> Key: HDFS-4685
> URL: https://issues.apache.org/jira/browse/HDFS-4685
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client, namenode, security
>Affects Versions: 1.1.2
>Reporter: Sachin Jose
>Assignee: Chris Nauroth
> Attachments: HDFS-ACLs-Design-1.pdf
>
>
> Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be 
> achieved using getfacl and setfacl utilities. Is there anybody working on 
> this feature ?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837010#comment-13837010
 ] 

Arpit Agarwal commented on HDFS-5590:
-

Jing - what do you think of deprecating the setting?

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-02 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-4983:


Labels: patch  (was: )
Status: Patch Available  (was: Open)

Hello Folks,

I'm new to the apache community and thanks in advance for reviewing the 
attached patch.

Best regards.

--Yongjun



> Numeric usernames do not work with WebHDFS FS
> -
>
> Key: HDFS-4983
> URL: https://issues.apache.org/jira/browse/HDFS-4983
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Yongjun Zhang
>  Labels: patch
> Attachments: HDFS-4983.001.patch
>
>
> Per the file 
> hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
>  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
> Given this, using a username such as "123" seems to fail for some reason 
> (tried on insecure setup):
> {code}
> [123@host-1 ~]$ whoami
> 123
> [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
> -ls: Invalid value: "123" does not belong to the domain 
> ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
> Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [ ...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-02 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-4983:


Attachment: HDFS-4983.001.patch

This patch implemented support to allow using key 
"webhdfs.user.provider.user.pattern" to specify user name pattern, such that 
numerical usernames can be supported.

> Numeric usernames do not work with WebHDFS FS
> -
>
> Key: HDFS-4983
> URL: https://issues.apache.org/jira/browse/HDFS-4983
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Yongjun Zhang
> Attachments: HDFS-4983.001.patch
>
>
> Per the file 
> hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
>  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
> Given this, using a username such as "123" seems to fail for some reason 
> (tried on insecure setup):
> {code}
> [123@host-1 ~]$ whoami
> 123
> [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
> -ls: Invalid value: "123" does not belong to the domain 
> ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
> Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [ ...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5590:


Status: Patch Available  (was: Open)

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5590:


Attachment: HDFS-5590.000.patch

Thanks for the comments Suresh and Arpit! So in the current patch I just change 
the default value to true for block persistence. Since this configuration 
property is not exposed in our document currently, I just update its javadoc in 
DFSConfigKeys.

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5590.000.patch
>
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5590:


Description: 
In a cluster with non-HA setup and dfs.persist.blocks set to false, we may have 
data loss in the following case:

# client creates file1 and requests a block from NN and get blk_id1_gs1
# client writes blk_id1_gs1 to DN
# NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
persisted in disk
# another client creates file2 and NN will allocate a new block using the same 
block id blk_id1_gs1 since block ID and generation stamp are both increased 
sequentially.

Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
same gs) in the system. It will case data loss.

  was:
In a cluster with non-HA setup and dfs.persist.blocks set to false, the current 
sequential block ID mechanism may cause data loss in the following case:

# client creates file1 and requests a block from NN and get blk_id1_gs1
# client writes blk_id1_gs1 to DN
# NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
persisted in disk
# another client creates file2 and NN will allocate a new block using the same 
block id blk_id1_gs1 since block ID and generation stamp are both increased 
sequentially.

Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
same gs) in the system. It will case data loss.


> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, we may 
> have data loss in the following case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5590) Block ID and generation stamp may be reused when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5590:


Summary: Block ID and generation stamp may be reused when persistBlocks is 
set to false  (was: Sequential block ID may cause data loss when persistBlocks 
is set to false)

> Block ID and generation stamp may be reused when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, the 
> current sequential block ID mechanism may cause data loss in the following 
> case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4685) Implementation of ACLs in HDFS

2013-12-02 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-4685:


Attachment: HDFS-ACLs-Design-1.pdf

I'm posting the design doc.  Feedback is welcome.  Big thanks to the 
contributors who already provided valuable feedback on this version of the doc: 
[~darumugam], [~brandonli], [~wheat9], [~kevin.minder], [~sanjay.radia], 
[~sureshms], [~szetszwo] and [~jingzhao].

> Implementation of ACLs in HDFS
> --
>
> Key: HDFS-4685
> URL: https://issues.apache.org/jira/browse/HDFS-4685
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client, namenode, security
>Affects Versions: 1.1.2
>Reporter: Sachin Jose
>Assignee: Chris Nauroth
> Attachments: HDFS-ACLs-Design-1.pdf
>
>
> Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be 
> achieved using getfacl and setfacl utilities. Is there anybody working on 
> this feature ?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Sequential block ID may cause data loss when persistBlocks is set to false

2013-12-02 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836939#comment-13836939
 ] 

Suresh Srinivas commented on HDFS-5590:
---

BTW I think this is not related to sequential block IDs. This can happen even 
with older scheme, albeit at lower probability. Please update the jira summary 
and description accordingly.

> Sequential block ID may cause data loss when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, the 
> current sequential block ID mechanism may cause data loss in the following 
> case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5554) Add Snapshot Feature to INodeFile

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836929#comment-13836929
 ] 

Hadoop QA commented on HDFS-5554:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616591/HDFS-5554.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestSnapshotPathINodes

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5609//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5609//console

This message is automatically generated.

> Add Snapshot Feature to INodeFile
> -
>
> Key: HDFS-5554
> URL: https://issues.apache.org/jira/browse/HDFS-5554
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5554.001.patch
>
>
> Similar with HDFS-5285, we can add a FileWithSnapshot feature to INodeFile 
> and use it to replace the current INodeFileWithSnapshot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Sequential block ID may cause data loss when persistBlocks is set to false

2013-12-02 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836920#comment-13836920
 ] 

Arpit Agarwal commented on HDFS-5590:
-

Jing, thanks for reporting this regression with sequential block IDs. I would 
go one step further and ask if we can deprecate the setting altogether (and 
always persist block allocations)?

> Sequential block ID may cause data loss when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, the 
> current sequential block ID mechanism may cause data loss in the following 
> case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5526) Datanode cannot roll back to previous layout version

2013-12-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-5526:
-

Hadoop Flags: Reviewed

> Datanode cannot roll back to previous layout version
> 
>
> Key: HDFS-5526
> URL: https://issues.apache.org/jira/browse/HDFS-5526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 3.0.0, 2.4.0, 0.23.10
>
> Attachments: HDFS-5526.patch, HDFS-5526.patch
>
>
> Current trunk layout version is -48.
> Hadoop v2.2.0 layout version is -47.
> If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes 
> cannot start with -rollback.  It will fail with IncorrectVersionException.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5526) Datanode cannot roll back to previous layout version

2013-12-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-5526:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Datanode cannot roll back to previous layout version
> 
>
> Key: HDFS-5526
> URL: https://issues.apache.org/jira/browse/HDFS-5526
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 3.0.0, 2.4.0, 0.23.10
>
> Attachments: HDFS-5526.patch, HDFS-5526.patch
>
>
> Current trunk layout version is -48.
> Hadoop v2.2.0 layout version is -47.
> If a cluster is upgraded from v2.2.0 (-47) to trunk (-48), the datanodes 
> cannot start with -rollback.  It will fail with IncorrectVersionException.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HDFS-4983) Numeric usernames do not work with WebHDFS FS

2013-12-02 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang reassigned HDFS-4983:
---

Assignee: Yongjun Zhang

> Numeric usernames do not work with WebHDFS FS
> -
>
> Key: HDFS-4983
> URL: https://issues.apache.org/jira/browse/HDFS-4983
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Assignee: Yongjun Zhang
>
> Per the file 
> hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java,
>  the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}.
> Given this, using a username such as "123" seems to fail for some reason 
> (tried on insecure setup):
> {code}
> [123@host-1 ~]$ whoami
> 123
> [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls /
> -ls: Invalid value: "123" does not belong to the domain 
> ^[A-Za-z_][A-Za-z0-9._-]*[$]?$
> Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [ ...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5558) LeaseManager monitor thread can crash if the last block is complete but another block is not.

2013-12-02 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836891#comment-13836891
 ] 

Daryn Sharp commented on HDFS-5558:
---

+1  It sounds like you addressed Colin's concern.  And yes, I helped debug a 
case awhile back where the user accidentally set the block size to N-KB instead 
of N-MB for their jobs.  The NN was overwhelmed and the dfs client's request 
for another block was being processed before the prior block's updates.

> LeaseManager monitor thread can crash if the last block is complete but 
> another block is not.
> -
>
> Key: HDFS-5558
> URL: https://issues.apache.org/jira/browse/HDFS-5558
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-5558.branch-023.patch, HDFS-5558.branch-023.patch, 
> HDFS-5558.patch, HDFS-5558.patch
>
>
> As mentioned in HDFS-5557, if a file has its last and penultimate block not 
> completed and the file is being closed, the last block may be completed but 
> the penultimate one might not. If this condition lasts long and the file is 
> abandoned, LeaseManager will try to recover the lease and the block. But 
> {{internalReleaseLease()}} will fail with invalid cast exception with this 
> kind of file.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Sequential block ID may cause data loss when persistBlocks is set to false

2013-12-02 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836889#comment-13836889
 ] 

Suresh Srinivas commented on HDFS-5590:
---

How about always persisting to editlog when a block is allocated. This what is 
done in HA mode. While doing the same in non-HA mode can be construed as 
affecting performance, clearly here there are correctness and data loss issues. 
My recommendation would be always persist blocks and hence change the default 
value for block persistence from false to true. We also need to document this 
and ensure people who set this to false, do it with the understanding of the 
risks.

> Sequential block ID may cause data loss when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, the 
> current sequential block ID mechanism may cause data loss in the following 
> case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas

2013-12-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-5557:
-

   Resolution: Fixed
Fix Version/s: 0.23.10
   2.4.0
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

> Write pipeline recovery for the last packet in the block may cause rejection 
> of valid replicas
> --
>
> Key: HDFS-5557
> URL: https://issues.apache.org/jira/browse/HDFS-5557
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Fix For: 3.0.0, 2.4.0, 0.23.10
>
> Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, 
> HDFS-5557.patch
>
>
> When a block is reported from a data node while the block is under 
> construction (i.e. not committed or completed), BlockManager calls 
> BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported 
> replica state. But BlockManager is calling it with the stored block, not 
> reported block.  This causes the recorded replicas' gen stamp to be that of 
> BlockInfoUnderConstruction itself, not the one from reported replica.
> When a pipeline recovery is done for the last packet of a block, the 
> incremental block reports with the new gen stamp may come before the client 
> calling updatePipeline(). If this happens, these replicas will be incorrectly 
> recorded with the old gen stamp and get removed later.  The result is close 
> or addAdditionalBlock failure.
> If the last block is completed, but the penultimate block is not because of 
> this issue, the file won't be closed. If this file is not cleared, but the 
> client goes away, the lease manager will try to recover the lease/block, at 
> which point it will crash. I will file a separate jira for this shortly.
> The worst case is to reject all good ones and accepting a bad one. In this 
> case, the block will get completed, but the data cannot be read until the 
> next full block report containing one of the valid replicas is received.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas

2013-12-02 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836882#comment-13836882
 ] 

Kihwal Lee commented on HDFS-5557:
--

Thanks for the review, Vnay and Daryn. I've committed this to branch-0.23, 
branch-2 and trunk. For branch-2 and 0.23, the update in TestReplicationPolicy 
was skipped as it is not applicable. 

> Write pipeline recovery for the last packet in the block may cause rejection 
> of valid replicas
> --
>
> Key: HDFS-5557
> URL: https://issues.apache.org/jira/browse/HDFS-5557
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, 
> HDFS-5557.patch
>
>
> When a block is reported from a data node while the block is under 
> construction (i.e. not committed or completed), BlockManager calls 
> BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported 
> replica state. But BlockManager is calling it with the stored block, not 
> reported block.  This causes the recorded replicas' gen stamp to be that of 
> BlockInfoUnderConstruction itself, not the one from reported replica.
> When a pipeline recovery is done for the last packet of a block, the 
> incremental block reports with the new gen stamp may come before the client 
> calling updatePipeline(). If this happens, these replicas will be incorrectly 
> recorded with the old gen stamp and get removed later.  The result is close 
> or addAdditionalBlock failure.
> If the last block is completed, but the penultimate block is not because of 
> this issue, the file won't be closed. If this file is not cleared, but the 
> client goes away, the lease manager will try to recover the lease/block, at 
> which point it will crash. I will file a separate jira for this shortly.
> The worst case is to reject all good ones and accepting a bad one. In this 
> case, the block will get completed, but the data cannot be read until the 
> next full block report containing one of the valid replicas is received.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5590) Sequential block ID may cause data loss when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836880#comment-13836880
 ] 

Jing Zhao commented on HDFS-5590:
-

Currently when dfs.persist.blocks is false (its default value) and the HA is 
not enabled, the getAdditionalBlock call will not call logSync. Even without  
the sequential block ID mechanism, failing to persist the new block can still 
cause data loss. Thus a quick fix here is to always call logSync for 
getAdditionalBlock. But this may affect the performance.

Another possible fix is to make sure the next block id and generation stamp is 
always larger than the max block id and gs in the system. Thus in 
BlockManager#processFirstBlockReport, we can change the following code
{code}
  // If block does not belong to any file, we are done.
  if (storedBlock == null) continue;
{code}
to 
{code}
 if (storedBlock == null) {
   // TODO: check the block id and generation stamp id of the reported 
block, and increase the local latest block id and generation stamp if necessary 
to make sure they are larger than the reported values
 }
{code}
This can make sure we do not have overlap in block id and generation stamp. But 
data loss is still possible.

> Sequential block ID may cause data loss when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, the 
> current sequential block ID mechanism may cause data loss in the following 
> case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas

2013-12-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836873#comment-13836873
 ] 

Hudson commented on HDFS-5557:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4816 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4816/])
HDFS-5557. Write pipeline recovery for the last packet in the block may cause 
rejection of valid replicas. Contributed by Kihwal Lee. (kihwal: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1547173)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Write pipeline recovery for the last packet in the block may cause rejection 
> of valid replicas
> --
>
> Key: HDFS-5557
> URL: https://issues.apache.org/jira/browse/HDFS-5557
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, 
> HDFS-5557.patch
>
>
> When a block is reported from a data node while the block is under 
> construction (i.e. not committed or completed), BlockManager calls 
> BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported 
> replica state. But BlockManager is calling it with the stored block, not 
> reported block.  This causes the recorded replicas' gen stamp to be that of 
> BlockInfoUnderConstruction itself, not the one from reported replica.
> When a pipeline recovery is done for the last packet of a block, the 
> incremental block reports with the new gen stamp may come before the client 
> calling updatePipeline(). If this happens, these replicas will be incorrectly 
> recorded with the old gen stamp and get removed later.  The result is close 
> or addAdditionalBlock failure.
> If the last block is completed, but the penultimate block is not because of 
> this issue, the file won't be closed. If this file is not cleared, but the 
> client goes away, the lease manager will try to recover the lease/block, at 
> which point it will crash. I will file a separate jira for this shortly.
> The worst case is to reject all good ones and accepting a bad one. In this 
> case, the block will get completed, but the data cannot be read until the 
> next full block report containing one of the valid replicas is received.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5558) LeaseManager monitor thread can crash if the last block is complete but another block is not.

2013-12-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-5558:
-

Attachment: HDFS-5558.patch
HDFS-5558.branch-023.patch

The patch has been updated according to the review comment.  

> LeaseManager monitor thread can crash if the last block is complete but 
> another block is not.
> -
>
> Key: HDFS-5558
> URL: https://issues.apache.org/jira/browse/HDFS-5558
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-5558.branch-023.patch, HDFS-5558.branch-023.patch, 
> HDFS-5558.patch, HDFS-5558.patch
>
>
> As mentioned in HDFS-5557, if a file has its last and penultimate block not 
> completed and the file is being closed, the last block may be completed but 
> the penultimate one might not. If this condition lasts long and the file is 
> abandoned, LeaseManager will try to recover the lease and the block. But 
> {{internalReleaseLease()}} will fail with invalid cast exception with this 
> kind of file.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5590) Sequential block ID may cause data loss when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5590:


Description: 
In a cluster with non-HA setup and dfs.persist.blocks set to false, the current 
sequential block ID mechanism may cause data loss in the following case:

# client creates file1 and requests a block from NN and get blk_id1_gs1
# client writes blk_id1_gs1 to DN
# NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
persisted in disk
# another client creates file2 and NN will allocate a new block using the same 
block id blk_id1_gs1 since block ID and generation stamp are both increased 
sequentially.

Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
same gs) in the system. It will case data loss.

  was:
In a cluster with non-HA setup and dfs.persist.blocks set to false, the current 
sequential block ID mechanism may cause data loss in the following case:

# client creates file1 and requests a block from NN and get blk_id1_gs1
# client writes blk_id1_gs1 to DN
# NN is restarted and because persistBlocks is false, blk_id1_gs1 is not 
persisted in editlog
# another client creates file2 and NN will allocate a new block using the same 
block id blk_id1_gs1 since block ID and generation stamp are both increased 
sequentially.

Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
same gs) in the system. It will case data loss.


> Sequential block ID may cause data loss when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, the 
> current sequential block ID mechanism may cause data loss in the following 
> case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 may not be 
> persisted in disk
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836852#comment-13836852
 ] 

Hadoop QA commented on HDFS-2832:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616582/h2832_20131202.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 48 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5608//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5608//console

This message is automatically generated.

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, 
> h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, 
> h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, 
> h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, 
> h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, 
> h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, 
> h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, 
> h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, 
> h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, 
> h2832_20131124.patch, h2832_20131202.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5590) Sequential block ID may cause data loss when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5590:


Description: 
In a cluster with non-HA setup and dfs.persist.blocks set to false, the current 
sequential block ID mechanism may cause data loss in the following case:

# client creates file1 and requests a block from NN and get blk_id1_gs1
# client writes blk_id1_gs1 to DN
# NN is restarted and because persistBlocks is false, blk_id1_gs1 is not 
persisted in editlog
# another client creates file2 and NN will allocate a new block using the same 
block id blk_id1_gs1 since block ID and generation stamp are both increased 
sequentially.

Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
same gs) in the system. It will case data loss.

  was:
In a cluster with non-HA setup and dfs.persist.blocks set to false, the current 
sequential block ID may cause data loss in the following case:

# client creates file1 and requests a block from NN and get blk_id1_gs1
# client writes blk_id1_gs1 to DN
# NN is restarted and because persistBlocks is false, blk_id1_gs1 is not 
persisted in editlog
# another client creates file2 and NN will allocate a new block using the same 
block id blk_id1_gs1 since block ID and generation stamp are both increased 
sequentially.

Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
same gs) in the system. It will case data loss.


> Sequential block ID may cause data loss when persistBlocks is set to false
> --
>
> Key: HDFS-5590
> URL: https://issues.apache.org/jira/browse/HDFS-5590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> In a cluster with non-HA setup and dfs.persist.blocks set to false, the 
> current sequential block ID mechanism may cause data loss in the following 
> case:
> # client creates file1 and requests a block from NN and get blk_id1_gs1
> # client writes blk_id1_gs1 to DN
> # NN is restarted and because persistBlocks is false, blk_id1_gs1 is not 
> persisted in editlog
> # another client creates file2 and NN will allocate a new block using the 
> same block id blk_id1_gs1 since block ID and generation stamp are both 
> increased sequentially.
> Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
> same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5590) Sequential block ID may cause data loss when persistBlocks is set to false

2013-12-02 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-5590:
---

 Summary: Sequential block ID may cause data loss when 
persistBlocks is set to false
 Key: HDFS-5590
 URL: https://issues.apache.org/jira/browse/HDFS-5590
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Jing Zhao
Assignee: Jing Zhao


In a cluster with non-HA setup and dfs.persist.blocks set to false, the current 
sequential block ID may cause data loss in the following case:

# client creates file1 and requests a block from NN and get blk_id1_gs1
# client writes blk_id1_gs1 to DN
# NN is restarted and because persistBlocks is false, blk_id1_gs1 is not 
persisted in editlog
# another client creates file2 and NN will allocate a new block using the same 
block id blk_id1_gs1 since block ID and generation stamp are both increased 
sequentially.

Now we may have two versions (file1 and file2) of the blk_id1_gs1 (same id, 
same gs) in the system. It will case data loss.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas

2013-12-02 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836849#comment-13836849
 ] 

Daryn Sharp commented on HDFS-5557:
---

+1  Nice work.

> Write pipeline recovery for the last packet in the block may cause rejection 
> of valid replicas
> --
>
> Key: HDFS-5557
> URL: https://issues.apache.org/jira/browse/HDFS-5557
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.4.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, 
> HDFS-5557.patch
>
>
> When a block is reported from a data node while the block is under 
> construction (i.e. not committed or completed), BlockManager calls 
> BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported 
> replica state. But BlockManager is calling it with the stored block, not 
> reported block.  This causes the recorded replicas' gen stamp to be that of 
> BlockInfoUnderConstruction itself, not the one from reported replica.
> When a pipeline recovery is done for the last packet of a block, the 
> incremental block reports with the new gen stamp may come before the client 
> calling updatePipeline(). If this happens, these replicas will be incorrectly 
> recorded with the old gen stamp and get removed later.  The result is close 
> or addAdditionalBlock failure.
> If the last block is completed, but the penultimate block is not because of 
> this issue, the file won't be closed. If this file is not cleared, but the 
> client goes away, the lease manager will try to recover the lease/block, at 
> which point it will crash. I will file a separate jira for this shortly.
> The worst case is to reject all good ones and accepting a bad one. In this 
> case, the block will get completed, but the data cannot be read until the 
> next full block report containing one of the valid replicas is received.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5587:
-

Attachment: HDFS-5587.001.patch

Uploaded the patch with additional unit test.

> add debug information when NFS fails to start with duplicate user or group 
> names
> 
>
> Key: HDFS-5587
> URL: https://issues.apache.org/jira/browse/HDFS-5587
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Attachments: HDFS-5587.001.patch
>
>
> When the host provides duplicate user or group names, NFS will not start and 
> print errors like the following:
> {noformat}
> ... ... 
> 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
> [TERM, HUP, INT]
> Exception in thread "main" java.lang.IllegalArgumentException: value already 
> present: s-iss
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
> at 
> com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
> at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
> at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
> at 
> org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
> 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
> ... ...
> {noformat}
> The reason NFS should not start is that, HDFS (non-kerberos cluster) uses 
> name as the only way to identify a user. On some linux box, it could have two 
> users with the same name but different user IDs. Linux might be able to work 
> fine with that most of the time. However, when NFS gateway talks to HDFS, 
> HDFS accepts only user name. That is, from HDFS' point of view, these two 
> different users are the same user even though they are different on the Linux 
> box.
> The duplicate names on Linux systems sometimes is because of some legacy 
> system configurations, or combined name services.
> Regardless, NFS gateway should print some help information so the user can 
> understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5587:
-

Attachment: (was: HDFS-5587.001.patch)

> add debug information when NFS fails to start with duplicate user or group 
> names
> 
>
> Key: HDFS-5587
> URL: https://issues.apache.org/jira/browse/HDFS-5587
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
>
> When the host provides duplicate user or group names, NFS will not start and 
> print errors like the following:
> {noformat}
> ... ... 
> 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
> [TERM, HUP, INT]
> Exception in thread "main" java.lang.IllegalArgumentException: value already 
> present: s-iss
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
> at 
> com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
> at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
> at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
> at 
> org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
> 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
> ... ...
> {noformat}
> The reason NFS should not start is that, HDFS (non-kerberos cluster) uses 
> name as the only way to identify a user. On some linux box, it could have two 
> users with the same name but different user IDs. Linux might be able to work 
> fine with that most of the time. However, when NFS gateway talks to HDFS, 
> HDFS accepts only user name. That is, from HDFS' point of view, these two 
> different users are the same user even though they are different on the Linux 
> box.
> The duplicate names on Linux systems sometimes is because of some legacy 
> system configurations, or combined name services.
> Regardless, NFS gateway should print some help information so the user can 
> understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5588) Incorrect values of trash configuration and trash emptier state in namenode log

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836846#comment-13836846
 ] 

Hadoop QA commented on HDFS-5588:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616598/HDFS-5588-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5611//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5611//console

This message is automatically generated.

> Incorrect values of trash configuration and trash emptier state in namenode 
> log
> ---
>
> Key: HDFS-5588
> URL: https://issues.apache.org/jira/browse/HDFS-5588
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.2.0
>Reporter: Adam Kawa
> Attachments: HDFS-5588-1.patch
>
>
> Values of trash configuration and trash emptier state in namenode log are 
> displayed in milliseconds (but it should be seconds). This is very confusing 
> because it makes you feel that Trash does not remove data or the meaning of 
> configuration settings changed.
> {code}
> // org.apache.hadoop.fs.TrashPolicyDefault
>   @Override
>   public void initialize(Configuration conf, FileSystem fs, Path home) {
> ...
> this.deletionInterval = (long)(conf.getFloat(
> FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT)
> * MSECS_PER_MINUTE);
> this.emptierInterval = (long)(conf.getFloat(
> FS_TRASH_CHECKPOINT_INTERVAL_KEY, 
> FS_TRASH_CHECKPOINT_INTERVAL_DEFAULT)
> * MSECS_PER_MINUTE);
> LOG.info("Namenode trash configuration: Deletion interval = " +
>  this.deletionInterval + " minutes, Emptier interval = " +
>  this.emptierInterval + " minutes.");
>}
> {code}
> this.deletionInterval and this.emptierInterval are in miliseconds, but the 
> LOG.info tells that they are in minutes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5587:
-

Attachment: HDFS-5587.001.patch

> add debug information when NFS fails to start with duplicate user or group 
> names
> 
>
> Key: HDFS-5587
> URL: https://issues.apache.org/jira/browse/HDFS-5587
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Attachments: HDFS-5587.001.patch
>
>
> When the host provides duplicate user or group names, NFS will not start and 
> print errors like the following:
> {noformat}
> ... ... 
> 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
> [TERM, HUP, INT]
> Exception in thread "main" java.lang.IllegalArgumentException: value already 
> present: s-iss
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
> at 
> com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
> at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
> at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
> at 
> org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
> 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
> ... ...
> {noformat}
> The reason NFS should not start is that, HDFS (non-kerberos cluster) uses 
> name as the only way to identify a user. On some linux box, it could have two 
> users with the same name but different user IDs. Linux might be able to work 
> fine with that most of the time. However, when NFS gateway talks to HDFS, 
> HDFS accepts only user name. That is, from HDFS' point of view, these two 
> different users are the same user even though they are different on the Linux 
> box.
> The duplicate names on Linux systems sometimes is because of some legacy 
> system configurations, or combined name services.
> Regardless, NFS gateway should print some help information so the user can 
> understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5564) Refactor tests in TestCacheDirectives

2013-12-02 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836828#comment-13836828
 ] 

Andrew Wang commented on HDFS-5564:
---

If we want to just turn this into a more general cleanup patch, we should also 
consider turning down the log level of cache reports to DEBUG. It's pretty 
spammy right now.

> Refactor tests in TestCacheDirectives
> -
>
> Key: HDFS-5564
> URL: https://issues.apache.org/jira/browse/HDFS-5564
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Trivial
>
> Some of the tests in TestCacheDirectives start their own MiniDFSCluster to 
> get a new config, even though we already start a cluster in the @Before 
> function. This contributes to longer test runs and code duplication.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5589) Namenode loops caching and uncaching when data should be uncached

2013-12-02 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5589:
--

Assignee: (was: Colin Patrick McCabe)

> Namenode loops caching and uncaching when data should be uncached
> -
>
> Key: HDFS-5589
> URL: https://issues.apache.org/jira/browse/HDFS-5589
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: caching, namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>
> This was reported by [~cnauroth] and [~brandonli], and [~schu] repro'd it too.
> If you add a new caching directive then remove it, the Namenode will 
> sometimes get stuck in a loop where it sends DNA_CACHE and then DNA_UNCACHE 
> repeatedly to the datanodes where the data was previously cached.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5564) Refactor tests in TestCacheDirectives

2013-12-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836820#comment-13836820
 ] 

Colin Patrick McCabe commented on HDFS-5564:


another miscellaneous cleanup to do here: 
{{DistributedFileSystem#addCacheDirective}} should fail if an ID is set.  It's 
a trivial fix (validate that it's not set in {{CacheManager#addDirective}}, a 
function the edit log loader code does not use.)  We also should add a unit 
test for this.

> Refactor tests in TestCacheDirectives
> -
>
> Key: HDFS-5564
> URL: https://issues.apache.org/jira/browse/HDFS-5564
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
>Priority: Trivial
>
> Some of the tests in TestCacheDirectives start their own MiniDFSCluster to 
> get a new config, even though we already start a cluster in the @Before 
> function. This contributes to longer test runs and code duplication.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5589) Namenode loops caching and uncaching when data should be uncached

2013-12-02 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-5589:
-

 Summary: Namenode loops caching and uncaching when data should be 
uncached
 Key: HDFS-5589
 URL: https://issues.apache.org/jira/browse/HDFS-5589
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, namenode
Affects Versions: 3.0.0
Reporter: Andrew Wang
Assignee: Colin Patrick McCabe


This was reported by [~cnauroth] and [~brandonli], and [~schu] repro'd it too.

If you add a new caching directive then remove it, the Namenode will sometimes 
get stuck in a loop where it sends DNA_CACHE and then DNA_UNCACHE repeatedly to 
the datanodes where the data was previously cached.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5588) Incorrect values of trash configuration and trash emptier state in namenode log

2013-12-02 Thread Adam Kawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Kawa updated HDFS-5588:


Attachment: HDFS-5588-1.patch

> Incorrect values of trash configuration and trash emptier state in namenode 
> log
> ---
>
> Key: HDFS-5588
> URL: https://issues.apache.org/jira/browse/HDFS-5588
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.2.0
>Reporter: Adam Kawa
> Attachments: HDFS-5588-1.patch
>
>
> Values of trash configuration and trash emptier state in namenode log are 
> displayed in milliseconds (but it should be seconds). This is very confusing 
> because it makes you feel that Trash does not remove data or the meaning of 
> configuration settings changed.
> {code}
> // org.apache.hadoop.fs.TrashPolicyDefault
>   @Override
>   public void initialize(Configuration conf, FileSystem fs, Path home) {
> ...
> this.deletionInterval = (long)(conf.getFloat(
> FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT)
> * MSECS_PER_MINUTE);
> this.emptierInterval = (long)(conf.getFloat(
> FS_TRASH_CHECKPOINT_INTERVAL_KEY, 
> FS_TRASH_CHECKPOINT_INTERVAL_DEFAULT)
> * MSECS_PER_MINUTE);
> LOG.info("Namenode trash configuration: Deletion interval = " +
>  this.deletionInterval + " minutes, Emptier interval = " +
>  this.emptierInterval + " minutes.");
>}
> {code}
> this.deletionInterval and this.emptierInterval are in miliseconds, but the 
> LOG.info tells that they are in minutes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5588) Incorrect values of trash configuration and trash emptier state in namenode log

2013-12-02 Thread Adam Kawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Kawa updated HDFS-5588:


Status: Patch Available  (was: Open)

> Incorrect values of trash configuration and trash emptier state in namenode 
> log
> ---
>
> Key: HDFS-5588
> URL: https://issues.apache.org/jira/browse/HDFS-5588
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 2.2.0
>Reporter: Adam Kawa
> Attachments: HDFS-5588-1.patch
>
>
> Values of trash configuration and trash emptier state in namenode log are 
> displayed in milliseconds (but it should be seconds). This is very confusing 
> because it makes you feel that Trash does not remove data or the meaning of 
> configuration settings changed.
> {code}
> // org.apache.hadoop.fs.TrashPolicyDefault
>   @Override
>   public void initialize(Configuration conf, FileSystem fs, Path home) {
> ...
> this.deletionInterval = (long)(conf.getFloat(
> FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT)
> * MSECS_PER_MINUTE);
> this.emptierInterval = (long)(conf.getFloat(
> FS_TRASH_CHECKPOINT_INTERVAL_KEY, 
> FS_TRASH_CHECKPOINT_INTERVAL_DEFAULT)
> * MSECS_PER_MINUTE);
> LOG.info("Namenode trash configuration: Deletion interval = " +
>  this.deletionInterval + " minutes, Emptier interval = " +
>  this.emptierInterval + " minutes.");
>}
> {code}
> this.deletionInterval and this.emptierInterval are in miliseconds, but the 
> LOG.info tells that they are in minutes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5588) Incorrect values of trash configuration and trash emptier state in namenode log

2013-12-02 Thread Adam Kawa (JIRA)
Adam Kawa created HDFS-5588:
---

 Summary: Incorrect values of trash configuration and trash emptier 
state in namenode log
 Key: HDFS-5588
 URL: https://issues.apache.org/jira/browse/HDFS-5588
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client, namenode
Affects Versions: 2.2.0
Reporter: Adam Kawa


Values of trash configuration and trash emptier state in namenode log are 
displayed in milliseconds (but it should be seconds). This is very confusing 
because it makes you feel that Trash does not remove data or the meaning of 
configuration settings changed.

{code}
// org.apache.hadoop.fs.TrashPolicyDefault
  @Override
  public void initialize(Configuration conf, FileSystem fs, Path home) {
...
this.deletionInterval = (long)(conf.getFloat(
FS_TRASH_INTERVAL_KEY, FS_TRASH_INTERVAL_DEFAULT)
* MSECS_PER_MINUTE);
this.emptierInterval = (long)(conf.getFloat(
FS_TRASH_CHECKPOINT_INTERVAL_KEY, FS_TRASH_CHECKPOINT_INTERVAL_DEFAULT)
* MSECS_PER_MINUTE);
LOG.info("Namenode trash configuration: Deletion interval = " +
 this.deletionInterval + " minutes, Emptier interval = " +
 this.emptierInterval + " minutes.");
   }
{code}

this.deletionInterval and this.emptierInterval are in miliseconds, but the 
LOG.info tells that they are in minutes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4881) fine tune "Access token verification failed" error msg in datanode log

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836787#comment-13836787
 ] 

Hadoop QA commented on HDFS-4881:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12586284/HDFS-4881-branch-1.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5610//console

This message is automatically generated.

> fine tune "Access token verification failed" error msg in datanode log
> --
>
> Key: HDFS-4881
> URL: https://issues.apache.org/jira/browse/HDFS-4881
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.0.0
> Environment: CentOS-5.3, java-version-1.6.0_26
>Reporter: takeshi.miao
>Priority: Trivial
> Fix For: 1.0.0
>
> Attachments: HDFS-4881-branch-1.0-v1.patch, 
> HDFS-4881-branch-1.0.patch, HDFS-4881-branch-1.patch
>
>
> I'd like to issue this ticket is due to we suffered a datanode access token 
> verification failure issue recently. The client is HBase who is accessing the 
> local datanode via DFSClient. The details log snippets as follows...
> *regionserver log*
> {code}
> ...
> [2013-05-24 08:33:37,553][regionserver8120-compactions-1369288874174][INFO 
> ][org.apache.hadoop.hbase.regionserver.Store]: Started compaction of 1 
> file(s) in cf=ho, hasReferences=true, into 
> hdfs://sjdc-s-hdd-001.sjdc.ispn.trendmicro.com:8020/user/SPN-hbase/spn.guidcensus.ho/f99c6fb26f488034bf0e6ddd7a647ba4/.tmp,
>  seqid=3, totalSize=4.2g
> [2013-05-24 08:33:37,554][regionserver8120-compactions-1369288874174][INFO 
> ][org.apache.hadoop.hdfs.DFSClient]: Access token was invalid when connecting 
> to /10.31.6.49:1004 : 
> org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got 
> access token error for OP_READ_BLOCK, self=/10.31.6.49:36530, 
> remote=/10.31.6.49:1004, for file 
> /user/SPN-hbase/spn.guidcensus.ho/a565dd142933e3abf9bec33d59210d1b/ho/c5b37b9dd8801275c8fb160c0fb32ce5c48b56f4,
>  for block 4549293737579979499_205814042
> ...
> {code}
> *datanode log*
> {code}
> ...
> [2013-05-24 08:33:37,554][DataXceiver for client /10.31.6.49:36530 [Waiting 
> for operation #1]][ERROR][org.apache.hadoop.hdfs.server.datanode.DataNode]: 
> DatanodeRegistration(10.31.6.49:1004, 
> storageID=DS-1953102179-10.31.6.49-1004-   1342490559943, infoPort=1006, 
> ipcPort=50020):DataXceiver
> java.io.IOException: Access token verification failed, for client 
> /10.31.6.49:36530 for OP_READ_BLOCK for block 
> blk_4549293737579979499_205814042
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:175)
> ...
> {code}
> After trace o.a.h.hdfs.security.token.block.BlockTokenSecretManager.java, I 
> found that there are more further details error description written in code.
> *o.a.h.hdfs.security.token.block.BlockTokenSecretManager.java*
> {code}
> public void checkAccess(BlockTokenIdentifier id, String userId, Block block,
>   AccessMode mode) throws InvalidToken {
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Checking access for user=" + userId + ", block=" + block
>   + ", access mode=" + mode + " using " + id.toString());
> }
> if (userId != null && !userId.equals(id.getUserId())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't belong to user " + userId);
> }
> if (id.getBlockId() != block.getBlockId()) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't apply to block " + block);
> }
> if (isExpired(id.getExpiryDate())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " is expired.");
> }
> if (!id.getAccessModes().contains(mode)) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't have " + mode + " permission");
> }
>   }
> {code}
> But actually, this InvalidTokenException will not be handled further (but 
> caught), so I can not trace what kind of this access block token verification 
> is...
> *o.a.h.hdfs.server.datanode.DataXceiver.java*
> {code}
> ...
> if (datanode.isBlockTokenEnabled) {
>   try {
> datanode.blockTokenSecretManager.checkAccess(accessToken, null, block,
> BlockTokenSecretManager.AccessMode.READ);
>   } catch (InvalidToken e) {
> // the e object not handled further...
> try {
>   out.writeShort(DataTransferProtocol.OP_STATUS_ERROR_ACCESS_TO

[jira] [Updated] (HDFS-5554) Add Snapshot Feature to INodeFile

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5554:


Status: Patch Available  (was: Open)

> Add Snapshot Feature to INodeFile
> -
>
> Key: HDFS-5554
> URL: https://issues.apache.org/jira/browse/HDFS-5554
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5554.001.patch
>
>
> Similar with HDFS-5285, we can add a FileWithSnapshot feature to INodeFile 
> and use it to replace the current INodeFileWithSnapshot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5554) Add Snapshot Feature to INodeFile

2013-12-02 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-5554:


Attachment: HDFS-5554.001.patch

Initial patch for review.

> Add Snapshot Feature to INodeFile
> -
>
> Key: HDFS-5554
> URL: https://issues.apache.org/jira/browse/HDFS-5554
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-5554.001.patch
>
>
> Similar with HDFS-5285, we can add a FileWithSnapshot feature to INodeFile 
> and use it to replace the current INodeFileWithSnapshot.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4881) fine tune "Access token verification failed" error msg in datanode log

2013-12-02 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836774#comment-13836774
 ] 

takeshi.miao commented on HDFS-4881:


[~daryn] Sorry for this too late response, since we currently migrate our 
cluster to hadoop-2.0.0, and not suffer this kind of issue. Since the codes are 
very different from 1.0.0 (due to BlockPool added), so I plan not to trace it 
further more.

I will close it as 'not an issue' or something, and may re-open it if we suffer 
this kind of issue in the future.

> fine tune "Access token verification failed" error msg in datanode log
> --
>
> Key: HDFS-4881
> URL: https://issues.apache.org/jira/browse/HDFS-4881
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.0.0
> Environment: CentOS-5.3, java-version-1.6.0_26
>Reporter: takeshi.miao
>Priority: Trivial
> Fix For: 1.0.0
>
> Attachments: HDFS-4881-branch-1.0-v1.patch, 
> HDFS-4881-branch-1.0.patch, HDFS-4881-branch-1.patch
>
>
> I'd like to issue this ticket is due to we suffered a datanode access token 
> verification failure issue recently. The client is HBase who is accessing the 
> local datanode via DFSClient. The details log snippets as follows...
> *regionserver log*
> {code}
> ...
> [2013-05-24 08:33:37,553][regionserver8120-compactions-1369288874174][INFO 
> ][org.apache.hadoop.hbase.regionserver.Store]: Started compaction of 1 
> file(s) in cf=ho, hasReferences=true, into 
> hdfs://sjdc-s-hdd-001.sjdc.ispn.trendmicro.com:8020/user/SPN-hbase/spn.guidcensus.ho/f99c6fb26f488034bf0e6ddd7a647ba4/.tmp,
>  seqid=3, totalSize=4.2g
> [2013-05-24 08:33:37,554][regionserver8120-compactions-1369288874174][INFO 
> ][org.apache.hadoop.hdfs.DFSClient]: Access token was invalid when connecting 
> to /10.31.6.49:1004 : 
> org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got 
> access token error for OP_READ_BLOCK, self=/10.31.6.49:36530, 
> remote=/10.31.6.49:1004, for file 
> /user/SPN-hbase/spn.guidcensus.ho/a565dd142933e3abf9bec33d59210d1b/ho/c5b37b9dd8801275c8fb160c0fb32ce5c48b56f4,
>  for block 4549293737579979499_205814042
> ...
> {code}
> *datanode log*
> {code}
> ...
> [2013-05-24 08:33:37,554][DataXceiver for client /10.31.6.49:36530 [Waiting 
> for operation #1]][ERROR][org.apache.hadoop.hdfs.server.datanode.DataNode]: 
> DatanodeRegistration(10.31.6.49:1004, 
> storageID=DS-1953102179-10.31.6.49-1004-   1342490559943, infoPort=1006, 
> ipcPort=50020):DataXceiver
> java.io.IOException: Access token verification failed, for client 
> /10.31.6.49:36530 for OP_READ_BLOCK for block 
> blk_4549293737579979499_205814042
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:175)
> ...
> {code}
> After trace o.a.h.hdfs.security.token.block.BlockTokenSecretManager.java, I 
> found that there are more further details error description written in code.
> *o.a.h.hdfs.security.token.block.BlockTokenSecretManager.java*
> {code}
> public void checkAccess(BlockTokenIdentifier id, String userId, Block block,
>   AccessMode mode) throws InvalidToken {
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Checking access for user=" + userId + ", block=" + block
>   + ", access mode=" + mode + " using " + id.toString());
> }
> if (userId != null && !userId.equals(id.getUserId())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't belong to user " + userId);
> }
> if (id.getBlockId() != block.getBlockId()) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't apply to block " + block);
> }
> if (isExpired(id.getExpiryDate())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " is expired.");
> }
> if (!id.getAccessModes().contains(mode)) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't have " + mode + " permission");
> }
>   }
> {code}
> But actually, this InvalidTokenException will not be handled further (but 
> caught), so I can not trace what kind of this access block token verification 
> is...
> *o.a.h.hdfs.server.datanode.DataXceiver.java*
> {code}
> ...
> if (datanode.isBlockTokenEnabled) {
>   try {
> datanode.blockTokenSecretManager.checkAccess(accessToken, null, block,
> BlockTokenSecretManager.AccessMode.READ);
>   } catch (InvalidToken e) {
> // the e object not handled further...
> try {
>   out.writeShort(DataTransferProtocol.OP_STATUS_ERROR_ACCESS_TOKEN);
>   out.flush();

[jira] [Commented] (HDFS-4881) fine tune "Access token verification failed" error msg in datanode log

2013-12-02 Thread takeshi.miao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836777#comment-13836777
 ] 

takeshi.miao commented on HDFS-4881:


Close it if no any objection within couple days

> fine tune "Access token verification failed" error msg in datanode log
> --
>
> Key: HDFS-4881
> URL: https://issues.apache.org/jira/browse/HDFS-4881
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 1.0.0
> Environment: CentOS-5.3, java-version-1.6.0_26
>Reporter: takeshi.miao
>Priority: Trivial
> Fix For: 1.0.0
>
> Attachments: HDFS-4881-branch-1.0-v1.patch, 
> HDFS-4881-branch-1.0.patch, HDFS-4881-branch-1.patch
>
>
> I'd like to issue this ticket is due to we suffered a datanode access token 
> verification failure issue recently. The client is HBase who is accessing the 
> local datanode via DFSClient. The details log snippets as follows...
> *regionserver log*
> {code}
> ...
> [2013-05-24 08:33:37,553][regionserver8120-compactions-1369288874174][INFO 
> ][org.apache.hadoop.hbase.regionserver.Store]: Started compaction of 1 
> file(s) in cf=ho, hasReferences=true, into 
> hdfs://sjdc-s-hdd-001.sjdc.ispn.trendmicro.com:8020/user/SPN-hbase/spn.guidcensus.ho/f99c6fb26f488034bf0e6ddd7a647ba4/.tmp,
>  seqid=3, totalSize=4.2g
> [2013-05-24 08:33:37,554][regionserver8120-compactions-1369288874174][INFO 
> ][org.apache.hadoop.hdfs.DFSClient]: Access token was invalid when connecting 
> to /10.31.6.49:1004 : 
> org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got 
> access token error for OP_READ_BLOCK, self=/10.31.6.49:36530, 
> remote=/10.31.6.49:1004, for file 
> /user/SPN-hbase/spn.guidcensus.ho/a565dd142933e3abf9bec33d59210d1b/ho/c5b37b9dd8801275c8fb160c0fb32ce5c48b56f4,
>  for block 4549293737579979499_205814042
> ...
> {code}
> *datanode log*
> {code}
> ...
> [2013-05-24 08:33:37,554][DataXceiver for client /10.31.6.49:36530 [Waiting 
> for operation #1]][ERROR][org.apache.hadoop.hdfs.server.datanode.DataNode]: 
> DatanodeRegistration(10.31.6.49:1004, 
> storageID=DS-1953102179-10.31.6.49-1004-   1342490559943, infoPort=1006, 
> ipcPort=50020):DataXceiver
> java.io.IOException: Access token verification failed, for client 
> /10.31.6.49:36530 for OP_READ_BLOCK for block 
> blk_4549293737579979499_205814042
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:252)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:175)
> ...
> {code}
> After trace o.a.h.hdfs.security.token.block.BlockTokenSecretManager.java, I 
> found that there are more further details error description written in code.
> *o.a.h.hdfs.security.token.block.BlockTokenSecretManager.java*
> {code}
> public void checkAccess(BlockTokenIdentifier id, String userId, Block block,
>   AccessMode mode) throws InvalidToken {
> if (LOG.isDebugEnabled()) {
>   LOG.debug("Checking access for user=" + userId + ", block=" + block
>   + ", access mode=" + mode + " using " + id.toString());
> }
> if (userId != null && !userId.equals(id.getUserId())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't belong to user " + userId);
> }
> if (id.getBlockId() != block.getBlockId()) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't apply to block " + block);
> }
> if (isExpired(id.getExpiryDate())) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " is expired.");
> }
> if (!id.getAccessModes().contains(mode)) {
>   throw new InvalidToken("Block token with " + id.toString()
>   + " doesn't have " + mode + " permission");
> }
>   }
> {code}
> But actually, this InvalidTokenException will not be handled further (but 
> caught), so I can not trace what kind of this access block token verification 
> is...
> *o.a.h.hdfs.server.datanode.DataXceiver.java*
> {code}
> ...
> if (datanode.isBlockTokenEnabled) {
>   try {
> datanode.blockTokenSecretManager.checkAccess(accessToken, null, block,
> BlockTokenSecretManager.AccessMode.READ);
>   } catch (InvalidToken e) {
> // the e object not handled further...
> try {
>   out.writeShort(DataTransferProtocol.OP_STATUS_ERROR_ACCESS_TOKEN);
>   out.flush();
>   throw new IOException("Access token verification failed, for client 
> "
>   + remoteAddress + " for OP_READ_BLOCK for block " + block); 
> } finally {
>   IOUtils.closeStream(out);
> }   
>   }   
> }
> ...
> {code}



--
This message was sent by Atlassian JIR

[jira] [Updated] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5587:
-

Issue Type: Improvement  (was: Bug)

> add debug information when NFS fails to start with duplicate user or group 
> names
> 
>
> Key: HDFS-5587
> URL: https://issues.apache.org/jira/browse/HDFS-5587
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Reporter: Brandon Li
>Assignee: Brandon Li
>
> When the host provides duplicate user or group names, NFS will not start and 
> print errors like the following:
> {noformat}
> ... ... 
> 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
> [TERM, HUP, INT]
> Exception in thread "main" java.lang.IllegalArgumentException: value already 
> present: s-iss
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
> at 
> com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
> at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
> at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
> at 
> org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
> 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
> ... ...
> {noformat}
> The reason NFS should not start is that, HDFS (non-kerberos cluster) uses 
> name as the only way to identify a user. On some linux box, it could have two 
> users with the same name but different user IDs. Linux might be able to work 
> fine with that most of the time. However, when NFS gateway talks to HDFS, 
> HDFS accepts only user name. That is, from HDFS' point of view, these two 
> different users are the same user even though they are different on the Linux 
> box.
> The duplicate names on Linux systems sometimes is because of some legacy 
> system configurations, or combined name services.
> Regardless, NFS gateway should print some help information so the user can 
> understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5587:
-

Priority: Minor  (was: Major)

> add debug information when NFS fails to start with duplicate user or group 
> names
> 
>
> Key: HDFS-5587
> URL: https://issues.apache.org/jira/browse/HDFS-5587
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: nfs
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
>
> When the host provides duplicate user or group names, NFS will not start and 
> print errors like the following:
> {noformat}
> ... ... 
> 13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
> [TERM, HUP, INT]
> Exception in thread "main" java.lang.IllegalArgumentException: value already 
> present: s-iss
> at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
> at 
> com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
> at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
> at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
> at 
> org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
> at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
> at 
> org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
> at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
> 13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
> ... ...
> {noformat}
> The reason NFS should not start is that, HDFS (non-kerberos cluster) uses 
> name as the only way to identify a user. On some linux box, it could have two 
> users with the same name but different user IDs. Linux might be able to work 
> fine with that most of the time. However, when NFS gateway talks to HDFS, 
> HDFS accepts only user name. That is, from HDFS' point of view, these two 
> different users are the same user even though they are different on the Linux 
> box.
> The duplicate names on Linux systems sometimes is because of some legacy 
> system configurations, or combined name services.
> Regardless, NFS gateway should print some help information so the user can 
> understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5587) add debug information when NFS fails to start with duplicate user or group names

2013-12-02 Thread Brandon Li (JIRA)
Brandon Li created HDFS-5587:


 Summary: add debug information when NFS fails to start with 
duplicate user or group names
 Key: HDFS-5587
 URL: https://issues.apache.org/jira/browse/HDFS-5587
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: nfs
Reporter: Brandon Li
Assignee: Brandon Li


When the host provides duplicate user or group names, NFS will not start and 
print errors like the following:
{noformat}
... ... 
13/11/25 18:11:52 INFO nfs3.Nfs3Base: registered UNIX signal handlers for 
[TERM, HUP, INT]
Exception in thread "main" java.lang.IllegalArgumentException: value already 
present: s-iss
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:112)
at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
at com.google.common.collect.HashBiMap.put(HashBiMap.java:85)
at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMapInternal(IdUserGroup.java:85)
at org.apache.hadoop.nfs.nfs3.IdUserGroup.updateMaps(IdUserGroup.java:110)
at org.apache.hadoop.nfs.nfs3.IdUserGroup.(IdUserGroup.java:54)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:172)
at 
org.apache.hadoop.hdfs.nfs.nfs3.RpcProgramNfs3.(RpcProgramNfs3.java:164)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.(Nfs3.java:41)
at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:52)
13/11/25 18:11:54 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
... ...
{noformat}

The reason NFS should not start is that, HDFS (non-kerberos cluster) uses name 
as the only way to identify a user. On some linux box, it could have two users 
with the same name but different user IDs. Linux might be able to work fine 
with that most of the time. However, when NFS gateway talks to HDFS, HDFS 
accepts only user name. That is, from HDFS' point of view, these two different 
users are the same user even though they are different on the Linux box.

The duplicate names on Linux systems sometimes is because of some legacy system 
configurations, or combined name services.

Regardless, NFS gateway should print some help information so the user can 
understand the error and the remove the duplicated names before NFS restart.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas

2013-12-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836741#comment-13836741
 ] 

Hadoop QA commented on HDFS-5557:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12616563/HDFS-5557.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5607//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5607//console

This message is automatically generated.

> Write pipeline recovery for the last packet in the block may cause rejection 
> of valid replicas
> --
>
> Key: HDFS-5557
> URL: https://issues.apache.org/jira/browse/HDFS-5557
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.3.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, 
> HDFS-5557.patch
>
>
> When a block is reported from a data node while the block is under 
> construction (i.e. not committed or completed), BlockManager calls 
> BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported 
> replica state. But BlockManager is calling it with the stored block, not 
> reported block.  This causes the recorded replicas' gen stamp to be that of 
> BlockInfoUnderConstruction itself, not the one from reported replica.
> When a pipeline recovery is done for the last packet of a block, the 
> incremental block reports with the new gen stamp may come before the client 
> calling updatePipeline(). If this happens, these replicas will be incorrectly 
> recorded with the old gen stamp and get removed later.  The result is close 
> or addAdditionalBlock failure.
> If the last block is completed, but the penultimate block is not because of 
> this issue, the file won't be closed. If this file is not cleared, but the 
> client goes away, the lease manager will try to recover the lease/block, at 
> which point it will crash. I will file a separate jira for this shortly.
> The worst case is to reject all good ones and accepting a bad one. In this 
> case, the block will get completed, but the data cannot be read until the 
> next full block report containing one of the valid replicas is received.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-2832:


Attachment: (was: editsStored)

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, 
> h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, 
> h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, 
> h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, 
> h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, 
> h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, 
> h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, 
> h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, 
> h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, 
> h2832_20131124.patch, h2832_20131202.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-2832) Enable support for heterogeneous storages in HDFS

2013-12-02 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-2832:


Attachment: h2832_20131202.patch
editsStored

Updated patch and editsStored after merging latest changes from trunk.

> Enable support for heterogeneous storages in HDFS
> -
>
> Key: HDFS-2832
> URL: https://issues.apache.org/jira/browse/HDFS-2832
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, 
> editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
> h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
> h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
> h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
> h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, 
> h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
> h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
> h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
> h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836715#comment-13836715
 ] 

Alejandro Abdelnur commented on HDFS-5569:
--

I agree with Colin, I don't think going this route is a good idea:

* CONFIGURATION: This would require disseminating the list of valid 
hosts/IPs/subnets to all DN in the cluster. 
* PERFORMANCE IMPACT: doing allow/deny using hostnames will force the webserver 
code ot do a reverse dns lookup on every request.
* EASY TO FAKE: 
http://stackoverflow.com/questions/9326138/is-it-possible-to-accurately-determine-the-ip-address-of-a-client-in-java-servle

IMO, the right way of doing this is that the authentication service (Kerberos 
or whatever custom thing being used) performs this check when granting the 
credentials.


> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5558) LeaseManager monitor thread can crash if the last block is complete but another block is not.

2013-12-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836710#comment-13836710
 ] 

Colin Patrick McCabe commented on HDFS-5558:


Thanks for the explanation.  Using WARN here is fine with me.

> LeaseManager monitor thread can crash if the last block is complete but 
> another block is not.
> -
>
> Key: HDFS-5558
> URL: https://issues.apache.org/jira/browse/HDFS-5558
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.9, 2.3.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Attachments: HDFS-5558.branch-023.patch, HDFS-5558.patch
>
>
> As mentioned in HDFS-5557, if a file has its last and penultimate block not 
> completed and the file is being closed, the last block may be completed but 
> the penultimate one might not. If this condition lasts long and the file is 
> abandoned, LeaseManager will try to recover the lease and the block. But 
> {{internalReleaseLease()}} will fail with invalid cast exception with this 
> kind of file.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5586) Add quick-restart option for datanode

2013-12-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-5586:
-

Description: 
This feature, combined with the graceful shutdown feature, will enable data 
nodes to come back up and start serving quickly.  This is likely a command line 
option for data node, which triggers it to look for saved state information in 
its local storage.  If the information is present and reasonably up-to-date, 
data node would skip some of the startup steps.

Ideally it should be able to do quick registration without requiring removal of 
all blocks from the date node descriptor on the name node and reconstructing it 
with the initial full block report. This implies that all RBW blocks are 
recorded during shutdown and on start-up they are not turned into RWR. Other 
than the quick registration, name node should treat the restart as if few heart 
beats were lost from the node. There should be no unexpected replica state 
changes.

  was:
This feature, combined with the graceful shutdown feature, will enable data 
nodes to come back up and start serving quickly.  This is likely a command line 
option for data node, which triggers it to look for saved state information in 
its local storage.  If the information is present and reasonably up-to-date, 
data node would skip some of the startup steps.

Ideally it should be able to do quick registration without requiring removal of 
all blocks from the date node descriptor on the name node and reconstructing it 
with the initial full block report. 


> Add quick-restart option for datanode
> -
>
> Key: HDFS-5586
> URL: https://issues.apache.org/jira/browse/HDFS-5586
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, ha, hdfs-client, namenode
>Reporter: Kihwal Lee
>
> This feature, combined with the graceful shutdown feature, will enable data 
> nodes to come back up and start serving quickly.  This is likely a command 
> line option for data node, which triggers it to look for saved state 
> information in its local storage.  If the information is present and 
> reasonably up-to-date, data node would skip some of the startup steps.
> Ideally it should be able to do quick registration without requiring removal 
> of all blocks from the date node descriptor on the name node and 
> reconstructing it with the initial full block report. This implies that all 
> RBW blocks are recorded during shutdown and on start-up they are not turned 
> into RWR. Other than the quick registration, name node should treat the 
> restart as if few heart beats were lost from the node. There should be no 
> unexpected replica state changes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5586) Add quick-restart option for datanode

2013-12-02 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-5586:


 Summary: Add quick-restart option for datanode
 Key: HDFS-5586
 URL: https://issues.apache.org/jira/browse/HDFS-5586
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kihwal Lee


This feature, combined with the graceful shutdown feature, will enable data 
nodes to come back up and start serving quickly.  This is likely a command line 
option for data node, which triggers it to look for saved state information in 
its local storage.  If the information is present and reasonably up-to-date, 
data node would skip some of the startup steps.

Ideally it should be able to do quick registration without requiring removal of 
all blocks from the date node descriptor on the name node and reconstructing it 
with the initial full block report. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access

2013-12-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836705#comment-13836705
 ] 

Colin Patrick McCabe commented on HDFS-5569:


Hi Adam,

In order for this to work, you need to distribute an appropriate configuration 
XML file to every DataNode and NameNode.  This seems to be no more or less 
complex than setting an appropriate {{iptables}} configuration file on every 
node.

I have mixed feelings about adding something to HDFS that can easily and simply 
be done at a lower level.  It seems like complexity we don't really need.

It's also worth noting that hostname-based security is also not very secure.  
For example, what happens if a bad guy configures his PC to have the IP address 
of a good guy?  At that point, he has an allowed hostname and can access all 
the goodies.  If we add explicit support for this kind of security in Hadoop, 
people will think that we are endorsing it as secure, which it is manifestly 
not.

My recommendation, for what it's worth, is to set up Kerberos and appropriate 
firewalls, to achieve real security.

> WebHDFS should support a deny/allow list for data access
> 
>
> Key: HDFS-5569
> URL: https://issues.apache.org/jira/browse/HDFS-5569
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: webhdfs
>Reporter: Adam Faris
>  Labels: features
>
> Currently we can't restrict what networks are allowed to transfer data using 
> WebHDFS.  Obviously we can use firewalls to block ports, but this can be 
> complicated and problematic to maintain.  Additionally, because all the jetty 
> servlets run inside the same container, blocking access to jetty to prevent 
> WebHDFS transfers also blocks the other servlets running inside that same 
> jetty container.
> I am requesting a deny/allow feature be added to WebHDFS.  This is already 
> done with the Apache HTTPD server, and is what I'd like to see the deny/allow 
> list modeled after.   Thanks.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-2882) DN continues to start up, even if block pool fails to initialize

2013-12-02 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836650#comment-13836650
 ] 

Colin Patrick McCabe commented on HDFS-2882:


Sorry that I haven't had time to review this in the last week.  I've been busy. 
 If someone else wants to review it, I am fine with that.

> DN continues to start up, even if block pool fails to initialize
> 
>
> Key: HDFS-2882
> URL: https://issues.apache.org/jira/browse/HDFS-2882
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.2-alpha
>Reporter: Todd Lipcon
>Assignee: Vinay
> Attachments: HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, 
> HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, hdfs-2882.txt
>
>
> I started a DN on a machine that was completely out of space on one of its 
> drives. I saw the following:
> 2012-02-02 09:56:50,499 FATAL 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for 
> block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id 
> DS-507718931-172.29.5.194-11072-12978
> 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021
> java.io.IOException: Mkdirs failed to create 
> /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.(FSDataset.java:335)
> but the DN continued to run, spewing NPEs when it tried to do block reports, 
> etc. This was on the HDFS-1623 branch but may affect trunk as well.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5583) Add OOB upgrade response and client-side logic for writes

2013-12-02 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-5583:
-

Issue Type: Sub-task  (was: Improvement)
Parent: HDFS-5535

> Add OOB upgrade response and client-side logic for writes
> -
>
> Key: HDFS-5583
> URL: https://issues.apache.org/jira/browse/HDFS-5583
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kihwal Lee
>
> Add an ability for data nodes to send an OOB response in order to indicate an 
> upcoming upgrade-restart. Client should ignore the pipeline error from the 
> node for a configured amount of time and try reconstruct the pipeline without 
> excluding the restarted node.  If the node does not come back in time, 
> regular pipeline recovery should happen.
> This feature is useful for the applications with a need to keep blocks local. 
> If the upgrade-restart is fast, the wait is preferable to losing locality.  
> It could also be used in general instead of the draining-writer strategy.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5581) NameNodeFsck should use only one instance of BlockPlacementPolicy

2013-12-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836682#comment-13836682
 ] 

Hudson commented on HDFS-5581:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4811 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4811/])
move HDFS-5581 to 2.3 (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1547094)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> NameNodeFsck should use only one instance of BlockPlacementPolicy
> -
>
> Key: HDFS-5581
> URL: https://issues.apache.org/jira/browse/HDFS-5581
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinay
>Assignee: Vinay
> Fix For: 2.3.0
>
> Attachments: HDFS-5581.patch, HDFS-5581.patch
>
>
> While going through NameNodeFsck I found that following code creates the new 
> instance of BlockPlacementPolicy for every block.
> {code}  // verify block placement policy
>   BlockPlacementStatus blockPlacementStatus = 
>   BlockPlacementPolicy.getInstance(conf, null, networktopology).
>   verifyBlockPlacement(path, lBlk, targetFileReplication);{code}
> It would be better to use the namenode's BPP itself instead of creating a new 
> one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4949) Centralized cache management in HDFS

2013-12-02 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836687#comment-13836687
 ] 

Chris Nauroth commented on HDFS-4949:
-

Hi, [~azuryy].  The merge vote passed and HDFS-4949 was merged to trunk about a 
month ago.  For example, here you can see the {{CacheManager}} class on trunk:

http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java


> Centralized cache management in HDFS
> 
>
> Key: HDFS-4949
> URL: https://issues.apache.org/jira/browse/HDFS-4949
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: 3.0.0, 2.3.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-4949-consolidated.patch, 
> caching-design-doc-2013-07-02.pdf, caching-design-doc-2013-08-09.pdf, 
> caching-design-doc-2013-10-24.pdf, caching-testplan.pdf
>
>
> HDFS currently has no support for managing or exposing in-memory caches at 
> datanodes. This makes it harder for higher level application frameworks like 
> Hive, Pig, and Impala to effectively use cluster memory, because they cannot 
> explicitly cache important datasets or place their tasks for memory locality.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5581) NameNodeFsck should use only one instance of BlockPlacementPolicy

2013-12-02 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-5581:
---

Target Version/s: 2.3.0  (was: 2.2.1)
   Fix Version/s: (was: 2.2.1)
  2.3.0

> NameNodeFsck should use only one instance of BlockPlacementPolicy
> -
>
> Key: HDFS-5581
> URL: https://issues.apache.org/jira/browse/HDFS-5581
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinay
>Assignee: Vinay
> Fix For: 2.3.0
>
> Attachments: HDFS-5581.patch, HDFS-5581.patch
>
>
> While going through NameNodeFsck I found that following code creates the new 
> instance of BlockPlacementPolicy for every block.
> {code}  // verify block placement policy
>   BlockPlacementStatus blockPlacementStatus = 
>   BlockPlacementPolicy.getInstance(conf, null, networktopology).
>   verifyBlockPlacement(path, lBlk, targetFileReplication);{code}
> It would be better to use the namenode's BPP itself instead of creating a new 
> one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5581) NameNodeFsck should use only one instance of BlockPlacementPolicy

2013-12-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836668#comment-13836668
 ] 

Hudson commented on HDFS-5581:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4810 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4810/])
HDFS-5581. NameNodeFsck should use only one instance of BlockPlacementPolicy 
(vinay via cmccabe) (cmccabe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1547088)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java


> NameNodeFsck should use only one instance of BlockPlacementPolicy
> -
>
> Key: HDFS-5581
> URL: https://issues.apache.org/jira/browse/HDFS-5581
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinay
>Assignee: Vinay
> Fix For: 2.2.1
>
> Attachments: HDFS-5581.patch, HDFS-5581.patch
>
>
> While going through NameNodeFsck I found that following code creates the new 
> instance of BlockPlacementPolicy for every block.
> {code}  // verify block placement policy
>   BlockPlacementStatus blockPlacementStatus = 
>   BlockPlacementPolicy.getInstance(conf, null, networktopology).
>   verifyBlockPlacement(path, lBlk, targetFileReplication);{code}
> It would be better to use the namenode's BPP itself instead of creating a new 
> one.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


  1   2   >