[jira] [Updated] (HDFS-6789) TestDFSClientFailover.testFileContextDoesntDnsResolveLogicalURI and TestDFSClientFailover.testDoesntDnsResolveLogicalURI failing on jdk7

2014-09-15 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6789:

Target Version/s: 2.6.0  (was: 2.5.0)

> TestDFSClientFailover.testFileContextDoesntDnsResolveLogicalURI and 
> TestDFSClientFailover.testDoesntDnsResolveLogicalURI failing on jdk7
> 
>
> Key: HDFS-6789
> URL: https://issues.apache.org/jira/browse/HDFS-6789
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.5.0
> Environment: jdk7
>Reporter: Rushabh S Shah
>Assignee: Akira AJISAKA
> Attachments: HDFS-6789.patch
>
>
> The following two tests are failing on jdk7.
> org.apache.hadoop.hdfs.TestDFSClientFailover.testFileContextDoesntDnsResolveLogicalURI
> org.apache.hadoop.hdfs.TestDFSClientFailover.testDoesntDnsResolveLogicalURI
> On jdk6 it just skips the tests .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-09-15 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134940#comment-14134940
 ] 

Jitendra Nath Pandey commented on HDFS-6826:


bq. The latest patch doesn't directly splice into the inodes, but for reference 
the reason it's bad is because replaying edits, whether at startup or HA 
standby, should not ever go through the plugin.
Replaying shouldn't be a problem as long as plugins implement the setters in 
idempotent manner. I think we should add this in the javadoc for the interface.

bq. About the only way to ensure compatibility is for the permission checker to 
have a hook to call out to the plugin. The plugin cannot override the core 
behavior.
  The purpose of the plugin is to allow new permission models. Therefore, it 
must be able to override the permission checking as well. 

I think v7.6 provide more features and is more inline with the purpose of this 
jira.
  


> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
> HDFS-6826-permchecker.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, 
> HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, 
> HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, 
> HDFS-6826v7.5.patch, HDFS-6826v7.6.patch, HDFS-6826v7.patch, 
> HDFS-6826v8.patch, HDFS-6826v9.patch, 
> HDFSPluggableAuthorizationProposal-v2.pdf, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7064) Fix test failures

2014-09-15 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7064:
-
Attachment: HDFS-7064.0.patch

Fix the following test issues:

org.apache.hadoop.hdfs.TestDFSShell.testGet
org.apache.hadoop.hdfs.TestDFSShell.testCopyCommandsWithForceOption
org.apache.hadoop.hdfs.TestDFSShell.testCopyToLocal
org.apache.hadoop.hdfs.server.datanode.TestDataDirs.testDataDirParsing
org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool.testDeleteBlockPool

> Fix test failures
> -
>
> Key: HDFS-7064
> URL: https://issues.apache.org/jira/browse/HDFS-7064
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: HDFS-6581
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
> Attachments: HDFS-7064.0.patch
>
>
> Fix test failures in the HDFS-6581 feature branch.
> Jenkins flagged the following failures.
> https://builds.apache.org/job/PreCommit-HDFS-Build/8025//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134862#comment-14134862
 ] 

Hadoop QA commented on HDFS-6826:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668911/HDFS-6826v9.patch
  against trunk revision 0ac760a.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.security.TestPermission
  org.apache.hadoop.hdfs.TestDFSShell
  org.apache.hadoop.hdfs.web.TestWebHDFSXAttr
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestAclWithSnapshot
  org.apache.hadoop.security.TestPermissionSymlinks
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion
  org.apache.hadoop.hdfs.server.namenode.TestFileContextXAttr
  org.apache.hadoop.hdfs.server.namenode.TestNameNodeXAttr
  org.apache.hadoop.hdfs.TestEncryptionZones
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8036//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8036//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8036//console

This message is automatically generated.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
> HDFS-6826-permchecker.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, 
> HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, 
> HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, 
> HDFS-6826v7.5.patch, HDFS-6826v7.6.patch, HDFS-6826v7.patch, 
> HDFS-6826v8.patch, HDFS-6826v9.patch, 
> HDFSPluggableAuthorizationProposal-v2.pdf, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7001) Tests in TestTracing depends on the order of execution

2014-09-15 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134857#comment-14134857
 ] 

Akira AJISAKA commented on HDFS-7001:
-

Looks good to me, +1 (non-binding).

> Tests in TestTracing depends on the order of execution
> --
>
> Key: HDFS-7001
> URL: https://issues.apache.org/jira/browse/HDFS-7001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-7001-0.patch, HDFS-7001-1.patch
>
>
> o.a.h.tracing.TestTracing#testSpanReceiverHost is assumed to be executed 
> first. It should be done in BeforeClass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7001) Tests in TestTracing depends on the order of execution

2014-09-15 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-7001:

Target Version/s: 2.6.0

> Tests in TestTracing depends on the order of execution
> --
>
> Key: HDFS-7001
> URL: https://issues.apache.org/jira/browse/HDFS-7001
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
> Attachments: HDFS-7001-0.patch, HDFS-7001-1.patch
>
>
> o.a.h.tracing.TestTracing#testSpanReceiverHost is assumed to be executed 
> first. It should be done in BeforeClass.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7067) ClassCastException while using a key created by keytool to create encryption zone.

2014-09-15 Thread Yi Yao (JIRA)
Yi Yao created HDFS-7067:


 Summary: ClassCastException while using a key created by keytool 
to create encryption zone. 
 Key: HDFS-7067
 URL: https://issues.apache.org/jira/browse/HDFS-7067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 2.6.0
Reporter: Yi Yao
Priority: Minor
 Fix For: 2.6.0


I'm using transparent encryption. If I create a key for KMS keystore via 
keytool and use the key to create an encryption zone. I get a 
ClassCastException rather than an exception with decent error message. I know 
we should use 'hadoop key create' to create a key. It's better to provide an 
decent error message to remind user to use the right way to create a KMS key.

[LOG]
ERROR[user=hdfs] Method:'GET' Exception:'java.lang.ClassCastException: 
javax.crypto.spec.SecretKeySpec cannot be cast to 
org.apache.hadoop.crypto.key.JavaKeyStoreProvider$KeyMetadata'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6904) YARN unable to renew delegation token fetched via webhdfs due to incorrect service port

2014-09-15 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-6904:
---
Assignee: Haohui Mai

> YARN unable to renew delegation token fetched via webhdfs due to incorrect 
> service port
> ---
>
> Key: HDFS-6904
> URL: https://issues.apache.org/jira/browse/HDFS-6904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Varun Vasudev
>Assignee: Haohui Mai
>Priority: Critical
>
> YARN is unable to renew delegation tokens obtained via the WebHDFS REST API. 
> The scenario is as follows -
> 1. User creates a delegation token using the WebHDFS REST API
> 2. User passes this token to YARN as part of app submission(via the YARN REST 
> API)
> 3. When YARN tries to renew this delegation token, it fails because the token 
> service is pointing to the RPC port but the token kind is WebHDFS.
> The exception is
> {noformat}
> 2014-08-19 03:12:54,733 WARN  security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(661)) - Unable to 
> add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: WEBHDFS delegation, 
> Service: NameNodeIP:8020, Ident: (WEBHDFS delegation token  for hrt_qa)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$5(DelegationTokenRenewer.java:357)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Unexpected HTTP response: code=-1 != 200, 
> op=RENEWDELEGATIONTOKEN, message=null
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:331)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:90)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:598)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:448)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:477)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:473)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.renewDelegationToken(WebHdfsFileSystem.java:1318)
> at 
> org.apache.hadoop.hdfs.web.TokenAspect$TokenManager.renew(TokenAspect.java:73)
> at org.apache.hadoop.security.token.Token.renew(Token.java:377)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:477)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:1)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:473)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:392)
> ... 6 more
> Caused by: java.io.IOException: The error stream is null.
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.jsonParse(WebHdfsFileSystem.java:304)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:329)
> ... 24 more
> 2014-08-19 03:12:54,735 DEBUG event.AsyncDispatcher 
> (AsyncDispatcher.java:dispatch(164)) - Dispatching the event 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppRejectedEvent.EventType:
>  APP_REJECTED
> {

[jira] [Updated] (HDFS-6904) YARN unable to renew delegation token fetched via webhdfs due to incorrect service port

2014-09-15 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-6904:
--
Priority: Critical  (was: Major)
Target Version/s: 2.6.0

This is important for 2.6, given we are trying to get YARN web-services to be 
stable for production usage..

> YARN unable to renew delegation token fetched via webhdfs due to incorrect 
> service port
> ---
>
> Key: HDFS-6904
> URL: https://issues.apache.org/jira/browse/HDFS-6904
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Reporter: Varun Vasudev
>Priority: Critical
>
> YARN is unable to renew delegation tokens obtained via the WebHDFS REST API. 
> The scenario is as follows -
> 1. User creates a delegation token using the WebHDFS REST API
> 2. User passes this token to YARN as part of app submission(via the YARN REST 
> API)
> 3. When YARN tries to renew this delegation token, it fails because the token 
> service is pointing to the RPC port but the token kind is WebHDFS.
> The exception is
> {noformat}
> 2014-08-19 03:12:54,733 WARN  security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(661)) - Unable to 
> add the application to the delegation token renewer.
> java.io.IOException: Failed to renew token: Kind: WEBHDFS delegation, 
> Service: NameNodeIP:8020, Ident: (WEBHDFS delegation token  for hrt_qa)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:394)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$5(DelegationTokenRenewer.java:357)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Unexpected HTTP response: code=-1 != 200, 
> op=RENEWDELEGATIONTOKEN, message=null
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:331)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:90)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:598)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:448)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:477)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:473)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.renewDelegationToken(WebHdfsFileSystem.java:1318)
> at 
> org.apache.hadoop.hdfs.web.TokenAspect$TokenManager.renew(TokenAspect.java:73)
> at org.apache.hadoop.security.token.Token.renew(Token.java:377)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:477)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:1)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:473)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:392)
> ... 6 more
> Caused by: java.io.IOException: The error stream is null.
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.jsonParse(WebHdfsFileSystem.java:304)
> at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:329)
> ... 24 more
> 2014-08-19 03:12:54,735 DEBUG event.AsyncDispatcher 
> (AsyncDispatcher.java:dispatch(164)

[jira] [Commented] (HDFS-6971) Bounded staleness of EDEK caches on the NN

2014-09-15 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134735#comment-14134735
 ] 

Zhe Zhang commented on HDFS-6971:
-

{{currentKeyCache}} in {{CachingKeyProvidet}} is already created with 
{{expireAfterAccess}}. Is this JIRA about another cache? Regardless, a flush 
function is still needed.

> Bounded staleness of EDEK caches on the NN
> --
>
> Key: HDFS-6971
> URL: https://issues.apache.org/jira/browse/HDFS-6971
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.5.0
>Reporter: Andrew Wang
>Assignee: Zhe Zhang
>
> The EDEK cache on the NN can hold onto keys after the admin has rolled the 
> key. It'd be good to time-bound the caches, perhaps also providing an 
> explicit "flush" command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-09-15 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated HDFS-6826:
-
Attachment: HDFS-6826v9.patch

Attached, v9, is a functioning patch based on [~daryn] idea. It requires a 
little refining.

A few  comments on the approach:

* Instead being invasive on the {{INode}} classes, it is invasive in the 
{{PermissionChecker}} and in the {{FSDirectory}}. 
* It does not allow to replace the permission check logic (v7.x does) (could be 
done in a follow up JIRA)
* It does not allow to replace the setter of authz info logic (v7.x does)
* It seems to be more efficient regarding the times it gets call and how the 
full path can be inferred by the plugin.
* Talking with Daryn, he suggested doing some changes in the {{FSDirectory}} 
for creating file status by passing. I'm deferring this to the refactoring he 
wants to do as it is not that trivial.

Refining required:

* How to handle setting of authz info, meaning fully ignoring setter calls for 
a path managed by a plugin. Else setters have effect. The plugin should be able 
to prevent those setter calls from happening.
* subAccess check is a TODO, the stack logic there needs to  carry more info 
for the plugin (but we don't want to do that if there is no plugin). 

The v7.6 and the v9 plugins provide the functionality it is needed for 
HMS/Sentry. I'm OK either way.

Could others weight on the preferred approach? I would like to get closure on 
this ASAP for Sentry to start building on top of it.




> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
> HDFS-6826-permchecker.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, 
> HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, 
> HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, 
> HDFS-6826v7.5.patch, HDFS-6826v7.6.patch, HDFS-6826v7.patch, 
> HDFS-6826v8.patch, HDFS-6826v9.patch, 
> HDFSPluggableAuthorizationProposal-v2.pdf, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6843) Create FileStatus isEncrypted() method

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134719#comment-14134719
 ] 

Hadoop QA commented on HDFS-6843:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668862/HDFS-6843.008.patch
  against trunk revision 88e329f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.fs.contract.localfs.TestLocalFSContractOpen
  
org.apache.hadoop.fs.contract.rawlocal.TestRawlocalContractOpen
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8035//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8035//console

This message is automatically generated.

> Create FileStatus isEncrypted() method
> --
>
> Key: HDFS-6843
> URL: https://issues.apache.org/jira/browse/HDFS-6843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, 
> HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, 
> HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch, 
> HDFS-6843.008.patch
>
>
> FileStatus should have a 'boolean isEncrypted()' method. (it was in the 
> context of discussing with AndreW about FileStatus being a Writable).
> Having this method would allow MR JobSubmitter do the following:
> -
> BOOLEAN intermediateEncryption = false
> IF jobconf.contains("mr.intermidate.encryption") THEN
>   intermediateEncryption = jobConf.getBoolean("mr.intermidate.encryption")
> ELSE
>   IF (I/O)Format INSTANCEOF File(I/O)Format THEN
> intermediateEncryption = ANY File(I/O)Format HAS a Path with status 
> isEncrypted()==TRUE
>   FI
>   jobConf.setBoolean("mr.intermidate.encryption", intermediateEncryption)
> FI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone

2014-09-15 Thread Charles Lamb (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134689#comment-14134689
 ] 

Charles Lamb commented on HDFS-6851:


TestWebHdfsFileSystemContract fails on my local machine both with and without 
the patch applied. TestPipelinesFailover passes on my local machine both with 
and without the patch applied.



> Flush EncryptionZoneWithId and add an id field to EncryptionZone
> 
>
> Key: HDFS-6851
> URL: https://issues.apache.org/jira/browse/HDFS-6851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 2.6.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6851-prelim.001.patch, HDFS-6851.000.patch, 
> HDFS-6851.002.patch
>
>
> EncryptionZoneWithId can be flushed by moving the id field up to 
> EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files

2014-09-15 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134671#comment-14134671
 ] 

Gopal V commented on HDFS-6912:
---

Thanks [~cmccabe] &[~andrew.wang], will pull into my branches.

> SharedFileDescriptorFactory should not allocate sparse files
> 
>
> Key: HDFS-6912
> URL: https://issues.apache.org/jira/browse/HDFS-6912
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching
>Affects Versions: 2.5.0
> Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
>Reporter: Gopal V
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch, 
> HDFS-6912.003.patch
>
>
> SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
> can lead to a SIGBUS later in the short-circuit reader when we try to read 
> from the sparse file and memory is not available.
> Note that if swap is enabled, we can still get a SIGBUS even with a 
> non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7065) Pipeline close recovery race can cause block corruption

2014-09-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134664#comment-14134664
 ] 

Kihwal Lee commented on HDFS-7065:
--

The pre-commit build is not working as expected.  There is no 
{{TestStorageMover}} in the source tree!  It is only in HDFS-6584 branch.  It 
looks like a leftover from previous build attempt. 

> Pipeline close recovery race can cause block corruption
> ---
>
> Key: HDFS-7065
> URL: https://issues.apache.org/jira/browse/HDFS-7065
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-7065.patch
>
>
> If multiple pipeline close recoveries are performed against the same block, 
> the replica may go corrupt.  Here is one case I have observed:
> The client tried to close a block, but the ACK timed out.  It excluded the 
> first DN and tried pipeline recovery (recoverClose). It too failed and 
> another recovery was attempted with only one DN.  This took more than usual 
> but the client eventually got an ACK and the file was closed successfully.  
> Later on the one and only replica was found to be corrupt.
> It turned out the DN was having transient slow disk I/O issue at that time. 
> The first recovery was stuck until the second recovery was attempted 30 
> seconds later.  After few seconds, they both threads started running. The 
> second recovery finished first and then the first recovery with an older gen 
> stamp finished, turning gen stamp backward.
> There is a sanity check in {{recoverCheck()}}, but since check and modify are 
> not synchronized, {{recoverClose()}} is not multi-thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134614#comment-14134614
 ] 

Hadoop QA commented on HDFS-6851:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668837/HDFS-6851.002.patch
  against trunk revision 9d4ec97.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8033//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8033//console

This message is automatically generated.

> Flush EncryptionZoneWithId and add an id field to EncryptionZone
> 
>
> Key: HDFS-6851
> URL: https://issues.apache.org/jira/browse/HDFS-6851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 2.6.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6851-prelim.001.patch, HDFS-6851.000.patch, 
> HDFS-6851.002.patch
>
>
> EncryptionZoneWithId can be flushed by moving the id field up to 
> EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7066) LazyWriter#evictBlocks misses a null check for replicaState

2014-09-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-7066:

Issue Type: Sub-task  (was: Bug)
Parent: HDFS-6581

> LazyWriter#evictBlocks misses a null check for replicaState
> ---
>
> Key: HDFS-7066
> URL: https://issues.apache.org/jira/browse/HDFS-7066
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: HDFS-6581
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Minor
> Fix For: HDFS-6581
>
> Attachments: HDFS-7066.0.patch
>
>
> LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
> replicaState. As a result, there are many NPEs in the debug log under certain 
> conditions. 
> {code}
> 2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
> (FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
> 2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl 
> (FsDatasetImpl.java:run(2409)) - Ignoring exception in LazyWriter:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The proposed fix is to break if there is no candidate available to evict.
> {code}
>   while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
> transientFreeSpaceBelowThreshold()) {
>LazyWriteReplicaTracker.ReplicaState replicaState =
>lazyWriteReplicaTracker.getNextCandidateForEviction();
>
>   if (replicaState == null) {
>   break;
>}
>
>if (LOG.isDebugEnabled()) {
>  LOG.debug("Evicting block " + replicaState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7066) LazyWriter#evictBlocks misses a null check for replicaState

2014-09-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal resolved HDFS-7066.
-
   Resolution: Fixed
Fix Version/s: HDFS-6581
 Hadoop Flags: Reviewed

+1 I committed this to the feature branch.

Thanks for fixing this Xiaoyu!

> LazyWriter#evictBlocks misses a null check for replicaState
> ---
>
> Key: HDFS-7066
> URL: https://issues.apache.org/jira/browse/HDFS-7066
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: HDFS-6581
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Minor
> Fix For: HDFS-6581
>
> Attachments: HDFS-7066.0.patch
>
>
> LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
> replicaState. As a result, there are many NPEs in the debug log under certain 
> conditions. 
> {code}
> 2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
> (FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
> 2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl 
> (FsDatasetImpl.java:run(2409)) - Ignoring exception in LazyWriter:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The proposed fix is to break if there is no candidate available to evict.
> {code}
>   while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
> transientFreeSpaceBelowThreshold()) {
>LazyWriteReplicaTracker.ReplicaState replicaState =
>lazyWriteReplicaTracker.getNextCandidateForEviction();
>
>   if (replicaState == null) {
>   break;
>}
>
>if (LOG.isDebugEnabled()) {
>  LOG.debug("Evicting block " + replicaState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7066) LazyWriter#evictBlocks misses a null check for replicaState

2014-09-15 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-7066:

Affects Version/s: (was: 2.5.1)
   HDFS-6581

> LazyWriter#evictBlocks misses a null check for replicaState
> ---
>
> Key: HDFS-7066
> URL: https://issues.apache.org/jira/browse/HDFS-7066
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: HDFS-6581
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Minor
> Fix For: HDFS-6581
>
> Attachments: HDFS-7066.0.patch
>
>
> LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
> replicaState. As a result, there are many NPEs in the debug log under certain 
> conditions. 
> {code}
> 2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
> (FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
> 2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl 
> (FsDatasetImpl.java:run(2409)) - Ignoring exception in LazyWriter:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The proposed fix is to break if there is no candidate available to evict.
> {code}
>   while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
> transientFreeSpaceBelowThreshold()) {
>LazyWriteReplicaTracker.ReplicaState replicaState =
>lazyWriteReplicaTracker.getNextCandidateForEviction();
>
>   if (replicaState == null) {
>   break;
>}
>
>if (LOG.isDebugEnabled()) {
>  LOG.debug("Evicting block " + replicaState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7065) Pipeline close recovery race can cause block corruption

2014-09-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134578#comment-14134578
 ] 

Kihwal Lee commented on HDFS-7065:
--

precommit did not report which test cases failed. Here is the list from the 
build log.
{panel}
Failed tests: 
  TestOfflineEditsViewer.testStored:167 Edits 
/home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/hadoop-hdfs-project/hadoop-hdfs/target/test-classes/editsStored
 should have all op codes
  TestBlockManager.testUseDelHint:616 null
  TestStorageMover.testNoSpaceArchive:706 null
  TestStorageMover.testNoSpaceDisk:625 null

Tests in error: 
  TestPipelinesFailover.testPipelineRecoveryStress:485 ? Runtime Deferred
{panel}

> Pipeline close recovery race can cause block corruption
> ---
>
> Key: HDFS-7065
> URL: https://issues.apache.org/jira/browse/HDFS-7065
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-7065.patch
>
>
> If multiple pipeline close recoveries are performed against the same block, 
> the replica may go corrupt.  Here is one case I have observed:
> The client tried to close a block, but the ACK timed out.  It excluded the 
> first DN and tried pipeline recovery (recoverClose). It too failed and 
> another recovery was attempted with only one DN.  This took more than usual 
> but the client eventually got an ACK and the file was closed successfully.  
> Later on the one and only replica was found to be corrupt.
> It turned out the DN was having transient slow disk I/O issue at that time. 
> The first recovery was stuck until the second recovery was attempted 30 
> seconds later.  After few seconds, they both threads started running. The 
> second recovery finished first and then the first recovery with an older gen 
> stamp finished, turning gen stamp backward.
> There is a sanity check in {{recoverCheck()}}, but since check and modify are 
> not synchronized, {{recoverClose()}} is not multi-thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7065) Pipeline close recovery race can cause block corruption

2014-09-15 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee reassigned HDFS-7065:


Assignee: Kihwal Lee

> Pipeline close recovery race can cause block corruption
> ---
>
> Key: HDFS-7065
> URL: https://issues.apache.org/jira/browse/HDFS-7065
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-7065.patch
>
>
> If multiple pipeline close recoveries are performed against the same block, 
> the replica may go corrupt.  Here is one case I have observed:
> The client tried to close a block, but the ACK timed out.  It excluded the 
> first DN and tried pipeline recovery (recoverClose). It too failed and 
> another recovery was attempted with only one DN.  This took more than usual 
> but the client eventually got an ACK and the file was closed successfully.  
> Later on the one and only replica was found to be corrupt.
> It turned out the DN was having transient slow disk I/O issue at that time. 
> The first recovery was stuck until the second recovery was attempted 30 
> seconds later.  After few seconds, they both threads started running. The 
> second recovery finished first and then the first recovery with an older gen 
> stamp finished, turning gen stamp backward.
> There is a sanity check in {{recoverCheck()}}, but since check and modify are 
> not synchronized, {{recoverClose()}} is not multi-thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7066) LazyWriter#evictBlocks misses a null check for replicaState

2014-09-15 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7066:
-
Description: 
LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
replicaState. As a result, there are many NPEs in the debug log under certain 
conditions. 

{code}
2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
(FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl (FsDatasetImpl.java:run(2409)) 
- Ignoring exception in LazyWriter:
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
at java.lang.Thread.run(Thread.java:745)
{code}

The proposed fix is to break if there is no candidate available to evict.

{code}

while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
transientFreeSpaceBelowThreshold()) {
   LazyWriteReplicaTracker.ReplicaState replicaState =
   lazyWriteReplicaTracker.getNextCandidateForEviction();

   
if (replicaState == null) {
  break;
   }
   

   if (LOG.isDebugEnabled()) {
 LOG.debug("Evicting block " + replicaState);
   }

{code}

  was:
LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
replicaState. As a result, there are many NPEs in the debug log under certain 
conditions. 

2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
(FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl (FsDatasetImpl.java:run(2409)) 
- Ignoring exception in LazyWriter:
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
at java.lang.Thread.run(Thread.java:745)

The proposed fix is to break if there is no candidate available to evict.

{code}

while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
transientFreeSpaceBelowThreshold()) {
   LazyWriteReplicaTracker.ReplicaState replicaState =
   lazyWriteReplicaTracker.getNextCandidateForEviction();

   
if (replicaState == null) {
  break;
   }
   

   if (LOG.isDebugEnabled()) {
 LOG.debug("Evicting block " + replicaState);
   }

{code}


> LazyWriter#evictBlocks misses a null check for replicaState
> ---
>
> Key: HDFS-7066
> URL: https://issues.apache.org/jira/browse/HDFS-7066
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.1
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Minor
> Attachments: HDFS-7066.0.patch
>
>
> LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
> replicaState. As a result, there are many NPEs in the debug log under certain 
> conditions. 
> {code}
> 2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
> (FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
> 2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl 
> (FsDatasetImpl.java:run(2409)) - Ignoring exception in LazyWriter:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The proposed fix is to break if there is no candidate available to evict.
> {code}
>   while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
> transientFreeSpaceBelowThreshold()) {
>LazyWriteReplicaTracker.ReplicaState replicaState =
>lazyWriteReplicaTracker.getNextCandidateForEviction();
>
>   if (replicaState == null) {
>   break;
>}
>
>if (LOG.isDebugEnabled()) {
>  LOG.debug("Evicting block " + replicaState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7066) LazyWriter#evictBlocks misses a null check for replicaState

2014-09-15 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7066:
-
Attachment: HDFS-7066.0.patch

> LazyWriter#evictBlocks misses a null check for replicaState
> ---
>
> Key: HDFS-7066
> URL: https://issues.apache.org/jira/browse/HDFS-7066
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.1
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Minor
> Attachments: HDFS-7066.0.patch
>
>
> LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
> replicaState. As a result, there are many NPEs in the debug log under certain 
> conditions. 
> {code}
> 2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
> (FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
> 2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl 
> (FsDatasetImpl.java:run(2409)) - Ignoring exception in LazyWriter:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> The proposed fix is to break if there is no candidate available to evict.
> {code}
>   while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
> transientFreeSpaceBelowThreshold()) {
>LazyWriteReplicaTracker.ReplicaState replicaState =
>lazyWriteReplicaTracker.getNextCandidateForEviction();
>
>   if (replicaState == null) {
>   break;
>}
>
>if (LOG.isDebugEnabled()) {
>  LOG.debug("Evicting block " + replicaState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7066) LazyWriter#evictBlocks misses a null check for replicaState

2014-09-15 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7066:
-
Description: 
LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
replicaState. As a result, there are many NPEs in the debug log under certain 
conditions. 

2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
(FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl (FsDatasetImpl.java:run(2409)) 
- Ignoring exception in LazyWriter:
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
at java.lang.Thread.run(Thread.java:745)

The proposed fix is to break if there is no candidate available to evict.

{code}

while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
transientFreeSpaceBelowThreshold()) {
   LazyWriteReplicaTracker.ReplicaState replicaState =
   lazyWriteReplicaTracker.getNextCandidateForEviction();

   
if (replicaState == null) {
  break;
   }
   

   if (LOG.isDebugEnabled()) {
 LOG.debug("Evicting block " + replicaState);
   }

{code}

  was:
LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
replicaState. As a result, there are many NPEs in the debug log. 

2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
(FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl (FsDatasetImpl.java:run(2409)) 
- Ignoring exception in LazyWriter:
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
at java.lang.Thread.run(Thread.java:745)

The proposed fix is to break if there is no candidate available to evict.

{code}

while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
transientFreeSpaceBelowThreshold()) {
   LazyWriteReplicaTracker.ReplicaState replicaState =
   lazyWriteReplicaTracker.getNextCandidateForEviction();

   
if (replicaState == null) {
  break;
   }
   

   if (LOG.isDebugEnabled()) {
 LOG.debug("Evicting block " + replicaState);
   }

{code}


> LazyWriter#evictBlocks misses a null check for replicaState
> ---
>
> Key: HDFS-7066
> URL: https://issues.apache.org/jira/browse/HDFS-7066
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.5.1
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Minor
>
> LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
> replicaState. As a result, there are many NPEs in the debug log under certain 
> conditions. 
> 2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
> (FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
> 2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl 
> (FsDatasetImpl.java:run(2409)) - Ignoring exception in LazyWriter:
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
>   at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
>   at java.lang.Thread.run(Thread.java:745)
> The proposed fix is to break if there is no candidate available to evict.
> {code}
>   while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
> transientFreeSpaceBelowThreshold()) {
>LazyWriteReplicaTracker.ReplicaState replicaState =
>lazyWriteReplicaTracker.getNextCandidateForEviction();
>
>   if (replicaState == null) {
>   break;
>}
>
>if (LOG.isDebugEnabled()) {
>  LOG.debug("Evicting block " + replicaState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7066) LazyWriter#evictBlocks misses a null check for replicaState

2014-09-15 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HDFS-7066:


 Summary: LazyWriter#evictBlocks misses a null check for 
replicaState
 Key: HDFS-7066
 URL: https://issues.apache.org/jira/browse/HDFS-7066
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.5.1
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
Priority: Minor


LazyWriter#evictBlocks (added for HDFS-6581) misses a null check for 
replicaState. As a result, there are many NPEs in the debug log. 

2014-09-15 14:27:10,820 DEBUG impl.FsDatasetImpl 
(FsDatasetImpl.java:evictBlocks(2335)) - Evicting block null
2014-09-15 14:27:10,821 WARN  impl.FsDatasetImpl (FsDatasetImpl.java:run(2409)) 
- Ignoring exception in LazyWriter:
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.evictBlocks(FsDatasetImpl.java:2343)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl$LazyWriter.run(FsDatasetImpl.java:2396)
at java.lang.Thread.run(Thread.java:745)

The proposed fix is to break if there is no candidate available to evict.

{code}

while (iterations++ < MAX_BLOCK_EVICTIONS_PER_ITERATION &&
transientFreeSpaceBelowThreshold()) {
   LazyWriteReplicaTracker.ReplicaState replicaState =
   lazyWriteReplicaTracker.getNextCandidateForEviction();

   
if (replicaState == null) {
  break;
   }
   

   if (LOG.isDebugEnabled()) {
 LOG.debug("Evicting block " + replicaState);
   }

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed

2014-09-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134560#comment-14134560
 ] 

Colin Patrick McCabe commented on HDFS-5546:


Which patch is current?  HDFS-5546.2.004.patch?  I'm not completely happy with 
that patch since the return code is still 0 even after errors are encountered.

> race condition crashes "hadoop ls -R" when directories are moved/removed
> 
>
> Key: HDFS-5546
> URL: https://issues.apache.org/jira/browse/HDFS-5546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, 
> HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, 
> HDFS-5546.2.004.patch
>
>
> This seems to be a rare race condition where we have a sequence of events 
> like this:
> 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D.
> 2. someone deletes or moves directory D
> 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which 
> calls DFS#listStatus(D). This throws FileNotFoundException.
> 4. ls command terminates with FNF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6912) SharedFileDescriptorFactory should not allocate sparse files

2014-09-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-6912:
---
  Resolution: Fixed
   Fix Version/s: 2.6.0
Target Version/s: 2.6.0
  Status: Resolved  (was: Patch Available)

> SharedFileDescriptorFactory should not allocate sparse files
> 
>
> Key: HDFS-6912
> URL: https://issues.apache.org/jira/browse/HDFS-6912
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: caching
>Affects Versions: 2.5.0
> Environment: HDFS Data node, with 8 gb tmpfs in /dev/shm
>Reporter: Gopal V
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-6912.001.patch, HDFS-6912.002.patch, 
> HDFS-6912.003.patch
>
>
> SharedFileDescriptor factory should not allocate sparse files.  Sparse files 
> can lead to a SIGBUS later in the short-circuit reader when we try to read 
> from the sparse file and memory is not available.
> Note that if swap is enabled, we can still get a SIGBUS even with a 
> non-sparse file, since the JVM uses MAP_NORESERVE in mmap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7047) Expose FileStatus#isEncrypted in libhdfs

2014-09-15 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-7047:
---
Attachment: HDFS-7047.001.patch

> Expose FileStatus#isEncrypted in libhdfs
> 
>
> Key: HDFS-7047
> URL: https://issues.apache.org/jira/browse/HDFS-7047
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Affects Versions: 2.6.0
>Reporter: Andrew Wang
>Assignee: Andrew Wang
> Attachments: HDFS-7047.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7059) HAadmin transtionToActive with forceActive option can show confusing message.

2014-09-15 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7059:
-
  Priority: Minor  (was: Major)
Issue Type: Improvement  (was: Bug)

> HAadmin transtionToActive with forceActive option can show confusing message.
> -
>
> Key: HDFS-7059
> URL: https://issues.apache.org/jira/browse/HDFS-7059
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-7059.patch
>
>
> Ran into this confusing message on our local HA setup.
> One of the namenode was down and the other was in standby mode.
> The namenode was not able to come out of safe mode so we did 
> transitionToActive with forceActive switch enabled.
> Due to change in HDFS-2949,  it will try connecting to all the  namenode  to 
> see whether they are active or not.
> But since the other namenode is down it will try connect to that namenode for 
> 'ipc.client.connect.max.retries' number of times.
> Every time it is not able to connect, it will log a message :
> INFO ipc.Client: Retrying connect to server: . Already tried 0 
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, 
> sleepTime=1000 MILLISECONDS)
> Since in our configuration, the number of retries is 50, it will show this 
> message 50 times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7059) HAadmin transtionToActive with forceActive option can show confusing message.

2014-09-15 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7059:
-
   Resolution: Fixed
Fix Version/s: 2.6.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

> HAadmin transtionToActive with forceActive option can show confusing message.
> -
>
> Key: HDFS-7059
> URL: https://issues.apache.org/jira/browse/HDFS-7059
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.6.0
>
> Attachments: HDFS-7059.patch
>
>
> Ran into this confusing message on our local HA setup.
> One of the namenode was down and the other was in standby mode.
> The namenode was not able to come out of safe mode so we did 
> transitionToActive with forceActive switch enabled.
> Due to change in HDFS-2949,  it will try connecting to all the  namenode  to 
> see whether they are active or not.
> But since the other namenode is down it will try connect to that namenode for 
> 'ipc.client.connect.max.retries' number of times.
> Every time it is not able to connect, it will log a message :
> INFO ipc.Client: Retrying connect to server: . Already tried 0 
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, 
> sleepTime=1000 MILLISECONDS)
> Since in our configuration, the number of retries is 50, it will show this 
> message 50 times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7059) HAadmin transtionToActive with forceActive option can show confusing message.

2014-09-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134525#comment-14134525
 ] 

Kihwal Lee commented on HDFS-7059:
--

Committed the patch to trunk and cherry-picked to branch-2. Thanks for working 
on the fix, Rushabh.

> HAadmin transtionToActive with forceActive option can show confusing message.
> -
>
> Key: HDFS-7059
> URL: https://issues.apache.org/jira/browse/HDFS-7059
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Fix For: 2.6.0
>
> Attachments: HDFS-7059.patch
>
>
> Ran into this confusing message on our local HA setup.
> One of the namenode was down and the other was in standby mode.
> The namenode was not able to come out of safe mode so we did 
> transitionToActive with forceActive switch enabled.
> Due to change in HDFS-2949,  it will try connecting to all the  namenode  to 
> see whether they are active or not.
> But since the other namenode is down it will try connect to that namenode for 
> 'ipc.client.connect.max.retries' number of times.
> Every time it is not able to connect, it will log a message :
> INFO ipc.Client: Retrying connect to server: . Already tried 0 
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, 
> sleepTime=1000 MILLISECONDS)
> Since in our configuration, the number of retries is 50, it will show this 
> message 50 times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7059) HAadmin transtionToActive with forceActive option can show confusing message.

2014-09-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134497#comment-14134497
 ] 

Kihwal Lee commented on HDFS-7059:
--

+1 The change looks good.

> HAadmin transtionToActive with forceActive option can show confusing message.
> -
>
> Key: HDFS-7059
> URL: https://issues.apache.org/jira/browse/HDFS-7059
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-7059.patch
>
>
> Ran into this confusing message on our local HA setup.
> One of the namenode was down and the other was in standby mode.
> The namenode was not able to come out of safe mode so we did 
> transitionToActive with forceActive switch enabled.
> Due to change in HDFS-2949,  it will try connecting to all the  namenode  to 
> see whether they are active or not.
> But since the other namenode is down it will try connect to that namenode for 
> 'ipc.client.connect.max.retries' number of times.
> Every time it is not able to connect, it will log a message :
> INFO ipc.Client: Retrying connect to server: . Already tried 0 
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, 
> sleepTime=1000 MILLISECONDS)
> Since in our configuration, the number of retries is 50, it will show this 
> message 50 times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7059) HAadmin transtionToActive with forceActive option can show confusing message.

2014-09-15 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134494#comment-14134494
 ] 

Kihwal Lee commented on HDFS-7059:
--

The test failure was reported in HADOOP-10668.

> HAadmin transtionToActive with forceActive option can show confusing message.
> -
>
> Key: HDFS-7059
> URL: https://issues.apache.org/jira/browse/HDFS-7059
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-7059.patch
>
>
> Ran into this confusing message on our local HA setup.
> One of the namenode was down and the other was in standby mode.
> The namenode was not able to come out of safe mode so we did 
> transitionToActive with forceActive switch enabled.
> Due to change in HDFS-2949,  it will try connecting to all the  namenode  to 
> see whether they are active or not.
> But since the other namenode is down it will try connect to that namenode for 
> 'ipc.client.connect.max.retries' number of times.
> Every time it is not able to connect, it will log a message :
> INFO ipc.Client: Retrying connect to server: . Already tried 0 
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, 
> sleepTime=1000 MILLISECONDS)
> Since in our configuration, the number of retries is 50, it will show this 
> message 50 times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6584) Support Archival Storage

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134483#comment-14134483
 ] 

Hadoop QA commented on HDFS-6584:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668803/h6584_20140915.patch
  against trunk revision 43b0303.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 24 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.mover.TestStorageMover
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  
org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer
  org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8031//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8031//console

This message is automatically generated.

> Support Archival Storage
> 
>
> Key: HDFS-6584
> URL: https://issues.apache.org/jira/browse/HDFS-6584
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer, namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-6584.000.patch, 
> HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, 
> archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, 
> h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, 
> h6584_20140915.patch
>
>
> In most of the Hadoop clusters, as more and more data is stored for longer 
> time, the demand for storage is outstripping the compute. Hadoop needs a cost 
> effective and easy to manage solution to meet this demand for storage. 
> Current solution is:
> - Delete the old unused data. This comes at operational cost of identifying 
> unnecessary data and deleting them manually.
> - Add more nodes to the clusters. This adds along with storage capacity 
> unnecessary compute capacity to the cluster.
> Hadoop needs a solution to decouple growing storage capacity from compute 
> capacity. Nodes with higher density and less expensive storage with low 
> compute power are becoming available and can be used as cold storage in the 
> clusters. Based on policy the data from hot storage can be moved to cold 
> storage. Adding more nodes to the cold storage can grow the storage 
> independent of the compute capacity in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7065) Pipeline close recovery race can cause block corruption

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134484#comment-14134484
 ] 

Hadoop QA commented on HDFS-7065:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668840/HDFS-7065.patch
  against trunk revision 9d4ec97.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8034//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8034//console

This message is automatically generated.

> Pipeline close recovery race can cause block corruption
> ---
>
> Key: HDFS-7065
> URL: https://issues.apache.org/jira/browse/HDFS-7065
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-7065.patch
>
>
> If multiple pipeline close recoveries are performed against the same block, 
> the replica may go corrupt.  Here is one case I have observed:
> The client tried to close a block, but the ACK timed out.  It excluded the 
> first DN and tried pipeline recovery (recoverClose). It too failed and 
> another recovery was attempted with only one DN.  This took more than usual 
> but the client eventually got an ACK and the file was closed successfully.  
> Later on the one and only replica was found to be corrupt.
> It turned out the DN was having transient slow disk I/O issue at that time. 
> The first recovery was stuck until the second recovery was attempted 30 
> seconds later.  After few seconds, they both threads started running. The 
> second recovery finished first and then the first recovery with an older gen 
> stamp finished, turning gen stamp backward.
> There is a sanity check in {{recoverCheck()}}, but since check and modify are 
> not synchronized, {{recoverClose()}} is not multi-thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6191) Disable quota checks when replaying edit log.

2014-09-15 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134477#comment-14134477
 ] 

Suresh Srinivas commented on HDFS-6191:
---

bq. Thanks, Jing for the review. I will take a look at other verification 
functions and file a separate jira if necessary.
[~jingzhao] and [~kihwal], do we need a follow up jira?

> Disable quota checks when replaying edit log.
> -
>
> Key: HDFS-6191
> URL: https://issues.apache.org/jira/browse/HDFS-6191
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
> Fix For: 0.23.11, 2.5.0
>
> Attachments: HDFS-6191.branch-0.23.patch, HDFS-6191.patch, 
> HDFS-6191.patch, HDFS-6191.patch
>
>
> Since the serving NN does quota checks before logging edits, performing quota 
> checks is unnecessary while replaying edits. I propose disabling quota checks 
> for 2NN and SBN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6843) Create FileStatus isEncrypted() method

2014-09-15 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6843:
---
Attachment: HDFS-6843.008.patch

.008 fixes Test*ContractOpen failures. I ran the other tests and they were 
spurious or unrelated.


> Create FileStatus isEncrypted() method
> --
>
> Key: HDFS-6843
> URL: https://issues.apache.org/jira/browse/HDFS-6843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, 
> HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, 
> HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch, 
> HDFS-6843.008.patch
>
>
> FileStatus should have a 'boolean isEncrypted()' method. (it was in the 
> context of discussing with AndreW about FileStatus being a Writable).
> Having this method would allow MR JobSubmitter do the following:
> -
> BOOLEAN intermediateEncryption = false
> IF jobconf.contains("mr.intermidate.encryption") THEN
>   intermediateEncryption = jobConf.getBoolean("mr.intermidate.encryption")
> ELSE
>   IF (I/O)Format INSTANCEOF File(I/O)Format THEN
> intermediateEncryption = ANY File(I/O)Format HAS a Path with status 
> isEncrypted()==TRUE
>   FI
>   jobConf.setBoolean("mr.intermidate.encryption", intermediateEncryption)
> FI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7035) Refactor DataStorage and BlockSlicePoolStorage

2014-09-15 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134467#comment-14134467
 ] 

Lei (Eddy) Xu commented on HDFS-7035:
-

Hi, all

I've run 
TestHDFSTrash,TestDistributedFileSystem,TestFileCreation,TestDataStorage,TestDataNodeRollingUpgrade,TestDatanodeStorageBase,TestBlockPoolSliceStorage,TestBPOfferService,TestFileAppend4,TestDataTransferKeepalive
 locally without failures. 

I think that the above failures are not related to this patch. 

> Refactor DataStorage and BlockSlicePoolStorage 
> ---
>
> Key: HDFS-7035
> URL: https://issues.apache.org/jira/browse/HDFS-7035
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Affects Versions: 2.5.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-7035.000.combo.patch, HDFS-7035.000.patch, 
> HDFS-7035.001.combo.patch, HDFS-7035.001.patch, HDFS-7035.002.patch, 
> HDFS-7035.003.patch, HDFS-7035.003.patch
>
>
> {{DataStorage}} and {{BlockPoolSliceStorage}} share many similar code path. 
> This jira extracts the common part of these two classes to simplify the logic 
> for both.
> This is the ground work for handling partial failures during hot swapping 
> volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7065) Pipeline close recovery race can cause block corruption

2014-09-15 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7065:
-
Status: Patch Available  (was: Open)

> Pipeline close recovery race can cause block corruption
> ---
>
> Key: HDFS-7065
> URL: https://issues.apache.org/jira/browse/HDFS-7065
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-7065.patch
>
>
> If multiple pipeline close recoveries are performed against the same block, 
> the replica may go corrupt.  Here is one case I have observed:
> The client tried to close a block, but the ACK timed out.  It excluded the 
> first DN and tried pipeline recovery (recoverClose). It too failed and 
> another recovery was attempted with only one DN.  This took more than usual 
> but the client eventually got an ACK and the file was closed successfully.  
> Later on the one and only replica was found to be corrupt.
> It turned out the DN was having transient slow disk I/O issue at that time. 
> The first recovery was stuck until the second recovery was attempted 30 
> seconds later.  After few seconds, they both threads started running. The 
> second recovery finished first and then the first recovery with an older gen 
> stamp finished, turning gen stamp backward.
> There is a sanity check in {{recoverCheck()}}, but since check and modify are 
> not synchronized, {{recoverClose()}} is not multi-thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7065) Pipeline close recovery race can cause block corruption

2014-09-15 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7065:
-
Attachment: HDFS-7065.patch

> Pipeline close recovery race can cause block corruption
> ---
>
> Key: HDFS-7065
> URL: https://issues.apache.org/jira/browse/HDFS-7065
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.5.0
>Reporter: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-7065.patch
>
>
> If multiple pipeline close recoveries are performed against the same block, 
> the replica may go corrupt.  Here is one case I have observed:
> The client tried to close a block, but the ACK timed out.  It excluded the 
> first DN and tried pipeline recovery (recoverClose). It too failed and 
> another recovery was attempted with only one DN.  This took more than usual 
> but the client eventually got an ACK and the file was closed successfully.  
> Later on the one and only replica was found to be corrupt.
> It turned out the DN was having transient slow disk I/O issue at that time. 
> The first recovery was stuck until the second recovery was attempted 30 
> seconds later.  After few seconds, they both threads started running. The 
> second recovery finished first and then the first recovery with an older gen 
> stamp finished, turning gen stamp backward.
> There is a sanity check in {{recoverCheck()}}, but since check and modify are 
> not synchronized, {{recoverClose()}} is not multi-thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7065) Pipeline close recovery race can cause block corruption

2014-09-15 Thread Kihwal Lee (JIRA)
Kihwal Lee created HDFS-7065:


 Summary: Pipeline close recovery race can cause block corruption
 Key: HDFS-7065
 URL: https://issues.apache.org/jira/browse/HDFS-7065
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Kihwal Lee
Priority: Critical


If multiple pipeline close recoveries are performed against the same block, the 
replica may go corrupt.  Here is one case I have observed:

The client tried to close a block, but the ACK timed out.  It excluded the 
first DN and tried pipeline recovery (recoverClose). It too failed and another 
recovery was attempted with only one DN.  This took more than usual but the 
client eventually got an ACK and the file was closed successfully.  Later on 
the one and only replica was found to be corrupt.

It turned out the DN was having transient slow disk I/O issue at that time. The 
first recovery was stuck until the second recovery was attempted 30 seconds 
later.  After few seconds, they both threads started running. The second 
recovery finished first and then the first recovery with an older gen stamp 
finished, turning gen stamp backward.

There is a sanity check in {{recoverCheck()}}, but since check and modify are 
not synchronized, {{recoverClose()}} is not multi-thread safe.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6851) Flush EncryptionZoneWithId and add an id field to EncryptionZone

2014-09-15 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6851:
---
Attachment: HDFS-6851.002.patch

[~andrew.wang],

Yes, this is ready to be reviewed (the .002 version). I ran a sampling of the 
test failures and they all passed so I am guessing that they are some sort of 
spurious jenkins failure. I also ran TestEncryptionZones to verify that this 
change doesn't brak anything. There are no new unit tests because this is just 
a refactoring.


> Flush EncryptionZoneWithId and add an id field to EncryptionZone
> 
>
> Key: HDFS-6851
> URL: https://issues.apache.org/jira/browse/HDFS-6851
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 2.6.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6851-prelim.001.patch, HDFS-6851.000.patch, 
> HDFS-6851.002.patch
>
>
> EncryptionZoneWithId can be flushed by moving the id field up to 
> EncryptionZone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6843) Create FileStatus isEncrypted() method

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134360#comment-14134360
 ] 

Hadoop QA commented on HDFS-6843:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668762/HDFS-6843.007.patch
  against trunk revision 24d920b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.fs.contract.localfs.TestLocalFSContractOpen
  
org.apache.hadoop.fs.contract.rawlocal.TestRawlocalContractOpen
  org.apache.hadoop.ha.TestZKFailoverControllerStress
  org.apache.hadoop.hdfs.server.namenode.TestAuditLogger
  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
  org.apache.hadoop.fs.contract.hdfs.TestHDFSContractOpen
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8030//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8030//console

This message is automatically generated.

> Create FileStatus isEncrypted() method
> --
>
> Key: HDFS-6843
> URL: https://issues.apache.org/jira/browse/HDFS-6843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, 
> HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, 
> HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch
>
>
> FileStatus should have a 'boolean isEncrypted()' method. (it was in the 
> context of discussing with AndreW about FileStatus being a Writable).
> Having this method would allow MR JobSubmitter do the following:
> -
> BOOLEAN intermediateEncryption = false
> IF jobconf.contains("mr.intermidate.encryption") THEN
>   intermediateEncryption = jobConf.getBoolean("mr.intermidate.encryption")
> ELSE
>   IF (I/O)Format INSTANCEOF File(I/O)Format THEN
> intermediateEncryption = ANY File(I/O)Format HAS a Path with status 
> isEncrypted()==TRUE
>   FI
>   jobConf.setBoolean("mr.intermidate.encryption", intermediateEncryption)
> FI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7064) Fix test failures

2014-09-15 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-7064:
---

 Summary: Fix test failures
 Key: HDFS-7064
 URL: https://issues.apache.org/jira/browse/HDFS-7064
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Affects Versions: HDFS-6581
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal


Fix test failures in the HDFS-6581 feature branch.

Jenkins flagged the following failures.
https://builds.apache.org/job/PreCommit-HDFS-Build/8025//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6584) Support Archival Storage

2014-09-15 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134276#comment-14134276
 ] 

Jing Zhao commented on HDFS-6584:
-

Thanks [~andrew.wang]!

bq. Since the Mover is based on the Balancer, is there any concern about it 
being too slow to move data from fast storage to archival? If all data migrates 
off to archival, the mover needs to keep up with the aggregate write rate of 
the cluster. The balancer, putting it mildly, is not the fastest tool in this 
regard.

Here are some of my thoughts. Please let me know if I miss something, 
[~szetszwo].
1) Currently the migration tool still depends on admin to mark files/dirs as 
COLD/WARM, it may be rare that users still actively writing new data into a
directory after marking it as COLD. Thus for now this may not be a critical 
concern.
2) Tools/services may later be developed to actively/automatically scan the 
namespace and mark COLD files based on different rules such as 
access/modification time. In some cases, if the rule is very aggressive and the 
migration is very slow, we may have the issue you mentioned. The current Mover 
is utilizing the Dispatcher, or more generally, the 
{{DataTransferProtocol#replaceBlock}} protocol. I guess with more aggressive 
settings (e.g., the max number of blocks scheduled on each DataNode for 
migration), the migration speed should not be very slow, and it should be easy 
for us to replace the Dispatcher with a faster migration framework.

bq. We exposed cachedHosts in BlockLocation, so application schedulers can 
choose to place their tasks for cache locality. We need a similar thing for 
storage type, so schedulers can prefer "hotter" replicas.
This is a very good suggestion, we can add this information later. Thanks!

BTW, HDFS-7062 has been committed to fix the open file issue. A doc patch has 
been uploaded in HDFS-6864. Thanks again for the great comments, [~andrew.wang]!

> Support Archival Storage
> 
>
> Key: HDFS-6584
> URL: https://issues.apache.org/jira/browse/HDFS-6584
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer, namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-6584.000.patch, 
> HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, 
> archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, 
> h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, 
> h6584_20140915.patch
>
>
> In most of the Hadoop clusters, as more and more data is stored for longer 
> time, the demand for storage is outstripping the compute. Hadoop needs a cost 
> effective and easy to manage solution to meet this demand for storage. 
> Current solution is:
> - Delete the old unused data. This comes at operational cost of identifying 
> unnecessary data and deleting them manually.
> - Add more nodes to the clusters. This adds along with storage capacity 
> unnecessary compute capacity to the cluster.
> Hadoop needs a solution to decouple growing storage capacity from compute 
> capacity. Nodes with higher density and less expensive storage with low 
> compute power are becoming available and can be used as cold storage in the 
> clusters. Based on policy the data from hot storage can be moved to cold 
> storage. Adding more nodes to the cold storage can grow the storage 
> independent of the compute capacity in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-09-15 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134271#comment-14134271
 ] 

Jitendra Nath Pandey commented on HDFS-6826:


bq. Allowing complete override of the permission checking is bad too. Plugins 
should not be in control of the traversal or iteration of inodes. The plugin 
should be just that - a hook for the core permission checking. Otherwise 
plugins will be fragile when changes are made to path resolution. And/or 
plugins will have to implement duplicated logic. 

The path resolution happens before generating the list of inodes for a given 
path, therefore in the proposed permission check API, the path resolution will 
not be controlled by the plugin. The iteration of inodes is actually part of 
the core permission checking because we check permission bits (execute mode or 
read mode) for intermediate directories, which was I think opted to look more 
like POSIX. However, a plugin should be able to override that semantics and 
provide a different model.





> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
> HDFS-6826-permchecker.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, 
> HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, 
> HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, 
> HDFS-6826v7.5.patch, HDFS-6826v7.6.patch, HDFS-6826v7.patch, 
> HDFS-6826v8.patch, HDFSPluggableAuthorizationProposal-v2.pdf, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7010) boot up libhdfs3 project

2014-09-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134270#comment-14134270
 ] 

Colin Patrick McCabe commented on HDFS-7010:


bq. The source code in gtest and gmock directories are using google license. Do 
you think it is better to keep them unchanged and put the google license file 
in that directories? I missed the license header in CMake files, thank you for 
pointing out.

We can't change the license for the gtest and gmock libraries -- they are 
released by Google.  So their headers should remain unchanged.

bq. GNU GCC recognises all of the following as C++ files, and will use C++ 
compilation regardless of whether invoke it through gcc or g+: .C, .cc, .cpp, 
.CPP, .c+, .cp, or .cxx. It is mainly a matter of taste I think. Do you think 
we can get many benefit from changing it? Changing filename extension will 
break the fault injection and many other tests in our team because our test 
framework records the filename in database and even in test case. I do not 
think it is worthy just for aligning with other code in HDFS.

We need to have a consistent coding style in the project, and part of that is 
having a consistent naming convention.  Sorry, but please change it to .cc like 
the other code.

bq. Boost is needed if the GCC version is older than 4.6. After fixing 
HDFS-7022, we can support older version GCC without Boost. Maybe we can write 
more description about it. libxml2 will be changed into libexpat in HDFS-7023.

I am fine with fixing these in follow-up JIRAs if it's more convenient.

> boot up libhdfs3 project
> 
>
> Key: HDFS-7010
> URL: https://issues.apache.org/jira/browse/HDFS-7010
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Zhanwei Wang
> Attachments: HDFS-7010.patch
>
>
> boot up libhdfs3 project with CMake, Readme and license file.
> Integrate google mock and google test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7049) TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134245#comment-14134245
 ] 

Hadoop QA commented on HDFS-7049:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12668811/HDFS-7049-branch-2.patch
  against trunk revision 43b0303.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8032//console

This message is automatically generated.

> TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2
> 
>
> Key: HDFS-7049
> URL: https://issues.apache.org/jira/browse/HDFS-7049
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Juan Yu
>Assignee: Juan Yu
>Priority: Minor
> Attachments: HDFS-7049-branch-2.patch
>
>
> On branch-2, TestByteRangeInputStream.testPropagatedClose throw NPE when 
> HftpFileSystem$RangeHeaderUrlOpener.connect
> This is due to fix of HDFS-6143 "WebHdfsFileSystem open should throw 
> FileNotFoundException for non-existing paths"
> public ByteRangeInputStream(URLOpener o, URLOpener r) throws IOException {
> this.originalURL = o;
> this.resolvedURL = r;
> getInputStream();
>   }
> the getInputStream() will be called in constructor now to verify if file 
> exists.
> Since we just try to test if ByteRangeInputStream#close is called at proper 
> time, we could mock(ByteRangeInputStream.class, CALLS_REAL_METHODS) for 
> testing to avoid the NPE issue.
> I believe the trunk version already does this, we just need to merge the test 
> from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7049) TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2

2014-09-15 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-7049:
--
Assignee: Juan Yu
  Status: Patch Available  (was: Open)

> TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2
> 
>
> Key: HDFS-7049
> URL: https://issues.apache.org/jira/browse/HDFS-7049
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Juan Yu
>Assignee: Juan Yu
>Priority: Minor
> Attachments: HDFS-7049-branch-2.patch
>
>
> On branch-2, TestByteRangeInputStream.testPropagatedClose throw NPE when 
> HftpFileSystem$RangeHeaderUrlOpener.connect
> This is due to fix of HDFS-6143 "WebHdfsFileSystem open should throw 
> FileNotFoundException for non-existing paths"
> public ByteRangeInputStream(URLOpener o, URLOpener r) throws IOException {
> this.originalURL = o;
> this.resolvedURL = r;
> getInputStream();
>   }
> the getInputStream() will be called in constructor now to verify if file 
> exists.
> Since we just try to test if ByteRangeInputStream#close is called at proper 
> time, we could mock(ByteRangeInputStream.class, CALLS_REAL_METHODS) for 
> testing to avoid the NPE issue.
> I believe the trunk version already does this, we just need to merge the test 
> from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7049) TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2

2014-09-15 Thread Juan Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juan Yu updated HDFS-7049:
--
Attachment: HDFS-7049-branch-2.patch

New TestByteRangeInputStream for branch-2 to fix test failure. It's the same 
test as the trunk version.

> TestByteRangeInputStream.testPropagatedClose fails and throw NPE on branch-2
> 
>
> Key: HDFS-7049
> URL: https://issues.apache.org/jira/browse/HDFS-7049
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Juan Yu
>Priority: Minor
> Attachments: HDFS-7049-branch-2.patch
>
>
> On branch-2, TestByteRangeInputStream.testPropagatedClose throw NPE when 
> HftpFileSystem$RangeHeaderUrlOpener.connect
> This is due to fix of HDFS-6143 "WebHdfsFileSystem open should throw 
> FileNotFoundException for non-existing paths"
> public ByteRangeInputStream(URLOpener o, URLOpener r) throws IOException {
> this.originalURL = o;
> this.resolvedURL = r;
> getInputStream();
>   }
> the getInputStream() will be called in constructor now to verify if file 
> exists.
> Since we just try to test if ByteRangeInputStream#close is called at proper 
> time, we could mock(ByteRangeInputStream.class, CALLS_REAL_METHODS) for 
> testing to avoid the NPE issue.
> I believe the trunk version already does this, we just need to merge the test 
> from trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3107) HDFS truncate

2014-09-15 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134206#comment-14134206
 ] 

Colin Patrick McCabe commented on HDFS-3107:


[~shv], [~zero45], do you have a design doc for this?  It's a pretty big change 
to HDFS semantics (previously, file data was immutable once written.)

At minimum, we need to decide:
* What the consistency semantics are (if some client reads a file at position X 
after it has been truncted to X-1, what might the client see?)  If data is 
appended after a truncate, it seems like we could get divergent histories in 
many situations, where one client sees one timeline and another client sees 
another.  Is this acceptable?  If so, we'll need to describe what the behavior 
actually is.  Are we going to guarantee that files never appear to shrink while 
clients have them open?  How does this interact with hflush and hsync?
* How this interacts with snapshots.  When truncating a block and then 
re-appending, a divergent block will be created, invalidating previous 
snapshots.  It seems that someone will have to copy this block, so the question 
is when.  I see some discussion of that here in the comments, but I don't see 
any conclusions.
* What are the use-cases.  Right now we have discussed only one use-case: 
allowing users to remove data they accidentially appended.  I would add one 
more use-case: improving support for fuse-dfs and NFS, which both require 
truncate to be implemented.  Are there any additional use-cases?
* And, of course... how it's going to be implemented :)

I think this should be a branch since as the HDFS append experience showed, it 
may take some time to get everything right.

> HDFS truncate
> -
>
> Key: HDFS-3107
> URL: https://issues.apache.org/jira/browse/HDFS-3107
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Reporter: Lei Chang
>Assignee: Plamen Jeliazkov
> Attachments: HDFS-3107.patch, HDFS_truncate_semantics_Mar15.pdf, 
> HDFS_truncate_semantics_Mar21.pdf
>
>   Original Estimate: 1,344h
>  Remaining Estimate: 1,344h
>
> Systems with transaction support often need to undo changes made to the 
> underlying storage when a transaction is aborted. Currently HDFS does not 
> support truncate (a standard Posix operation) which is a reverse operation of 
> append, which makes upper layer applications use ugly workarounds (such as 
> keeping track of the discarded byte range per file in a separate metadata 
> store, and periodically running a vacuum process to rewrite compacted files) 
> to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6584) Support Archival Storage

2014-09-15 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-6584:

Attachment: h6584_20140915.patch

> Support Archival Storage
> 
>
> Key: HDFS-6584
> URL: https://issues.apache.org/jira/browse/HDFS-6584
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer, namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-6584.000.patch, 
> HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, 
> archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, 
> h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, 
> h6584_20140915.patch
>
>
> In most of the Hadoop clusters, as more and more data is stored for longer 
> time, the demand for storage is outstripping the compute. Hadoop needs a cost 
> effective and easy to manage solution to meet this demand for storage. 
> Current solution is:
> - Delete the old unused data. This comes at operational cost of identifying 
> unnecessary data and deleting them manually.
> - Add more nodes to the clusters. This adds along with storage capacity 
> unnecessary compute capacity to the cluster.
> Hadoop needs a solution to decouple growing storage capacity from compute 
> capacity. Nodes with higher density and less expensive storage with low 
> compute power are becoming available and can be used as cold storage in the 
> clusters. Based on policy the data from hot storage can be moved to cold 
> storage. Adding more nodes to the cold storage can grow the storage 
> independent of the compute capacity in the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7052) Archival Storage: Add Mover into hdfs script

2014-09-15 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-7052:

Attachment: HDFS-7052.001.patch

Update the patch:
# Still keep the name mover before finding a better name
# Sort the mover option in hdfs script with the correct order

> Archival Storage: Add Mover into hdfs script
> 
>
> Key: HDFS-7052
> URL: https://issues.apache.org/jira/browse/HDFS-7052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>Priority: Minor
> Attachments: HDFS-7052.000.patch, HDFS-7052.001.patch
>
>
> Similar with Balancer, we should add Mover into the hdfs script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7032) Add WebHDFS support for reading and writing to encryption zones

2014-09-15 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-7032:
--
   Resolution: Fixed
Fix Version/s: 2.6.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks Charles.

> Add WebHDFS support for reading and writing to encryption zones
> ---
>
> Key: HDFS-7032
> URL: https://issues.apache.org/jira/browse/HDFS-7032
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption, webhdfs
>Affects Versions: 3.0.0, 2.6.0
>Reporter: Stephen Chu
>Assignee: Charles Lamb
> Fix For: 2.6.0
>
> Attachments: HDFS-7032.001.patch, HDFS-7032.002.patch
>
>
> Currently, decrypting files within encryption zones does not work through 
> WebHDFS. Users will get returned the raw data.
> For example:
> {code}
> bash-4.1$ hdfs crypto -listZones
> /enc2 key128 
> /jenkins  key128 
> bash-4.1$ hdfs dfs -cat /enc2/hello
> hello and goodbye
> bash-4.1$ hadoop fs -cat 
> webhdfs://hdfs-cdh5-vanilla-1.host.com:20101/enc2/hello14/09/08 15:55:26 WARN 
> ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' 
> has not been set, no TrustStore will be loaded
> 忡?~?A
> ?`?y???Wbash-4.1$ 
> bash-4.1$ curl -i -L 
> "http://hdfs-cdh5-vanilla-1.host.com:20101/webhdfs/v1/enc2/hello?user.name=hdfs&op=OPEN";
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Set-Cookie: 
> hadoop.auth=u=hdfs&p=hdfs&t=simple&e=1410252968270&s=QzpylAy1ltts1F6hHpsVFGC0TfA=;
>  Version=1; Path=/; Expires=Tue, 09-Sep-2014 08:56:08 GMT; HttpOnly
> Location: 
> http://hdfs-cdh5-vanilla-1.host.com:20003/webhdfs/v1/enc2/hello?op=OPEN&user.name=hdfs&namenoderpcaddress=hdfs-cdh5-vanilla-1.host.com:8020&offset=0
> Content-Length: 0
> Server: Jetty(6.1.26)
> HTTP/1.1 200 OK
> Cache-Control: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Content-Length: 18
> Access-Control-Allow-Methods: GET
> Access-Control-Allow-Origin: *
> Server: Jetty(6.1.26)
> 忡?~?A
> ?`?y???Wbash-4.1$ 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6843) Create FileStatus isEncrypted() method

2014-09-15 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134159#comment-14134159
 ] 

Andrew Wang commented on HDFS-6843:
---

I just kicked Jenkins to get another run. It seems to have run, but encountered 
some weird errors, and apparently it didn't post a JIRA comment.

> Create FileStatus isEncrypted() method
> --
>
> Key: HDFS-6843
> URL: https://issues.apache.org/jira/browse/HDFS-6843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, 
> HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, 
> HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch
>
>
> FileStatus should have a 'boolean isEncrypted()' method. (it was in the 
> context of discussing with AndreW about FileStatus being a Writable).
> Having this method would allow MR JobSubmitter do the following:
> -
> BOOLEAN intermediateEncryption = false
> IF jobconf.contains("mr.intermidate.encryption") THEN
>   intermediateEncryption = jobConf.getBoolean("mr.intermidate.encryption")
> ELSE
>   IF (I/O)Format INSTANCEOF File(I/O)Format THEN
> intermediateEncryption = ANY File(I/O)Format HAS a Path with status 
> isEncrypted()==TRUE
>   FI
>   jobConf.setBoolean("mr.intermidate.encryption", intermediateEncryption)
> FI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7062) Archival Storage: skip under construction block for migration

2014-09-15 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-7062.
-
  Resolution: Fixed
Hadoop Flags: Reviewed

Thanks Nicholas for the review. I've committed this.

> Archival Storage: skip under construction block for migration
> -
>
> Key: HDFS-7062
> URL: https://issues.apache.org/jira/browse/HDFS-7062
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, namenode
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-7062.000.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7046) HA NN can NPE upon transition to active

2014-09-15 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134106#comment-14134106
 ] 

Daryn Sharp commented on HDFS-7046:
---

Modifying the startup of the secret manager feels hacky.  Changing active state 
in the very middle of processing an edit op seems pretty dangerous and wrong.  
Even if it works today.  It's not something I would have ever expected the NN 
to do.

If the NN is allowed to exit safemode before finishing the edits replay then 
effectively it's exiting due to consistency "at some point in the past", 
instead of consistent "right now". 

> HA NN can NPE upon transition to active
> ---
>
> Key: HDFS-7046
> URL: https://issues.apache.org/jira/browse/HDFS-7046
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.5.0
>Reporter: Daryn Sharp
>Assignee: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-7046.patch, HDFS-7046_test_reproduce.patch
>
>
> While processing edits, the NN may decide after adjusting block totals to 
> leave safe mode - in the middle of the edit.  Going active starts the secret 
> manager which generates a new secret key, which in turn generates an edit, 
> which NPEs because the edit log is not open.
> # Transitions should _not_ occur in the middle of an edit.
> # The edit log appears to claim it's open for write when the stream isn't 
> even open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6436) WebHdfsFileSystem execute get, renew and cancel delegationtoken operation should use spnego to authenticate

2014-09-15 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134060#comment-14134060
 ] 

Daryn Sharp commented on HDFS-6436:
---

Since jira seems to be rebuilding really old patches, can you please retest and 
close if necessary?  If still an issue, then it's a httpfs server issue.  The 
JDK will do spnego w/o isSpnego set to true.  We've been using webhdfs under 
production workloads for a long time.  Perhaps httpfs server is configured to 
only initiate spnego upon an OPTIONS request which would be a bug in httpfs 
server.

> WebHdfsFileSystem execute get, renew and cancel delegationtoken operation 
> should use spnego to authenticate
> ---
>
> Key: HDFS-6436
> URL: https://issues.apache.org/jira/browse/HDFS-6436
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0, 2.4.0
> Environment: Kerberos
>Reporter: Bangtao Zhou
> Attachments: HDFS-6436.patch
>
>
> while in kerberos secure mode, when using WebHdfsFileSystem to access HDFS, 
> it allways get an 
> *org.apache.hadoop.security.authentication.client.AuthenticationException: 
> Unauthorized*, for example, when call WebHdfsFileSystem.listStatus it will 
> execute a LISTSTATUS Op, and this Op should authenticate via *delegation 
> token*, so it will execute a GETDELEGATIONTOKEN Op to get a delegation 
> token(actually GETDELEGATIONTOKEN authenticates via *SPNEGO*), but it still 
> use delegation token to authenticate, so it allways get an Unauthorized 
> Exception.
> Exception is like this:
> {code:java}
> 19:05:11.758 [main] DEBUG o.a.h.hdfs.web.URLConnectionFactory - open URL 
> connection
> java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> Unauthorized
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:287)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:82)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:538)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:406)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:434)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:430)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:1058)
> 19:05:11.766 [main] DEBUG o.a.h.security.UserGroupInformation - 
> PrivilegedActionException as:bang...@cyhadoop.com (auth:KERBEROS) 
> cause:java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> Unauthorized
>   at 
> org.apache.hadoop.hdfs.web.TokenAspect.ensureTokenInitialized(TokenAspect.java:134)
> 19:05:11.767 [main] DEBUG o.a.h.security.UserGroupInformation - 
> PrivilegedActionException as:bang...@cyhadoop.com (auth:KERBEROS) 
> cause:java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> Unauthorized
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getDelegationToken(WebHdfsFileSystem.java:213)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getAuthParameters(WebHdfsFileSystem.java:371)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toUrl(WebHdfsFileSystem.java:392)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractFsPathRunner.getUrl(WebHdfsFileSystem.java:602)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:533)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:406)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:434)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:430)
>   at 
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.listStatus(WebHdfsFileSystem.java:1037)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1483)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1523)
>   at org.apache.ha

[jira] [Commented] (HDFS-5546) race condition crashes "hadoop ls -R" when directories are moved/removed

2014-09-15 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134044#comment-14134044
 ] 

Daryn Sharp commented on HDFS-5546:
---

Consistency is always good.  However, the issue in the curly set expansions is 
it arguably shouldn't have been part of the globber.  The shell pre-expands 
curly sets before attempting to do glob expansion.  Currently that's all done 
by the globber so I'm not sure there's much we can do.  At least in this jira.

I'm +1 on the current patch assuming others are too.



> race condition crashes "hadoop ls -R" when directories are moved/removed
> 
>
> Key: HDFS-5546
> URL: https://issues.apache.org/jira/browse/HDFS-5546
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Colin Patrick McCabe
>Assignee: Lei (Eddy) Xu
>Priority: Minor
> Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, 
> HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, 
> HDFS-5546.2.004.patch
>
>
> This seems to be a rare race condition where we have a sequence of events 
> like this:
> 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D.
> 2. someone deletes or moves directory D
> 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which 
> calls DFS#listStatus(D). This throws FileNotFoundException.
> 4. ls command terminates with FNF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134026#comment-14134026
 ] 

Hadoop QA commented on HDFS-6826:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12668780/HDFS-6826-permchecker.patch
  against trunk revision 24d920b.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8029//console

This message is automatically generated.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
> HDFS-6826-permchecker.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, 
> HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, 
> HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, 
> HDFS-6826v7.5.patch, HDFS-6826v7.6.patch, HDFS-6826v7.patch, 
> HDFS-6826v8.patch, HDFSPluggableAuthorizationProposal-v2.pdf, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-09-15 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-6826:
--
Attachment: HDFS-6826-permchecker.patch

Won't compile, intentionally, due to plugin not being defined.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
> HDFS-6826-permchecker.patch, HDFS-6826v3.patch, HDFS-6826v4.patch, 
> HDFS-6826v5.patch, HDFS-6826v6.patch, HDFS-6826v7.1.patch, 
> HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, HDFS-6826v7.4.patch, 
> HDFS-6826v7.5.patch, HDFS-6826v7.6.patch, HDFS-6826v7.patch, 
> HDFS-6826v8.patch, HDFSPluggableAuthorizationProposal-v2.pdf, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6826) Plugin interface to enable delegation of HDFS authorization assertions

2014-09-15 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14134006#comment-14134006
 ] 

Daryn Sharp commented on HDFS-6826:
---

Apologies for my delay, again.  Awhile back, Alejandro and I spoke offline 
about how hooking in at the inode level and/or allowing a plugin to completely 
override the permission checking logic is detrimental.  The latest patch 
doesn't  directly splice into the inodes, but for reference the reason it's bad 
is because replaying edits, whether at startup or HA standby, should not ever 
go through the plugin.

Allowing complete override of the permission checking is bad too.  Plugins 
should not be in control of the traversal or iteration of inodes.  The plugin 
should be just that - a hook for the core permission checking.  Otherwise 
plugins will be fragile when changes are made to path resolution.  And/or 
plugins will have to implement duplicated logic.

With backlogged feature work I'm doing for optional fine grain locks, path 
resolution, the locking, and permission checking will all be folded together.  
In order for locking to work, inodes will be "resolved as you go".  About the 
only way to ensure compatibility is for the permission checker to have a hook 
to call out to the plugin.  The plugin cannot override the core behavior.

I'll attach an incomplete example patch for how an external plugin can 
substitute different inode attrs during a resolution.  It can be further 
optimized (I scaled it back) but it demonstrates the basic principle.  It 
doesn't address that argus needs another hook to provide further permission 
checks but it meets Alejandro's needs.  An enhancement for argus can be another 
jira.

> Plugin interface to enable delegation of HDFS authorization assertions
> --
>
> Key: HDFS-6826
> URL: https://issues.apache.org/jira/browse/HDFS-6826
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: security
>Affects Versions: 2.4.1
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: HDFS-6826-idea.patch, HDFS-6826-idea2.patch, 
> HDFS-6826v3.patch, HDFS-6826v4.patch, HDFS-6826v5.patch, HDFS-6826v6.patch, 
> HDFS-6826v7.1.patch, HDFS-6826v7.2.patch, HDFS-6826v7.3.patch, 
> HDFS-6826v7.4.patch, HDFS-6826v7.5.patch, HDFS-6826v7.6.patch, 
> HDFS-6826v7.patch, HDFS-6826v8.patch, 
> HDFSPluggableAuthorizationProposal-v2.pdf, 
> HDFSPluggableAuthorizationProposal.pdf
>
>
> When Hbase data, HiveMetaStore data or Search data is accessed via services 
> (Hbase region servers, HiveServer2, Impala, Solr) the services can enforce 
> permissions on corresponding entities (databases, tables, views, columns, 
> search collections, documents). It is desirable, when the data is accessed 
> directly by users accessing the underlying data files (i.e. from a MapReduce 
> job), that the permission of the data files map to the permissions of the 
> corresponding data entity (i.e. table, column family or search collection).
> To enable this we need to have the necessary hooks in place in the NameNode 
> to delegate authorization to an external system that can map HDFS 
> files/directories to data entities and resolve their permissions based on the 
> data entities permissions.
> I’ll be posting a design proposal in the next few days.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7032) Add WebHDFS support for reading and writing to encryption zones

2014-09-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133957#comment-14133957
 ] 

Hadoop QA commented on HDFS-7032:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12668752/HDFS-7032.002.patch
  against trunk revision fc741b5.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8026//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8026//console

This message is automatically generated.

> Add WebHDFS support for reading and writing to encryption zones
> ---
>
> Key: HDFS-7032
> URL: https://issues.apache.org/jira/browse/HDFS-7032
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption, webhdfs
>Affects Versions: 3.0.0, 2.6.0
>Reporter: Stephen Chu
>Assignee: Charles Lamb
> Attachments: HDFS-7032.001.patch, HDFS-7032.002.patch
>
>
> Currently, decrypting files within encryption zones does not work through 
> WebHDFS. Users will get returned the raw data.
> For example:
> {code}
> bash-4.1$ hdfs crypto -listZones
> /enc2 key128 
> /jenkins  key128 
> bash-4.1$ hdfs dfs -cat /enc2/hello
> hello and goodbye
> bash-4.1$ hadoop fs -cat 
> webhdfs://hdfs-cdh5-vanilla-1.host.com:20101/enc2/hello14/09/08 15:55:26 WARN 
> ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' 
> has not been set, no TrustStore will be loaded
> 忡?~?A
> ?`?y???Wbash-4.1$ 
> bash-4.1$ curl -i -L 
> "http://hdfs-cdh5-vanilla-1.host.com:20101/webhdfs/v1/enc2/hello?user.name=hdfs&op=OPEN";
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Set-Cookie: 
> hadoop.auth=u=hdfs&p=hdfs&t=simple&e=1410252968270&s=QzpylAy1ltts1F6hHpsVFGC0TfA=;
>  Version=1; Path=/; Expires=Tue, 09-Sep-2014 08:56:08 GMT; HttpOnly
> Location: 
> http://hdfs-cdh5-vanilla-1.host.com:20003/webhdfs/v1/enc2/hello?op=OPEN&user.name=hdfs&namenoderpcaddress=hdfs-cdh5-vanilla-1.host.com:8020&offset=0
> Content-Length: 0
> Server: Jetty(6.1.26)
> HTTP/1.1 200 OK
> Cache-Control: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Content-Length: 18
> Access-Control-Allow-Methods: GET
> Access-Control-Allow-Origin: *
> Server: Jetty(6.1.26)
> 忡?~?A
> ?`?y???Wbash-4.1$ 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7061) Add test to verify encryption zone creation after NameNode restart without saving namespace

2014-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133916#comment-14133916
 ] 

Hudson commented on HDFS-7061:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1872 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1872/])
HDFS-7061. Add test to verify encryption zone creation after NameNode restart 
without saving namespace. Contributed by Stephen Chu. (wang: rev 
fc741b5d78e7e006355e17b1b5839f502e37261b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Add test to verify encryption zone creation after NameNode restart without 
> saving namespace
> ---
>
> Key: HDFS-7061
> URL: https://issues.apache.org/jira/browse/HDFS-7061
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption, test
>Affects Versions: 3.0.0, 2.6.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-7061.1.patch
>
>
> Right now we verify that encryption zones are expected after saving the 
> namespace and restarting the NameNode.
> We should also verify that encryption zone modifications are expected after 
> restarting the NameNode without saving the namespace.
> This is similar to TestFSImageWithXAttr and TestFSImageWithAcl where we 
> toggle NN restarts with saving namespace and not saving namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6642) [ FsShell - help message ] Need to refactor usage info for shell commands

2014-09-15 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133903#comment-14133903
 ] 

Allen Wittenauer commented on HDFS-6642:


Please don't set the fix version.  It will be set when a patch has been applied 
to the source base.

> [ FsShell - help message ] Need to refactor usage info for shell commands
> -
>
> Key: HDFS-6642
> URL: https://issues.apache.org/jira/browse/HDFS-6642
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Brahma Reddy Battula
>
> Following two can be refactored, 
> 1 ) private final String usagePrefix =
> "Usage: hadoop fs [generic options]";
> This will come for hadoop fs,hdfs dfs...I feel, it can be separated out
> 2 ) Following generics also will be called for dfs, haadmin which is not 
> required.
> ToolRunner.printGenericCommandUsage(out);
>   public static void printGenericCommandUsage(PrintStream out) {
> 
> out.println("Generic options supported are");
> out.println("-conf  specify an application 
> configuration file");
> out.println("-D use value for given 
> property");
> out.println("-fs   specify a namenode");
> out.println("-jt specify a job tracker");
> out.println("-files " + 
>   "specify comma separated files to be copied to the map reduce cluster");
> out.println("-libjars " +
>   "specify comma separated jar files to include in the classpath.");
> out.println("-archives " +
> "specify comma separated archives to be unarchived" +
> " on the compute machines.\n");
> out.println("The general command line syntax is");
> out.println("bin/hadoop command [genericOptions] [commandOptions]\n");
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6642) [ FsShell - help message ] Need to refactor usage info for shell commands

2014-09-15 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-6642:
---
Target Version/s: 3.0.0

> [ FsShell - help message ] Need to refactor usage info for shell commands
> -
>
> Key: HDFS-6642
> URL: https://issues.apache.org/jira/browse/HDFS-6642
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Brahma Reddy Battula
>
> Following two can be refactored, 
> 1 ) private final String usagePrefix =
> "Usage: hadoop fs [generic options]";
> This will come for hadoop fs,hdfs dfs...I feel, it can be separated out
> 2 ) Following generics also will be called for dfs, haadmin which is not 
> required.
> ToolRunner.printGenericCommandUsage(out);
>   public static void printGenericCommandUsage(PrintStream out) {
> 
> out.println("Generic options supported are");
> out.println("-conf  specify an application 
> configuration file");
> out.println("-D use value for given 
> property");
> out.println("-fs   specify a namenode");
> out.println("-jt specify a job tracker");
> out.println("-files " + 
>   "specify comma separated files to be copied to the map reduce cluster");
> out.println("-libjars " +
>   "specify comma separated jar files to include in the classpath.");
> out.println("-archives " +
> "specify comma separated archives to be unarchived" +
> " on the compute machines.\n");
> out.println("The general command line syntax is");
> out.println("bin/hadoop command [genericOptions] [commandOptions]\n");
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6642) [ FsShell - help message ] Need to refactor usage info for shell commands

2014-09-15 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-6642:
---
Fix Version/s: (was: 3.0.0)

> [ FsShell - help message ] Need to refactor usage info for shell commands
> -
>
> Key: HDFS-6642
> URL: https://issues.apache.org/jira/browse/HDFS-6642
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Brahma Reddy Battula
>
> Following two can be refactored, 
> 1 ) private final String usagePrefix =
> "Usage: hadoop fs [generic options]";
> This will come for hadoop fs,hdfs dfs...I feel, it can be separated out
> 2 ) Following generics also will be called for dfs, haadmin which is not 
> required.
> ToolRunner.printGenericCommandUsage(out);
>   public static void printGenericCommandUsage(PrintStream out) {
> 
> out.println("Generic options supported are");
> out.println("-conf  specify an application 
> configuration file");
> out.println("-D use value for given 
> property");
> out.println("-fs   specify a namenode");
> out.println("-jt specify a job tracker");
> out.println("-files " + 
>   "specify comma separated files to be copied to the map reduce cluster");
> out.println("-libjars " +
>   "specify comma separated jar files to include in the classpath.");
> out.println("-archives " +
> "specify comma separated archives to be unarchived" +
> " on the compute machines.\n");
> out.println("The general command line syntax is");
> out.println("bin/hadoop command [genericOptions] [commandOptions]\n");
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7061) Add test to verify encryption zone creation after NameNode restart without saving namespace

2014-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133898#comment-14133898
 ] 

Hudson commented on HDFS-7061:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1897 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1897/])
HDFS-7061. Add test to verify encryption zone creation after NameNode restart 
without saving namespace. Contributed by Stephen Chu. (wang: rev 
fc741b5d78e7e006355e17b1b5839f502e37261b)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java


> Add test to verify encryption zone creation after NameNode restart without 
> saving namespace
> ---
>
> Key: HDFS-7061
> URL: https://issues.apache.org/jira/browse/HDFS-7061
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption, test
>Affects Versions: 3.0.0, 2.6.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-7061.1.patch
>
>
> Right now we verify that encryption zones are expected after saving the 
> namespace and restarting the NameNode.
> We should also verify that encryption zone modifications are expected after 
> restarting the NameNode without saving the namespace.
> This is similar to TestFSImageWithXAttr and TestFSImageWithAcl where we 
> toggle NN restarts with saving namespace and not saving namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6843) Create FileStatus isEncrypted() method

2014-09-15 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6843:
---
Attachment: HDFS-6843.007.patch

[~andrew.wang],

The .007 patch includes FsPermissionExtension and fixes the test.

Thanks!



> Create FileStatus isEncrypted() method
> --
>
> Key: HDFS-6843
> URL: https://issues.apache.org/jira/browse/HDFS-6843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, 
> HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, 
> HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch
>
>
> FileStatus should have a 'boolean isEncrypted()' method. (it was in the 
> context of discussing with AndreW about FileStatus being a Writable).
> Having this method would allow MR JobSubmitter do the following:
> -
> BOOLEAN intermediateEncryption = false
> IF jobconf.contains("mr.intermidate.encryption") THEN
>   intermediateEncryption = jobConf.getBoolean("mr.intermidate.encryption")
> ELSE
>   IF (I/O)Format INSTANCEOF File(I/O)Format THEN
> intermediateEncryption = ANY File(I/O)Format HAS a Path with status 
> isEncrypted()==TRUE
>   FI
>   jobConf.setBoolean("mr.intermidate.encryption", intermediateEncryption)
> FI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6843) Create FileStatus isEncrypted() method

2014-09-15 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6843:
---
Status: Patch Available  (was: In Progress)

> Create FileStatus isEncrypted() method
> --
>
> Key: HDFS-6843
> URL: https://issues.apache.org/jira/browse/HDFS-6843
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6843.001.patch, HDFS-6843.002.patch, 
> HDFS-6843.003.patch, HDFS-6843.004.patch, HDFS-6843.005.patch, 
> HDFS-6843.005.patch, HDFS-6843.006.patch, HDFS-6843.007.patch
>
>
> FileStatus should have a 'boolean isEncrypted()' method. (it was in the 
> context of discussing with AndreW about FileStatus being a Writable).
> Having this method would allow MR JobSubmitter do the following:
> -
> BOOLEAN intermediateEncryption = false
> IF jobconf.contains("mr.intermidate.encryption") THEN
>   intermediateEncryption = jobConf.getBoolean("mr.intermidate.encryption")
> ELSE
>   IF (I/O)Format INSTANCEOF File(I/O)Format THEN
> intermediateEncryption = ANY File(I/O)Format HAS a Path with status 
> isEncrypted()==TRUE
>   FI
>   jobConf.setBoolean("mr.intermidate.encryption", intermediateEncryption)
> FI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6642) [ FsShell - help message ] Need to refactor usage info for shell commands

2014-09-15 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-6642:
---
Fix Version/s: 3.0.0

> [ FsShell - help message ] Need to refactor usage info for shell commands
> -
>
> Key: HDFS-6642
> URL: https://issues.apache.org/jira/browse/HDFS-6642
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Brahma Reddy Battula
> Fix For: 3.0.0
>
>
> Following two can be refactored, 
> 1 ) private final String usagePrefix =
> "Usage: hadoop fs [generic options]";
> This will come for hadoop fs,hdfs dfs...I feel, it can be separated out
> 2 ) Following generics also will be called for dfs, haadmin which is not 
> required.
> ToolRunner.printGenericCommandUsage(out);
>   public static void printGenericCommandUsage(PrintStream out) {
> 
> out.println("Generic options supported are");
> out.println("-conf  specify an application 
> configuration file");
> out.println("-D use value for given 
> property");
> out.println("-fs   specify a namenode");
> out.println("-jt specify a job tracker");
> out.println("-files " + 
>   "specify comma separated files to be copied to the map reduce cluster");
> out.println("-libjars " +
>   "specify comma separated jar files to include in the classpath.");
> out.println("-archives " +
> "specify comma separated archives to be unarchived" +
> " on the compute machines.\n");
> out.println("The general command line syntax is");
> out.println("bin/hadoop command [genericOptions] [commandOptions]\n");
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6705) Create an XAttr that disallows the HDFS admin from accessing a file

2014-09-15 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-6705:
---
Attachment: HDFS-6705.007.patch

bq. I see getXAttrs and unprotectedGetXAttrs in this patch; is this change 
supposed to be here?

Yes. This is in response to your previous comment "We're still doing another 
path resolution to do checkUnreadableBySuperuser. Can we try to reuse the inode 
from the IIP just below? This would also let us avoid throwing IOException in 
the check method."

bq. Would still prefer if in the test, special1 was renamed to something else 
like security1

Oh sorry. When you said "Mention of "special" xattr is non-specific, could we 
say "unreadable by superuser" or "UBS" or something instead?" I thought you 
were only referring to the comments. Anyway, I've changed it from special1 to 
security1.


> Create an XAttr that disallows the HDFS admin from accessing a file
> ---
>
> Key: HDFS-6705
> URL: https://issues.apache.org/jira/browse/HDFS-6705
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, security
>Affects Versions: 3.0.0
>Reporter: Charles Lamb
>Assignee: Charles Lamb
> Attachments: HDFS-6705.001.patch, HDFS-6705.002.patch, 
> HDFS-6705.003.patch, HDFS-6705.004.patch, HDFS-6705.005.patch, 
> HDFS-6705.006.patch, HDFS-6705.007.patch
>
>
> There needs to be an xattr that specifies that the HDFS admin can not access 
> a file. This is needed for m/r delegation tokens and data at rest encryption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7032) Add WebHDFS support for reading and writing to encryption zones

2014-09-15 Thread Charles Lamb (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7032:
---
Attachment: HDFS-7032.002.patch

Rebased.

> Add WebHDFS support for reading and writing to encryption zones
> ---
>
> Key: HDFS-7032
> URL: https://issues.apache.org/jira/browse/HDFS-7032
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption, webhdfs
>Affects Versions: 3.0.0, 2.6.0
>Reporter: Stephen Chu
>Assignee: Charles Lamb
> Attachments: HDFS-7032.001.patch, HDFS-7032.002.patch
>
>
> Currently, decrypting files within encryption zones does not work through 
> WebHDFS. Users will get returned the raw data.
> For example:
> {code}
> bash-4.1$ hdfs crypto -listZones
> /enc2 key128 
> /jenkins  key128 
> bash-4.1$ hdfs dfs -cat /enc2/hello
> hello and goodbye
> bash-4.1$ hadoop fs -cat 
> webhdfs://hdfs-cdh5-vanilla-1.host.com:20101/enc2/hello14/09/08 15:55:26 WARN 
> ssl.FileBasedKeyStoresFactory: The property 'ssl.client.truststore.location' 
> has not been set, no TrustStore will be loaded
> 忡?~?A
> ?`?y???Wbash-4.1$ 
> bash-4.1$ curl -i -L 
> "http://hdfs-cdh5-vanilla-1.host.com:20101/webhdfs/v1/enc2/hello?user.name=hdfs&op=OPEN";
> HTTP/1.1 307 TEMPORARY_REDIRECT
> Cache-Control: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Set-Cookie: 
> hadoop.auth=u=hdfs&p=hdfs&t=simple&e=1410252968270&s=QzpylAy1ltts1F6hHpsVFGC0TfA=;
>  Version=1; Path=/; Expires=Tue, 09-Sep-2014 08:56:08 GMT; HttpOnly
> Location: 
> http://hdfs-cdh5-vanilla-1.host.com:20003/webhdfs/v1/enc2/hello?op=OPEN&user.name=hdfs&namenoderpcaddress=hdfs-cdh5-vanilla-1.host.com:8020&offset=0
> Content-Length: 0
> Server: Jetty(6.1.26)
> HTTP/1.1 200 OK
> Cache-Control: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Expires: Mon, 08 Sep 2014 22:56:08 GMT
> Date: Mon, 08 Sep 2014 22:56:08 GMT
> Pragma: no-cache
> Content-Type: application/octet-stream
> Content-Length: 18
> Access-Control-Allow-Methods: GET
> Access-Control-Allow-Origin: *
> Server: Jetty(6.1.26)
> 忡?~?A
> ?`?y???Wbash-4.1$ 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7061) Add test to verify encryption zone creation after NameNode restart without saving namespace

2014-09-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133805#comment-14133805
 ] 

Hudson commented on HDFS-7061:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #681 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/681/])
HDFS-7061. Add test to verify encryption zone creation after NameNode restart 
without saving namespace. Contributed by Stephen Chu. (wang: rev 
fc741b5d78e7e006355e17b1b5839f502e37261b)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestEncryptionZones.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Add test to verify encryption zone creation after NameNode restart without 
> saving namespace
> ---
>
> Key: HDFS-7061
> URL: https://issues.apache.org/jira/browse/HDFS-7061
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption, test
>Affects Versions: 3.0.0, 2.6.0
>Reporter: Stephen Chu
>Assignee: Stephen Chu
>Priority: Minor
> Fix For: 2.6.0
>
> Attachments: HDFS-7061.1.patch
>
>
> Right now we verify that encryption zones are expected after saving the 
> namespace and restarting the NameNode.
> We should also verify that encryption zone modifications are expected after 
> restarting the NameNode without saving the namespace.
> This is similar to TestFSImageWithXAttr and TestFSImageWithAcl where we 
> toggle NN restarts with saving namespace and not saving namespace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-2932) Under replicated block after the pipeline recovery.

2014-09-15 Thread Srikanth Upputuri (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133800#comment-14133800
 ] 

Srikanth Upputuri commented on HDFS-2932:
-

[~vinayrpet], though I really don't see a reason why we should not delete a 
mis-stamped replica (during block report processing) after the block is 
committed, I agree with you that this improvement in early detection may be 
unnecessary (or even slightly risky?) particularly when the benefit is very 
little.

Can I mark it duplicate of HDFS-3493?

> Under replicated block after the pipeline recovery.
> ---
>
> Key: HDFS-2932
> URL: https://issues.apache.org/jira/browse/HDFS-2932
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 0.24.0
>Reporter: J.Andreina
> Fix For: 0.24.0
>
>
> Started 1NN,DN1,DN2,DN3 in the same machine.
> Written a huge file of size 2 Gb
> while the write for the block-id-1005 is in progress bruought down DN3.
> after the pipeline recovery happened.Block stamp changed into block_id_1006 
> in DN1,Dn2.
> after the write is over.DN3 is brought up and fsck command is issued.
> the following mess is displayed as follows
> "block-id_1006 is underreplicatede.Target replicas is 3 but found 2 replicas".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-3586) Blocks are not getting replicate even DN's are availble.

2014-09-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B resolved HDFS-3586.
-
Resolution: Duplicate

Resolving as duplicate

> Blocks are not getting replicate even DN's are availble.
> 
>
> Key: HDFS-3586
> URL: https://issues.apache.org/jira/browse/HDFS-3586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-3586-analysis.txt
>
>
> Scenario:
> =
> Started four DN's(Say DN1,DN2,DN3 and DN4)
> writing files with RF=3..
> formed pipeline with DN1->DN2->DN3.
> Since DN3 network is very slow.it's not able to send acks.
> Again pipeline is fromed with DN1->DN2->DN4.
> Here DN4 network is also slow.
> So finally commitblocksync happend tp DN1 and DN2 successfully.
> block present in all the four DN's(finalized state in two DN's and rbw state 
> in another DN's)..
> Here NN is asking replicate to DN3 and DN4,but it's failing since replcia's 
> are already present in RBW dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-3586) Blocks are not getting replicate even DN's are availble.

2014-09-15 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B reopened HDFS-3586:
-

> Blocks are not getting replicate even DN's are availble.
> 
>
> Key: HDFS-3586
> URL: https://issues.apache.org/jira/browse/HDFS-3586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-3586-analysis.txt
>
>
> Scenario:
> =
> Started four DN's(Say DN1,DN2,DN3 and DN4)
> writing files with RF=3..
> formed pipeline with DN1->DN2->DN3.
> Since DN3 network is very slow.it's not able to send acks.
> Again pipeline is fromed with DN1->DN2->DN4.
> Here DN4 network is also slow.
> So finally commitblocksync happend tp DN1 and DN2 successfully.
> block present in all the four DN's(finalized state in two DN's and rbw state 
> in another DN's)..
> Here NN is asking replicate to DN3 and DN4,but it's failing since replcia's 
> are already present in RBW dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-3586) Blocks are not getting replicate even DN's are availble.

2014-09-15 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula resolved HDFS-3586.

Resolution: Fixed

Closing,,Since it is addressed as part of HDFS-3493

> Blocks are not getting replicate even DN's are availble.
> 
>
> Key: HDFS-3586
> URL: https://issues.apache.org/jira/browse/HDFS-3586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-3586-analysis.txt
>
>
> Scenario:
> =
> Started four DN's(Say DN1,DN2,DN3 and DN4)
> writing files with RF=3..
> formed pipeline with DN1->DN2->DN3.
> Since DN3 network is very slow.it's not able to send acks.
> Again pipeline is fromed with DN1->DN2->DN4.
> Here DN4 network is also slow.
> So finally commitblocksync happend tp DN1 and DN2 successfully.
> block present in all the four DN's(finalized state in two DN's and rbw state 
> in another DN's)..
> Here NN is asking replicate to DN3 and DN4,but it's failing since replcia's 
> are already present in RBW dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-3586) Blocks are not getting replicate even DN's are availble.

2014-09-15 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned HDFS-3586:
--

Assignee: Brahma Reddy Battula  (was: amith)

> Blocks are not getting replicate even DN's are availble.
> 
>
> Key: HDFS-3586
> URL: https://issues.apache.org/jira/browse/HDFS-3586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
> Attachments: HDFS-3586-analysis.txt
>
>
> Scenario:
> =
> Started four DN's(Say DN1,DN2,DN3 and DN4)
> writing files with RF=3..
> formed pipeline with DN1->DN2->DN3.
> Since DN3 network is very slow.it's not able to send acks.
> Again pipeline is fromed with DN1->DN2->DN4.
> Here DN4 network is also slow.
> So finally commitblocksync happend tp DN1 and DN2 successfully.
> block present in all the four DN's(finalized state in two DN's and rbw state 
> in another DN's)..
> Here NN is asking replicate to DN3 and DN4,but it's failing since replcia's 
> are already present in RBW dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-2932) Under replicated block after the pipeline recovery.

2014-09-15 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133717#comment-14133717
 ] 

Vinayakumar B edited comment on HDFS-2932 at 9/15/14 8:56 AM:
--

Hi [~usrikanth],
Thanks for looking into this issue.

1. Case1  is solved in HDFS-3493

2. Case 2: I understand your point,
I agree that it will be better to detect the failed replicas as early as 
possible and delete it. but we cannot do that on the fly while writing itself. 
It may delete the working copy itself if the report from datanode is little 
delayed. and this is possible in case of huge cluster. Its better to keep the 
corrupt replica for sometime instead of loosing the valid replica. Similar 
cases observed and code has been added to ignore such variations. see below 
comment in BlockManager.java.
{code}  // If it's a RBW report for a COMPLETE block, it may just be 
that
  // the block report got a little bit delayed after the pipeline
  // closed. So, ignore this report, assuming we will get a
  // FINALIZED replica later. See HDFS-2791{code}
   {quote}
Solution is to be able to detect and capture a write-pipeline-failed replica as 
early as possible. First fix may be to change the check from 'isCompleted' to 
'isCommitted'. This will capture write-pipeline-failed replicas reported just 
after commit and before 'complete' and mark them as corrupt.
{quote}
 the timegap between 'isCommitted' and 'isCompleted' is not so huge, so ideally 
this will not change much.

{quote}
Then to capture write-pipeline-failed replicas reported before commit, I am 
investigating if this can be solved by marking them as corrupt as part of 
commit. There already exists a check to find any mis-stamped replicas during 
commit but we only remove them from the blocksMap. In addition can we not mark 
such replicas as corrupt?
{quote}
"setGenerationStampAndVerifyReplicas" is just updating the inmemory states of 
the replicas being written in Namenode. its not changing the blocksMap. I dont 
think this is the right place to decide about the corrupt replicas.

I think its always better to handle the block validations when reported from 
datanode, yes it takes time :(


was (Author: vinayrpet):
Hi [~usrikanth],
Thanks for looking into this issue.

1. Case1  is a solved in HDFS-3493

2. Case 2: I understand your point,
I agree that it will be better to detect the failed replicas as early as 
possible and delete it. but we cannot do that on the fly while writing itself. 
It may delete the working copy itself if the report from datanode is little 
delayed. and this is possible in case of huge cluster. Its better to keep the 
corrupt replica for sometime instead of loosing the valid replica. Similar 
cases observed and code has been added to ignore such variations. see below 
comment in BlockManager.java.
{code}  // If it's a RBW report for a COMPLETE block, it may just be 
that
  // the block report got a little bit delayed after the pipeline
  // closed. So, ignore this report, assuming we will get a
  // FINALIZED replica later. See HDFS-2791{code}
   {quote}
Solution is to be able to detect and capture a write-pipeline-failed replica as 
early as possible. First fix may be to change the check from 'isCompleted' to 
'isCommitted'. This will capture write-pipeline-failed replicas reported just 
after commit and before 'complete' and mark them as corrupt.
{quote}
 the timegap between 'isCommitted' and 'isCompleted' is not so huge, so ideally 
this will not change much.

{quote}
Then to capture write-pipeline-failed replicas reported before commit, I am 
investigating if this can be solved by marking them as corrupt as part of 
commit. There already exists a check to find any mis-stamped replicas during 
commit but we only remove them from the blocksMap. In addition can we not mark 
such replicas as corrupt?
{quote}
"setGenerationStampAndVerifyReplicas" is just updating the inmemory states of 
the replicas being written in Namenode. its not changing the blocksMap. I dont 
think this is the right place to decide about the corrupt replicas.

I think its always better to handle the block validations when reported from 
datanode, yes it takes time :(

> Under replicated block after the pipeline recovery.
> ---
>
> Key: HDFS-2932
> URL: https://issues.apache.org/jira/browse/HDFS-2932
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 0.24.0
>Reporter: J.Andreina
> Fix For: 0.24.0
>
>
> Started 1NN,DN1,DN2,DN3 in the same machine.
> Written a huge file of size 2 Gb
> while the write for the block-id-1005 is in progress bruought down DN3.
> after the pipeline recovery happened.Block stamp change

[jira] [Commented] (HDFS-2932) Under replicated block after the pipeline recovery.

2014-09-15 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133717#comment-14133717
 ] 

Vinayakumar B commented on HDFS-2932:
-

Hi [~usrikanth],
Thanks for looking into this issue.

1. Case1  is a solved in HDFS-3493

2. Case 2: I understand your point,
I agree that it will be better to detect the failed replicas as early as 
possible and delete it. but we cannot do that on the fly while writing itself. 
It may delete the working copy itself if the report from datanode is little 
delayed. and this is possible in case of huge cluster. Its better to keep the 
corrupt replica for sometime instead of loosing the valid replica. Similar 
cases observed and code has been added to ignore such variations. see below 
comment in BlockManager.java.
{code}  // If it's a RBW report for a COMPLETE block, it may just be 
that
  // the block report got a little bit delayed after the pipeline
  // closed. So, ignore this report, assuming we will get a
  // FINALIZED replica later. See HDFS-2791{code}
   {quote}
Solution is to be able to detect and capture a write-pipeline-failed replica as 
early as possible. First fix may be to change the check from 'isCompleted' to 
'isCommitted'. This will capture write-pipeline-failed replicas reported just 
after commit and before 'complete' and mark them as corrupt.
{quote}
 the timegap between 'isCommitted' and 'isCompleted' is not so huge, so ideally 
this will not change much.

{quote}
Then to capture write-pipeline-failed replicas reported before commit, I am 
investigating if this can be solved by marking them as corrupt as part of 
commit. There already exists a check to find any mis-stamped replicas during 
commit but we only remove them from the blocksMap. In addition can we not mark 
such replicas as corrupt?
{quote}
"setGenerationStampAndVerifyReplicas" is just updating the inmemory states of 
the replicas being written in Namenode. its not changing the blocksMap. I dont 
think this is the right place to decide about the corrupt replicas.

I think its always better to handle the block validations when reported from 
datanode, yes it takes time :(

> Under replicated block after the pipeline recovery.
> ---
>
> Key: HDFS-2932
> URL: https://issues.apache.org/jira/browse/HDFS-2932
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 0.24.0
>Reporter: J.Andreina
> Fix For: 0.24.0
>
>
> Started 1NN,DN1,DN2,DN3 in the same machine.
> Written a huge file of size 2 Gb
> while the write for the block-id-1005 is in progress bruought down DN3.
> after the pipeline recovery happened.Block stamp changed into block_id_1006 
> in DN1,Dn2.
> after the write is over.DN3 is brought up and fsck command is issued.
> the following mess is displayed as follows
> "block-id_1006 is underreplicatede.Target replicas is 3 but found 2 replicas".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3586) Blocks are not getting replicate even DN's are availble.

2014-09-15 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133677#comment-14133677
 ] 

Vinayakumar B commented on HDFS-3586:
-

Hi [~usrikanth],

yes you are right.. HDFS-3493 solves the above issue. this jira can be resolved 
as duplicate.

as mentioned in HDFS-3493 changes, in following cases corrupt replica will be 
deleted.

{code}
// case 1: have enough number of live replicas
// case 2: corrupted replicas + live replicas > Replication factor
// case 3: Block is marked corrupt due to failure while writing. In this
// case genstamp will be different than that of valid block.
// In all these cases we can delete the replica.
// In case of 3, rbw block will be deleted and valid block can be replicated
{code}

> Blocks are not getting replicate even DN's are availble.
> 
>
> Key: HDFS-3586
> URL: https://issues.apache.org/jira/browse/HDFS-3586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Brahma Reddy Battula
>Assignee: amith
> Attachments: HDFS-3586-analysis.txt
>
>
> Scenario:
> =
> Started four DN's(Say DN1,DN2,DN3 and DN4)
> writing files with RF=3..
> formed pipeline with DN1->DN2->DN3.
> Since DN3 network is very slow.it's not able to send acks.
> Again pipeline is fromed with DN1->DN2->DN4.
> Here DN4 network is also slow.
> So finally commitblocksync happend tp DN1 and DN2 successfully.
> block present in all the four DN's(finalized state in two DN's and rbw state 
> in another DN's)..
> Here NN is asking replicate to DN3 and DN4,but it's failing since replcia's 
> are already present in RBW dir.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)