[jira] [Commented] (HDFS-5074) Allow starting up from an fsimage checkpoint in the middle of a segment

2013-10-03 Thread Giri (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784884#comment-13784884
 ] 

Giri commented on HDFS-5074:


The namenode log is:

2013-09-26 10:44:41,914 FATAL 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unknown error 
encountered while tailing edits. Shutting down standby NN.
java.io.IOException: There appears to be a gap in the edit log.  We expected 
txid 26, but got txid 5400.
at 
org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:183)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:111)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:733)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:279)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:456)
at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
2013-09-26 10:44:41,917 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1

> Allow starting up from an fsimage checkpoint in the middle of a segment
> ---
>
> Key: HDFS-5074
> URL: https://issues.apache.org/jira/browse/HDFS-5074
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Affects Versions: 3.0.0, 2.1.0-beta
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-5074.txt
>
>
> We've seen the following behavior a couple times:
> - SBN is running and somehow encounters an error in the middle of replaying 
> an edit log in the tailer (eg the JN it's reading from crashes)
> - SBN successfully has processed half of the edits in the segment it was 
> reading.
> - SBN saves a checkpoint, which now falls in the middle of a segment, and 
> then restarts
> Upon restart, the SBN will load this checkpoint which falls in the middle of 
> a segment. {{selectInputStreams}} then fails when the SBN requests a 
> mid-segment txid.
> We should handle this case by downloading the right segment and 
> fast-forwarding to the correct txid.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5190) move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI

2013-10-03 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5190:
--

Attachment: hdfs-5190-1.patch

Patch attached. I moved the cache pool commands to CacheAdmin, and also moved 
the tests over to a new TestCacheAdminCLI test. I took the liberty of extending 
{{TableListing}} so we can use it for help text in CacheAdmin, and also 
sprucing up some of the help text. I also folded in a fix for HDFS-5269, which 
was causing an NPE in CacheAdmin.

No new tests added for the existing addPath/removePath/etc tests, I'll add 
those in a v2. Putting this up now in case anyone's got time to review in the 
AM.

> move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI
> -
>
> Key: HDFS-5190
> URL: https://issues.apache.org/jira/browse/HDFS-5190
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Colin Patrick McCabe
>Assignee: Andrew Wang
> Attachments: hdfs-5190-1.patch
>
>
> As per the discussion in HDFS-5158, we should move the cache pool add, 
> remove, list commands into cacheadmin.  We also should write a unit test in 
> TestHDFSCLI for these commands.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5269) Attempting to remove a cache directive fails with NullPointerException.

2013-10-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784920#comment-13784920
 ] 

Andrew Wang commented on HDFS-5269:
---

Hey Chris, thanks for the report. I folded in a fix for this in HDFS-5190 since 
I was already poking around in {{CacheAdmin}}. I went with 
{{DistributedFileSystem}} taking a {{PathBasedCacheDescriptor}}, which it then 
unpacks it into a {{long}} to pass to the {{DFSClient}} API. {{CacheAdmin}} 
then calls the {{DFSClient}} API directly. Let me know if this works for you (a 
full review on HDFS-5190 would also be appreciated!).

> Attempting to remove a cache directive fails with NullPointerException.
> ---
>
> Key: HDFS-5269
> URL: https://issues.apache.org/jira/browse/HDFS-5269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, tools
>Affects Versions: HDFS-4949
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>
> Any attempt to remove a cache directive via the "hdfs cacheadmin -removePath" 
> command fails with {{NullPointerException}}.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Work started] (HDFS-5190) move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI

2013-10-03 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-5190 started by Andrew Wang.

> move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI
> -
>
> Key: HDFS-5190
> URL: https://issues.apache.org/jira/browse/HDFS-5190
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Colin Patrick McCabe
>Assignee: Andrew Wang
> Attachments: hdfs-5190-1.patch
>
>
> As per the discussion in HDFS-5158, we should move the cache pool add, 
> remove, list commands into cacheadmin.  We also should write a unit test in 
> TestHDFSCLI for these commands.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HDFS-4517) Cover class RemoteBlockReader with unit tests

2013-10-03 Thread Ivan A. Veselovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan A. Veselovsky reassigned HDFS-4517:


Assignee: Ivan A. Veselovsky  (was: Dennis Y)

> Cover class RemoteBlockReader with unit tests
> -
>
> Key: HDFS-4517
> URL: https://issues.apache.org/jira/browse/HDFS-4517
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
>Assignee: Ivan A. Veselovsky
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HADOOP-4517-branch-0.23-a.patch, 
> HADOOP-4517-branch-2-a.patch, HADOOP-4517-branch-2-b.patch, 
> HADOOP-4517-branch-2c.patch, HADOOP-4517-trunk-a.patch, 
> HADOOP-4517-trunk-b.patch, HADOOP-4517-trunk-c.patch, 
> HDFS-4517-branch-2--N2.patch, HDFS-4517-branch-2--N3.patch, 
> HDFS-4517-branch-2--N4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4517) Cover class RemoteBlockReader with unit tests

2013-10-03 Thread Ivan A. Veselovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784959#comment-13784959
 ] 

Ivan A. Veselovsky commented on HDFS-4517:
--

Hi, Kihwal, is that possible to commit it into branch-0.23 also?

> Cover class RemoteBlockReader with unit tests
> -
>
> Key: HDFS-4517
> URL: https://issues.apache.org/jira/browse/HDFS-4517
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
>Assignee: Ivan A. Veselovsky
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HADOOP-4517-branch-0.23-a.patch, 
> HADOOP-4517-branch-2-a.patch, HADOOP-4517-branch-2-b.patch, 
> HADOOP-4517-branch-2c.patch, HADOOP-4517-trunk-a.patch, 
> HADOOP-4517-trunk-b.patch, HADOOP-4517-trunk-c.patch, 
> HDFS-4517-branch-2--N2.patch, HDFS-4517-branch-2--N3.patch, 
> HDFS-4517-branch-2--N4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4512) Cover package org.apache.hadoop.hdfs.server.common with tests

2013-10-03 Thread Ivan A. Veselovsky (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784962#comment-13784962
 ] 

Ivan A. Veselovsky commented on HDFS-4512:
--

Hi, Kihwal, is that possible to commit it into branch-0.23 also?

> Cover package org.apache.hadoop.hdfs.server.common with tests
> -
>
> Key: HDFS-4512
> URL: https://issues.apache.org/jira/browse/HDFS-4512
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
>Assignee: Vadim Bondarev
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HADOOP-4512-branch-0.23-a.patch, 
> HADOOP-4512-branch-2-a.patch, HADOOP-4512-trunk-a.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5289) Race condition in TestRetryCacheWithHA#testCreateSymlink causes spurious test failure

2013-10-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784993#comment-13784993
 ] 

Hudson commented on HDFS-5289:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #351 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/351/])
HDFS-5289. Race condition in TestRetryCacheWithHA#testCreateSymlink causes 
spurious test failure. Contributed by Aaron T. Myers. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1528693)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java


> Race condition in TestRetryCacheWithHA#testCreateSymlink causes spurious test 
> failure
> -
>
> Key: HDFS-5289
> URL: https://issues.apache.org/jira/browse/HDFS-5289
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.1.1-beta
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.1.2-beta
>
> Attachments: HDFS-5289.patch
>
>
> The code to check if the operation has been completed on the active NN can 
> potentially execute before the thread actually doing the operation has run. 
> In this case the checking code will retry the check if the result of the 
> check is null. However, the test operation does not in fact return null, 
> instead throwing an exception if the file doesn't exist yet. We need to catch 
> the exception and retry.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5289) Race condition in TestRetryCacheWithHA#testCreateSymlink causes spurious test failure

2013-10-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785090#comment-13785090
 ] 

Hudson commented on HDFS-5289:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1541 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1541/])
HDFS-5289. Race condition in TestRetryCacheWithHA#testCreateSymlink causes 
spurious test failure. Contributed by Aaron T. Myers. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1528693)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java


> Race condition in TestRetryCacheWithHA#testCreateSymlink causes spurious test 
> failure
> -
>
> Key: HDFS-5289
> URL: https://issues.apache.org/jira/browse/HDFS-5289
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.1.1-beta
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.1.2-beta
>
> Attachments: HDFS-5289.patch
>
>
> The code to check if the operation has been completed on the active NN can 
> potentially execute before the thread actually doing the operation has run. 
> In this case the checking code will retry the check if the result of the 
> check is null. However, the test operation does not in fact return null, 
> instead throwing an exception if the file doesn't exist yet. We need to catch 
> the exception and retry.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5289) Race condition in TestRetryCacheWithHA#testCreateSymlink causes spurious test failure

2013-10-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785213#comment-13785213
 ] 

Hudson commented on HDFS-5289:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1567 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1567/])
HDFS-5289. Race condition in TestRetryCacheWithHA#testCreateSymlink causes 
spurious test failure. Contributed by Aaron T. Myers. (atm: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1528693)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java


> Race condition in TestRetryCacheWithHA#testCreateSymlink causes spurious test 
> failure
> -
>
> Key: HDFS-5289
> URL: https://issues.apache.org/jira/browse/HDFS-5289
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.1.1-beta
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.1.2-beta
>
> Attachments: HDFS-5289.patch
>
>
> The code to check if the operation has been completed on the active NN can 
> potentially execute before the thread actually doing the operation has run. 
> In this case the checking code will retry the check if the result of the 
> check is null. However, the test operation does not in fact return null, 
> instead throwing an exception if the file doesn't exist yet. We need to catch 
> the exception and retry.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4512) Cover package org.apache.hadoop.hdfs.server.common with tests

2013-10-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785217#comment-13785217
 ] 

Kihwal Lee commented on HDFS-4512:
--

Our development focus has been moving from branch-0.23 to branch-2, so we are 
trying to refrain from committing non-critical changes to 0.23.

> Cover package org.apache.hadoop.hdfs.server.common with tests
> -
>
> Key: HDFS-4512
> URL: https://issues.apache.org/jira/browse/HDFS-4512
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
>Assignee: Vadim Bondarev
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HADOOP-4512-branch-0.23-a.patch, 
> HADOOP-4512-branch-2-a.patch, HADOOP-4512-trunk-a.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4517) Cover class RemoteBlockReader with unit tests

2013-10-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785223#comment-13785223
 ] 

Kihwal Lee commented on HDFS-4517:
--

As I said in other jira, we are only making critical changes to branch-0.23 at 
this point. The future effort will be on branch-2 and that's the best place for 
this patch to be in.

> Cover class RemoteBlockReader with unit tests
> -
>
> Key: HDFS-4517
> URL: https://issues.apache.org/jira/browse/HDFS-4517
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
>Assignee: Ivan A. Veselovsky
> Fix For: 3.0.0, 2.3.0
>
> Attachments: HADOOP-4517-branch-0.23-a.patch, 
> HADOOP-4517-branch-2-a.patch, HADOOP-4517-branch-2-b.patch, 
> HADOOP-4517-branch-2c.patch, HADOOP-4517-trunk-a.patch, 
> HADOOP-4517-trunk-b.patch, HADOOP-4517-trunk-c.patch, 
> HDFS-4517-branch-2--N2.patch, HDFS-4517-branch-2--N3.patch, 
> HDFS-4517-branch-2--N4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5293) Symlink resolution requires unnecessary RPCs

2013-10-03 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-5293:
-

 Summary: Symlink resolution requires unnecessary RPCs
 Key: HDFS-5293
 URL: https://issues.apache.org/jira/browse/HDFS-5293
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp
Priority: Critical


When the NN encounters a symlink, it throws an {{UnresolvedLinkException}}.  
This exception contains only the path that is a symlink.  The client issues 
another RPC to obtain the link target, followed by another RPC with the link 
target + remainder of the original path.

{{UnresolvedLinkException}} should be returning both the link and the target to 
avoid a costly and unnecessary intermediate RPC to obtain the link target.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5285) Flatten INodeFile hierarchy

2013-10-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5285:
-

Attachment: h5285_20131002.patch

h5285_20131002.patch: fixes compilation errors.

> Flatten INodeFile hierarchy
> ---
>
> Key: HDFS-5285
> URL: https://issues.apache.org/jira/browse/HDFS-5285
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h5285_20131001.patch, h5285_20131002.patch
>
>
> For files, there are INodeFile, INodeFileUnderConstruction, 
> INodeFileWithSnapshot and INodeFileUnderConstructionWithSnapshot for 
> representing whether a file is under construction or whether it is in some 
> snapshot.  The following are two major problems of the current approach:
> - Java class does not support multiple inheritances so that 
> INodeFileUnderConstructionWithSnapshot cannot extend both 
> INodeFileUnderConstruction and INodeFileWithSnapshot.
> - The number of classes is exponential to the number of features.  Currently, 
> there are only two features, UnderConstruction and WithSnapshot.  The number 
> of classes is 2^2 = 4.  It is hard to add one more feature since the number 
> of classes will become 2^3 = 8.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5285) Flatten INodeFile hierarchy

2013-10-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-5285:
-

Status: Patch Available  (was: Open)

> Flatten INodeFile hierarchy
> ---
>
> Key: HDFS-5285
> URL: https://issues.apache.org/jira/browse/HDFS-5285
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h5285_20131001.patch, h5285_20131002.patch
>
>
> For files, there are INodeFile, INodeFileUnderConstruction, 
> INodeFileWithSnapshot and INodeFileUnderConstructionWithSnapshot for 
> representing whether a file is under construction or whether it is in some 
> snapshot.  The following are two major problems of the current approach:
> - Java class does not support multiple inheritances so that 
> INodeFileUnderConstructionWithSnapshot cannot extend both 
> INodeFileUnderConstruction and INodeFileWithSnapshot.
> - The number of classes is exponential to the number of features.  Currently, 
> there are only two features, UnderConstruction and WithSnapshot.  The number 
> of classes is 2^2 = 4.  It is hard to add one more feature since the number 
> of classes will become 2^3 = 8.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5294) DistributedFileSystem getLinkStatus should not fully qualifies the link target

2013-10-03 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-5294:
-

 Summary: DistributedFileSystem getLinkStatus should not fully 
qualifies the link target
 Key: HDFS-5294
 URL: https://issues.apache.org/jira/browse/HDFS-5294
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Daryn Sharp


The NN returns a {{FileStatus}} containing the exact link target as specified 
by the user at creation.  However, {{DistributedFileSystem#getFileLinkStatus}} 
explicit overwrites the target with the fully scheme qualified path lookup.  
This causes multiple issues such as:
# Prevents clients from discerning if the target is relative or absolute
# Mangles a target that is not intended to be a path
# Causes incorrect resolution with multi-layered filesystems - ie. the link 
should be resolved relative to a higher level fs (ie. viewfs, chroot, filtered, 
etc)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5285) Flatten INodeFile hierarchy

2013-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785350#comment-13785350
 ] 

Hadoop QA commented on HDFS-5285:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606594/h5285_20131002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength
  org.apache.hadoop.hdfs.TestClose
  org.apache.hadoop.hdfs.TestShortCircuitLocalRead
  org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs
  org.apache.hadoop.hdfs.TestFSInputChecker
  org.apache.hadoop.hdfs.server.namenode.TestBackupNode
  org.apache.hadoop.hdfs.TestDataTransferProtocol
  org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes
  
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeProtocol
  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
  org.apache.hadoop.hdfs.TestDFSClientFailover
  org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot
  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
  org.apache.hadoop.tools.TestJMXGet
  org.apache.hadoop.hdfs.TestDFSShell
  org.apache.hadoop.hdfs.security.TestDelegationToken
  
org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication
  org.apache.hadoop.fs.permission.TestStickyBit
  org.apache.hadoop.hdfs.TestFileConcurrentReader
  org.apache.hadoop.hdfs.server.datanode.TestCachingStrategy
  
org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks
  org.apache.hadoop.fs.TestFcHdfsCreateMkdir
  org.apache.hadoop.hdfs.TestCrcCorruption
  org.apache.hadoop.hdfs.TestAppendDifferentChecksum
  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
  org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs
  org.apache.hadoop.hdfs.server.namenode.TestParallelImageWrite
  
org.apache.hadoop.hdfs.security.TestDelegationTokenForProxyUser
  org.apache.hadoop.hdfs.server.namenode.TestSequentialBlockId
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap
  org.apache.hadoop.hdfs.TestDFSPermission
  org.apache.hadoop.hdfs.TestDFSUpgradeFromImage
  org.apache.hadoop.hdfs.TestListFilesInFileContext
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
  org.apache.hadoop.fs.viewfs.TestViewFsDefaultValue
  org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSetQuotaWithSnapshot
  org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
  org.apache.hadoop.hdfs.TestDFSRemove
  org.apache.hadoop.hdfs.TestRestartDFS
  org.apache.hadoop.hdfs.TestFSOutputSummer
  
org.apache.hadoop.hdfs.server.namenode.TestProcessCorruptBlocks
  org.apache.hadoop.hdfs.TestHDFSTrash
  org.apache.hadoop.hdfs.TestDFSRollback
  org.apache.hadoop.hdfs.server.namenode.TestFSDirectory
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  
org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks
  org.apache.hadoop.hdfs.TestClientReportBadBlock
  org.apache.hadoop.hdfs.TestDecommission
  org.apache.hadoop.hdfs.TestSmallBlock
  org.apache.hadoop.hdfs.server.da

[jira] [Resolved] (HDFS-5269) Attempting to remove a cache directive fails with NullPointerException.

2013-10-03 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved HDFS-5269.
-

Resolution: Duplicate

Thanks, [~andrew.wang].  I took a quick look at the HDFS-5190 patch, and I 
agree that it makes sense to fold the fix in there.  I'm resolving HDFS-5269 as 
duplicate of HDFS-5190.

> Attempting to remove a cache directive fails with NullPointerException.
> ---
>
> Key: HDFS-5269
> URL: https://issues.apache.org/jira/browse/HDFS-5269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, tools
>Affects Versions: HDFS-4949
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>
> Any attempt to remove a cache directive via the "hdfs cacheadmin -removePath" 
> command fails with {{NullPointerException}}.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5190) move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI

2013-10-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785410#comment-13785410
 ] 

Colin Patrick McCabe commented on HDFS-5190:


Thanks, Andrew.  This looks good.  I see that you are using TableListing to 
display usage in some cases.  That's a use I hadn't thought of, but it makes 
sense.  It's also great to see a CLI test for this.

Just one question: can we shorten some of the pool-related command names?
{{hadoop cacheadmin -addCachePool}} could probably be shortened to {{hadoop 
cacheadmin -addPool}}.

It's pretty clear that in cacheadmin, the pools we're talking about are cache 
pools, I think.

> move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI
> -
>
> Key: HDFS-5190
> URL: https://issues.apache.org/jira/browse/HDFS-5190
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Colin Patrick McCabe
>Assignee: Andrew Wang
> Attachments: hdfs-5190-1.patch
>
>
> As per the discussion in HDFS-5158, we should move the cache pool add, 
> remove, list commands into cacheadmin.  We also should write a unit test in 
> TestHDFSCLI for these commands.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5294) DistributedFileSystem#getFileLinkStatus should not fully qualify the link target

2013-10-03 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5294:
--

Summary: DistributedFileSystem#getFileLinkStatus should not fully qualify 
the link target  (was: DistributedFileSystem getLinkStatus should not fully 
qualifies the link target)

> DistributedFileSystem#getFileLinkStatus should not fully qualify the link 
> target
> 
>
> Key: HDFS-5294
> URL: https://issues.apache.org/jira/browse/HDFS-5294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>
> The NN returns a {{FileStatus}} containing the exact link target as specified 
> by the user at creation.  However, 
> {{DistributedFileSystem#getFileLinkStatus}} explicit overwrites the target 
> with the fully scheme qualified path lookup.  This causes multiple issues 
> such as:
> # Prevents clients from discerning if the target is relative or absolute
> # Mangles a target that is not intended to be a path
> # Causes incorrect resolution with multi-layered filesystems - ie. the link 
> should be resolved relative to a higher level fs (ie. viewfs, chroot, 
> filtered, etc)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5294) DistributedFileSystem#getFileLinkStatus should not fully qualify the link target

2013-10-03 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785431#comment-13785431
 ] 

Andrew Wang commented on HDFS-5294:
---

Same criticism also applies to {{#getLinkTarget}}. We qualify FileStatus paths 
everywhere right now, which I agree makes it painful for wrapper-FSes like 
{{ViewFileSystem}}.

> DistributedFileSystem#getFileLinkStatus should not fully qualify the link 
> target
> 
>
> Key: HDFS-5294
> URL: https://issues.apache.org/jira/browse/HDFS-5294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>
> The NN returns a {{FileStatus}} containing the exact link target as specified 
> by the user at creation.  However, 
> {{DistributedFileSystem#getFileLinkStatus}} explicit overwrites the target 
> with the fully scheme qualified path lookup.  This causes multiple issues 
> such as:
> # Prevents clients from discerning if the target is relative or absolute
> # Mangles a target that is not intended to be a path
> # Causes incorrect resolution with multi-layered filesystems - ie. the link 
> should be resolved relative to a higher level fs (ie. viewfs, chroot, 
> filtered, etc)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5295) hsftp throws an exception in the end on secure cluster with https enabled

2013-10-03 Thread Yesha Vora (JIRA)
Yesha Vora created HDFS-5295:


 Summary: hsftp throws an exception in the end on secure cluster 
with https enabled
 Key: HDFS-5295
 URL: https://issues.apache.org/jira/browse/HDFS-5295
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
 Environment: Secure
Reporter: Yesha Vora


The hsftp command throws "java.net.SocketException: Unexpected end of file from 
server" exception in the end in secure environment and with https enabled.

Using https port defined by dfs.https.port=50701

/usr/bin/hdfs dfs -fs hdfs://hostname:8020 -ls -R 
"hsftp://hostname:50701/user/abc/1380829410/.abc.crc";
-rw-r--r--   3 abc abc12 2013-10-03 19:44 
hsftp://hostname:50701/user/abc/1380829410/.abc.crc
13/10/03 19:50:48 INFO tools.DelegationTokenFetcher: error in cancel over HTTP
java.net.SocketException: Unexpected end of file from server
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:718)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:715)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1322)
at 
java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
at 
org.apache.hadoop.security.authentication.client.KerberosAuthenticator.isNegotiate(KerberosAuthenticator.java:224)
at 
org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:194)
at 
org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:232)
at 
org.apache.hadoop.security.SecurityUtil.openSecureHttpConnection(SecurityUtil.java:512)
at 
org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.cancelDelegationToken(DelegationTokenFetcher.java:354)
at 
org.apache.hadoop.hdfs.HftpFileSystem$TokenManager.cancel(HftpFileSystem.java:730)
at org.apache.hadoop.security.token.Token.cancel(Token.java:384)
at 
org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
at 
org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
at 
org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
at org.apache.hadoop.hdfs.HftpFileSystem.close(HftpFileSystem.java:417)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2524)
at 
org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2541)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
13/10/03 19:50:48 INFO fs.FileSystem: FileSystem.Cache.closeAll() threw an 
exception:
java.net.SocketException: Unexpected end of file from server




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HDFS-5295) hsftp throws an exception in the end on secure cluster with https enabled

2013-10-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDFS-5295:
---

Assignee: Arpit Agarwal

> hsftp throws an exception in the end on secure cluster with https enabled
> -
>
> Key: HDFS-5295
> URL: https://issues.apache.org/jira/browse/HDFS-5295
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
> Environment: Secure
>Reporter: Yesha Vora
>Assignee: Arpit Agarwal
>
> The hsftp command throws "java.net.SocketException: Unexpected end of file 
> from server" exception in the end in secure environment and with https 
> enabled.
> Using https port defined by dfs.https.port=50701
> /usr/bin/hdfs dfs -fs hdfs://hostname:8020 -ls -R 
> "hsftp://hostname:50701/user/abc/1380829410/.abc.crc";
> -rw-r--r--   3 abc abc12 2013-10-03 19:44 
> hsftp://hostname:50701/user/abc/1380829410/.abc.crc
> 13/10/03 19:50:48 INFO tools.DelegationTokenFetcher: error in cancel over HTTP
> java.net.SocketException: Unexpected end of file from server
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:718)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:715)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1322)
>   at 
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.isNegotiate(KerberosAuthenticator.java:224)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:194)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:232)
>   at 
> org.apache.hadoop.security.SecurityUtil.openSecureHttpConnection(SecurityUtil.java:512)
>   at 
> org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.cancelDelegationToken(DelegationTokenFetcher.java:354)
>   at 
> org.apache.hadoop.hdfs.HftpFileSystem$TokenManager.cancel(HftpFileSystem.java:730)
>   at org.apache.hadoop.security.token.Token.cancel(Token.java:384)
>   at 
> org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
>   at 
> org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
>   at 
> org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
>   at org.apache.hadoop.hdfs.HftpFileSystem.close(HftpFileSystem.java:417)
>   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2524)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2541)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> 13/10/03 19:50:48 INFO fs.FileSystem: FileSystem.Cache.closeAll() threw an 
> exception:
> java.net.SocketException: Unexpected end of file from server



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5295) hsftp throws an exception in the end on secure cluster with https enabled

2013-10-03 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785488#comment-13785488
 ] 

Daryn Sharp commented on HDFS-5295:
---

What does the NN log show?

> hsftp throws an exception in the end on secure cluster with https enabled
> -
>
> Key: HDFS-5295
> URL: https://issues.apache.org/jira/browse/HDFS-5295
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.1.1-beta
> Environment: Secure
>Reporter: Yesha Vora
>Assignee: Arpit Agarwal
>
> The hsftp command throws "java.net.SocketException: Unexpected end of file 
> from server" exception in the end in secure environment and with https 
> enabled.
> Using https port defined by dfs.https.port=50701
> /usr/bin/hdfs dfs -fs hdfs://hostname:8020 -ls -R 
> "hsftp://hostname:50701/user/abc/1380829410/.abc.crc";
> -rw-r--r--   3 abc abc12 2013-10-03 19:44 
> hsftp://hostname:50701/user/abc/1380829410/.abc.crc
> 13/10/03 19:50:48 INFO tools.DelegationTokenFetcher: error in cancel over HTTP
> java.net.SocketException: Unexpected end of file from server
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:718)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579)
>   at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:715)
>   at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:579)
>   at 
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1322)
>   at 
> java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.isNegotiate(KerberosAuthenticator.java:224)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:194)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:232)
>   at 
> org.apache.hadoop.security.SecurityUtil.openSecureHttpConnection(SecurityUtil.java:512)
>   at 
> org.apache.hadoop.hdfs.tools.DelegationTokenFetcher.cancelDelegationToken(DelegationTokenFetcher.java:354)
>   at 
> org.apache.hadoop.hdfs.HftpFileSystem$TokenManager.cancel(HftpFileSystem.java:730)
>   at org.apache.hadoop.security.token.Token.cancel(Token.java:384)
>   at 
> org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.cancel(DelegationTokenRenewer.java:152)
>   at 
> org.apache.hadoop.fs.DelegationTokenRenewer$RenewAction.access$200(DelegationTokenRenewer.java:58)
>   at 
> org.apache.hadoop.fs.DelegationTokenRenewer.removeRenewAction(DelegationTokenRenewer.java:241)
>   at org.apache.hadoop.hdfs.HftpFileSystem.close(HftpFileSystem.java:417)
>   at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2524)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2541)
>   at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
> 13/10/03 19:50:48 INFO fs.FileSystem: FileSystem.Cache.closeAll() threw an 
> exception:
> java.net.SocketException: Unexpected end of file from server



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5190) move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI

2013-10-03 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5190:
--

Attachment: hdfs-5190-2.patch

Shinier new patch attached.

* I took your advice about renaming the pool commands. I also decided to rename 
the {{addPath}} etc commands to {{addDirective}} etc, since I think it makes 
more sense (lots of the help and error text talks about directives). 
{{removeDirective}} should also really be {{removeDescriptor}}, but maybe 
that's okay.
* Added more tests, including coverage for directive commands
* Fixed a removePool bug in CacheManager
* Prettified output messages and exception handling in parts of the CacheAdmin

> move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI
> -
>
> Key: HDFS-5190
> URL: https://issues.apache.org/jira/browse/HDFS-5190
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Colin Patrick McCabe
>Assignee: Andrew Wang
> Attachments: hdfs-5190-1.patch, hdfs-5190-2.patch
>
>
> As per the discussion in HDFS-5158, we should move the cache pool add, 
> remove, list commands into cacheadmin.  We also should write a unit test in 
> TestHDFSCLI for these commands.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5119) Persist CacheManager state in the edit log

2013-10-03 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785558#comment-13785558
 ] 

Chris Nauroth commented on HDFS-5119:
-

Hi, Andrew.  This looks good.  I ran the patch in a test cluster and did a 
successful layout version upgrade.  I ran all caching-related operations 
multiple times, forced a checkpoint, and restarted the namenode to confirm that 
it maintained the correct state.  Nice job!  Here are a few minor comments:

# {{Text#readString(DataInput)}} is now equivalent to 
{{Text#readString(DataInput, int)}} passing {{Integer#MAX_VALUE}} for the 
second argument.  Do you want to call the 2-arg method from the 1-arg method to 
cut some duplication?
# Is it time to remove the following TODO?
{code}
  CacheManager(FSNamesystem namesystem, FSDirectory dir, Configuration conf) {
// TODO: support loading and storing of the CacheManager state
{code}
# Please add JavaDocs for the following {{CacheManager}} methods: 
{{unprotectedAddEntry}}, {{addDirective}}, {{unprotectedAddDirective}}, 
{{unprotectedRemoveDescriptor}}, {{unprotectedAddCachePool}}.
# I noticed that {{FSNamesystem#setCacheReplicationInt}} logs an audit event 
for "setReplication" instead of "setCacheReplication".  Do you want to change 
that string right now, or do you prefer if I file a new bug report?
# Regarding the following code, do we need to intercept setReplication calls on 
existing files and also update cache replication to keep them in sync?  Also, I 
notice this is skipping directories.  Will this code change when we add support 
for caching a directory in HDFS-5096?
{code}
  INode node = dir.getINode(entry.getPath());
  if (node != null && node.isFile()) {
INodeFile file = node.asFile();
// TODO: adjustable cache replication factor
namesystem.setCacheReplicationInt(entry.getPath(),
file.getBlockReplication());
  } else {
LOG.warn("Path " + entry.getPath() + " is not a file");
  }
{code}


> Persist CacheManager state in the edit log
> --
>
> Key: HDFS-5119
> URL: https://issues.apache.org/jira/browse/HDFS-5119
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-4949
>Reporter: Colin Patrick McCabe
>Assignee: Andrew Wang
> Attachments: hdfs-5119-1.patch, hdfs-5119-2.patch
>
>
> CacheManager state should be persisted in the edit log.  At the moment, this 
> state consists of information about cache pools and cache directives.  It's 
> not necessary to persist any information about what is cached on the 
> DataNodes at any particular moment, since this changes all the time.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5296) DFS usage gets doubled in the WebUI of federated namenode

2013-10-03 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-5296:
--

Attachment: BBF12817-B83E-4CC5-B0B8-4FA322E87FB7.png

> DFS usage gets doubled in the WebUI of federated namenode
> -
>
> Key: HDFS-5296
> URL: https://issues.apache.org/jira/browse/HDFS-5296
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: BBF12817-B83E-4CC5-B0B8-4FA322E87FB7.png
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5296) DFS usage gets doubled in the WebUI of federated namenode

2013-10-03 Thread Siqi Li (JIRA)
Siqi Li created HDFS-5296:
-

 Summary: DFS usage gets doubled in the WebUI of federated namenode
 Key: HDFS-5296
 URL: https://issues.apache.org/jira/browse/HDFS-5296
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Siqi Li
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (HDFS-5224) Refactor PathBasedCache* methods to use a Path rather than a String

2013-10-03 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth reassigned HDFS-5224:
---

Assignee: Chris Nauroth  (was: Colin Patrick McCabe)

> Refactor PathBasedCache* methods to use a Path rather than a String
> ---
>
> Key: HDFS-5224
> URL: https://issues.apache.org/jira/browse/HDFS-5224
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: HDFS-4949
>Reporter: Andrew Wang
>Assignee: Chris Nauroth
>
> As discussed in HDFS-5213, we should refactor PathBasedCacheDirective and 
> related methods in DistributedFileSystem to use a Path to represent paths to 
> cache, rather than a String.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5296) DFS usage gets doubled in the WebUI of federated namenode

2013-10-03 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-5296:
--

Attachment: HDFS-5296-v1.patch

> DFS usage gets doubled in the WebUI of federated namenode
> -
>
> Key: HDFS-5296
> URL: https://issues.apache.org/jira/browse/HDFS-5296
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: BBF12817-B83E-4CC5-B0B8-4FA322E87FB7.png, 
> HDFS-5296-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5296) DFS usage gets doubled in the WebUI of federated namenode

2013-10-03 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated HDFS-5296:
--

Status: Patch Available  (was: Open)

> DFS usage gets doubled in the WebUI of federated namenode
> -
>
> Key: HDFS-5296
> URL: https://issues.apache.org/jira/browse/HDFS-5296
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.5-alpha
>Reporter: Siqi Li
>Priority: Minor
> Attachments: BBF12817-B83E-4CC5-B0B8-4FA322E87FB7.png, 
> HDFS-5296-v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5297) Fix broken hyperlinks in HDFS document

2013-10-03 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created HDFS-5297:
---

 Summary: Fix broken hyperlinks in HDFS document
 Key: HDFS-5297
 URL: https://issues.apache.org/jira/browse/HDFS-5297
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.1.0-beta, 3.0.0
Reporter: Akira AJISAKA
Priority: Minor
 Fix For: 3.0.0, 2.1.2-beta


I found a lot of broken hyperlinks in HDFS document to be fixed.
Ex.)
In HdfsUserGuide.apt.vm, there is an broken hyperlinks as below
{noformat}
   For command usage, see {{{dfsadmin}}}.
{noformat}
It should be fixed to 
{noformat}
   For command usage, see 
{{{../hadoop-common/CommandsManual.html#dfsadmin}dfsadmin}}.
{noformat}





--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5293) Symlink resolution requires unnecessary RPCs

2013-10-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785706#comment-13785706
 ] 

Colin Patrick McCabe commented on HDFS-5293:


If we're going to do this, why not just make all NN operations resolve symlinks 
as far as they can?  That would remove all the performance concerns about 
returning unresolved paths, at least in the context of non-cross-FS symlinks.

We already have many filesystems that do symlink resolution internally, such as 
LocalFileSystem, Ceph, etc. etc.  If HDFS's namenode did symlink resolution for 
all RPCs, we could return unresolved paths everywhere and be happy.

> Symlink resolution requires unnecessary RPCs
> 
>
> Key: HDFS-5293
> URL: https://issues.apache.org/jira/browse/HDFS-5293
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> When the NN encounters a symlink, it throws an {{UnresolvedLinkException}}.  
> This exception contains only the path that is a symlink.  The client issues 
> another RPC to obtain the link target, followed by another RPC with the link 
> target + remainder of the original path.
> {{UnresolvedLinkException}} should be returning both the link and the target 
> to avoid a costly and unnecessary intermediate RPC to obtain the link target.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5119) Persist CacheManager state in the edit log

2013-10-03 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-5119:
--

Attachment: hdfs-5119-3.patch

Thanks for the review Chris, it's great to get validation on the metadata 
upgrade and your manual testing. New patch addresses all of your comments 
except hooking into {{setReplication}} to sync up with changes in file 
replication; I believe it'll be addressed in the other JIRA as you said.

> Persist CacheManager state in the edit log
> --
>
> Key: HDFS-5119
> URL: https://issues.apache.org/jira/browse/HDFS-5119
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: HDFS-4949
>Reporter: Colin Patrick McCabe
>Assignee: Andrew Wang
> Attachments: hdfs-5119-1.patch, hdfs-5119-2.patch, hdfs-5119-3.patch
>
>
> CacheManager state should be persisted in the edit log.  At the moment, this 
> state consists of information about cache pools and cache directives.  It's 
> not necessary to persist any information about what is cached on the 
> DataNodes at any particular moment, since this changes all the time.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-4510) Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests

2013-10-03 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated HDFS-4510:
--

Attachment: HDFS-4510--n5.patch

Looks like those hardcoded port numbers are not really needed. Updated the 
patch for trunk. The same patch is applicable to branch-2.

> Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
> 
>
> Key: HDFS-4510
> URL: https://issues.apache.org/jira/browse/HDFS-4510
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
> Attachments: HADOOP-4510-branch-0.23-a.patch, 
> HADOOP-4510-branch-0.23-b.patch, HADOOP-4510-branch-0.23-c.patch, 
> HADOOP-4510-branch-2-a.patch, HADOOP-4510-branch-2-b.patch, 
> HADOOP-4510-branch-2-c.patch, HADOOP-4510-trunk-a.patch, 
> HADOOP-4510-trunk-b.patch, HADOOP-4510-trunk-c.patch, HDFS-4510--n5.patch, 
> HDFS-4510-trunk--N27.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5259) Support client which combines appended data with old data before sends it to NFS server

2013-10-03 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785726#comment-13785726
 ] 

Jing Zhao commented on HDFS-5259:
-

The patch looks good to me. The only one concern:
{code}
+  byte[] fullData = request.getData().array();
+  byte[] appendedData = new byte[(int) smallerCount];
+  System.arraycopy(fullData, (int) (cachedOffset - offset), appendedData,
+  0, (int) smallerCount);
{code}

Here I think we may avoid this data copy by passing the bytebuffer from the 
WriteRequest to the WriteCtx, instead of passing in a byte array. And if we 
make this change, we may also want to add some unit tests.

> Support client which combines appended data with old data before sends it to 
> NFS server
> ---
>
> Key: HDFS-5259
> URL: https://issues.apache.org/jira/browse/HDFS-5259
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Reporter: Yesha Vora
>Assignee: Brandon Li
> Attachments: HDFS-5259.000.patch, HDFS-5259.001.patch
>
>
> The append does not work with some Linux client. The Client gets 
> "Input/output Error" when it tries to append. And NFS server considers it as 
> random write and fails the request.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5294) DistributedFileSystem#getFileLinkStatus should not fully qualify the link target

2013-10-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785734#comment-13785734
 ] 

Colin Patrick McCabe commented on HDFS-5294:


There are a ton of FileSystem operations that return paths.  If we're going to 
switch them all to return unresolve dpaths, that will also affect 
getFileStatus, listStatus, getLocatedFileStatus, resolvePath, 
listCorruptFileBlocks, globStatus, createSnapshot, etc.

Also, as we've discussed elsewhere, doing this would be a large performance 
regression, huge in some cases.

I really don't think we should do this unless we also do symlink resolution 
server-side to avoid doing N symlink resolution RPCs every time we use a path 
with N symlinks in it.

> DistributedFileSystem#getFileLinkStatus should not fully qualify the link 
> target
> 
>
> Key: HDFS-5294
> URL: https://issues.apache.org/jira/browse/HDFS-5294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>
> The NN returns a {{FileStatus}} containing the exact link target as specified 
> by the user at creation.  However, 
> {{DistributedFileSystem#getFileLinkStatus}} explicit overwrites the target 
> with the fully scheme qualified path lookup.  This causes multiple issues 
> such as:
> # Prevents clients from discerning if the target is relative or absolute
> # Mangles a target that is not intended to be a path
> # Causes incorrect resolution with multi-layered filesystems - ie. the link 
> should be resolved relative to a higher level fs (ie. viewfs, chroot, 
> filtered, etc)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4510) Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests

2013-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785743#comment-13785743
 ] 

Hadoop QA commented on HDFS-4510:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606696/HDFS-4510--n5.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5092//console

This message is automatically generated.

> Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
> 
>
> Key: HDFS-4510
> URL: https://issues.apache.org/jira/browse/HDFS-4510
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
> Attachments: HADOOP-4510-branch-0.23-a.patch, 
> HADOOP-4510-branch-0.23-b.patch, HADOOP-4510-branch-0.23-c.patch, 
> HADOOP-4510-branch-2-a.patch, HADOOP-4510-branch-2-b.patch, 
> HADOOP-4510-branch-2-c.patch, HADOOP-4510-trunk-a.patch, 
> HADOOP-4510-trunk-b.patch, HADOOP-4510-trunk-c.patch, HDFS-4510--n5.patch, 
> HDFS-4510-trunk--N27.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-4510) Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests

2013-10-03 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785774#comment-13785774
 ] 

Andrey Klochkov commented on HDFS-4510:
---

Robot failed on compiling the non patched trunk, i.e. this is not relevant to 
the patch.

> Cover classes ClusterJspHelper/NamenodeJspHelper with unit tests
> 
>
> Key: HDFS-4510
> URL: https://issues.apache.org/jira/browse/HDFS-4510
> Project: Hadoop HDFS
>  Issue Type: Test
>Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
>Reporter: Vadim Bondarev
> Attachments: HADOOP-4510-branch-0.23-a.patch, 
> HADOOP-4510-branch-0.23-b.patch, HADOOP-4510-branch-0.23-c.patch, 
> HADOOP-4510-branch-2-a.patch, HADOOP-4510-branch-2-b.patch, 
> HADOOP-4510-branch-2-c.patch, HADOOP-4510-trunk-a.patch, 
> HADOOP-4510-trunk-b.patch, HADOOP-4510-trunk-c.patch, HDFS-4510--n5.patch, 
> HDFS-4510-trunk--N27.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5283) NN not coming out of startup safemode due to under construction blocks only inside snapshots also counted in safemode threshhold

2013-10-03 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785809#comment-13785809
 ] 

Jing Zhao commented on HDFS-5283:
-

bq. We need to have some reference which tells that the BlockCollection resides 
inside the snapshot if we are not able to find outside. I think In directory 
delete case with snapshot, changing the Inode types recursively is necessary. 
This keeps the behaviour of both cases ( file deletion and directory deletion ) 
in consistent.

So in our current solution, for each BlockCollection (which is an INodeUC) in 
the blocksMap, we first check if it's in the current fsdir tree. Here our claim 
is, if the inode is not in the current tree (i.e., we cannot identify the 
node's absolute full path or the node with the absolute full path in the 
current fsdir tree is actually not the node stored in the blocksMap), this 
inode should be a file only existing in snapshot, no matter this node is 
instance of INodeUCWithSnapshot or not. If this claim stands, to convert an 
INodeUC to an INodeUCWithSnapshot during deletion will be unnecessary.

bq. storedBlock.addNode(node);

Without this the new test will fail when setting DN number to a >1 value. 
Currently when NN receives the first block report from a DN, for each block in 
the report, it will check its total number of available replica, and if the 
number is EQUAL to the minimum required replica number, it increases the 
blockSafe value by 1 in the safemodeInfo. Thus here when we call 
"namesystem.incrementSafeBlockCount(numOfReplicas)", the numOfReplicas must be 
the current available replica's number. Otherwise we will miss the EQUAL case 
and fail to increase the blockSafe number. 

> NN not coming out of startup safemode due to under construction blocks only 
> inside snapshots also counted in safemode threshhold
> 
>
> Key: HDFS-5283
> URL: https://issues.apache.org/jira/browse/HDFS-5283
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 3.0.0, 2.1.1-beta
>Reporter: Vinay
>Assignee: Vinay
>Priority: Blocker
> Attachments: HDFS-5283.000.patch, HDFS-5283.patch, HDFS-5283.patch
>
>
> This is observed in one of our env:
> 1. A MR Job was running which has created some temporary files and was 
> writing to them.
> 2. Snapshot was taken
> 3. And Job was killed and temporary files were deleted.
> 4. Namenode restarted.
> 5. After restart Namenode was in safemode waiting for blocks
> Analysis
> -
> 1. Since the snapshot taken also includes the temporary files which were 
> open, and later original files are deleted.
> 2. UnderConstruction blocks count was taken from leases. not considered the 
> UC blocks only inside snapshots
> 3. So safemode threshold count was more and NN did not come out of safemode



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5298) make symlinks production-ready

2013-10-03 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-5298:
--

 Summary: make symlinks production-ready
 Key: HDFS-5298
 URL: https://issues.apache.org/jira/browse/HDFS-5298
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Colin Patrick McCabe


This is an umbrella JIRA for all the things we have to do to make symlinks 
production-ready for Hadoop 2.3.

Note that some of these subtasks are scheduled for 2.1.2 / 2.2, but the overall 
effort is for 2.3.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5294) DistributedFileSystem#getFileLinkStatus should not fully qualify the link target

2013-10-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785812#comment-13785812
 ] 

Colin Patrick McCabe commented on HDFS-5294:


This is a duplicate of the earlier HADOOP-9780.  I propose that we close this 
issue as duplicate and move the discussion to there.  If we do end up making 
this change, it will affect more than just HDFS, so it doesn't make sense to 
have this be an HDFS (as opposed to common JIRA) anyway.

> DistributedFileSystem#getFileLinkStatus should not fully qualify the link 
> target
> 
>
> Key: HDFS-5294
> URL: https://issues.apache.org/jira/browse/HDFS-5294
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>
> The NN returns a {{FileStatus}} containing the exact link target as specified 
> by the user at creation.  However, 
> {{DistributedFileSystem#getFileLinkStatus}} explicit overwrites the target 
> with the fully scheme qualified path lookup.  This causes multiple issues 
> such as:
> # Prevents clients from discerning if the target is relative or absolute
> # Mangles a target that is not intended to be a path
> # Causes incorrect resolution with multi-layered filesystems - ie. the link 
> should be resolved relative to a higher level fs (ie. viewfs, chroot, 
> filtered, etc)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5283) NN not coming out of startup safemode due to under construction blocks only inside snapshots also counted in safemode threshhold

2013-10-03 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785892#comment-13785892
 ] 

Vinay commented on HDFS-5283:
-

bq. So in our current solution, for each BlockCollection (which is an INodeUC) 
in the blocksMap, we first check if it's in the current fsdir tree. Here our 
claim is, if the inode is not in the current tree (i.e., we cannot identify the 
node's absolute full path or the node with the absolute full path in the 
current fsdir tree is actually not the node stored in the blocksMap), this 
inode should be a file only existing in snapshot, no matter this node is 
instance of INodeUCWithSnapshot or not. If this claim stands, to convert an 
INodeUC to an INodeUCWithSnapshot during deletion will be unnecessary.
I agree that the current solution mentioned here will work for this issue. My 
only doubt is, why there are different behaviors with file delete and directory 
delete. Not changing Inodes recursively was intentional or its an issue? 

bq. Without this the new test will fail when setting DN number to a >1 value.
I got it. Got confused because in my earlier patch I haven't used 
{{countLiveNodes(storedBlock);}} instead used {{((BlockInfoUnderConstruction) 
storedBlock).getNumExpectedLocations()}}, so test was passing. 
But will it not be better if we use {{((BlockInfoUnderConstruction) 
storedBlock).getNumExpectedLocations()}} instead of 
{{storedBlock.addNode(node)}} and {{countLiveNodes(storedBlock);}}.. as in my 
patch ? .. Any problems you seeing in that ?

> NN not coming out of startup safemode due to under construction blocks only 
> inside snapshots also counted in safemode threshhold
> 
>
> Key: HDFS-5283
> URL: https://issues.apache.org/jira/browse/HDFS-5283
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 3.0.0, 2.1.1-beta
>Reporter: Vinay
>Assignee: Vinay
>Priority: Blocker
> Attachments: HDFS-5283.000.patch, HDFS-5283.patch, HDFS-5283.patch
>
>
> This is observed in one of our env:
> 1. A MR Job was running which has created some temporary files and was 
> writing to them.
> 2. Snapshot was taken
> 3. And Job was killed and temporary files were deleted.
> 4. Namenode restarted.
> 5. After restart Namenode was in safemode waiting for blocks
> Analysis
> -
> 1. Since the snapshot taken also includes the temporary files which were 
> open, and later original files are deleted.
> 2. UnderConstruction blocks count was taken from leases. not considered the 
> UC blocks only inside snapshots
> 3. So safemode threshold count was more and NN did not come out of safemode



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5299) DFS client hangs in updatePipeline RPC when failover happened

2013-10-03 Thread Vinay (JIRA)
Vinay created HDFS-5299:
---

 Summary: DFS client hangs in updatePipeline RPC when failover 
happened
 Key: HDFS-5299
 URL: https://issues.apache.org/jira/browse/HDFS-5299
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.1.0-beta, 3.0.0
Reporter: Vinay
Assignee: Vinay
Priority: Blocker


DFSClient got hanged in updatedPipeline call to namenode when the failover 
happened at exactly sametime.


When we digged down, issue found to be with handling the RetryCache in 
updatePipeline.

Here are the steps :
1. Client was writing slowly.
2. One of the datanode was down and updatePipeline was called to ANN.
3. Call reached the ANN, while processing updatePipeline call it got shutdown.
3. Now Client retried (Since the api marked as AtMostOnce) to another NameNode. 
at that time still NN was in STANDBY. and got StandbyException.
4. Now one more time client failover happened. 
5. Now SNN became Active.
6. Client called to current ANN again for updatePipeline, 

Now client call got hanged in NN, waiting for the cached call with same callid 
to be over. But this cached call is already got over last time with 
StandbyException.

Conclusion :
Always whenever the new entry is added to cache we need to update the result of 
the call before returning the call or throwing exception.
I can see similar issue multiple RPCs in FSNameSystem.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5299) DFS client hangs in updatePipeline RPC when failover happened

2013-10-03 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5299:


Description: 
DFSClient got hanged in updatedPipeline call to namenode when the failover 
happened at exactly sametime.


When we digged down, issue found to be with handling the RetryCache in 
updatePipeline.

Here are the steps :
1. Client was writing slowly.
2. One of the datanode was down and updatePipeline was called to ANN.
3. Call reached the ANN, while processing updatePipeline call it got shutdown.
3. Now Client retried (Since the api marked as AtMostOnce) to another NameNode. 
at that time still NN was in STANDBY and got StandbyException.
4. Now one more time client failover happened. 
5. Now SNN became Active.
6. Client called to current ANN again for updatePipeline, 

Now client call got hanged in NN, waiting for the cached call with same callid 
to be over. But this cached call is already got over last time with 
StandbyException.

Conclusion :
Always whenever the new entry is added to cache we need to update the result of 
the call before returning the call or throwing exception.
I can see similar issue multiple RPCs in FSNameSystem.

  was:
DFSClient got hanged in updatedPipeline call to namenode when the failover 
happened at exactly sametime.


When we digged down, issue found to be with handling the RetryCache in 
updatePipeline.

Here are the steps :
1. Client was writing slowly.
2. One of the datanode was down and updatePipeline was called to ANN.
3. Call reached the ANN, while processing updatePipeline call it got shutdown.
3. Now Client retried (Since the api marked as AtMostOnce) to another NameNode. 
at that time still NN was in STANDBY. and got StandbyException.
4. Now one more time client failover happened. 
5. Now SNN became Active.
6. Client called to current ANN again for updatePipeline, 

Now client call got hanged in NN, waiting for the cached call with same callid 
to be over. But this cached call is already got over last time with 
StandbyException.

Conclusion :
Always whenever the new entry is added to cache we need to update the result of 
the call before returning the call or throwing exception.
I can see similar issue multiple RPCs in FSNameSystem.


> DFS client hangs in updatePipeline RPC when failover happened
> -
>
> Key: HDFS-5299
> URL: https://issues.apache.org/jira/browse/HDFS-5299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.1.0-beta
>Reporter: Vinay
>Assignee: Vinay
>Priority: Blocker
>
> DFSClient got hanged in updatedPipeline call to namenode when the failover 
> happened at exactly sametime.
> When we digged down, issue found to be with handling the RetryCache in 
> updatePipeline.
> Here are the steps :
> 1. Client was writing slowly.
> 2. One of the datanode was down and updatePipeline was called to ANN.
> 3. Call reached the ANN, while processing updatePipeline call it got shutdown.
> 3. Now Client retried (Since the api marked as AtMostOnce) to another 
> NameNode. at that time still NN was in STANDBY and got StandbyException.
> 4. Now one more time client failover happened. 
> 5. Now SNN became Active.
> 6. Client called to current ANN again for updatePipeline, 
> Now client call got hanged in NN, waiting for the cached call with same 
> callid to be over. But this cached call is already got over last time with 
> StandbyException.
> Conclusion :
> Always whenever the new entry is added to cache we need to update the result 
> of the call before returning the call or throwing exception.
> I can see similar issue multiple RPCs in FSNameSystem.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5259) Support client which combines appended data with old data before sends it to NFS server

2013-10-03 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-5259:
-

Attachment: HDFS-5259.003.patch

[~jingzhao], I played with the idea to avoid the copy and see the 
special-altered-write logic leaked into other part of the whole write data path.
The special-altered-write is expected to happen rarely, so I am not sure if 
it's worth it to make the code harder to read.  I still uploaded the patch so 
you can take a look.  



> Support client which combines appended data with old data before sends it to 
> NFS server
> ---
>
> Key: HDFS-5259
> URL: https://issues.apache.org/jira/browse/HDFS-5259
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Reporter: Yesha Vora
>Assignee: Brandon Li
> Attachments: HDFS-5259.000.patch, HDFS-5259.001.patch, 
> HDFS-5259.003.patch
>
>
> The append does not work with some Linux client. The Client gets 
> "Input/output Error" when it tries to append. And NFS server considers it as 
> random write and fails the request.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5259) Support client which combines appended data with old data before sends it to NFS server

2013-10-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785924#comment-13785924
 ] 

Hadoop QA commented on HDFS-5259:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12606733/HDFS-5259.003.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5093//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5093//console

This message is automatically generated.

> Support client which combines appended data with old data before sends it to 
> NFS server
> ---
>
> Key: HDFS-5259
> URL: https://issues.apache.org/jira/browse/HDFS-5259
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: nfs
>Reporter: Yesha Vora
>Assignee: Brandon Li
> Attachments: HDFS-5259.000.patch, HDFS-5259.001.patch, 
> HDFS-5259.003.patch
>
>
> The append does not work with some Linux client. The Client gets 
> "Input/output Error" when it tries to append. And NFS server considers it as 
> random write and fails the request.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5299) DFS client hangs in updatePipeline RPC when failover happened

2013-10-03 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5299:


Status: Patch Available  (was: Open)

> DFS client hangs in updatePipeline RPC when failover happened
> -
>
> Key: HDFS-5299
> URL: https://issues.apache.org/jira/browse/HDFS-5299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
>Priority: Blocker
> Attachments: HDFS-5299.patch
>
>
> DFSClient got hanged in updatedPipeline call to namenode when the failover 
> happened at exactly sametime.
> When we digged down, issue found to be with handling the RetryCache in 
> updatePipeline.
> Here are the steps :
> 1. Client was writing slowly.
> 2. One of the datanode was down and updatePipeline was called to ANN.
> 3. Call reached the ANN, while processing updatePipeline call it got shutdown.
> 3. Now Client retried (Since the api marked as AtMostOnce) to another 
> NameNode. at that time still NN was in STANDBY and got StandbyException.
> 4. Now one more time client failover happened. 
> 5. Now SNN became Active.
> 6. Client called to current ANN again for updatePipeline, 
> Now client call got hanged in NN, waiting for the cached call with same 
> callid to be over. But this cached call is already got over last time with 
> StandbyException.
> Conclusion :
> Always whenever the new entry is added to cache we need to update the result 
> of the call before returning the call or throwing exception.
> I can see similar issue multiple RPCs in FSNameSystem.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5299) DFS client hangs in updatePipeline RPC when failover happened

2013-10-03 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5299:


Attachment: HDFS-5299.patch

Attaching the patch. Please review

> DFS client hangs in updatePipeline RPC when failover happened
> -
>
> Key: HDFS-5299
> URL: https://issues.apache.org/jira/browse/HDFS-5299
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0, 2.1.0-beta
>Reporter: Vinay
>Assignee: Vinay
>Priority: Blocker
> Attachments: HDFS-5299.patch
>
>
> DFSClient got hanged in updatedPipeline call to namenode when the failover 
> happened at exactly sametime.
> When we digged down, issue found to be with handling the RetryCache in 
> updatePipeline.
> Here are the steps :
> 1. Client was writing slowly.
> 2. One of the datanode was down and updatePipeline was called to ANN.
> 3. Call reached the ANN, while processing updatePipeline call it got shutdown.
> 3. Now Client retried (Since the api marked as AtMostOnce) to another 
> NameNode. at that time still NN was in STANDBY and got StandbyException.
> 4. Now one more time client failover happened. 
> 5. Now SNN became Active.
> 6. Client called to current ANN again for updatePipeline, 
> Now client call got hanged in NN, waiting for the cached call with same 
> callid to be over. But this cached call is already got over last time with 
> StandbyException.
> Conclusion :
> Always whenever the new entry is added to cache we need to update the result 
> of the call before returning the call or throwing exception.
> I can see similar issue multiple RPCs in FSNameSystem.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HDFS-5300) FSNameSystem#deleteSnapshot() should not check owner in case of permissions disabled

2013-10-03 Thread Vinay (JIRA)
Vinay created HDFS-5300:
---

 Summary: FSNameSystem#deleteSnapshot() should not check owner in 
case of permissions disabled
 Key: HDFS-5300
 URL: https://issues.apache.org/jira/browse/HDFS-5300
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Vinay
Assignee: Vinay


FSNameSystem#deleteSnapshot() should not check owner in case of permissions 
disabled

{code:java}  checkOperation(OperationCategory.WRITE);
  if (isInSafeMode()) {
throw new SafeModeException(
"Cannot delete snapshot for " + snapshotRoot, safeMode);
  }
  FSPermissionChecker pc = getPermissionChecker();
  checkOwner(pc, snapshotRoot);

  BlocksMapUpdateInfo collectedBlocks = new BlocksMapUpdateInfo();
  List removedINodes = new ChunkedArrayList();
  dir.writeLock();{code}

should check owner only in case of permissions enabled as its done for all 
other operations.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5300) FSNameSystem#deleteSnapshot() should not check owner in case of permissions disabled

2013-10-03 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5300:


Attachment: HDFS-5300.patch

Attached the patch, Please review

> FSNameSystem#deleteSnapshot() should not check owner in case of permissions 
> disabled
> 
>
> Key: HDFS-5300
> URL: https://issues.apache.org/jira/browse/HDFS-5300
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Vinay
>Assignee: Vinay
> Attachments: HDFS-5300.patch
>
>
> FSNameSystem#deleteSnapshot() should not check owner in case of permissions 
> disabled
> {code:java}  checkOperation(OperationCategory.WRITE);
>   if (isInSafeMode()) {
> throw new SafeModeException(
> "Cannot delete snapshot for " + snapshotRoot, safeMode);
>   }
>   FSPermissionChecker pc = getPermissionChecker();
>   checkOwner(pc, snapshotRoot);
>   BlocksMapUpdateInfo collectedBlocks = new BlocksMapUpdateInfo();
>   List removedINodes = new ChunkedArrayList();
>   dir.writeLock();{code}
> should check owner only in case of permissions enabled as its done for all 
> other operations.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HDFS-5300) FSNameSystem#deleteSnapshot() should not check owner in case of permissions disabled

2013-10-03 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay updated HDFS-5300:


Affects Version/s: 3.0.0
   2.1.0-beta
   Status: Patch Available  (was: Open)

> FSNameSystem#deleteSnapshot() should not check owner in case of permissions 
> disabled
> 
>
> Key: HDFS-5300
> URL: https://issues.apache.org/jira/browse/HDFS-5300
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.1.0-beta, 3.0.0
>Reporter: Vinay
>Assignee: Vinay
> Attachments: HDFS-5300.patch
>
>
> FSNameSystem#deleteSnapshot() should not check owner in case of permissions 
> disabled
> {code:java}  checkOperation(OperationCategory.WRITE);
>   if (isInSafeMode()) {
> throw new SafeModeException(
> "Cannot delete snapshot for " + snapshotRoot, safeMode);
>   }
>   FSPermissionChecker pc = getPermissionChecker();
>   checkOwner(pc, snapshotRoot);
>   BlocksMapUpdateInfo collectedBlocks = new BlocksMapUpdateInfo();
>   List removedINodes = new ChunkedArrayList();
>   dir.writeLock();{code}
> should check owner only in case of permissions enabled as its done for all 
> other operations.



--
This message was sent by Atlassian JIRA
(v6.1#6144)