[jira] [Updated] (HDFS-9129) Move the safemode block count into BlockManager

2015-10-23 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-9129:

Attachment: HDFS-9129.010.patch

Thank you [~wheat9] for your review. The v10 patch addresses the comments, 
along with rebasing from {{trunk}} branch. As there are conflicts because of 
[HDFS-4015], I also invited [~arpitagarwal] and [~anu] to review the patch.

See response inline to [~wheat9]'s comments.
{quote}
1. It does not give much information compared to figuring out the issues on the 
code directly. What does "thresholds met" / "extensions reached" mean? It 
causes more confusions than explanations.
{quote}
The motivation is that diagram is helpful for glimpse, if we can provide 
definition of "thresholds met" / "extension reached". In v10 patch, I add more 
explanations in the comments.

{quote}
{code}
LOG.error("Non-recognized block manager safe mode status: {}", status);
{code}
2. Should be an assert.
{quote}
Truely, I'll simply {{assert false : "some comment"}}.

{quote}
{code}
private volatile boolean isInManualSafeMode = false;
private volatile boolean isInResourceLowSafeMode = false;

...
isInManualSafeMode = !resourcesLow;
isInResourceLowSafeMode = resourcesLow;
{code}
3. How do these two variables synchronize? Is the system in consistent state in 
the middle of the execution?
{quote}
Good question. Actually it's not in consistent state in the middle of the 
execution. If the {{resourceLow}} is true, and before that name node is in 
manual safe mode, in the middle of the execution {{isInSafeMode}} will return 
false, which means the safe mode is OFF. The main reason is that writing to the 
two variables (aka enter/leave safemode) is guarded by the FSNamesystem write 
lock, while read is not.

The enum type state was replace with two boolean flags in the v7 patch, because 
the two-lay state machine was cumbersome per offline discussion with [~wheat9] 
and [~jingzhao]. Guarding all the read looks expensive. Bitwise operation on a 
flag variable seems tricky.

The new design goes back to the {{trunk}} logic, which makes the block manager 
stays in safe mode in the middle of the execution:
{code}
  if (resourcesLow) {
isInResourceLowSafeMode = true;
  } else {
isInManualSafeMode = true;
  }
{code}
In case both {{isInManualSafeMode}} and {{isInResourceLowSafeMode}} are true, 
{{isInResourceLowSafeMode}} will short-circuit {{isInManualSafeMode}}, 
according to our current logic, e.g. {{getTurnOffTip()}}.

{quote}
{code}
+// INITIALIZED -> THRESHOLD
+bmSafeMode.setBlockTotal(BLOCK_TOTAL);
+assertEquals(BMSafeModeStatus.THRESHOLD, getSafeModeStatus());
+assertTrue(bmSafeMode.isInSafeMode());
{code}
4. It makes sense to put it in a test instead of in the @Before method.
{quote}
That makes sense to me. I'll add a new test called {{testSetBlockTotal}}.

{quote}
{code}
+// EXTENSION -> OFF
+Whitebox.setInternalState(bmSafeMode, "status", 
BMSafeModeStatus.EXTENSION);
+reachBlockThreshold();
+reachExtension();
+bmSafeMode.checkSafeMode();
+assertEquals(BMSafeModeStatus.OFF, getSafeModeStatus());
{code}
5. Please refactor the code – you can reuse the getSafeModeStatus() that is 
defined by the class.
{quote}
Actually the first statment for each state transition tests is to *set* the 
internal state.
I'll make the fields package accessible so the Whitebox is not needed. See the 
end of this comment.

{quote}
{code}
assertEquals(getBlockSafe(), i);
{code}
6. The expected value should go first according to the function signature.
{quote}
Nice catch. I will revise the whole test to fix all similar ones.

{quote}
7. A higher level question is that why it needs so many getInternalState() 
statements? It looks to me that many of these behaviors can be observed outside 
without whitebox testing.
{quote}
The reason was that *all* non-static fields in {{BlockManagerSafeMode}} is 
private. By design, I made them _private_ because we don't need access to its 
internal state in {{BlockManager}}.

In the new design, the v10 patch simply makes the fields package private so the 
test will be more straight-forward.

> Move the safemode block count into BlockManager
> ---
>
> Key: HDFS-9129
> URL: https://issues.apache.org/jira/browse/HDFS-9129
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Haohui Mai
>Assignee: Mingliang Liu
> Attachments: HDFS-9129.000.patch, HDFS-9129.001.patch, 
> HDFS-9129.002.patch, HDFS-9129.003.patch, HDFS-9129.004.patch, 
> HDFS-9129.005.patch, HDFS-9129.006.patch, HDFS-9129.007.patch, 
> HDFS-9129.008.patch, HDFS-9129.009.patch, HDFS-9129.010.patch
>
>
> The {{SafeMode}} needs to track whether there are enough blocks so that the 
> NN can get out of the safemode. These fields can moved to the 
> {

[jira] [Commented] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972439#comment-14972439
 ] 

Hudson commented on HDFS-9297:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2468 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2468/])
HDFS-9297. Update TestBlockMissingException to use (lei: rev 
5679e46b7f867f8f7f8195c86c37e3db7b23d7d7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972438#comment-14972438
 ] 

Hudson commented on HDFS-9301:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2468 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2468/])
HDFS-9301. HDFS clients can't construct HdfsConfiguration instances. (wheat9: 
rev 15eb84b37e6c0195d59d3a29fbc5b7417bf022ff)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfigurationLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972436#comment-14972436
 ] 

Hudson commented on HDFS-9290:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2468 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2468/])
HDFS-9290. DFSClient#callAppend() is not backward compatible for (kihwal: rev 
b9e0417bdf2b9655dc4256bdb43683eca1ab46be)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972437#comment-14972437
 ] 

Hudson commented on HDFS-4015:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2468 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2468/])
HDFS-4015. Safemode should count and report orphaned blocks. (arp: rev 
86c92227fc56b6e06d879d250728e8dc8cbe98fe)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetadataConsistency.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeStatusMXBean.java


> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9228) Retry failed NN requests

2015-10-23 Thread Rohan Rajeevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972430#comment-14972430
 ] 

Rohan Rajeevan commented on HDFS-9228:
--

Agree with [~arpitagarwal] here. 

I think the default value for dfs.client.retry.policy.enabled is set to false. 
See https://issues.apache.org/jira/browse/HDFS-8708 for further discussion 
regarding the issue.

> Retry failed NN requests
> 
>
> Key: HDFS-9228
> URL: https://issues.apache.org/jira/browse/HDFS-9228
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>
> Handle the use case of temporary network or NN hiccups and have a 
> configurable number of retries for NN operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8831) Trash Support for deletion in HDFS encryption zone

2015-10-23 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-8831:
-
Attachment: HDFS-8831.02.patch

> Trash Support for deletion in HDFS encryption zone
> --
>
> Key: HDFS-8831
> URL: https://issues.apache.org/jira/browse/HDFS-8831
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8831-10152015.pdf, HDFS-8831.00.patch, 
> HDFS-8831.01.patch, HDFS-8831.02.patch
>
>
> Currently, "Soft Delete" is only supported if the whole encryption zone is 
> deleted. If you delete files whinin the zone with trash feature enabled, you 
> will get error similar to the following 
> {code}
> rm: Failed to move to trash: hdfs://HW11217.local:9000/z1_1/startnn.sh: 
> /z1_1/startnn.sh can't be moved from an encryption zone.
> {code}
> With HDFS-8830, we can support "Soft Delete" by adding the .Trash folder of 
> the file being deleted appropriately to the same encryption zone. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7284) Add more debug info to BlockInfoUnderConstruction#setGenerationStampAndVerifyReplicas

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972426#comment-14972426
 ] 

Hadoop QA commented on HDFS-7284:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 17s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m 17s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 25s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 35s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 33s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  51m 17s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 36s | Tests passed in 
hadoop-hdfs-client. |
| | | 103m 55s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.util.TestByteArrayManager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768378/HDFS-7284.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 7781fe1 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13178/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13178/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13178/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13178/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13178/console |


This message was automatically generated.

> Add more debug info to 
> BlockInfoUnderConstruction#setGenerationStampAndVerifyReplicas
> -
>
> Key: HDFS-7284
> URL: https://issues.apache.org/jira/browse/HDFS-7284
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Hu Liu,
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Attachments: HDFS-7284.001.patch, HDFS-7284.002.patch, 
> HDFS-7284.003.patch, HDFS-7284.004.patch, HDFS-7284.005.patch
>
>
> When I was looking at some replica loss issue, I got the following info from 
> log
> {code}
> 2014-10-13 01:54:53,104 INFO BlockStateChange: BLOCK* Removing stale replica 
> from location x.x.x.x
> {code}
> I could just know that a replica is removed, but I don't know which block and 
> its timestamp. I need to know the id and timestamp of the block from the log 
> file.
> So it's better to add more info including block id and timestamp to the code 
> snippet
> {code}
> for (ReplicaUnderConstruction r : replicas) {
>   if (genStamp != r.getGenerationStamp()) {
> r.getExpectedLocation().removeBlock(this);
> NameNode.blockStateChangeLog.info("BLOCK* Removing stale replica "
> + "from location: " + r.getExpectedLocation());
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972386#comment-14972386
 ] 

Hudson commented on HDFS-9297:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #578 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/578/])
HDFS-9297. Update TestBlockMissingException to use (lei: rev 
5679e46b7f867f8f7f8195c86c37e3db7b23d7d7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972385#comment-14972385
 ] 

Hudson commented on HDFS-4015:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #578 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/578/])
HDFS-4015. Safemode should count and report orphaned blocks. (arp: rev 
86c92227fc56b6e06d879d250728e8dc8cbe98fe)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetadataConsistency.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeStatusMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto


> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4015:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Committed to branch-2 for 2.8.0.

Thanks for contributing this improvement [~anu], and thanks for the reviews 
[~liuml07] and [~jnp].

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Fix For: 2.8.0
>
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972372#comment-14972372
 ] 

Hudson commented on HDFS-9297:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #591 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/591/])
HDFS-9297. Update TestBlockMissingException to use (lei: rev 
5679e46b7f867f8f7f8195c86c37e3db7b23d7d7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972370#comment-14972370
 ] 

Hudson commented on HDFS-4015:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #591 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/591/])
HDFS-4015. Safemode should count and report orphaned blocks. (arp: rev 
86c92227fc56b6e06d879d250728e8dc8cbe98fe)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetadataConsistency.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeStatusMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto


> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972367#comment-14972367
 ] 

Arpit Agarwal commented on HDFS-4015:
-

Committed to trunk. Keeping Jira open for the branch-2 commit.

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972363#comment-14972363
 ] 

Hudson commented on HDFS-4015:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2523 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2523/])
HDFS-4015. Safemode should count and report orphaned blocks. (arp: rev 
86c92227fc56b6e06d879d250728e8dc8cbe98fe)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeStatusMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetadataConsistency.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972365#comment-14972365
 ] 

Hudson commented on HDFS-9297:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2523 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2523/])
HDFS-9297. Update TestBlockMissingException to use (lei: rev 
5679e46b7f867f8f7f8195c86c37e3db7b23d7d7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972348#comment-14972348
 ] 

Hudson commented on HDFS-4015:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1313 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1313/])
HDFS-4015. Safemode should count and report orphaned blocks. (arp: rev 
86c92227fc56b6e06d879d250728e8dc8cbe98fe)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetadataConsistency.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeStatusMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972349#comment-14972349
 ] 

Hudson commented on HDFS-9297:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1313 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1313/])
HDFS-9297. Update TestBlockMissingException to use (lei: rev 
5679e46b7f867f8f7f8195c86c37e3db7b23d7d7)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972343#comment-14972343
 ] 

Hudson commented on HDFS-9301:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #532 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/532/])
HDFS-9301. HDFS clients can't construct HdfsConfiguration instances. (wheat9: 
rev 15eb84b37e6c0195d59d3a29fbc5b7417bf022ff)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfigurationLoader.java


> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972344#comment-14972344
 ] 

Hudson commented on HDFS-9297:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #532 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/532/])
HDFS-9297. Update TestBlockMissingException to use (lei: rev 
5679e46b7f867f8f7f8195c86c37e3db7b23d7d7)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java


> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972342#comment-14972342
 ] 

Hudson commented on HDFS-4015:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #532 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/532/])
HDFS-4015. Safemode should count and report orphaned blocks. (arp: rev 
86c92227fc56b6e06d879d250728e8dc8cbe98fe)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetadataConsistency.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeStatusMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972341#comment-14972341
 ] 

Hudson commented on HDFS-9290:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #532 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/532/])
HDFS-9290. DFSClient#callAppend() is not backward compatible for (kihwal: rev 
b9e0417bdf2b9655dc4256bdb43683eca1ab46be)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java


> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7984) webhdfs:// needs to support provided delegation tokens

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972329#comment-14972329
 ] 

Hadoop QA commented on HDFS-7984:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 41s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 53s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 31s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m  6s | The applied patch generated  1 
new checkstyle issues (total was 108, now 109). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 53s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |   6m 52s | Tests passed in 
hadoop-common. |
| | |  48m 28s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768446/HDFS-7984.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 86c9222 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13177/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13177/artifact/patchprocess/whitespace.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13177/artifact/patchprocess/testrun_hadoop-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13177/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13177/console |


This message was automatically generated.

> webhdfs:// needs to support provided delegation tokens
> --
>
> Key: HDFS-7984
> URL: https://issues.apache.org/jira/browse/HDFS-7984
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Attachments: HDFS-7984.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9279) Decomissioned capacity should not be considered for configured/used capacity

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972309#comment-14972309
 ] 

Hadoop QA commented on HDFS-9279:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 29s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  4s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 31s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 33s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 15s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  64m 26s | Tests failed in hadoop-hdfs. |
| | | 108m 34s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestInterDatanodeProtocol |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768433/HDFS-9279-v2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5679e46 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13176/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13176/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13176/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13176/console |


This message was automatically generated.

> Decomissioned capacity should not be considered for configured/used capacity
> 
>
> Key: HDFS-9279
> URL: https://issues.apache.org/jira/browse/HDFS-9279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9279-v1.patch, HDFS-9279-v2.patch
>
>
> Capacity of a decommissioned node is being accounted as configured and used 
> capacity metrics. This gives incorrect perception of cluster usage.
> Once a node is decommissioned, its capacity should be considered similar to a 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972301#comment-14972301
 ] 

Hudson commented on HDFS-9301:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2522 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2522/])
HDFS-9301. HDFS clients can't construct HdfsConfiguration instances. (wheat9: 
rev 15eb84b37e6c0195d59d3a29fbc5b7417bf022ff)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfigurationLoader.java


> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972300#comment-14972300
 ] 

Hudson commented on HDFS-9290:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2522 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2522/])
HDFS-9290. DFSClient#callAppend() is not backward compatible for (kihwal: rev 
b9e0417bdf2b9655dc4256bdb43683eca1ab46be)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java


> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972284#comment-14972284
 ] 

Hudson commented on HDFS-4015:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8699 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8699/])
HDFS-4015. Safemode should count and report orphaned blocks. (arp: rev 
86c92227fc56b6e06d879d250728e8dc8cbe98fe)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMetadataConsistency.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeStatusMXBean.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java


> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972285#comment-14972285
 ] 

Hudson commented on HDFS-9297:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8699 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8699/])
HDFS-9297. Update TestBlockMissingException to use (lei: rev 
5679e46b7f867f8f7f8195c86c37e3db7b23d7d7)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockMissingException.java


> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972281#comment-14972281
 ] 

Hadoop QA commented on HDFS-9260:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 12s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | @author |   0m  0s | The patch appears to contain 1 
@author tags which the Hadoop  community has agreed to not allow in code 
contributions. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 19 new or modified test files. |
| {color:green}+1{color} | javac |   7m 55s | There were no new javac warning 
messages. |
| {color:red}-1{color} | javadoc |  10m 20s | The applied patch generated  3  
additional warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 25s | The applied patch generated  
77 new checkstyle issues (total was 883, now 952). |
| {color:red}-1{color} | whitespace |   0m 33s | The patch has 3  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 29s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 33s | The patch appears to introduce 3 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  9s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  54m 34s | Tests failed in hadoop-hdfs. |
| | | 101m 17s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.balancer.TestBalancer |
| Timed out tests | 
org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768437/HDFS-7435.006.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/artifact/patchprocess/diffJavadocWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13175/console |


This message was automatically generated.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972280#comment-14972280
 ] 

Hadoop QA commented on HDFS-9098:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 28s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  9s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 29s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 40s | The applied patch generated  7 
new checkstyle issues (total was 116, now 122). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   4m 39s | The patch appears to introduce 3 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  8s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  48m 40s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 34s | Tests passed in 
hadoop-hdfs-client. |
| | | 101m 19s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-client |
| Failed unit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.TestStripedDataStreamers |
|   | hadoop.hdfs.TestCrcCorruption |
|   | hadoop.hdfs.server.namenode.TestEditLog |
| Timed out tests | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy |
|   | org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | org.apache.hadoop.hdfs.TestDatanodeDeath |
|   | org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks 
|
|   | org.apache.hadoop.hdfs.TestDecommission |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks |
|   | org.apache.hadoop.hdfs.TestDFSRemove |
|   | org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks |
|   | org.apache.hadoop.hdfs.server.namenode.TestRecoverStripedBlocks |
|   | org.apache.hadoop.hdfs.TestDFSUpgrade |
|   | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | org.apache.hadoop.hdfs.server.namenode.TestGenericJournalConf |
|   | org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | org.apache.hadoop.hdfs.server.namenode.TestDeleteRace |
|   | org.apache.hadoop.hdfs.TestParallelShortCircuitReadUnCached |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768447/HDFS-9098.00.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13174/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13174/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13174/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-client.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13174/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13174/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13174/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13174/console |


This message was automatically generated.

> Erasure coding: emulate race conditions among striped streamers in write 
> pipeline
> -
>
> Key: HDFS-9098

[jira] [Commented] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972272#comment-14972272
 ] 

Hudson commented on HDFS-9301:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #590 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/590/])
HDFS-9301. HDFS clients can't construct HdfsConfiguration instances. (wheat9: 
rev 15eb84b37e6c0195d59d3a29fbc5b7417bf022ff)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfigurationLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java


> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972271#comment-14972271
 ] 

Hudson commented on HDFS-9290:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #590 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/590/])
HDFS-9290. DFSClient#callAppend() is not backward compatible for (kihwal: rev 
b9e0417bdf2b9655dc4256bdb43683eca1ab46be)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java


> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8831) Trash Support for deletion in HDFS encryption zone

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972268#comment-14972268
 ] 

Hadoop QA commented on HDFS-8831:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  22m 16s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:red}-1{color} | javac |   8m 58s | The applied patch generated  2  
additional warning messages. |
| {color:green}+1{color} | javadoc |  12m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 28s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 46s | The applied patch generated  5 
new checkstyle issues (total was 199, now 201). |
| {color:red}-1{color} | whitespace |   0m  4s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 52s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   9m 39s | The patch appears to introduce 3 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   7m 33s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  54m 33s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 32s | Tests passed in 
hadoop-hdfs-client. |
| | | 123m  1s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common |
| FindBugs | module:hadoop-hdfs-client |
| Failed unit tests | hadoop.ipc.TestDecayRpcScheduler |
|   | hadoop.net.TestDNS |
|   | hadoop.hdfs.server.datanode.TestFsDatasetCache |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.util.TestByteArrayManager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768402/HDFS-8831.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/diffJavacWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-client.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13173/console |


This message was automatically generated.

> Trash Support for deletion in HDFS encryption zone
> --
>
> Key: HDFS-8831
> URL: https://issues.apache.org/jira/browse/HDFS-8831
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8831-10152015.pdf, HDFS-8831.00.patch, 
> HDFS-8831.01.patch
>
>
> Currently, "Soft Delete" is only supported if the whole encryption zone is 
> deleted. If you delete files whinin the zone with trash feature enabled, you 
> will get error similar to the following 
> {code}
> rm: Failed to move to trash: hdfs://HW11217.local:9000/z1_1/startnn.sh: 
> /z1_1/startnn.sh can't be moved from an encryption zone.
> {code}
> With HDFS-8830, we can support "Soft Delete" by adding the .Trash folder of 
> the file being deleted appropriately to the same encryption zon

[jira] [Commented] (HDFS-9231) fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972259#comment-14972259
 ] 

Hadoop QA commented on HDFS-9231:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  23m 51s | Pre-patch trunk has 1 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |  10m 34s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  13m 58s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 31s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 58s | The applied patch generated  5 
new checkstyle issues (total was 427, now 430). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m  1s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 46s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 30s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   4m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  63m 51s | Tests failed in hadoop-hdfs. |
| | | 125m 24s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestDFSUpgradeFromImage |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768427/HDFS-9231.007.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13172/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13172/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13172/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13172/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13172/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13172/console |


This message was automatically generated.

> fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot
> ---
>
> Key: HDFS-9231
> URL: https://issues.apache.org/jira/browse/HDFS-9231
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, 
> HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, 
> HDFS-9231.006.patch, HDFS-9231.007.patch
>
>
> Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt 
> blocks with the original file dir instead of the snapshot dir, and {{fsck 
> -list-corruptfileblocks -includeSnapshots}} behave the same.
> This can be confusing because even when the original file is deleted, fsck 
> will still show that deleted file as corrupted, although what's actually 
> corrupted is the snapshot. 
> As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9264) Minor cleanup of operations on FsVolumeList#volumes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972243#comment-14972243
 ] 

Hudson commented on HDFS-9264:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2467 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2467/])
HDFS-9264. Minor cleanup of operations on FsVolumeList#volumes.  (Walter (lei: 
rev 533a2be5ac7c7f0473fdd24d6201582d08964e21)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java


> Minor cleanup of operations on FsVolumeList#volumes
> ---
>
> Key: HDFS-9264
> URL: https://issues.apache.org/jira/browse/HDFS-9264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9264.01.patch, HDFS-9264.02.patch
>
>
> We can use {{CopyOnWriteArrayList}} to simplify the operations on 
> {{FsVolumeList#volumes}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972244#comment-14972244
 ] 

Hudson commented on HDFS-9184:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2467 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2467/])
HDFS-9184. Logging HDFS operation's caller context into audit logs. (jitendra: 
rev 600ad7bf4104bcaeec00a4089d59bb1fdf423299)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogAtDebug.java
* hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ProtoUtil.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HdfsAuditLogger.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java


> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch, HDFS-9184.009.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972241#comment-14972241
 ] 

Hudson commented on HDFS-8808:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2467 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2467/])
HDFS-8808. dfs.image.transfer.bandwidthPerSec should not apply to (zhz: rev 
ab3c4cff4af338caaa23be0ec383fc1fe473714f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Fix For: 2.8.0
>
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1497#comment-1497
 ] 

Hudson commented on HDFS-9290:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #577 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/577/])
HDFS-9290. DFSClient#callAppend() is not backward compatible for (kihwal: rev 
b9e0417bdf2b9655dc4256bdb43683eca1ab46be)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java


> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972223#comment-14972223
 ] 

Hudson commented on HDFS-9301:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #577 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/577/])
HDFS-9301. HDFS clients can't construct HdfsConfiguration instances. (wheat9: 
rev 15eb84b37e6c0195d59d3a29fbc5b7417bf022ff)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfigurationLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java


> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9297) Update TestBlockMissingException to use corruptBlockOnDataNodesByDeletingBlockFile()

2015-10-23 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-9297:

   Resolution: Fixed
Fix Version/s: 2.8.0
   3.0.0
   Status: Resolved  (was: Patch Available)

+1. Committed.

Thanks, [~twu]

> Update TestBlockMissingException to use 
> corruptBlockOnDataNodesByDeletingBlockFile()
> 
>
> Key: HDFS-9297
> URL: https://issues.apache.org/jira/browse/HDFS-9297
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Trivial
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9297.001.patch
>
>
> TestBlockMissingException uses its own function to corrupt a block by 
> deleting all its block files. HDFS-7235 introduced a helper function 
> {{corruptBlockOnDataNodesByDeletingBlockFile()}} that does exactly the same 
> thing. We can update this test to use the helper function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9289) check genStamp when complete file

2015-10-23 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972209#comment-14972209
 ] 

Zhe Zhang commented on HDFS-9289:
-

[~lichangleo] Thanks for reporting the issue. 

bq. but the file complete with the old block genStamp.
How did that happen? So the client somehow had an old GS? IIUC the 
{{updatePipeline}} protocol is below (using {{client_GS}}, {{DN_GS}}, and 
{{NN_GS}} to denote the 3 copies of GS):
# Client asks for new GS from NN through {{updateBlockForPipeline}}. After 
this, {{client_GS}} is new, both {{DN_GS}} and {{NN_GS}} are old
# Client calls {{createBlockOutputStream}} to update DN's GS. After this, both 
{{client_GS}} and {{DN_GS}} are new, {{NN_GS}} is old
# Client calls {{updatePipeline}}. After this, all 3 GSes should be new

Maybe step 3 failed, and then client tried to complete the file? It'd be ideal 
if you could extend the unit test to reproduce the error without the fix (or 
paste the error log). Thanks!

> check genStamp when complete file
> -
>
> Key: HDFS-9289
> URL: https://issues.apache.org/jira/browse/HDFS-9289
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
>Priority: Critical
> Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch
>
>
> we have seen a case of corrupt block which is caused by file complete after a 
> pipelineUpdate, but the file complete with the old block genStamp. This 
> caused the replicas of two datanodes in updated pipeline to be viewed as 
> corrupte. Propose to check genstamp when commit block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972199#comment-14972199
 ] 

Hudson commented on HDFS-8808:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #531 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/531/])
HDFS-8808. dfs.image.transfer.bandwidthPerSec should not apply to (zhz: rev 
ab3c4cff4af338caaa23be0ec383fc1fe473714f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java


> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Fix For: 2.8.0
>
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9264) Minor cleanup of operations on FsVolumeList#volumes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972201#comment-14972201
 ] 

Hudson commented on HDFS-9264:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #531 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/531/])
HDFS-9264. Minor cleanup of operations on FsVolumeList#volumes.  (Walter (lei: 
rev 533a2be5ac7c7f0473fdd24d6201582d08964e21)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java


> Minor cleanup of operations on FsVolumeList#volumes
> ---
>
> Key: HDFS-9264
> URL: https://issues.apache.org/jira/browse/HDFS-9264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9264.01.patch, HDFS-9264.02.patch
>
>
> We can use {{CopyOnWriteArrayList}} to simplify the operations on 
> {{FsVolumeList#volumes}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972202#comment-14972202
 ] 

Hudson commented on HDFS-9184:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #531 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/531/])
HDFS-9184. Logging HDFS operation's caller context into audit logs. (jitendra: 
rev 600ad7bf4104bcaeec00a4089d59bb1fdf423299)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ProtoUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HdfsAuditLogger.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogAtDebug.java


> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch, HDFS-9184.009.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972192#comment-14972192
 ] 

Arpit Agarwal commented on HDFS-4015:
-

The build failure looks unrelated to the patch. Since the only Jenkins failure 
flagged by the v6 patch was fixed by a trivial whitespace change I will commit 
the v7 patch shortly.

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning seems easy enough 
> to accomplish and worth doing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9282) Make data directory count and storage raw capacity related tests FsDataset-agnostic

2015-10-23 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972179#comment-14972179
 ] 

Lei (Eddy) Xu commented on HDFS-9282:
-

Thanks a lot for the patch, [~twu]

* Could you fix the whitespace warning, which is related.

{code}
   /**
 * Get the default number of data directories for underlying storage per
 * DataNode. Used by MiniDFSCluster to initialize cluster.
{code}

We can remove {{Used by MniDFSCluster...}}.

* {{FsDatasetTestUtils#getNumOfDataDirs()}} should also be named as 
{{getDefaultNumOfDataDirs()}} to make it consistent with the Factory.

* Lets write the following code in separated lines. 
{code}
public int getNumOfDataDirs() { return this.defaultNumOfDataDirs; }
{code}

* {{public static final int defaultNumOfDataDirs = 2;}} should be 
{{DEFAULT_NUM_OF_DATA_DIRS}}, and could you add some comments for it?

* The {{FsVolumeReferences}} from the following code should be closed to 
release reference counts after usage.
{code}
DF df = new DF(new File(
dataset.getFsVolumeReferences().get(0).getBasePath()),
dataset.datanode.getConf());
{code}

Also for {{assert (dataset.getFsVolumeReferences().size() != 0);}}. Btw, can we 
use {{Preconditions.checkState}} to do the checks here?

> Make data directory count and storage raw capacity related tests 
> FsDataset-agnostic
> ---
>
> Key: HDFS-9282
> URL: https://issues.apache.org/jira/browse/HDFS-9282
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: HDFS, test
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Minor
> Attachments: HDFS-9282.001.patch
>
>
> DFSMiniCluster and several tests have hard coded assumption of the underlying 
> storage having 2 data directories (volumes). As HDFS-9188 pointed out, with 
> new FsDataset implementations, these hard coded assumption about number of 
> data directories and raw capacities of storage may change as well.
> We need to extend FsDatasetTestUtils to provide:
> * Number of data directories of underlying storage per DataNode
> * Raw storage capacity of underlying storage per DataNode.
> * Have MiniDFSCluster automatically pick up the correct values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-4015) Safemode should count and report orphaned blocks

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972163#comment-14972163
 ] 

Hadoop QA commented on HDFS-4015:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  24m 37s | Findbugs (version 3.0.0) 
appears to be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   8m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 43s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   3m 51s | Site still builds. |
| {color:red}-1{color} | checkstyle |   2m 42s | The applied patch generated  3 
new checkstyle issues (total was 138, now 138). |
| {color:green}+1{color} | whitespace |   0m  3s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m  5s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 42s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   5m 38s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 56s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests |  57m 32s | Tests passed in hadoop-hdfs. 
|
| {color:red}-1{color} | hdfs tests |   0m 36s | Tests failed in 
hadoop-hdfs-client. |
| | | 122m 52s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed build | hadoop-hdfs-client |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768388/HDFS-4015.007.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / 15eb84b |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13168/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13168/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13168/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13168/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13168/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13168/console |


This message was automatically generated.

> Safemode should count and report orphaned blocks
> 
>
> Key: HDFS-4015
> URL: https://issues.apache.org/jira/browse/HDFS-4015
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Todd Lipcon
>Assignee: Anu Engineer
> Attachments: HDFS-4015.001.patch, HDFS-4015.002.patch, 
> HDFS-4015.003.patch, HDFS-4015.004.patch, HDFS-4015.005.patch, 
> HDFS-4015.006.patch, HDFS-4015.007.patch
>
>
> The safemode status currently reports the number of unique reported blocks 
> compared to the total number of blocks referenced by the namespace. However, 
> it does not report the inverse: blocks which are reported by datanodes but 
> not referenced by the namespace.
> In the case that an admin accidentally starts up from an old image, this can 
> be confusing: safemode and fsck will show "corrupt files", which are the 
> files which actually have been deleted but got resurrected by restarting from 
> the old image. This will convince them that they can safely force leave 
> safemode and remove these files -- after all, they know that those files 
> should really have been deleted. However, they're not aware that leaving 
> safemode will also unrecoverably delete a bunch of other block files which 
> have been orphaned due to the namespace rollback.
> I'd like to consider reporting something like: "90 of expected 100 
> blocks have been reported. Additionally, 1 blocks have been reported 
> which do not correspond to any file in the namespace. Forcing exit of 
> safemode will unrecoverably remove those data blocks"
> Whether this statistic is also used for some kind of "inverse safe mode" is 
> the logical next step, but just reporting it as a warning s

[jira] [Updated] (HDFS-7984) webhdfs:// needs to support provided delegation tokens

2015-10-23 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-7984:
---
Status: Patch Available  (was: Open)

> webhdfs:// needs to support provided delegation tokens
> --
>
> Key: HDFS-7984
> URL: https://issues.apache.org/jira/browse/HDFS-7984
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Attachments: HDFS-7984.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9302) WebHDFS throws NullPointerException if newLength is not provided

2015-10-23 Thread Karthik Palaniappan (JIRA)
Karthik Palaniappan created HDFS-9302:
-

 Summary: WebHDFS throws NullPointerException if newLength is not 
provided
 Key: HDFS-9302
 URL: https://issues.apache.org/jira/browse/HDFS-9302
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.7.1
 Environment: Centos6
Reporter: Karthik Palaniappan
Priority: Minor


$ curl -X POST "http://namenode:50070/webhdfs/v1/foo?op=truncate";
{"RemoteException":{"exception":"NullPointerException","javaClassName":"java.lang.NullPointerException","message":null}}

We should change newLength to be a required parameter in the webhdfs 
documentation 
(https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#New_Length),
 and throw an IllegalArgumentException if isn't provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8831) Trash Support for deletion in HDFS encryption zone

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972157#comment-14972157
 ] 

Hadoop QA commented on HDFS-8831:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  20m 15s | Findbugs (version 3.0.0) 
appears to be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:red}-1{color} | javac |   8m  6s | The applied patch generated  2  
additional warning messages. |
| {color:green}+1{color} | javadoc |  10m 31s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 10s | The applied patch generated  5 
new checkstyle issues (total was 199, now 201). |
| {color:red}-1{color} | whitespace |   0m  3s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   7m  4s | The patch appears to introduce 4 
new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | common tests |   6m 41s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests |  54m 58s | Tests failed in hadoop-hdfs. |
| {color:red}-1{color} | hdfs tests |   0m 40s | Tests failed in 
hadoop-hdfs-client. |
| | | 113m 44s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-common |
| FindBugs | module:hadoop-hdfs |
| FindBugs | module:hadoop-hdfs-client |
| Failed unit tests | hadoop.metrics2.sink.TestFileSink |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.TestEncryptionZonesWithKMS |
|   | hadoop.hdfs.TestEncryptionZones |
| Failed build | hadoop-hdfs-client |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768402/HDFS-8831.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/diffJavacWarnings.txt
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/diffcheckstylehadoop-common.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/whitespace.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/newPatchFindbugsWarningshadoop-common.html
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs-client.html
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13170/console |


This message was automatically generated.

> Trash Support for deletion in HDFS encryption zone
> --
>
> Key: HDFS-8831
> URL: https://issues.apache.org/jira/browse/HDFS-8831
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
> Attachments: HDFS-8831-10152015.pdf, HDFS-8831.00.patch, 
> HDFS-8831.01.patch
>
>
> Currently, "Soft Delete" is only supported if the whole encryption zone is 
> deleted. If you delete files whinin the zone with trash feature enabled, you 
> will get error similar to the following 
> {code}
> rm: Failed to move to trash: hdfs://HW11217.local:9000/z1_1/startnn.sh: 
> /z1_1/startnn.sh can't be moved from an encryption zone.
> {code}
> With HDFS-8830, we can support "Soft Delete" by adding the .Trash folder of 
> the 

[jira] [Commented] (HDFS-9299) Give ReplicationMonitor a readable thread name

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972137#comment-14972137
 ] 

Hadoop QA commented on HDFS-9299:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 41s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   9m  2s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 47s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 37s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 53s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 38s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 58s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 38s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  56m  4s | Tests failed in hadoop-hdfs. |
| | | 105m 48s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.qjournal.client.TestQJMWithFaults |
|   | hadoop.hdfs.TestReadStripedFileWithDecoding |
|   | hadoop.hdfs.TestFileAppend4 |
|   | hadoop.hdfs.TestDFSClientFailover |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768380/HDFS-9299.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13166/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13166/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13166/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13166/console |


This message was automatically generated.

> Give ReplicationMonitor a readable thread name
> --
>
> Key: HDFS-9299
> URL: https://issues.apache.org/jira/browse/HDFS-9299
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
>Priority: Trivial
> Attachments: HDFS-9299.001.patch
>
>
> Currently the log output from the Replication Monitor is the class name, by 
> setting the name on the thread the output will be easier to read.
> Current
> 2015-10-23 11:07:53,344 
> [org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@2fbdc5dd]
>  INFO  blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.
> After
> 2015-10-23 11:07:53,344 [ReplicationMonitor] INFO  
> blockmanagement.BlockManager (BlockManager.java:run(4125)) - Stopping 
> ReplicationMonitor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9298) remove replica and not add replica with wrong genStamp

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972134#comment-14972134
 ] 

Hadoop QA commented on HDFS-9298:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  16m 37s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  6s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 30s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 32s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   2m 38s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  50m 31s | Tests failed in hadoop-hdfs. |
| | |  94m 53s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestDatanodeDeath |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.server.namenode.ha.TestFailoverWithBlockTokensEnabled |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestQuotasWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768366/HDFS-9298.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13171/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13171/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13171/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13171/console |


This message was automatically generated.

> remove replica and not add replica with wrong genStamp
> --
>
> Key: HDFS-9298
> URL: https://issues.apache.org/jira/browse/HDFS-9298
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-9298.1.patch
>
>
> currently, in setGenerationStampAndVerifyReplicas, replica with wrong gen 
> stamp is not really removed, only StorageLocation of that replica is removed. 
> Moreover, we should check genStamp before addReplicaIfNotPresent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9098) Erasure coding: emulate race conditions among striped streamers in write pipeline

2015-10-23 Thread Zhe Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9098:

Attachment: HDFS-9098.00.patch

Thanks Jing for the helpful review! Updating the patch to address the comment 
on linking fault injection and sync points.

The basic challenge here is to handle non-linear event flow in a thread (e.g. 
with if statements). The approach in the patch is to have 2 types of sync 
points:
* {{syncBarrier}}: blocks until the prior event happens
* {{syncCondition}}: returns false until the prior event happens

The biggest remaining work is to add more details in each sync event (either 
barrier or condition). For example, a certain {{WRITE_CHUNK}} barrier should 
only apply for a specified file offset.

This is still an experimental patch, so more comments / questions will be very 
helpful.

> Erasure coding: emulate race conditions among striped streamers in write 
> pipeline
> -
>
> Key: HDFS-9098
> URL: https://issues.apache.org/jira/browse/HDFS-9098
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9098.00.patch, HDFS-9098.wip.patch
>
>
> Apparently the interleaving of events among {{StripedDataStreamer}}'s is very 
> tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
> conditions under HDFS-9040.
> Let's use FaultInjector to emulate different combinations of interleaved 
> events.
> In particular, we should consider inject delays in the following places:
> # {{Streamer#endBlock}}
> # {{Streamer#locateFollowingBlock}}
> # {{Streamer#updateBlockForPipeline}}
> # {{Streamer#updatePipeline}}
> # {{OutputStream#writeChunk}}
> # {{OutputStream#close}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972119#comment-14972119
 ] 

Hudson commented on HDFS-9184:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2521 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2521/])
HDFS-9184. Logging HDFS operation's caller context into audit logs. (jitendra: 
rev 600ad7bf4104bcaeec00a4089d59bb1fdf423299)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HdfsAuditLogger.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ProtoUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogAtDebug.java


> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch, HDFS-9184.009.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972116#comment-14972116
 ] 

Hudson commented on HDFS-8808:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2521 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2521/])
HDFS-8808. dfs.image.transfer.bandwidthPerSec should not apply to (zhz: rev 
ab3c4cff4af338caaa23be0ec383fc1fe473714f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java


> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Fix For: 2.8.0
>
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9264) Minor cleanup of operations on FsVolumeList#volumes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972118#comment-14972118
 ] 

Hudson commented on HDFS-9264:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2521 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2521/])
HDFS-9264. Minor cleanup of operations on FsVolumeList#volumes.  (Walter (lei: 
rev 533a2be5ac7c7f0473fdd24d6201582d08964e21)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Minor cleanup of operations on FsVolumeList#volumes
> ---
>
> Key: HDFS-9264
> URL: https://issues.apache.org/jira/browse/HDFS-9264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9264.01.patch, HDFS-9264.02.patch
>
>
> We can use {{CopyOnWriteArrayList}} to simplify the operations on 
> {{FsVolumeList#volumes}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9300) DirectoryScanner.testThrottle() is still a little flakey

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972115#comment-14972115
 ] 

Hadoop QA commented on HDFS-9300:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |   9m 14s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |  10m 54s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 28s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 15s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 45s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 31s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   1m 29s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  61m 20s | Tests failed in hadoop-hdfs. |
| | |  90m 50s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestStartup |
|   | org.apache.hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks 
|
|   | org.apache.hadoop.hdfs.server.namenode.TestEditLogJournalFailures |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768391/HDFS-9300.001.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13164/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13164/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13164/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13164/console |


This message was automatically generated.

> DirectoryScanner.testThrottle() is still a little flakey
> 
>
> Key: HDFS-9300
> URL: https://issues.apache.org/jira/browse/HDFS-9300
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: balancer & mover, test
>Affects Versions: 2.7.1
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
> Attachments: HDFS-9300.001.patch
>
>
> It failed in:
> https://builds.apache.org/job/PreCommit-HDFS-Build/13160/testReport/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testThrottling/
> by narrowly missing the performance boundaries.  The only solution I have is 
> to relax the boundaries a little.  The throttle's just a hard thing to test 
> in an unpredictable environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7984) webhdfs:// needs to support provided delegation tokens

2015-10-23 Thread HeeSoo Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972112#comment-14972112
 ] 

HeeSoo Kim commented on HDFS-7984:
--

Anthony is right. {noformat} HADOOP_TOKEN_FILE_LOCATION {noformat} supports to 
read delegation token for WebHDFSFileSystem.
However, {noformat} HADOOP_TOKEN_FILE_LOCATION{noformat} is only support one 
file. ugi can have multiple tokens in credential. This patch supports multiple 
delegation token files as parameter.
{noformat} -Dhadoop.token.files=filename1,filename2{noformat}

> webhdfs:// needs to support provided delegation tokens
> --
>
> Key: HDFS-7984
> URL: https://issues.apache.org/jira/browse/HDFS-7984
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Attachments: HDFS-7984.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972106#comment-14972106
 ] 

Hudson commented on HDFS-9184:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1312 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1312/])
HDFS-9184. Logging HDFS operation's caller context into audit logs. (jitendra: 
rev 600ad7bf4104bcaeec00a4089d59bb1fdf423299)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HdfsAuditLogger.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ProtoUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogAtDebug.java
* hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java


> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch, HDFS-9184.009.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972102#comment-14972102
 ] 

Hudson commented on HDFS-9290:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1312 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1312/])
HDFS-9290. DFSClient#callAppend() is not backward compatible for (kihwal: rev 
b9e0417bdf2b9655dc4256bdb43683eca1ab46be)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972101#comment-14972101
 ] 

Hudson commented on HDFS-8808:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1312 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1312/])
HDFS-8808. dfs.image.transfer.bandwidthPerSec should not apply to (zhz: rev 
ab3c4cff4af338caaa23be0ec383fc1fe473714f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Fix For: 2.8.0
>
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9264) Minor cleanup of operations on FsVolumeList#volumes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972105#comment-14972105
 ] 

Hudson commented on HDFS-9264:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1312 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1312/])
HDFS-9264. Minor cleanup of operations on FsVolumeList#volumes.  (Walter (lei: 
rev 533a2be5ac7c7f0473fdd24d6201582d08964e21)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java


> Minor cleanup of operations on FsVolumeList#volumes
> ---
>
> Key: HDFS-9264
> URL: https://issues.apache.org/jira/browse/HDFS-9264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9264.01.patch, HDFS-9264.02.patch
>
>
> We can use {{CopyOnWriteArrayList}} to simplify the operations on 
> {{FsVolumeList#volumes}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7984) webhdfs:// needs to support provided delegation tokens

2015-10-23 Thread HeeSoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeeSoo Kim updated HDFS-7984:
-
Attachment: HDFS-7984.patch

> webhdfs:// needs to support provided delegation tokens
> --
>
> Key: HDFS-7984
> URL: https://issues.apache.org/jira/browse/HDFS-7984
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
> Attachments: HDFS-7984.patch
>
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972104#comment-14972104
 ] 

Hudson commented on HDFS-9301:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1312 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1312/])
HDFS-9301. HDFS clients can't construct HdfsConfiguration instances. (wheat9: 
rev 15eb84b37e6c0195d59d3a29fbc5b7417bf022ff)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfigurationLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java


> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7984) webhdfs:// needs to support provided delegation tokens

2015-10-23 Thread HeeSoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

HeeSoo Kim reassigned HDFS-7984:


Assignee: HeeSoo Kim

> webhdfs:// needs to support provided delegation tokens
> --
>
> Key: HDFS-7984
> URL: https://issues.apache.org/jira/browse/HDFS-7984
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.0.0
>Reporter: Allen Wittenauer
>Assignee: HeeSoo Kim
>Priority: Blocker
>
> When using the webhdfs:// filesystem (especially from distcp), we need the 
> ability to inject a delegation token rather than webhdfs initialize its own.  
> This would allow for cross-authentication-zone file system accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers

2015-10-23 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972097#comment-14972097
 ] 

Zhe Zhang commented on HDFS-9079:
-

Thanks [~szetszwo], good question. I think it's compatible. For old clients, 
the "preallocated" GSes are invisible -- it is equivalent to the scenario where 
another client allocates a few new blocks and uses those GSes. 

Unless there are closely controlled unit tests relying on the assumption that 
GSes are allocated continuously without any gap. From Jenkins results there 
aren't any in trunk.

> Erasure coding: preallocate multiple generation stamps and serialize updates 
> from data streamers
> 
>
> Key: HDFS-9079
> URL: https://issues.apache.org/jira/browse/HDFS-9079
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, 
> HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) 
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) 
> Updates block on NN
> {code}
> To simplify the above we can preallocate GS when NN creates a new striped 
> block group ({{FSN#createNewBlock}}). For each new striped block group we can 
> reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can 
> be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we 
> shouldn't try to further recover anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9298) remove replica and not add replica with wrong genStamp

2015-10-23 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated HDFS-9298:
---
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> remove replica and not add replica with wrong genStamp
> --
>
> Key: HDFS-9298
> URL: https://issues.apache.org/jira/browse/HDFS-9298
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-9298.1.patch
>
>
> currently, in setGenerationStampAndVerifyReplicas, replica with wrong gen 
> stamp is not really removed, only StorageLocation of that replica is removed. 
> Moreover, we should check genStamp before addReplicaIfNotPresent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9298) remove replica and not add replica with wrong genStamp

2015-10-23 Thread Chang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972092#comment-14972092
 ] 

Chang Li commented on HDFS-9298:


Thanks [~jingzhao] for explain! I will close this as won't fix

> remove replica and not add replica with wrong genStamp
> --
>
> Key: HDFS-9298
> URL: https://issues.apache.org/jira/browse/HDFS-9298
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-9298.1.patch
>
>
> currently, in setGenerationStampAndVerifyReplicas, replica with wrong gen 
> stamp is not really removed, only StorageLocation of that replica is removed. 
> Moreover, we should check genStamp before addReplicaIfNotPresent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9079) Erasure coding: preallocate multiple generation stamps and serialize updates from data streamers

2015-10-23 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972089#comment-14972089
 ] 

Tsz Wo Nicholas Sze commented on HDFS-9079:
---

Thanks for the explanation.  Is this an incompatible change?

> Erasure coding: preallocate multiple generation stamps and serialize updates 
> from data streamers
> 
>
> Key: HDFS-9079
> URL: https://issues.apache.org/jira/browse/HDFS-9079
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Affects Versions: HDFS-7285
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, 
> HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) 
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) 
> Updates block on NN
> {code}
> To simplify the above we can preallocate GS when NN creates a new striped 
> block group ({{FSN#createNewBlock}}). For each new striped block group we can 
> reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can 
> be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we 
> shouldn't try to further recover anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7298) HDFS may honor socket timeout configuration

2015-10-23 Thread bimal tandel (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972084#comment-14972084
 ] 

bimal tandel commented on HDFS-7298:


[~yzhangal], I will take a look at this Jira.

> HDFS may honor socket timeout configuration
> ---
>
> Key: HDFS-7298
> URL: https://issues.apache.org/jira/browse/HDFS-7298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Guo Ruijing
>Assignee: bimal tandel
>
> DFS_CLIENT_SOCKET_TIMEOUT_KEY: HDFS socket read timeout
> DFS_DATANODE_SOCKET_WRITE_TIMEOUT_KEY: HDFS socket write timeout 
> HDFS may honor socket timeout configuration:
> 1. DataXceiver.java:
> 1) existing code(not expected)
>int timeoutValue = dnConf.socketTimeout
>   + (HdfsServerConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> targets.length);
> 2) proposed code:
>int timeoutValue = dnConf.socketTimeout ? (dnConf.socketTimeout
>   + (HdfsServerConstants.READ_TIMEOUT_EXTENSION * targets.length) 
> : 0;
>   int writeTimeout = dnConf.socketWriteTimeout ? 
> (dnConf.socketWriteTimeout + 
>   (HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> targets.length)) : 0;
> 2) DFSClient.java
> existing code is expected:
>   int getDatanodeWriteTimeout(int numNodes) {
> return (dfsClientConf.confTime > 0) ?
>   (dfsClientConf.confTime + HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> numNodes) : 0;
>   }
>   int getDatanodeReadTimeout(int numNodes) {
> return dfsClientConf.socketTimeout > 0 ?
> (HdfsServerConstants.READ_TIMEOUT_EXTENSION * numNodes +
> dfsClientConf.socketTimeout) : 0;
>   }
> 3) DataNode.java:
> existing code is not expected: 
> long writeTimeout = dnConf.socketWriteTimeout +
> HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> (targets.length-1);
> proposed code:
> long writeTimeout = dnConf.socketWriteTimeout ? 
> (dnConf.socketWriteTimeout  +
> HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> (targets.length-1)) : 0;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7298) HDFS may honor socket timeout configuration

2015-10-23 Thread bimal tandel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bimal tandel reassigned HDFS-7298:
--

Assignee: bimal tandel

> HDFS may honor socket timeout configuration
> ---
>
> Key: HDFS-7298
> URL: https://issues.apache.org/jira/browse/HDFS-7298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Guo Ruijing
>Assignee: bimal tandel
>
> DFS_CLIENT_SOCKET_TIMEOUT_KEY: HDFS socket read timeout
> DFS_DATANODE_SOCKET_WRITE_TIMEOUT_KEY: HDFS socket write timeout 
> HDFS may honor socket timeout configuration:
> 1. DataXceiver.java:
> 1) existing code(not expected)
>int timeoutValue = dnConf.socketTimeout
>   + (HdfsServerConstants.READ_TIMEOUT_EXTENSION * targets.length);
>   int writeTimeout = dnConf.socketWriteTimeout +
>   (HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> targets.length);
> 2) proposed code:
>int timeoutValue = dnConf.socketTimeout ? (dnConf.socketTimeout
>   + (HdfsServerConstants.READ_TIMEOUT_EXTENSION * targets.length) 
> : 0;
>   int writeTimeout = dnConf.socketWriteTimeout ? 
> (dnConf.socketWriteTimeout + 
>   (HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> targets.length)) : 0;
> 2) DFSClient.java
> existing code is expected:
>   int getDatanodeWriteTimeout(int numNodes) {
> return (dfsClientConf.confTime > 0) ?
>   (dfsClientConf.confTime + HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> numNodes) : 0;
>   }
>   int getDatanodeReadTimeout(int numNodes) {
> return dfsClientConf.socketTimeout > 0 ?
> (HdfsServerConstants.READ_TIMEOUT_EXTENSION * numNodes +
> dfsClientConf.socketTimeout) : 0;
>   }
> 3) DataNode.java:
> existing code is not expected: 
> long writeTimeout = dnConf.socketWriteTimeout +
> HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> (targets.length-1);
> proposed code:
> long writeTimeout = dnConf.socketWriteTimeout ? 
> (dnConf.socketWriteTimeout  +
> HdfsServerConstants.WRITE_TIMEOUT_EXTENSION * 
> (targets.length-1)) : 0;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972060#comment-14972060
 ] 

Hudson commented on HDFS-9301:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8698 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8698/])
HDFS-9301. HDFS clients can't construct HdfsConfiguration instances. (wheat9: 
rev 15eb84b37e6c0195d59d3a29fbc5b7417bf022ff)
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfigurationLoader.java
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HdfsConfiguration.java


> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972059#comment-14972059
 ] 

Hudson commented on HDFS-9290:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8698 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8698/])
HDFS-9290. DFSClient#callAppend() is not backward compatible for (kihwal: rev 
b9e0417bdf2b9655dc4256bdb43683eca1ab46be)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSClient.java


> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-23 Thread Staffan Friberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-9260:
--
Attachment: HDFS-7435.006.patch

Merged and diff:ed again due to conflict

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9298) remove replica and not add replica with wrong genStamp

2015-10-23 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972046#comment-14972046
 ] 

Jing Zhao commented on HDFS-9298:
-

For GS smaller than the current one, there may be some race conditions between 
writing recovery and block report. For example, we can have this scenario: the 
GS recorded on NN gets bumped during the recovery, while a block report that 
still contains the old GS gets delayed on the network thus is received 
afterwards. Thus it may be better for NN not to make any decision while the 
block is still UC.

> remove replica and not add replica with wrong genStamp
> --
>
> Key: HDFS-9298
> URL: https://issues.apache.org/jira/browse/HDFS-9298
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chang Li
>Assignee: Chang Li
> Attachments: HDFS-9298.1.patch
>
>
> currently, in setGenerationStampAndVerifyReplicas, replica with wrong gen 
> stamp is not really removed, only StorageLocation of that replica is removed. 
> Moreover, we should check genStamp before addReplicaIfNotPresent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7284) Add more debug info to BlockInfoUnderConstruction#setGenerationStampAndVerifyReplicas

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972044#comment-14972044
 ] 

Hadoop QA commented on HDFS-7284:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  18m 23s | Findbugs (version 3.0.0) 
appears to be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   8m  3s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 46s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  9s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 40s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   4m 38s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |   0m 24s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 29s | Tests passed in 
hadoop-hdfs-client. |
| | |  50m 46s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed build | hadoop-hdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768378/HDFS-7284.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13165/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13165/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13165/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13165/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13165/console |


This message was automatically generated.

> Add more debug info to 
> BlockInfoUnderConstruction#setGenerationStampAndVerifyReplicas
> -
>
> Key: HDFS-7284
> URL: https://issues.apache.org/jira/browse/HDFS-7284
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Hu Liu,
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Attachments: HDFS-7284.001.patch, HDFS-7284.002.patch, 
> HDFS-7284.003.patch, HDFS-7284.004.patch, HDFS-7284.005.patch
>
>
> When I was looking at some replica loss issue, I got the following info from 
> log
> {code}
> 2014-10-13 01:54:53,104 INFO BlockStateChange: BLOCK* Removing stale replica 
> from location x.x.x.x
> {code}
> I could just know that a replica is removed, but I don't know which block and 
> its timestamp. I need to know the id and timestamp of the block from the log 
> file.
> So it's better to add more info including block id and timestamp to the code 
> snippet
> {code}
> for (ReplicaUnderConstruction r : replicas) {
>   if (genStamp != r.getGenerationStamp()) {
> r.getExpectedLocation().removeBlock(this);
> NameNode.blockStateChangeLog.info("BLOCK* Removing stale replica "
> + "from location: " + r.getExpectedLocation());
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9279) Decomissioned capacity should not be considered for configured/used capacity

2015-10-23 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-9279:
--
Attachment: HDFS-9279-v2.patch

Correcting patch apply error.

> Decomissioned capacity should not be considered for configured/used capacity
> 
>
> Key: HDFS-9279
> URL: https://issues.apache.org/jira/browse/HDFS-9279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9279-v1.patch, HDFS-9279-v2.patch
>
>
> Capacity of a decommissioned node is being accounted as configured and used 
> capacity metrics. This gives incorrect perception of cluster usage.
> Once a node is decommissioned, its capacity should be considered similar to a 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9268) fuse_dfs chown crashes when uid is passed as -1

2015-10-23 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972003#comment-14972003
 ] 

Zhe Zhang commented on HDFS-9268:
-

Thanks [~cmccabe] for the clarification. Please see if you want to update the 
patch and make {{fuseConnect}} private. Otherwise +1 on the latest patch.

> fuse_dfs chown crashes when uid is passed as -1
> ---
>
> Key: HDFS-9268
> URL: https://issues.apache.org/jira/browse/HDFS-9268
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-9268.001.patch, HDFS-9268.002.patch
>
>
> JVM crashes when users attempt to use vi to update a file on fuse file system 
> with insufficient permission. (I use CDH's hadoop-fuse-dfs wrapper script to 
> generate the bug, but the same bug is reproducible in trunk)
> The root cause is a segfault in a dfs-fuse method
> To reproduce it do as follows:
> mkdir /mnt/fuse
> chmod 777 /mnt/fuse
> ulimit -c unlimited# to enable coredump
> hadoop-fuse-dfs -odebug hdfs://localhost:9000/fuse /mnt/fuse
> touch /mnt/fuse/y
> chmod 600 /mnt/fuse/y
> vim /mnt/fuse/y
> (in vim, :w to save the file)
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x003b82f27ad6, pid=26606, tid=140079005689600
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libc.so.6+0x127ad6]  __tls_get_addr@@GLIBC_2.3+0x127ad6
> #
> # Core dump written. Default location: /home/weichiu/core or core.26606
> #
> # An error report file with more information is saved as:
> # /home/weichiu/hs_err_pid26606.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> /usr/bin/hadoop-fuse-dfs: line 29: 26606 Aborted (core 
> dumped) env CLASSPATH="${CLASSPATH}" ${HADOOP_HOME}/bin/fuse_dfs $@
> ===
> The coredump shows the segfault comes from 
> (gdb) bt
> #0  0x003b82e328e5 in raise () from /lib64/libc.so.6
> #1  0x003b82e340c5 in abort () from /lib64/libc.so.6
> #2  0x7f66fc924d75 in os::abort(bool) () from 
> /etc/alternatives/jre/jre/lib/amd64/server/libjvm.so
> #3  0x7f66fcaa76d7 in VMError::report_and_die() () from 
> /etc/alternatives/jre/jre/lib/amd64/server/libjvm.so
> #4  0x7f66fc929c8f in JVM_handle_linux_signal () from 
> /etc/alternatives/jre/jre/lib/amd64/server/libjvm.so
> #5  
> #6  0x003b82f27ad6 in __strcmp_sse42 () from /lib64/libc.so.6
> #7  0x004039a0 in hdfsConnTree_RB_FIND ()
> #8  0x00403e8f in fuseConnect ()
> #9  0x004046db in dfs_chown ()
> #10 0x7f66fcf8f6d2 in ?? () from /lib64/libfuse.so.2
> #11 0x7f66fcf940d1 in ?? () from /lib64/libfuse.so.2
> #12 0x7f66fcf910ef in ?? () from /lib64/libfuse.so.2
> #13 0x003b83207851 in start_thread () from /lib64/libpthread.so.0
> #14 0x003b82ee894d in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9268) fuse_dfs chown crashes when uid is passed as -1

2015-10-23 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972001#comment-14972001
 ] 

Colin Patrick McCabe commented on HDFS-9268:


I don't think {{fuseConnect}} is used anywhere but in fuse_connect.c at this 
point, so it could be made private to that file.

> fuse_dfs chown crashes when uid is passed as -1
> ---
>
> Key: HDFS-9268
> URL: https://issues.apache.org/jira/browse/HDFS-9268
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-9268.001.patch, HDFS-9268.002.patch
>
>
> JVM crashes when users attempt to use vi to update a file on fuse file system 
> with insufficient permission. (I use CDH's hadoop-fuse-dfs wrapper script to 
> generate the bug, but the same bug is reproducible in trunk)
> The root cause is a segfault in a dfs-fuse method
> To reproduce it do as follows:
> mkdir /mnt/fuse
> chmod 777 /mnt/fuse
> ulimit -c unlimited# to enable coredump
> hadoop-fuse-dfs -odebug hdfs://localhost:9000/fuse /mnt/fuse
> touch /mnt/fuse/y
> chmod 600 /mnt/fuse/y
> vim /mnt/fuse/y
> (in vim, :w to save the file)
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x003b82f27ad6, pid=26606, tid=140079005689600
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_79-b15) (build 
> 1.7.0_79-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.79-b02 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libc.so.6+0x127ad6]  __tls_get_addr@@GLIBC_2.3+0x127ad6
> #
> # Core dump written. Default location: /home/weichiu/core or core.26606
> #
> # An error report file with more information is saved as:
> # /home/weichiu/hs_err_pid26606.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> /usr/bin/hadoop-fuse-dfs: line 29: 26606 Aborted (core 
> dumped) env CLASSPATH="${CLASSPATH}" ${HADOOP_HOME}/bin/fuse_dfs $@
> ===
> The coredump shows the segfault comes from 
> (gdb) bt
> #0  0x003b82e328e5 in raise () from /lib64/libc.so.6
> #1  0x003b82e340c5 in abort () from /lib64/libc.so.6
> #2  0x7f66fc924d75 in os::abort(bool) () from 
> /etc/alternatives/jre/jre/lib/amd64/server/libjvm.so
> #3  0x7f66fcaa76d7 in VMError::report_and_die() () from 
> /etc/alternatives/jre/jre/lib/amd64/server/libjvm.so
> #4  0x7f66fc929c8f in JVM_handle_linux_signal () from 
> /etc/alternatives/jre/jre/lib/amd64/server/libjvm.so
> #5  
> #6  0x003b82f27ad6 in __strcmp_sse42 () from /lib64/libc.so.6
> #7  0x004039a0 in hdfsConnTree_RB_FIND ()
> #8  0x00403e8f in fuseConnect ()
> #9  0x004046db in dfs_chown ()
> #10 0x7f66fcf8f6d2 in ?? () from /lib64/libfuse.so.2
> #11 0x7f66fcf940d1 in ?? () from /lib64/libfuse.so.2
> #12 0x7f66fcf910ef in ?? () from /lib64/libfuse.so.2
> #13 0x003b83207851 in start_thread () from /lib64/libpthread.so.0
> #14 0x003b82ee894d in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971997#comment-14971997
 ] 

Hudson commented on HDFS-8808:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #589 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/589/])
HDFS-8808. dfs.image.transfer.bandwidthPerSec should not apply to (zhz: rev 
ab3c4cff4af338caaa23be0ec383fc1fe473714f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java


> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Fix For: 2.8.0
>
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9264) Minor cleanup of operations on FsVolumeList#volumes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971999#comment-14971999
 ] 

Hudson commented on HDFS-9264:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #589 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/589/])
HDFS-9264. Minor cleanup of operations on FsVolumeList#volumes.  (Walter (lei: 
rev 533a2be5ac7c7f0473fdd24d6201582d08964e21)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Minor cleanup of operations on FsVolumeList#volumes
> ---
>
> Key: HDFS-9264
> URL: https://issues.apache.org/jira/browse/HDFS-9264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9264.01.patch, HDFS-9264.02.patch
>
>
> We can use {{CopyOnWriteArrayList}} to simplify the operations on 
> {{FsVolumeList#volumes}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972000#comment-14972000
 ] 

Hudson commented on HDFS-9184:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #589 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/589/])
HDFS-9184. Logging HDFS operation's caller context into audit logs. (jitendra: 
rev 600ad7bf4104bcaeec00a4089d59bb1fdf423299)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HdfsAuditLogger.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ProtoUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogAtDebug.java
* hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch, HDFS-9184.009.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9279) Decomissioned capacity should not be considered for configured/used capacity

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971986#comment-14971986
 ] 

Hadoop QA commented on HDFS-9279:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768373/HDFS-9279-v1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13169/console |


This message was automatically generated.

> Decomissioned capacity should not be considered for configured/used capacity
> 
>
> Key: HDFS-9279
> URL: https://issues.apache.org/jira/browse/HDFS-9279
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9279-v1.patch
>
>
> Capacity of a decommissioned node is being accounted as configured and used 
> capacity metrics. This gives incorrect perception of cluster usage.
> Once a node is decommissioned, its capacity should be considered similar to a 
> dead node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9252) Change TestFileTruncate to use FsDatasetTestUtils to get block file size and genstamp.

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971977#comment-14971977
 ] 

Hadoop QA commented on HDFS-9252:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  28m  4s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | javac |  14m  7s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  19m 12s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   1m  5s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 56s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 43s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   1m  1s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   4m 22s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   5m 40s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  66m 20s | Tests failed in hadoop-hdfs. |
| | | 143m 35s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | 
hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication |
|   | hadoop.hdfs.server.namenode.TestFSNamesystemMBean |
|   | hadoop.hdfs.server.namenode.TestParallelImageWrite |
|   | hadoop.hdfs.TestRecoverStripedFile |
|   | hadoop.hdfs.server.namenode.TestFSImageWithAcl |
|   | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.TestEncryptionZones |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestCreateEditsLog |
|   | org.apache.hadoop.hdfs.server.namenode.TestProcessCorruptBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12767632/HDFS-9252.02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 600ad7b |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13163/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13163/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13163/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13163/console |


This message was automatically generated.

> Change TestFileTruncate to use FsDatasetTestUtils to get block file size and 
> genstamp.
> --
>
> Key: HDFS-9252
> URL: https://issues.apache.org/jira/browse/HDFS-9252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-9252.00.patch, HDFS-9252.01.patch, 
> HDFS-9252.02.patch
>
>
> {{TestFileTruncate}} verifies block size and genstamp by directly accessing 
> the  local filesystem, e.g.:
> {code}
> assertTrue(cluster.getBlockMetadataFile(dn0,
>newBlock.getBlock()).getName().endsWith(
>newBlock.getBlock().getGenerationStamp() + ".meta"));
> {code}
> Lets abstract the fsdataset-special logic behind FsDatasetTestUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971979#comment-14971979
 ] 

Hadoop QA commented on HDFS-9260:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768368/HDFS-7435.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 15eb84b |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13167/console |


This message was automatically generated.

> Improve performance and GC friendliness of startup and FBRs
> ---
>
> Key: HDFS-9260
> URL: https://issues.apache.org/jira/browse/HDFS-9260
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode, performance
>Affects Versions: 2.7.1
>Reporter: Staffan Friberg
>Assignee: Staffan Friberg
> Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9264) Minor cleanup of operations on FsVolumeList#volumes

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971967#comment-14971967
 ] 

Hudson commented on HDFS-9264:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #576 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/576/])
HDFS-9264. Minor cleanup of operations on FsVolumeList#volumes.  (Walter (lei: 
rev 533a2be5ac7c7f0473fdd24d6201582d08964e21)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Minor cleanup of operations on FsVolumeList#volumes
> ---
>
> Key: HDFS-9264
> URL: https://issues.apache.org/jira/browse/HDFS-9264
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Walter Su
>Assignee: Walter Su
>Priority: Minor
> Fix For: 3.0.0, 2.8.0
>
> Attachments: HDFS-9264.01.patch, HDFS-9264.02.patch
>
>
> We can use {{CopyOnWriteArrayList}} to simplify the operations on 
> {{FsVolumeList#volumes}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9184) Logging HDFS operation's caller context into audit logs

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971968#comment-14971968
 ] 

Hudson commented on HDFS-9184:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #576 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/576/])
HDFS-9184. Logging HDFS operation's caller context into audit logs. (jitendra: 
rev 600ad7bf4104bcaeec00a4089d59bb1fdf423299)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
* hadoop-common-project/hadoop-common/src/main/proto/RpcHeader.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogAtDebug.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestAuditLogger.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HdfsAuditLogger.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ProtoUtil.java


> Logging HDFS operation's caller context into audit logs
> ---
>
> Key: HDFS-9184
> URL: https://issues.apache.org/jira/browse/HDFS-9184
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9184.000.patch, HDFS-9184.001.patch, 
> HDFS-9184.002.patch, HDFS-9184.003.patch, HDFS-9184.004.patch, 
> HDFS-9184.005.patch, HDFS-9184.006.patch, HDFS-9184.007.patch, 
> HDFS-9184.008.patch, HDFS-9184.009.patch
>
>
> For a given HDFS operation (e.g. delete file), it's very helpful to track 
> which upper level job issues it. The upper level callers may be specific 
> Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode 
> (NN) is abused/spammed, the operator may want to know immediately which MR 
> job should be blamed so that she can kill it. To this end, the caller context 
> contains at least the application-dependent "tracking id".
> There are several existing techniques that may be related to this problem.
> 1. Currently the HDFS audit log tracks the users of the the operation which 
> is obviously not enough. It's common that the same user issues multiple jobs 
> at the same time. Even for a single top level task, tracking back to a 
> specific caller in a chain of operations of the whole workflow (e.g.Oozie -> 
> Hive -> Yarn) is hard, if not impossible.
> 2. HDFS integrated {{htrace}} support for providing tracing information 
> across multiple layers. The span is created in many places interconnected 
> like a tree structure which relies on offline analysis across RPC boundary. 
> For this use case, {{htrace}} has to be enabled at 100% sampling rate which 
> introduces significant overhead. Moreover, passing additional information 
> (via annotations) other than span id from root of the tree to leaf is a 
> significant additional work.
> 3. In [HDFS-4680 | https://issues.apache.org/jira/browse/HDFS-4680], there 
> are some related discussion on this topic. The final patch implemented the 
> tracking id as a part of delegation token. This protects the tracking 
> information from being changed or impersonated. However, kerberos 
> authenticated connections or insecure connections don't have tokens. 
> [HADOOP-8779] proposes to use tokens in all the scenarios, but that might 
> mean changes to several upstream projects and is a major change in their 
> security implementation.
> We propose another approach to address this problem. We also treat HDFS audit 
> log as a good place for after-the-fact root cause analysis. We propose to put 
> the caller id (e.g. Hive query id) in threadlocals. Specially, on client side 
> the threadlocal object is passed to NN as a part of RPC header (optional), 
> while on sever side NN retrieves it from header and put it to {{Handler}}'s 
> threadlocals. Finally in {{FSNamesystem}}, HDFS audit logger will record the 
> caller context for each operation. In this way, the existing code is not 
> affected.
> It is still challenging to keep "lying" client from abusing the caller 
> context. Our proposal is to add a {{signature}} field to the caller context. 
> The client choose to provide its signature along with the caller id. The 
> operator may need to validate the signature at the time of offline analysis. 
> The NN is not responsible for validating the signature online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8808) dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby

2015-10-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971965#comment-14971965
 ] 

Hudson commented on HDFS-8808:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #576 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/576/])
HDFS-8808. dfs.image.transfer.bandwidthPerSec should not apply to (zhz: rev 
ab3c4cff4af338caaa23be0ec383fc1fe473714f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ImageServlet.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestBootstrapStandby.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/BootstrapStandby.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> dfs.image.transfer.bandwidthPerSec should not apply to -bootstrapStandby
> 
>
> Key: HDFS-8808
> URL: https://issues.apache.org/jira/browse/HDFS-8808
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Gautam Gopalakrishnan
>Assignee: Zhe Zhang
> Fix For: 2.8.0
>
> Attachments: HDFS-8808-00.patch, HDFS-8808-01.patch, 
> HDFS-8808-02.patch, HDFS-8808-03.patch, HDFS-8808.04.patch
>
>
> The parameter {{dfs.image.transfer.bandwidthPerSec}} can be used to limit the 
> speed with which the fsimage is copied between the namenodes during regular 
> use. However, as a side effect, this also limits transfers when the 
> {{-bootstrapStandby}} option is used. This option is often used during 
> upgrades and could potentially slow down the entire workflow. The request 
> here is to ensure {{-bootstrapStandby}} is unaffected by this bandwidth 
> setting



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971897#comment-14971897
 ] 

Kihwal Lee edited comment on HDFS-9290 at 10/23/15 9:56 PM:


The patch looks good +1.


was (Author: kihwal):
The patch looks goof +1.

> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7284) Add more debug info to BlockInfoUnderConstruction#setGenerationStampAndVerifyReplicas

2015-10-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971946#comment-14971946
 ] 

Hadoop QA commented on HDFS-7284:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  25m  7s | Findbugs (version 3.0.0) 
appears to be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |  12m 57s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  14m 22s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 38s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   3m  4s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   2m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 46s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   6m 32s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   4m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests |  56m 48s | Tests failed in hadoop-hdfs. |
| {color:green}+1{color} | hdfs tests |   0m 33s | Tests passed in 
hadoop-hdfs-client. |
| | | 127m 44s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
| Timed out tests | 
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager |
|   | 
org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | org.apache.hadoop.hdfs.server.mover.TestStorageMover |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12768378/HDFS-7284.005.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 600ad7b |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13162/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13162/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| hadoop-hdfs-client test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13162/artifact/patchprocess/testrun_hadoop-hdfs-client.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13162/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/13162/console |


This message was automatically generated.

> Add more debug info to 
> BlockInfoUnderConstruction#setGenerationStampAndVerifyReplicas
> -
>
> Key: HDFS-7284
> URL: https://issues.apache.org/jira/browse/HDFS-7284
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.5.1
>Reporter: Hu Liu,
>Assignee: Wei-Chiu Chuang
>  Labels: supportability
> Attachments: HDFS-7284.001.patch, HDFS-7284.002.patch, 
> HDFS-7284.003.patch, HDFS-7284.004.patch, HDFS-7284.005.patch
>
>
> When I was looking at some replica loss issue, I got the following info from 
> log
> {code}
> 2014-10-13 01:54:53,104 INFO BlockStateChange: BLOCK* Removing stale replica 
> from location x.x.x.x
> {code}
> I could just know that a replica is removed, but I don't know which block and 
> its timestamp. I need to know the id and timestamp of the block from the log 
> file.
> So it's better to add more info including block id and timestamp to the code 
> snippet
> {code}
> for (ReplicaUnderConstruction r : replicas) {
>   if (genStamp != r.getGenerationStamp()) {
> r.getExpectedLocation().removeBlock(this);
> NameNode.blockStateChangeLog.info("BLOCK* Removing stale replica "
> + "from location: " + r.getExpectedLocation());
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9231) fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot

2015-10-23 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971947#comment-14971947
 ] 

Xiao Chen commented on HDFS-9231:
-

Per the offline talk, we can make {{FSDirSnapshotOp#getSnapshotFiles}} more 
elegant by adding an interface on {{Snapshot}} to get the full snapshot root 
path, hence eliminating the need of pass in snapshottableDir. Attached patch 07 
to do this. Thanks Yongjun.

> fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot
> ---
>
> Key: HDFS-9231
> URL: https://issues.apache.org/jira/browse/HDFS-9231
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, 
> HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, 
> HDFS-9231.006.patch, HDFS-9231.007.patch
>
>
> Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt 
> blocks with the original file dir instead of the snapshot dir, and {{fsck 
> -list-corruptfileblocks -includeSnapshots}} behave the same.
> This can be confusing because even when the original file is deleted, fsck 
> will still show that deleted file as corrupted, although what's actually 
> corrupted is the snapshot. 
> As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9231) fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot

2015-10-23 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9231:

Status: Patch Available  (was: Open)

> fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot
> ---
>
> Key: HDFS-9231
> URL: https://issues.apache.org/jira/browse/HDFS-9231
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, 
> HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, 
> HDFS-9231.006.patch, HDFS-9231.007.patch
>
>
> Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt 
> blocks with the original file dir instead of the snapshot dir, and {{fsck 
> -list-corruptfileblocks -includeSnapshots}} behave the same.
> This can be confusing because even when the original file is deleted, fsck 
> will still show that deleted file as corrupted, although what's actually 
> corrupted is the snapshot. 
> As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-9301:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~liuml07] for the 
contribution and Steve for the reviews.

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9231) fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot

2015-10-23 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9231:

Attachment: HDFS-9231.007.patch

> fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot
> ---
>
> Key: HDFS-9231
> URL: https://issues.apache.org/jira/browse/HDFS-9231
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, 
> HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, 
> HDFS-9231.006.patch, HDFS-9231.007.patch
>
>
> Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt 
> blocks with the original file dir instead of the snapshot dir, and {{fsck 
> -list-corruptfileblocks -includeSnapshots}} behave the same.
> This can be confusing because even when the original file is deleted, fsck 
> will still show that deleted file as corrupted, although what's actually 
> corrupted is the snapshot. 
> As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7210) Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an append call from DFSClient

2015-10-23 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-7210:
-
Hadoop Flags: Incompatible change,Reviewed  (was: Reviewed)

> Avoid two separate RPC's namenode.append() and namenode.getFileInfo() for an 
> append call from DFSClient
> ---
>
> Key: HDFS-7210
> URL: https://issues.apache.org/jira/browse/HDFS-7210
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Fix For: 2.7.0
>
> Attachments: HDFS-7210-001.patch, HDFS-7210-002.patch, 
> HDFS-7210-003.patch, HDFS-7210-004.patch, HDFS-7210-005.patch, 
> HDFS-7210-006.patch
>
>
> Currently DFSClient does 2 RPCs to namenode for an append operation.
> {{append()}} for re-opening the file and getting the last block, 
> {{getFileInfo()}} Another on to get HdfsFileState
> If we can combine result of these 2 calls and make one RPC, then it can 
> reduce load on NameNode.
> For the backward compatibility we need to keep existing {{append()}} call as 
> is



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (HDFS-9301) HDFS clients can't construct HdfsConfiguration instances

2015-10-23 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai moved HADOOP-12506 to HDFS-9301:
---

Key: HDFS-9301  (was: HADOOP-12506)
Project: Hadoop HDFS  (was: Hadoop Common)

> HDFS clients can't construct HdfsConfiguration instances
> 
>
> Key: HDFS-9301
> URL: https://issues.apache.org/jira/browse/HDFS-9301
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Assignee: Mingliang Liu
> Attachments: HDFS-9241.000.patch, HDFS-9241.001.patch, 
> HDFS-9241.002.patch, HDFS-9241.003.patch, HDFS-9241.004.patch, 
> HDFS-9241.005.patch
>
>
> the changes for the hdfs client classpath make instantiating 
> {{HdfsConfiguration}} from the client impossible; it only lives server side. 
> This breaks any app which creates one.
> I know people will look at the {{@Private}} tag and say "don't do that then", 
> but it's worth considering precisely why I, at least, do this: it's the only 
> way to guarantee that the hdfs-default and hdfs-site resources get on the 
> classpath, including all the security settings. It's precisely the use case 
> which {{HdfsConfigurationLoader.init();}} offers internally to the hdfs code.
> What am I meant to do now? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9290:
-
   Resolution: Fixed
Fix Version/s: 2.7.2
   3.0.0
   Status: Resolved  (was: Patch Available)

[~twu], thanks for reporting this and providing a fix. I've committed this to 
trunk, branch-2 and branch-2.7.

> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9290:
-
Hadoop Flags: Reviewed

> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Fix For: 3.0.0, 2.7.2
>
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9231) fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot

2015-10-23 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-9231:

Status: Open  (was: Patch Available)

> fsck doesn't explicitly list when Bad Replicas/Blocks are in a snapshot
> ---
>
> Key: HDFS-9231
> URL: https://issues.apache.org/jira/browse/HDFS-9231
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: snapshots
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-9231.001.patch, HDFS-9231.002.patch, 
> HDFS-9231.003.patch, HDFS-9231.004.patch, HDFS-9231.005.patch, 
> HDFS-9231.006.patch
>
>
> Currently for snapshot files, {{fsck -list-corruptfileblocks}} shows corrupt 
> blocks with the original file dir instead of the snapshot dir, and {{fsck 
> -list-corruptfileblocks -includeSnapshots}} behave the same.
> This can be confusing because even when the original file is deleted, fsck 
> will still show that deleted file as corrupted, although what's actually 
> corrupted is the snapshot. 
> As a side note, {{fsck -files -includeSnapshots}} shows the snapshot dirs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9290) DFSClient#callAppend() is not backward compatible for slightly older NameNodes

2015-10-23 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14971897#comment-14971897
 ] 

Kihwal Lee commented on HDFS-9290:
--

The patch looks goof +1.

> DFSClient#callAppend() is not backward compatible for slightly older NameNodes
> --
>
> Key: HDFS-9290
> URL: https://issues.apache.org/jira/browse/HDFS-9290
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Tony Wu
>Assignee: Tony Wu
>Priority: Blocker
> Attachments: HDFS-9290.001.patch, HDFS-9290.002.patch
>
>
> HDFS-7210 combined 2 RPC calls used at file append into a single one. 
> Specifically {{getFileInfo()}} is combined with {{append()}}. While backward 
> compatibility for older client is handled by the new NameNode (protobuf). 
> Newer client's {{append()}} call does not work with older NameNodes. One will 
> run into an exception like the following:
> {code:java}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.isLazyPersist(DFSOutputStream.java:1741)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.getChecksum4Compute(DFSOutputStream.java:1550)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1560)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1670)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForAppend(DFSOutputStream.java:1717)
> at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1861)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1922)
> at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1892)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:340)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:336)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:318)
> at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1164)
> {code}
> The cause is that the new client code is expecting both the last block and 
> file info in the same RPC but the old NameNode only replied with the first. 
> The exception itself does not reflect this and one will have to look at the 
> HDFS source code to really understand what happened.
> We can have the client detect it's talking to a old NameNode and send an 
> extra {{getFileInfo()}} RPC. Or we should improve the exception being thrown 
> to accurately reflect the cause of failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >