date:20160825

[jira] [Commented] (HDFS-8901) Use ByteBuffer in striping positional read

2016-08-25 Thread SammiChen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438489#comment-15438489
 ] 

SammiChen commented on HDFS-8901:
-

Hi Zhe,

Thanks for your great effort to review the patch. Very good suggestions. Here 
is my thoughts. 

1. the piece of code will be moved to BlockReaderUtil, create a new readAll 
function for it. 
2. will be handled.
3. Yes. we are planning to add a ByteBuffer version read API
4. good question, I'm double checking the code. And will share my finding later.
5. will be handled.
6. Every StripingChunk will have either a ChunkByteBuffer or a ByteBuffer to 
track different buffer source. ChunkByteBuffer will be used when reusing user 
input buffer as the striping read buffer during positional read. And ByteBuffer 
will be used when read using stateful strip reader.
7. the memory copy is avoid in case chunk.useChunkBuffer is not true, that's an 
improvement.


> Use ByteBuffer in striping positional read
> --
>
> Key: HDFS-8901
> URL: https://issues.apache.org/jira/browse/HDFS-8901
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: SammiChen
> Attachments: HDFS-8901-v10.patch, HDFS-8901-v2.patch, 
> HDFS-8901-v3.patch, HDFS-8901-v4.patch, HDFS-8901-v5.patch, 
> HDFS-8901-v6.patch, HDFS-8901-v7.patch, HDFS-8901-v8.patch, 
> HDFS-8901-v9.patch, HDFS-8901.v11.patch, HDFS-8901.v12.patch, 
> HDFS-8901.v13.patch, HDFS-8901.v14.patch, initial-poc.patch
>
>
> Native erasure coder prefers to direct ByteBuffer for performance 
> consideration. To prepare for it, this change uses ByteBuffer through the 
> codes in implementing striping position read. It will also fix avoiding 
> unnecessary data copying between striping read chunk buffers and decode input 
> buffers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10803) TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails intermittently due to no free space available

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438482#comment-15438482
 ] 

Hadoop QA commented on HDFS-10803:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 25s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10803 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825596/HDFS-10803.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 688ba967f82a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 27c3b86 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16546/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16546/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16546/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails 
> intermittently due to no free space available
> 
>
> Key: HDFS-10803
>

[jira] [Updated] (HDFS-10803) TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails intermittently due to no free space available

2016-08-25 Thread Yiqun Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10803:
-
Attachment: HDFS-10803.001.patch

> TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails 
> intermittently due to no free space available
> 
>
> Key: HDFS-10803
> URL: https://issues.apache.org/jira/browse/HDFS-10803
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-10803.001.patch
>
>
> The test {{TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools}} 
> fails intermittently. The stack 
> infos(https://builds.apache.org/job/PreCommit-HDFS-Build/16534/testReport/org.apache.hadoop.hdfs.server.balancer/TestBalancerWithMultipleNameNodes/testBalancing2OutOf3Blockpools/):
> {code}
> java.io.IOException: Creating block, no free space available
>   at 
> org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset$BInfo.(SimulatedFSDataset.java:151)
>   at 
> org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset.injectBlocks(SimulatedFSDataset.java:580)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.injectBlocks(MiniDFSCluster.java:2679)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.unevenDistribution(TestBalancerWithMultipleNameNodes.java:405)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancing2OutOf3Blockpools(TestBalancerWithMultipleNameNodes.java:516)
> {code}
> The error message means that the datanode's capacity has used up and there is 
> no other space to create a new file block. 
> I looked into the code, I found the main reason seemed that the 
> {{capacities}}  for cluster is not correctly constructed in the second 
> cluster startup before preparing to redistribute blocks in test.
> The related code:
> {code}
>   // Here we do redistribute blocks nNameNodes times for each node,
>   // we need to adjust the capacities. Otherwise it will cause the no 
>   // free space errors sometimes.
>   final MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
>   .nnTopology(MiniDFSNNTopology.simpleFederatedTopology(nNameNodes))
>   .numDataNodes(nDataNodes)
>   .racks(racks)
>   .simulatedCapacities(newCapacities)
>   .format(false)
>   .build();
>   LOG.info("UNEVEN 11");
> ...
> for(int n = 0; n < nNameNodes; n++) {
>   // redistribute blocks
>   final Block[][] blocksDN = TestBalancer.distributeBlocks(
>   blocks[n], s.replication, distributionPerNN);
> 
>   for(int d = 0; d < blocksDN.length; d++)
> cluster.injectBlocks(n, d, Arrays.asList(blocksDN[d]));
>   LOG.info("UNEVEN 13: n=" + n);
> }
> {code}
> And that means the totalUsed value has been increased as 
> {{nNameNodes*usedSpacePerNN}} rather than {{usedSpacePerNN}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10803) TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails intermittently due to no free space available

2016-08-25 Thread Yiqun Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10803:
-
Status: Patch Available  (was: Open)

Attach a patch for fixing this.

> TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails 
> intermittently due to no free space available
> 
>
> Key: HDFS-10803
> URL: https://issues.apache.org/jira/browse/HDFS-10803
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>
> The test {{TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools}} 
> fails intermittently. The stack 
> infos(https://builds.apache.org/job/PreCommit-HDFS-Build/16534/testReport/org.apache.hadoop.hdfs.server.balancer/TestBalancerWithMultipleNameNodes/testBalancing2OutOf3Blockpools/):
> {code}
> java.io.IOException: Creating block, no free space available
>   at 
> org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset$BInfo.(SimulatedFSDataset.java:151)
>   at 
> org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset.injectBlocks(SimulatedFSDataset.java:580)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.injectBlocks(MiniDFSCluster.java:2679)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.unevenDistribution(TestBalancerWithMultipleNameNodes.java:405)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancing2OutOf3Blockpools(TestBalancerWithMultipleNameNodes.java:516)
> {code}
> The error message means that the datanode's capacity has used up and there is 
> no other space to create a new file block. 
> I looked into the code, I found the main reason seemed that the 
> {{capacities}}  for cluster is not correctly constructed in the second 
> cluster startup before preparing to redistribute blocks in test.
> The related code:
> {code}
>   // Here we do redistribute blocks nNameNodes times for each node,
>   // we need to adjust the capacities. Otherwise it will cause the no 
>   // free space errors sometimes.
>   final MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
>   .nnTopology(MiniDFSNNTopology.simpleFederatedTopology(nNameNodes))
>   .numDataNodes(nDataNodes)
>   .racks(racks)
>   .simulatedCapacities(newCapacities)
>   .format(false)
>   .build();
>   LOG.info("UNEVEN 11");
> ...
> for(int n = 0; n < nNameNodes; n++) {
>   // redistribute blocks
>   final Block[][] blocksDN = TestBalancer.distributeBlocks(
>   blocks[n], s.replication, distributionPerNN);
> 
>   for(int d = 0; d < blocksDN.length; d++)
> cluster.injectBlocks(n, d, Arrays.asList(blocksDN[d]));
>   LOG.info("UNEVEN 13: n=" + n);
> }
> {code}
> And that means the totalUsed value has been increased as 
> {{nNameNodes*usedSpacePerNN}} rather than {{usedSpacePerNN}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10803) TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails intermittently due to no free space available

2016-08-25 Thread Yiqun Lin (JIRA)

Yiqun Lin created HDFS-10803:


 Summary: 
TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools fails 
intermittently due to no free space available
 Key: HDFS-10803
 URL: https://issues.apache.org/jira/browse/HDFS-10803
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yiqun Lin
Assignee: Yiqun Lin


The test {{TestBalancerWithMultipleNameNodes#testBalancing2OutOf3Blockpools}} 
fails intermittently. The stack 
infos(https://builds.apache.org/job/PreCommit-HDFS-Build/16534/testReport/org.apache.hadoop.hdfs.server.balancer/TestBalancerWithMultipleNameNodes/testBalancing2OutOf3Blockpools/):
{code}
java.io.IOException: Creating block, no free space available
at 
org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset$BInfo.(SimulatedFSDataset.java:151)
at 
org.apache.hadoop.hdfs.server.datanode.SimulatedFSDataset.injectBlocks(SimulatedFSDataset.java:580)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.injectBlocks(MiniDFSCluster.java:2679)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.unevenDistribution(TestBalancerWithMultipleNameNodes.java:405)
at 
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes.testBalancing2OutOf3Blockpools(TestBalancerWithMultipleNameNodes.java:516)
{code}

The error message means that the datanode's capacity has used up and there is 
no other space to create a new file block. 

I looked into the code, I found the main reason seemed that the {{capacities}}  
for cluster is not correctly constructed in the second cluster startup before 
preparing to redistribute blocks in test.
The related code:
{code}
  // Here we do redistribute blocks nNameNodes times for each node,
  // we need to adjust the capacities. Otherwise it will cause the no 
  // free space errors sometimes.
  final MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
  .nnTopology(MiniDFSNNTopology.simpleFederatedTopology(nNameNodes))
  .numDataNodes(nDataNodes)
  .racks(racks)
  .simulatedCapacities(newCapacities)
  .format(false)
  .build();
  LOG.info("UNEVEN 11");
...
for(int n = 0; n < nNameNodes; n++) {
  // redistribute blocks
  final Block[][] blocksDN = TestBalancer.distributeBlocks(
  blocks[n], s.replication, distributionPerNN);

  for(int d = 0; d < blocksDN.length; d++)
cluster.injectBlocks(n, d, Arrays.asList(blocksDN[d]));

  LOG.info("UNEVEN 13: n=" + n);
}
{code}
And that means the totalUsed value has been increased as 
{{nNameNodes*usedSpacePerNN}} rather than {{usedSpacePerNN}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10795) Fix an error in ReaderStrategy#ByteBufferStrategy

2016-08-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438397#comment-15438397
 ] 

Hudson commented on HDFS-10795:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10350 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10350/])
HDFS-10795. Fix an error in ReaderStrategy#ByteBufferStrategy. (kai.zheng: rev 
f4a21d3abaa7c5a9f0a0d8417e81f7eaf3d1b29a)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/ReaderStrategy.java


> Fix an error in ReaderStrategy#ByteBufferStrategy
> -
>
> Key: HDFS-10795
> URL: https://issues.apache.org/jira/browse/HDFS-10795
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: SammiChen
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10795-v1.patch
>
>
> ReaderStrategy#ByteBufferStrategy's function {{readFromBlock}} allocate a 
> temp ByteBuffer, but not used. Refactor to make it more reasonable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10795) Fix an error in ReaderStrategy#ByteBufferStrategy

2016-08-25 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-10795:
-
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

Committed to 3.0.0-alpha1 and trunk branches. Thanks [~Sammi] for the 
contribution.

> Fix an error in ReaderStrategy#ByteBufferStrategy
> -
>
> Key: HDFS-10795
> URL: https://issues.apache.org/jira/browse/HDFS-10795
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: SammiChen
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10795-v1.patch
>
>
> ReaderStrategy#ByteBufferStrategy's function {{readFromBlock}} allocate a 
> temp ByteBuffer, but not used. Refactor to make it more reasonable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10795) Fix an error in ReaderStrategy#ByteBufferStrategy

2016-08-25 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-10795:
-
Hadoop Flags: Reviewed

> Fix an error in ReaderStrategy#ByteBufferStrategy
> -
>
> Key: HDFS-10795
> URL: https://issues.apache.org/jira/browse/HDFS-10795
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: SammiChen
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10795-v1.patch
>
>
> ReaderStrategy#ByteBufferStrategy's function {{readFromBlock}} allocate a 
> temp ByteBuffer, but not used. Refactor to make it more reasonable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10795) Fix an error in ReaderStrategy#ByteBufferStrategy

2016-08-25 Thread Kai Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438368#comment-15438368
 ] 

Kai Zheng commented on HDFS-10795:
--

The patch LGTM and +1. 

> Fix an error in ReaderStrategy#ByteBufferStrategy
> -
>
> Key: HDFS-10795
> URL: https://issues.apache.org/jira/browse/HDFS-10795
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HDFS-10795-v1.patch
>
>
> ReaderStrategy#ByteBufferStrategy's function {{readFromBlock}} allocate a 
> temp ByteBuffer, but not used. Refactor to make it more reasonable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10795) Fix an error in ReaderStrategy#ByteBufferStrategy

2016-08-25 Thread Kai Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-10795:
-
Summary: Fix an error in ReaderStrategy#ByteBufferStrategy  (was: Refactor 
ReaderStrategy#ByteBufferStrategy)

> Fix an error in ReaderStrategy#ByteBufferStrategy
> -
>
> Key: HDFS-10795
> URL: https://issues.apache.org/jira/browse/HDFS-10795
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HDFS-10795-v1.patch
>
>
> ReaderStrategy#ByteBufferStrategy's function {{readFromBlock}} allocate a 
> temp ByteBuffer, but not used. Refactor to make it more reasonable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438345#comment-15438345
 ] 

Hadoop QA commented on HDFS-10798:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 32s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 581 unchanged - 1 fixed = 585 total (was 582) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 32s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.qjournal.client.TestQuorumJournalManager |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10798 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825572/HDFS-10789.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 68ea9baea93d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 81485db |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16545/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16545/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16545/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U:

[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-25 Thread James Clampffer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438333#comment-15438333
 ] 

James Clampffer commented on HDFS-10754:


Looks good to me, +1.  Thanks for taking a couple more passes.  Having these 
tools is great and as a bonus they give a good set of stuff to run in a 
regression test suite.

{code}
//SetOwner DOES NOT guarantee that the handler will only be called once 
at a time, so we DO need locking in handlerSetOwner.
{code}
Thanks for making note of this.  The default of 1 worker thread has made it 
very easy for race conditions to slip in.


There's some really nitpicky c++isms that aren't worth holding this up but 
might be worth using in the future.
{code}
std::string port = (uri.get_port()) ? std::to_string(uri.get_port().value()) : 
"";
{code}
Is a good candidate for optional::value_or to simplify things.  
uri.get_port.value() is already returning a string temporary, so you don't need 
to construct another rvalue, if you're doing it for clarity about the return 
type then I can't complain.  I haven't disassembled this specific example but 
IIRC an optimized build will get rid of the second constructor anyway (not like 
it's in the critical path either).
{code}
std::string port = uri.get_port().value_or("");
{code}

libhdfspp/tools/tools_common.cpp should really be 
libhdfspp/tools/tools_common.cc.  That'll get fixed soon enough by any other 
patches that deal with tools.

{code}
 SetOwnerState(const std::string & username_, const std::string & groupname_,
  const std::function & handler_,
  uint64_t request_counter_, bool find_is_done_)
  : username(username_),
// cutting some stuff out
status(),
lock() {
{code}
You don't need status and lock in the initializer list; the default constructor 
for member variables is called implicitly.  Again, if that's more a preference 
thing I'm fine with it (or maybe it's needed and I'm missing something).


> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> ---
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch, 
> HDFS-10754.HDFS-8707.009.patch, HDFS-10754.HDFS-8707.010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10798:
-
Description: Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is 
set at 1 second. In a busy system a lower overhead might be desired. In other 
scenarios, more aggressive reporting might be desired. We should make the 
threshold configurable.  (was: Currently 
{{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. In a busy 
system this might add too much overhead. We should make the threshold 
configurable.)

> Make the threshold of reporting FSNamesystem lock contention configurable
> -
>
> Key: HDFS-10798
> URL: https://issues.apache.org/jira/browse/HDFS-10798
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: logging, namenode
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>  Labels: newbie
> Attachments: HDFS-10789.001.patch
>
>
> Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. 
> In a busy system a lower overhead might be desired. In other scenarios, more 
> aggressive reporting might be desired. We should make the threshold 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10652) Add a unit test for HDFS-4660

2016-08-25 Thread Vinayakumar B (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438289#comment-15438289
 ] 

Vinayakumar B commented on HDFS-10652:
--

Thanks [~yzhangal] for finding the problem in test. 

Updated the test looks good to me. 
I think it could be better to change {{System.out.println(..)}} change to 
{{LOG.info()}} in test.
Rest all looks fine.
+1 once addressed.

Sorry I am currently in travel and not able to update the patch myself. Thanks 
for taking care of this.

> Add a unit test for HDFS-4660
> -
>
> Key: HDFS-10652
> URL: https://issues.apache.org/jira/browse/HDFS-10652
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Vinayakumar B
> Attachments: HDFS-10652-002.patch, HDFS-10652.001.patch, 
> HDFS-10652.003.patch, HDFS-10652.004.patch, HDFS-10652.005.patch, 
> HDFS-10652.006.patch, HDFS-10652.007.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10652) Add a unit test for HDFS-4660

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438271#comment-15438271
 ] 

Hadoop QA commented on HDFS-10652:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m  2s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSShell |
|   | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10652 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825562/HDFS-10652.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 58bdef09db21 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 81485db |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16544/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16544/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16544/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add a unit test for HDFS-4660
> -
>
> Key: HDFS-10652
> URL: https://issues.apache.org/jira/browse/HDFS-10652
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Vinayakumar B
> Attachments:

[jira] [Commented] (HDFS-10748) TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout

2016-08-25 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438241#comment-15438241
 ] 

Yiqun Lin commented on HDFS-10748:
--

Thanks [~xyao] for the review and commit!

> TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout
> 
>
> Key: HDFS-10748
> URL: https://issues.apache.org/jira/browse/HDFS-10748
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiaoyu Yao
>Assignee: Yiqun Lin
> Fix For: 2.8.0
>
> Attachments: HDFS-10748.001.patch, HDFS-10748.002.patch
>
>
> This was fixed by HDFS-7886. But some recent [Jenkins 
> Results|https://builds.apache.org/job/PreCommit-HDFS-Build/16390/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
>  started seeing this again: 
> {code}
> Tests run: 18, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 172.025 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestFileTruncate
> testTruncateWithDataNodesRestart(org.apache.hadoop.hdfs.server.namenode.TestFileTruncate)
>   Time elapsed: 43.861 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Timed out waiting for 
> /test/testTruncateWithDataNodesRestart to reach 3 replicas
>   at 
> org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:751)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestart(TestFileTruncate.java:704)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-10798:
---
Attachment: HDFS-10789.001.patch

> Make the threshold of reporting FSNamesystem lock contention configurable
> -
>
> Key: HDFS-10798
> URL: https://issues.apache.org/jira/browse/HDFS-10798
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: logging, namenode
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>  Labels: newbie
> Attachments: HDFS-10789.001.patch
>
>
> Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. 
> In a busy system this might add too much overhead. We should make the 
> threshold configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-10798:
---
Status: Patch Available  (was: In Progress)

Add configuration 'dfs.namenode.write-lock.reporting.threshold.ms' to configure 
this value. 

> Make the threshold of reporting FSNamesystem lock contention configurable
> -
>
> Key: HDFS-10798
> URL: https://issues.apache.org/jira/browse/HDFS-10798
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: logging, namenode
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>  Labels: newbie
>
> Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. 
> In a busy system this might add too much overhead. We should make the 
> threshold configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Manoj Govindassamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438210#comment-15438210
 ] 

Manoj Govindassamy commented on HDFS-10793:
---

-- TestNameNodeMetadataConsistency failure is not related to this patch
-- Manually verified the patch as mentioned in comment 1 and 2
-- All check styling issues are related to number of arguments in methods 
exceeding recommended count of 7

> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch, HDFS-10793.002.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438204#comment-15438204
 ] 

Hadoop QA commented on HDFS-10793:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 182 unchanged - 2 fixed = 185 total (was 184) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 50s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10793 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825549/HDFS-10793.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 433ded302311 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 81485db |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16543/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16543/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16543/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16543/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix HdfsAuditLogger

[jira] [Commented] (HDFS-10652) Add a unit test for HDFS-4660

2016-08-25 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438191#comment-15438191
 ] 

Yongjun Zhang commented on HDFS-10652:
--

Added comment instead to address the second comment from Wei-Chiu, and uploaded 
rev 007.

Thanks for taking further look [~jojochuang] and [~vinayrpet].


> Add a unit test for HDFS-4660
> -
>
> Key: HDFS-10652
> URL: https://issues.apache.org/jira/browse/HDFS-10652
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Vinayakumar B
> Attachments: HDFS-10652-002.patch, HDFS-10652.001.patch, 
> HDFS-10652.003.patch, HDFS-10652.004.patch, HDFS-10652.005.patch, 
> HDFS-10652.006.patch, HDFS-10652.007.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10652) Add a unit test for HDFS-4660

2016-08-25 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-10652:
-
Attachment: HDFS-10652.007.patch

> Add a unit test for HDFS-4660
> -
>
> Key: HDFS-10652
> URL: https://issues.apache.org/jira/browse/HDFS-10652
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Vinayakumar B
> Attachments: HDFS-10652-002.patch, HDFS-10652.001.patch, 
> HDFS-10652.003.patch, HDFS-10652.004.patch, HDFS-10652.005.patch, 
> HDFS-10652.006.patch, HDFS-10652.007.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10794) [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the block storage movement work

2016-08-25 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10794:
---
Summary: [SPS]: Provide storage policy satisfy worker at DN for 
co-ordinating the block storage movement work  (was: Provide storage policy 
satisfy worker at DN for co-ordinating the block storage movement work)

> [SPS]: Provide storage policy satisfy worker at DN for co-ordinating the 
> block storage movement work
> 
>
> Key: HDFS-10794
> URL: https://issues.apache.org/jira/browse/HDFS-10794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10794-00.patch
>
>
> The idea of this jira is to implement a mechanism to move the blocks to the 
> given target in order to satisfy the block storage policy. Datanode receives 
> {{blocktomove}} details via heart beat response from NN. More specifically, 
> its a datanode side extension to handle the block storage movement commands.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10802) [SPS]: Add satisfyStoragePolicy API in HdfsAdmin

2016-08-25 Thread Uma Maheswara Rao G (JIRA)

Uma Maheswara Rao G created HDFS-10802:
--

 Summary: [SPS]: Add satisfyStoragePolicy API in HdfsAdmin
 Key: HDFS-10802
 URL: https://issues.apache.org/jira/browse/HDFS-10802
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs-client
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


This JIRA is to track the work for adding user/admin API for calling to 
satisfyStoragePolicy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10801) [SPS]: Protocol buffer changes for sending storage movement commands from NN to DN

2016-08-25 Thread Uma Maheswara Rao G (JIRA)

Uma Maheswara Rao G created HDFS-10801:
--

 Summary: [SPS]: Protocol buffer changes for sending storage 
movement commands from NN to DN 
 Key: HDFS-10801
 URL: https://issues.apache.org/jira/browse/HDFS-10801
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Uma Maheswara Rao G
Assignee: Rakesh R


This JIRA is for tracking the work of protocol buffer changes for sending the 
storage movement commands from NN to DN



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-08-25 Thread Sean Mackrory (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438128#comment-15438128
 ] 

Sean Mackrory commented on HDFS-10797:
--

To reproduce the discrepancy you can follow the following procedure. I put a 
100 MB file into HDFS and snapshot it (hadoop fs -du -s reports 100 MB * 
replication after both operations), and then append another 100 MB onto it 
(hadoop fs -du -s will report 200 MB * replication factor at that point). If I 
move the file to trash or simply rename it, hadoop fs -du -s starts reporting 
300 MB * replication factor in the second column. I believe at this point it is 
counting some of the overlap in block between the snapshot and the regular file 
twice, because it views the move operation the same as a delete, but since the 
file wasn't actually deleted it gets counted again.
{quote}
dd if=/dev/zero of=100MB.zero bs=1 count=1
bin/hadoop fs -mkdir -p /user/sean
bin/hadoop fs -chown sean /user/sean
bin/hadoop fs -put 100MB.zero /user/sean/HDFS-10797

bin/hdfs dfsadmin -allowSnapshot /user/sean
bin/hdfs dfs -createSnapshot /user/sean s1

bin/hadoop fs -appendToFile 100MB.zero /user/sean/HDFS-10797

bin/hadoop fs -du -s /user/sean

bin/hadoop fs -rm /user/sean/HDFS-10797 # or simply rename with mv
bin/hadoop fs -du -s /user/sean
{quote}

> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Sean Mackrory
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how 
> much disk usage is used by a snapshot by tallying up the files in the 
> snapshot that have since been deleted (that way it won't overlap with regular 
> files whose disk usage is computed separately). However that is determined 
> from a diff that shows moved (to Trash or otherwise) or renamed files as a 
> deletion and a creation operation that may overlap with the list of blocks. 
> Only the deletion operation is taken into consideration, and this causes 
> those blocks to get represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10800) [SPS]: Storage Policy Satisfier daemon thread in Namenode to find the blocks which were placed in wrong storages than what NN is expecting.

2016-08-25 Thread Uma Maheswara Rao G (JIRA)

Uma Maheswara Rao G created HDFS-10800:
--

 Summary: [SPS]: Storage Policy Satisfier daemon thread in Namenode 
to find the blocks which were placed in wrong storages than what NN is 
expecting.
 Key: HDFS-10800
 URL: https://issues.apache.org/jira/browse/HDFS-10800
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


This JIRA is for implementing a daemon thread called StoragePolicySatisfier in 
nematode, which should scan the asked files blocks which were placed in wrong 
storages in DNs. 

 The idea is:
  # When user called on some files/dirs for satisfyStorage policy, They 
should have tracked in NN and then StoragePolicyDaemon thread will pick one by 
one file and then check the blocks which might have placed in wrong storage in 
DN than what NN is expecting it to.
  # After checking all, it should also construct the data structures for 
the required information to move a block from one storage to another.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9507) LeaseRenewer Logging Under-Reporting

2016-08-25 Thread Sameer Abhyankar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438051#comment-15438051
 ] 

Sameer Abhyankar commented on HDFS-9507:


Hello - I was looking through the code for this and it seems that 
DFSClient#renewLease() will only return a false when either the client is not 
running or there are no files being written. Otherwise DFSClient#renewLease 
will either renew the lease successfully and return "true" or it will throw an 
Exception (which will bubble up). The behavior of LeaseRenewer#renew to either 
log a warning or bubble up the Exception would be valid, correct?

As for the LOG level, I agree it should be logged as a warn instead of a debug.

> LeaseRenewer Logging Under-Reporting
> 
>
> Key: HDFS-9507
> URL: https://issues.apache.org/jira/browse/HDFS-9507
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: BELUGA BEHR
>Priority: Minor
>
> Why is it that in LeaseRenewer#run() failures to renew a lease on a file are 
> reported with "warn" level logging, but in LeaseRenewer#renew() it is 
> reported with a "debug" level warn?
> In LeaseRenewer#renew(), if the method renewLease() returns 'false' then the 
> problem is silently discarded (continue, no Exception is thrown) and the next 
> client in the list tries to renew.
> {code:title=LeaseRenewer.java|borderStyle=solid}
> private void run(final int id) throws InterruptedException {
>   ...
>   try {
> renew();
> lastRenewed = Time.monotonicNow();
>   } catch (SocketTimeoutException ie) {
> LOG.warn("Failed to renew lease for " + clientsString() + " for "
> + (elapsed/1000) + " seconds.  Aborting ...", ie);
> synchronized (this) {
>   while (!dfsclients.isEmpty()) {
> DFSClient dfsClient = dfsclients.get(0);
> dfsClient.closeAllFilesBeingWritten(true);
> closeClient(dfsClient);
>   }
>   //Expire the current LeaseRenewer thread.
>   emptyTime = 0;
> }
> break;
>   } catch (IOException ie) {
> LOG.warn("Failed to renew lease for " + clientsString() + " for "
>   + (elapsed/1000) + " seconds.  Will retry shortly ...", ie);
>   }
> }
> ...
> }
> private void renew() throws IOException {
> {
>...
> if (!c.renewLease()) {
>   LOG.debug("Did not renew lease for client {}", c);
>   continue;
> }
>...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Manoj Govindassamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-10793:
--
Attachment: HDFS-10793.002.patch

Attaching v002 patch -- took care of styling issues.

> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch, HDFS-10793.002.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10798:
-
Component/s: logging

> Make the threshold of reporting FSNamesystem lock contention configurable
> -
>
> Key: HDFS-10798
> URL: https://issues.apache.org/jira/browse/HDFS-10798
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: logging, namenode
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>  Labels: newbie
>
> Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. 
> In a busy system this might add too much overhead. We should make the 
> threshold configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work started] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-10798 started by Erik Krogen.
--
> Make the threshold of reporting FSNamesystem lock contention configurable
> -
>
> Key: HDFS-10798
> URL: https://issues.apache.org/jira/browse/HDFS-10798
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>  Labels: newbie
>
> Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. 
> In a busy system this might add too much overhead. We should make the 
> threshold configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437863#comment-15437863
 ] 

Kihwal Lee commented on HDFS-9145:
--

I can reproduce it consistently. But when I tried the approach in the latest 
patch in HDFS-8915, it passes. If I undo the patch, the test failure is 
reproduced 100% times. Let's get HDFS-8915 moving.

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch, testlog.txt
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437857#comment-15437857
 ] 

Mingliang Liu edited comment on HDFS-9145 at 8/25/16 10:40 PM:
---

[~kihwal] are you able to reproduce this failure consistently? How about the 
test without this patch?

On my local machine, I can not reproduce the bug on Java8 against the 
{{branch-2.7}} specifying {{TestFSNamesystem}} class (9 test cases).


was (Author: liuml07):
[~kihwal] are you able to reproduce this failure consistently? How about the 
test without this patch?

On my local machine, I can not reproduce the bug on Java8 against the 
{{branch-2}} specifying {{TestFSNamesystem}} class (9 test cases).

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch, testlog.txt
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437857#comment-15437857
 ] 

Mingliang Liu commented on HDFS-9145:
-

[~kihwal] are you able to reproduce this failure consistently? How about the 
test without this patch?

On my local machine, I can not reproduce the bug on Java8 against the 
{{branch-2}} specifying {{TestFSNamesystem}} class (9 test cases).

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch, testlog.txt
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-9145:
-
Attachment: testlog.txt

I am not sure how helpful this log will be. I am using openjdk 1.8.0_101 on my 
box.

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch, testlog.txt
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437840#comment-15437840
 ] 

Kihwal Lee commented on HDFS-9145:
--

For me it fails when I run {{TestFSNamesystem}}, but passes when I specify 
{{TestFSNamesystem#testFSLockGetWaiterCount}}.
I will get the test log and upload here.

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437821#comment-15437821
 ] 

Mingliang Liu commented on HDFS-10793:
--

Looks good once Andrew's comment is addressed. Thanks.

> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Manoj Govindassamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437807#comment-15437807
 ] 

Manoj Govindassamy commented on HDFS-10793:
---

Sure [~andrew.wang]. Will do the suggested changes. Will wait for others review 
so that I can post the next patch with all comments incorporated. Thanks for 
the review.

> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437796#comment-15437796
 ] 

Zhe Zhang commented on HDFS-9145:
-

Thanks [~liuml07], it looks very likely. 


> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437792#comment-15437792
 ] 

Mingliang Liu edited comment on HDFS-9145 at 8/25/16 10:04 PM:
---

Hm Is this an unrelated bug [HDFS-8915]?


was (Author: liuml07):
Hm Is this a unrelated bug [HDFS-8915]?

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437792#comment-15437792
 ] 

Mingliang Liu commented on HDFS-9145:
-

Hm Is this a unrelated bug [HDFS-8915]?

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437781#comment-15437781
 ] 

Zhe Zhang commented on HDFS-9145:
-

I can't reproduce the {{testFSLockGetWaiterCount}} failure locally. [~kihwal] 
[~liuml07] How about in your local environments?

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437773#comment-15437773
 ] 

Zhe Zhang commented on HDFS-9145:
-

Thanks Kihwal. I'm trying to fix this.

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10475) Adding metrics for long FSD lock

2016-08-25 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437760#comment-15437760
 ] 

Zhe Zhang commented on HDFS-10475:
--

Thanks Xiaoyu for clarifying this. Do you still plan to work on it? Some of my 
colleagues might be interested in implementing this if you are OK with it.

> Adding metrics for long FSD lock
> 
>
> Key: HDFS-10475
> URL: https://issues.apache.org/jira/browse/HDFS-10475
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>
> This is a follow up of the comment on HADOOP-12916 and 
> [here|https://issues.apache.org/jira/browse/HDFS-9924?focusedCommentId=15310837=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15310837]
>  add more metrics and WARN/DEBUG logs for long FSD/FSN locking operations on 
> namenode similar to what we have for slow write/network WARN/metrics on 
> datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437742#comment-15437742
 ] 

Kihwal Lee commented on HDFS-9145:
--

There is a test failure in branch-2.7 after this.
{noformat}
---
 T E S T S
---
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was 
removed in 8.0
Running org.apache.hadoop.hdfs.server.namenode.TestFSNamesystem
Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.741 sec <<< 
FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestFSNamesystem
testFSLockGetWaiterCount(org.apache.hadoop.hdfs.server.namenode.TestFSNamesystem)
  Time elapsed: 0.004 sec  <<< FAILURE!
java.lang.AssertionError: Expected number of blocked thread not found 
expected:<3> but was:<2>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.hdfs.server.namenode.TestFSNamesystem.testFSLockGetWaiterCount(TestFSNamesystem.java:244)
{noformat}

It seems to pass when this test case is run separately. It might be due to 
interactions with other tests in the suite.

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437718#comment-15437718
 ] 

Andrew Wang commented on HDFS-10793:


Overall looks great to me, thanks for picking this up Manoj. One little nit, 
could we undo some of the whitespace changes? I think your IDE is configured to 
align parameters, but I think our normal convention is to double indent. I also 
would mildly prefer if you generate the patch without "--no-prefix", since then 
I can apply with just "git apply hdfs.patch".

[~arpitagarwal] / [~liuml07] either of you want to review too? Should be a 
quick one.

> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10768) Optimize mkdir ops

2016-08-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437708#comment-15437708
 ] 

Kihwal Lee commented on HDFS-10768:
---

I will review the latest patch by tomorrow.

> Optimize mkdir ops
> --
>
> Key: HDFS-10768
> URL: https://issues.apache.org/jira/browse/HDFS-10768
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10768.1.patch, HDFS-10768.patch
>
>
> Directory creation causes excessive object allocation: ex. an immutable list 
> builder, containing the string of components converted from the IIP's 
> byte[]s, sublist views of the string list, iterable, followed by string to 
> byte[] conversion.  This can all be eliminated by accessing the component's 
> byte[] in the IIP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10768) Optimize mkdir ops

2016-08-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437701#comment-15437701
 ] 

Kihwal Lee commented on HDFS-10768:
---

{{TestEditLogJournalFailures}} was broken by HADOOP-13465. It has been reverted 
from trunk.
{{TestRollingFileSystemSinkWithHdfs}} passes. Cannot reproduce the failure.

> Optimize mkdir ops
> --
>
> Key: HDFS-10768
> URL: https://issues.apache.org/jira/browse/HDFS-10768
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10768.1.patch, HDFS-10768.patch
>
>
> Directory creation causes excessive object allocation: ex. an immutable list 
> builder, containing the string of components converted from the IIP's 
> byte[]s, sublist views of the string list, iterable, followed by string to 
> byte[] conversion.  This can all be eliminated by accessing the component's 
> byte[] in the IIP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10742) Measurement of lock held time in FsDatasetImpl

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437671#comment-15437671
 ] 

Hadoop QA commented on HDFS-10742:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock |
|   | hadoop.hdfs.server.namenode.TestEditLogJournalFailures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10742 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825515/HDFS-10742.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ab90c40c203a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1360bd2 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16542/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16542/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16542/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Measurement of lock held time in FsDatasetImpl
> --
>
> Key: HDFS-10742
> URL:

[jira] [Updated] (HDFS-8617) Throttle DiskChecker#checkDirs() speed.

2016-08-25 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-8617:

   Resolution: Duplicate
Fix Version/s: 3.0.0-beta1
   2.9.0
   Status: Resolved  (was: Patch Available)

Thanks [~arpitagarwal] to point out. I will close this JIRA as duplicated.

> Throttle DiskChecker#checkDirs() speed.
> ---
>
> Key: HDFS-8617
> URL: https://issues.apache.org/jira/browse/HDFS-8617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.7.0
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: HDFS-8617.000.patch
>
>
> As described in HDFS-8564,  {{DiskChecker.checkDirs(finalizedDir)}} is 
> causing excessive I/Os because {{finalizedDirs}} might have up to 64K 
> sub-directories (HDFS-6482).
> This patch proposes to limit the rate of IO operations in 
> {{DiskChecker.checkDirs()}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10792) RedundantEditLogInputStream should log caught exceptions

2016-08-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10792:
---
Issue Type: Improvement  (was: Bug)

> RedundantEditLogInputStream should log caught exceptions
> 
>
> Key: HDFS-10792
> URL: https://issues.apache.org/jira/browse/HDFS-10792
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Minor
>  Labels: supportability
> Attachments: HDFS-10792.01.patch
>
>
> There are a few places in {{RedundantEditLogInputStream}} where an 
> IOException is caught but never logged. We should improve the logging of 
> these exceptions to help debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work started] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests

2016-08-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-10799 started by Wei-Chiu Chuang.
--
> NameNode should use loginUser(hdfs) to serve iNotify requests
> -
>
> Key: HDFS-10799
> URL: https://issues.apache.org/jira/browse/HDFS-10799
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
> Environment: Kerberized, HA cluster, iNotify client, CDH5.7.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10799.001.patch
>
>
> When a NameNode serves iNotify requests from a client, it verifies the client 
> has superuser permission and then uses the client's Kerberos principal to 
> read edits from journal nodes.
> However, if the client does not renew its tgt tickets, the connection from 
> NameNode to journal nodes may fail. In which case, the NameNode thinks the 
> edits are corrupt, and prints a scary error message:
> "During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!"
> However, the edits are actually good. NameNode _should not freak out when an 
> iNotify client's tgt ticket expires_.
> I think that an easy solution to this bug, is that after NameNode verifies 
> client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then 
> read edits. This will make sure the operation does not fail due to an expired 
> client ticket.
> Excerpt of related logs:
> {noformat}
> 2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:h...@example.com (auth:KERBEROS) 
> cause:java.io.IOException: We encountered an error reading 
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
>  
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
>   During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!
> 2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 112 on 8020, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client 
> IP:port] Call#73 Retry#0
> java.io.IOException: We encountered an error reading 
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
>  
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
>   During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Manoj Govindassamy (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437625#comment-15437625
 ] 

Manoj Govindassamy commented on HDFS-10793:
---

-- All 3 check style issues are because of number of arguments in the method 
are beyond 7. Not introduced anything new, just a bit refactoring.
-- TestEditLogJournalFailures are not related to this patch and it fails even 
without this patch.



> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437597#comment-15437597
 ] 

Hadoop QA commented on HDFS-10793:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 182 unchanged - 2 fixed = 185 total (was 184) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 18s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestEditLogJournalFailures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10793 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825511/HDFS-10793.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 29d00ef89787 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1360bd2 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16541/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16541/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16541/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16541/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix HdfsAuditLogger binary

[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437491#comment-15437491
 ] 

Hadoop QA commented on HDFS-10754:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
27s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
40s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
38s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
9s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
7s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
8s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
6s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Issue | HDFS-10754 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825510/HDFS-10754.HDFS-8707.010.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  javadoc  
mvninstall  |
| uname | Linux fc80ed71065d 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / 4a74bc4 |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_101 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| JDK v1.7.0_101  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16540/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
|

[jira] [Updated] (HDFS-10742) Measurement of lock held time in FsDatasetImpl

2016-08-25 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-10742:
--
Attachment: HDFS-10742.008.patch

fix checkstyle

> Measurement of lock held time in FsDatasetImpl
> --
>
> Key: HDFS-10742
> URL: https://issues.apache.org/jira/browse/HDFS-10742
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0-alpha2
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10742.001.patch, HDFS-10742.002.patch, 
> HDFS-10742.003.patch, HDFS-10742.004.patch, HDFS-10742.005.patch, 
> HDFS-10742.006.patch, HDFS-10742.007.patch, HDFS-10742.008.patch
>
>
> This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is 
> held by a thread. Doing so will allow us to measure lock statistics.
> This can be done by extending the {{AutoCloseableLock}} lock object in 
> {{FsDatasetImpl}}. In the future we can also consider replacing the lock with 
> a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10742) Measurement of lock held time in FsDatasetImpl

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437482#comment-15437482
 ] 

Hadoop QA commented on HDFS-10742:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 24s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 109 unchanged - 0 fixed = 113 total (was 109) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestHASafeMode |
|   | hadoop.hdfs.server.namenode.TestEditLogJournalFailures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10742 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825507/HDFS-10742.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c2ecb97868a7 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1360bd2 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16539/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16539/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16539/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16539/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was

[jira] [Commented] (HDFS-10768) Optimize mkdir ops

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437483#comment-15437483
 ] 

Hadoop QA commented on HDFS-10768:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 58 unchanged - 1 fixed = 58 total (was 59) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | hadoop.hdfs.server.namenode.TestEditLogJournalFailures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10768 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825502/HDFS-10768.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1576a6028302 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1360bd2 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16538/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16538/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16538/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Optimize mkdir ops
> --
>
> Key: HDFS-10768
> URL: https://issues.apache.org/jira/browse/HDFS-10768
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>

[jira] [Updated] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Manoj Govindassamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-10793:
--
Status: Patch Available  (was: Open)

> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests

2016-08-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10799:
---
Description: 
When a NameNode serves iNotify requests from a client, it verifies the client 
has superuser permission and then uses the client's Kerberos principal to read 
edits from journal nodes.

However, if the client does not renew its tgt tickets, the connection from 
NameNode to journal nodes may fail. In which case, the NameNode thinks the 
edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!"

However, the edits are actually good. NameNode _should not freak out when an 
iNotify client's tgt ticket expires_.

I think that an easy solution to this bug, is that after NameNode verifies 
client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then 
read edits. This will make sure the operation does not fail due to an expired 
client ticket.

Excerpt of related logs:
{noformat}
2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:h...@example.com (auth:KERBEROS) 
cause:java.io.IOException: We encountered an error reading 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
  During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!
2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
112 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client 
IP:port] Call#73 Retry#0
java.io.IOException: We encountered an error reading 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
  During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
{noformat}

  was:
When a NameNode serves iNotify requests from a client, it verifies the client 
has superuser permission and then uses the client's Kerberos principal to read 
edits from journal nodes.

However, if the client does not renew its tgt tickets, the connection from 
NameNode to journal nodes may fail. In which case, the NameNode thinks the 
edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!"

However, the edits are actually good. NameNode _should not freak out when an 
iNotify client's tgt ticket

[jira] [Comment Edited] (HDFS-10652) Add a unit test for HDFS-4660

2016-08-25 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437462#comment-15437462
 ] 

Yongjun Zhang edited comment on HDFS-10652 at 8/25/16 7:02 PM:
---

Thanks [~jojochuang] for reviewing.

I on-purposely added those messages. My thinking is,  it's worthwhile since it 
helps debugging unit test, and it may help adding clarity in real cluster too. 
Do you agree?

I will address your other comment soon.

Thanks.



was (Author: yzhangal):
Thanks [~jojochuang] for reviewing.

I on-purposely added those messages. My thinking is,  it's worthwhile since it 
helps debugging unit test, and it may help adding clarity in real cluster too.

Thanks.


> Add a unit test for HDFS-4660
> -
>
> Key: HDFS-10652
> URL: https://issues.apache.org/jira/browse/HDFS-10652
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Vinayakumar B
> Attachments: HDFS-10652-002.patch, HDFS-10652.001.patch, 
> HDFS-10652.003.patch, HDFS-10652.004.patch, HDFS-10652.005.patch, 
> HDFS-10652.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10652) Add a unit test for HDFS-4660

2016-08-25 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437462#comment-15437462
 ] 

Yongjun Zhang commented on HDFS-10652:
--

Thanks [~jojochuang] for reviewing.

I on-purposely added those messages. My thinking is,  it's worthwhile since it 
helps debugging unit test, and it may help adding clarity in real cluster too.

Thanks.


> Add a unit test for HDFS-4660
> -
>
> Key: HDFS-10652
> URL: https://issues.apache.org/jira/browse/HDFS-10652
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Vinayakumar B
> Attachments: HDFS-10652-002.patch, HDFS-10652.001.patch, 
> HDFS-10652.003.patch, HDFS-10652.004.patch, HDFS-10652.005.patch, 
> HDFS-10652.006.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests

2016-08-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10799:
---
Attachment: HDFS-10799.001.patch

v01: a quick fix. Need tests, and probably need to wrap 
NameNodeRpcServer#getCurrentEditLogTxid as well.

> NameNode should use loginUser(hdfs) to serve iNotify requests
> -
>
> Key: HDFS-10799
> URL: https://issues.apache.org/jira/browse/HDFS-10799
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
> Environment: Kerberized, HA cluster, iNotify client, CDH5.7.0
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10799.001.patch
>
>
> When a NameNode serves iNotify requests from a client, it verifies the client 
> has superuser permission and then uses the client's Kerberos principal to 
> read edits from journal nodes.
> However, if the client does not renew its tgt tickets, the connection from 
> NameNode to journal nodes may fail. In which case, the NameNode thinks the 
> edits are corrupt, and prints a scary error message:
> "During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!"
> However, the edits are actually good. NameNode _should not freak out when an 
> iNotify client's tgt ticket expires_.
> I think that an easy solution to this bug, is that after NameNode verifies 
> client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then 
> read edits. This will make sure the operation does not fail due to an expired 
> client ticket.
> Expert of related logs:
> {noformat}
> 2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:h...@example.com (auth:KERBEROS) 
> cause:java.io.IOException: We encountered an error reading 
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
>  
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
>   During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!
> 2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 112 on 8020, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client 
> IP:port] Call#73 Retry#0
> java.io.IOException: We encountered an error reading 
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
>  
> http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
>   During automatic edit log failover, we noticed that all of the remaining 
> edit log streams are shorter than the current one!  The best remaining edit 
> log ends at transaction 11577603, but we thought we could read up to 
> transaction 11577606.  If you continue, metadata will be lost forever!
> at 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213)
> at 
> org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
>

[jira] [Updated] (HDFS-10793) Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184

2016-08-25 Thread Manoj Govindassamy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-10793:
--
Attachment: HDFS-10793.001.patch

Attaching v001 patch to address binary compatibility issue with 
HdfsAuditLogger. 

> Fix HdfsAuditLogger binary incompatibility introduced by HDFS-9184
> --
>
> Key: HDFS-10793
> URL: https://issues.apache.org/jira/browse/HDFS-10793
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Andrew Wang
>Assignee: Manoj Govindassamy
>Priority: Blocker
> Attachments: HDFS-10793.001.patch
>
>
> HDFS-9184 added a new parameter to an existing method signature in 
> HdfsAuditLogger, which is a Public/Evolving class. This breaks binary 
> compatibility with implementing subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-4660) Block corruption can happen during pipeline recovery

2016-08-25 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437454#comment-15437454
 ] 

Yongjun Zhang commented on HDFS-4660:
-

Thank you very much [~nroberts]!

> Block corruption can happen during pipeline recovery
> 
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.3-alpha, 3.0.0-alpha1
>Reporter: Peng Zhang
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.6.4
>
> Attachments: HDFS-4660.br26.patch, HDFS-4660.patch, HDFS-4660.patch, 
> HDFS-4660.v2.patch, periodic_hflush.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests

2016-08-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10799:
---
Description: 
When a NameNode serves iNotify requests from a client, it verifies the client 
has superuser permission and then uses the client's Kerberos principal to read 
edits from journal nodes.

However, if the client does not renew its tgt tickets, the connection from 
NameNode to journal nodes may fail. In which case, the NameNode thinks the 
edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!"

However, the edits are actually good. NameNode _should not freak out when an 
iNotify client's tgt ticket expires_.

I think that an easy solution to this bug, is that after NameNode verifies 
client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then 
read edits. This will make sure the operation does not fail due to an expired 
client ticket.

Expert of related logs:
{noformat}
2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:h...@example.com (auth:KERBEROS) 
cause:java.io.IOException: We encountered an error reading 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
  During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!
2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
112 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client 
IP:port] Call#73 Retry#0
java.io.IOException: We encountered an error reading 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
  During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
{noformat}

  was:
When a NameNode serves iNotify requests from a client, it verifies the client 
has superuser permission and then uses the client's Kerberos principal to read 
edits from journal nodes.

However, if the client does not renew its tgt tickets, the connection from 
NameNode to journal nodes may fail. In which case, the NameNode thinks the 
edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!"

However, the edits are actually good. NameNode _should not freak out when an 
iNotify client's tgt ticket

[jira] [Updated] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests

2016-08-25 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-10799:
---
Description: 
When a NameNode serves iNotify requests from a client, it verifies the client 
has superuser permission and then uses the client's Kerberos principal to read 
edits from journal nodes.

However, if the client does not renew its tgt tickets, the connection from 
NameNode to journal nodes may fail. In which case, the NameNode thinks the 
edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!"

However, the edits are actually good. NameNode _should not freak out when an 
iNotify client's tgt ticket expires_.

I think that an easy solution to this bug, is that after NameNode verifies 
client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then 
read edits. This will make sure the operation does not fail due to an expired 
client ticket.

Appendix:
{noformat}
2016-08-18 19:05:13,979 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as:h...@example.com (auth:KERBEROS) 
cause:java.io.IOException: We encountered an error reading 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
  During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!
2016-08-18 19:05:13,979 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
112 on 8020, call 
org.apache.hadoop.hdfs.protocol.ClientProtocol.getEditsFromTxid from [client 
IP:port] Call#73 Retry#0
java.io.IOException: We encountered an error reading 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy,
 
http://jn1.example.com:8480/getJournal?jid=nameservice1=11577487=yyy.
  During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!
at 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:213)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.readOp(NameNodeRpcServer.java:1674)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEditsFromTxid(NameNodeRpcServer.java:1736)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEditsFromTxid(AuthorizationProviderProxyClientProtocol.java:1010)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEditsFromTxid(ClientNamenodeProtocolServerSideTranslatorPB.java:1475)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
{noformat}

  was:
When a NameNode serves iNotify requests from a client, it verifies the client 
has superuser permission and then uses the client's Kerberos principal to read 
edits from journal nodes.

However, if the client does not renew its tgt tickets, the connection from 
NameNode to journal nodes may fail. In which case, the NameNode thinks the 
edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!"

However, the edits are actually good. NameNode _should not freak out when an 
iNotify client's tgt ticket expires_.

I

[jira] [Created] (HDFS-10799) NameNode should use loginUser(hdfs) to serve iNotify requests

2016-08-25 Thread Wei-Chiu Chuang (JIRA)

Wei-Chiu Chuang created HDFS-10799:
--

 Summary: NameNode should use loginUser(hdfs) to serve iNotify 
requests
 Key: HDFS-10799
 URL: https://issues.apache.org/jira/browse/HDFS-10799
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
 Environment: Kerberized, HA cluster, iNotify client, CDH5.7.0
Reporter: Wei-Chiu Chuang
Assignee: Wei-Chiu Chuang


When a NameNode serves iNotify requests from a client, it verifies the client 
has superuser permission and then uses the client's Kerberos principal to read 
edits from journal nodes.

However, if the client does not renew its tgt tickets, the connection from 
NameNode to journal nodes may fail. In which case, the NameNode thinks the 
edits are corrupt, and prints a scary error message:
"During automatic edit log failover, we noticed that all of the remaining edit 
log streams are shorter than the current one!  The best remaining edit log ends 
at transaction 11577603, but we thought we could read up to transaction 
11577606.  If you continue, metadata will be lost forever!"

However, the edits are actually good. NameNode _should not freak out when an 
iNotify client's tgt ticket expires_.

I think that an easy solution to this bug, is that after NameNode verifies 
client has superuser permission, call {{SecurityUtil.doAsLoginUser}} and then 
read edits. This will make sure the operation does not fail due to an expired 
client ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10798:
-
Assignee: Erik Krogen

> Make the threshold of reporting FSNamesystem lock contention configurable
> -
>
> Key: HDFS-10798
> URL: https://issues.apache.org/jira/browse/HDFS-10798
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Erik Krogen
>  Labels: newbie
>
> Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. 
> In a busy system this might add too much overhead. We should make the 
> threshold configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-25 Thread Anatoli Shein (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anatoli Shein updated HDFS-10754:
-
Attachment: HDFS-10754.HDFS-8707.010.patch

New patch is attached. Please review.

> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> ---
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch, 
> HDFS-10754.HDFS-8707.009.patch, HDFS-10754.HDFS-8707.010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-25 Thread Anatoli Shein (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437421#comment-15437421
 ] 

Anatoli Shein commented on HDFS-10754:
--

Yes, seems like hdfs_find.cpp did not get included into my last diff. Should be 
good in the new patch.

> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> ---
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch, 
> HDFS-10754.HDFS-8707.009.patch, HDFS-10754.HDFS-8707.010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-25 Thread Anatoli Shein (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437416#comment-15437416
 ] 

Anatoli Shein edited comment on HDFS-10754 at 8/25/16 6:33 PM:
---

Thank for the review, [~bobhansen].

I have addressed your comments as follows:

* The new recursive methods should not use future/promises internally. That 
blocks one of the asio threads waiting for more data; if a consumer tried to do 
one of these with a single thread in the threadpool, it would deadlock waiting 
for the subtasks to complete, but they'd all get queued up behind the initial 
handler.

* Instead, whenever they're all done (request_count == 0), the last one out the 
door (the handler that dropped the request_count to 0) should call into the 
consumer's handler directly with the final status. If any of the other threads 
has received an error, all of the subsequent deliveries to the handler should 
return false, telling find "I'm going to report an error anyway, so don't 
bother recursing any more." It's good to wait until request_count==0, even in 
an error state, so the consumer doesn't have any lame duck requests queued up 
to take care of?

* Also because all of this is asynchronous, you can't allocate the lock and 
state variables on the stack. When the consumer calls SetOwner('/', true, 
handler), the function is going to return as soon as the find operation is 
kicked off, destroying all of the elements on the stack. We'll need to create a 
little struct for SetOwner that is maintained with a shared_ptr and cleaned up 
when the last request is done.
(/) I removed futures and promises from recursive methods, and created a small 
struct to keep the state. Now it is purely async.

Minor points:

* In find, perhaps recursion_counter is a bit of a misnomer at this point. It's 
more outstanding_requests, since for big directories, we'll have more requests 
without recursing.
(/) Done.

* Perhaps FindOperationState is a better name than CurrentState, and 
SharedFindState is better than just SharedState (since we might have many 
shared states in the FileSystem class).
(/) Done.

* In CurrentState, perhaps "depth" is more accurate than position?
(/) Yes, I changed it to "depth" now.

* Do we support a globbing find without recursion? Can I find "/dir?/path*/" 
"*.db", and not have it recurse to the sub-directories of path*?
(/) POSIX find supports globbing without recursion by setting maxdepth to zero. 
I added maxdepth functionality for our tool also.

* Can we push the shims and state into the .cpp file and keep them out of the 
itnerface (even if private)?
(/) Done.


was (Author: anatoli.shein):
Thank for the review, [~bobhansen].

I have addressed your comments as follows:

* The new recursive methods should not use future/promises internally. That 
blocks one of the asio threads waiting for more data; if a consumer tried to do 
one of these with a single thread in the threadpool, it would deadlock waiting 
for the subtasks to complete, but they'd all get queued up behind the initial 
handler.

Instead, whenever they're all done (request_count == 0), the last one out the 
door (the handler that dropped the request_count to 0) should call into the 
consumer's handler directly with the final status. If any of the other threads 
has received an error, all of the subsequent deliveries to the handler should 
return false, telling find "I'm going to report an error anyway, so don't 
bother recursing any more." It's good to wait until request_count==0, even in 
an error state, so the consumer doesn't have any lame duck requests queued up 
to take care of?

Also because all of this is asynchronous, you can't allocate the lock and state 
variables on the stack. When the consumer calls SetOwner('/', true, handler), 
the function is going to return as soon as the find operation is kicked off, 
destroying all of the elements on the stack. We'll need to create a little 
struct for SetOwner that is maintained with a shared_ptr and cleaned up when 
the last request is done.

(/) I removed futures and promises from recursive methods, and created a small 
struct to keep the state. Now it is purely async.

Minor points:

* In find, perhaps recursion_counter is a bit of a misnomer at this point. It's 
more outstanding_requests, since for big directories, we'll have more requests 
without recursing.
(/) Done.

* Perhaps FindOperationState is a better name than CurrentState, and 
SharedFindState is better than just SharedState (since we might have many 
shared states in the FileSystem class).
(/) Done.

* In CurrentState, perhaps "depth" is more accurate than position?
(/) Yes, I changed it to "depth" now.

* Do we support a globbing find without recursion? Can I find "/dir?/path*/" 
"*.db", and not have it recurse to the sub-directories of path*?
(/) POSIX find supports globbing without recursion by setting maxdepth to

[jira] [Commented] (HDFS-10754) libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, hdfs_chown, hdfs_chmod and hdfs_find.

2016-08-25 Thread Anatoli Shein (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437416#comment-15437416
 ] 

Anatoli Shein commented on HDFS-10754:
--

Thank for the review, [~bobhansen].

I have addressed your comments as follows:

* The new recursive methods should not use future/promises internally. That 
blocks one of the asio threads waiting for more data; if a consumer tried to do 
one of these with a single thread in the threadpool, it would deadlock waiting 
for the subtasks to complete, but they'd all get queued up behind the initial 
handler.

Instead, whenever they're all done (request_count == 0), the last one out the 
door (the handler that dropped the request_count to 0) should call into the 
consumer's handler directly with the final status. If any of the other threads 
has received an error, all of the subsequent deliveries to the handler should 
return false, telling find "I'm going to report an error anyway, so don't 
bother recursing any more." It's good to wait until request_count==0, even in 
an error state, so the consumer doesn't have any lame duck requests queued up 
to take care of?

Also because all of this is asynchronous, you can't allocate the lock and state 
variables on the stack. When the consumer calls SetOwner('/', true, handler), 
the function is going to return as soon as the find operation is kicked off, 
destroying all of the elements on the stack. We'll need to create a little 
struct for SetOwner that is maintained with a shared_ptr and cleaned up when 
the last request is done.

(/) I removed futures and promises from recursive methods, and created a small 
struct to keep the state. Now it is purely async.

Minor points:

* In find, perhaps recursion_counter is a bit of a misnomer at this point. It's 
more outstanding_requests, since for big directories, we'll have more requests 
without recursing.
(/) Done.

* Perhaps FindOperationState is a better name than CurrentState, and 
SharedFindState is better than just SharedState (since we might have many 
shared states in the FileSystem class).
(/) Done.

* In CurrentState, perhaps "depth" is more accurate than position?
(/) Yes, I changed it to "depth" now.

* Do we support a globbing find without recursion? Can I find "/dir?/path*/" 
"*.db", and not have it recurse to the sub-directories of path*?
(/) POSIX find supports globbing without recursion by setting maxdepth to zero. 
I added maxdepth functionality for our tool also.

* Can we push the shims and state into the .cpp file and keep them out of the 
itnerface (even if private)?
(/) Done.

> libhdfs++: Create tools directory and implement hdfs_cat, hdfs_chgrp, 
> hdfs_chown, hdfs_chmod and hdfs_find.
> ---
>
> Key: HDFS-10754
> URL: https://issues.apache.org/jira/browse/HDFS-10754
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
> Attachments: HDFS-10754.HDFS-8707.000.patch, 
> HDFS-10754.HDFS-8707.001.patch, HDFS-10754.HDFS-8707.002.patch, 
> HDFS-10754.HDFS-8707.003.patch, HDFS-10754.HDFS-8707.004.patch, 
> HDFS-10754.HDFS-8707.005.patch, HDFS-10754.HDFS-8707.006.patch, 
> HDFS-10754.HDFS-8707.007.patch, HDFS-10754.HDFS-8707.008.patch, 
> HDFS-10754.HDFS-8707.009.patch, HDFS-10754.HDFS-8707.010.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10742) Measurement of lock held time in FsDatasetImpl

2016-08-25 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-10742:
--
Attachment: HDFS-10742.007.patch

Thanks [~chris.douglas] for the comments! 

Updated a patch to fix Jenkins findbugs complains. Also reverted an unnecessary 
change from previous patch that could be misleading. Will update with unit test 
later on.

Apart from this. I'm considering using hadoop's built-in metric system, instead 
of maintaining and printing locally maintained stats, what do you think about 
this?

> Measurement of lock held time in FsDatasetImpl
> --
>
> Key: HDFS-10742
> URL: https://issues.apache.org/jira/browse/HDFS-10742
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.0.0-alpha2
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10742.001.patch, HDFS-10742.002.patch, 
> HDFS-10742.003.patch, HDFS-10742.004.patch, HDFS-10742.005.patch, 
> HDFS-10742.006.patch, HDFS-10742.007.patch
>
>
> This JIRA proposes to measure the time the of lock of {{FsDatasetImpl}} is 
> held by a thread. Doing so will allow us to measure lock statistics.
> This can be done by extending the {{AutoCloseableLock}} lock object in 
> {{FsDatasetImpl}}. In the future we can also consider replacing the lock with 
> a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437340#comment-15437340
 ] 

Mingliang Liu commented on HDFS-9145:
-

Thanks [~zhz] for taking care of this. I also noticed that you backported 
[HDFS-10798].

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Mingliang Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437340#comment-15437340
 ] 

Mingliang Liu edited comment on HDFS-9145 at 8/25/16 5:57 PM:
--

Thanks [~zhz] for taking care of this. I also noticed that you backported 
[HDFS-9467].


was (Author: liuml07):
Thanks [~zhz] for taking care of this. I also noticed that you backported 
[HDFS-10798].

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10584) Allow long-running Mover tool to login with keytab

2016-08-25 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437327#comment-15437327
 ] 

Rakesh R commented on HDFS-10584:
-

Hi [~zhz], I've tried addressing your comments in the latest patch, please 
review it again when you get a chance. Thanks!

> Allow long-running Mover tool to login with keytab
> --
>
> Key: HDFS-10584
> URL: https://issues.apache.org/jira/browse/HDFS-10584
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: balancer & mover
>Reporter: Rakesh R
>Assignee: Rakesh R
> Attachments: HDFS-10584-00.patch, HDFS-10584-01.patch, 
> HDFS-10584-02.patch, HDFS-10584-03.patch
>
>
> The idea of this jira is to support {{mover}} tool the ability to login from 
> a keytab. That way, the RPC client would re-login from the keytab after 
> expiration, which means the process could remain authenticated indefinitely. 
> With some people wanting to run mover non-stop in "daemon mode", that might 
> be a reasonable feature to add. Recently balancer has been enhanced using 
> this feature.
> Thanks [~zhz] for the offline discussions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10791) Delete block meta file when the block file is missing

2016-08-25 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437326#comment-15437326
 ] 

Tsz Wo Nicholas Sze commented on HDFS-10791:


Hi [~linyiqun], thanks for pointing out the code.  However, there are two 
deficiencies:
# The code is only used by DirectoryScanner on FINALIZED blocks.  RBW block 
meta files won't be cleaned up.
# DirectoryScanner uses filters to list files so that it skips unidentified 
files if these files are not matched by the patterns.

> Delete block meta file when the block file is missing
> -
>
> Key: HDFS-10791
> URL: https://issues.apache.org/jira/browse/HDFS-10791
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Tsz Wo Nicholas Sze
>
> When the block file is missing, the block meta file should be deleted if it 
> exists.
> Note that such situation is possible since the meta file is closed before the 
> block file, the datanode could be killed in-between.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-9467) Fix data race accessing writeLockHeldTimeStamp in FSNamesystem

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9467:

Fix Version/s: 2.7.4

> Fix data race accessing writeLockHeldTimeStamp in FSNamesystem
> --
>
> Key: HDFS-9467
> URL: https://issues.apache.org/jira/browse/HDFS-9467
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9467.000.patch, HDFS-9467.001.patch
>
>
> This is a followup of [HDFS-9145].
> Actually the value of {{writeLockInterval}} should be captured within the 
> lock. The current code has a race on {{writeLockHeldTimeStamp}}. Thanks to 
> [~jingzhao] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9145:

Fix Version/s: 2.7.4

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9145) Tracking methods that hold FSNamesytemLock for too long

2016-08-25 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437295#comment-15437295
 ] 

Zhe Zhang commented on HDFS-9145:
-

Thanks [~liuml07], [~jingzhao]. This is pretty good improvement. I just 
backport this change and HDFS-9467 to branch-2.7.

> Tracking methods that hold FSNamesytemLock for too long
> ---
>
> Key: HDFS-9145
> URL: https://issues.apache.org/jira/browse/HDFS-9145
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Jing Zhao
>Assignee: Mingliang Liu
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-9145.000.patch, HDFS-9145.001.patch, 
> HDFS-9145.002.patch, HDFS-9145.003.patch
>
>
> It will be helpful that if we can have a way to track (or at least log a msg) 
> if some operation is holding the FSNamesystem lock for a long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10768) Optimize mkdir ops

2016-08-25 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-10768:
---
Attachment: HDFS-10768.1.patch

Fixed findbugs and the style issue.

> Optimize mkdir ops
> --
>
> Key: HDFS-10768
> URL: https://issues.apache.org/jira/browse/HDFS-10768
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10768.1.patch, HDFS-10768.patch
>
>
> Directory creation causes excessive object allocation: ex. an immutable list 
> builder, containing the string of components converted from the IIP's 
> byte[]s, sublist views of the string list, iterable, followed by string to 
> byte[] conversion.  This can all be eliminated by accessing the component's 
> byte[] in the IIP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2016-08-25 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437259#comment-15437259
 ] 

Zhe Zhang commented on HDFS-8883:
-

Thanks [~anu] for the work. This is a pretty good improvement. I just 
backported to branch-2.7.

> NameNode Metrics : Add FSNameSystem lock Queue Length
> -
>
> Key: HDFS-8883
> URL: https://issues.apache.org/jira/browse/HDFS-8883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-8883.001.patch
>
>
> FSNameSystemLock can have contention when NameNode is under load. This patch 
> adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
> as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-8883) NameNode Metrics : Add FSNameSystem lock Queue Length

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8883:

Fix Version/s: 2.7.4

> NameNode Metrics : Add FSNameSystem lock Queue Length
> -
>
> Key: HDFS-8883
> URL: https://issues.apache.org/jira/browse/HDFS-8883
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-8883.001.patch
>
>
> FSNameSystemLock can have contention when NameNode is under load. This patch 
> adds  LockQueueLength -- the number of threads waiting on FSNameSystemLock -- 
> as a metric in NameNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8773:

Fix Version/s: (was: 2.7.4)

> Few FSNamesystem metrics are not documented in the Metrics page
> ---
>
> Key: HDFS-8773
> URL: https://issues.apache.org/jira/browse/HDFS-8773
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0
>
> Attachments: HDFS-8773-00.patch
>
>
> This jira is to document missing metrics in the [Metrics 
> page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem].
>  Following are not documented:
> {code}
> MissingReplOneBlocks
> NumFilesUnderConstruction
> NumActiveClients
> HAState
> FSState
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-8773) Few FSNamesystem metrics are not documented in the Metrics page

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8773:

Fix Version/s: 2.7.4

> Few FSNamesystem metrics are not documented in the Metrics page
> ---
>
> Key: HDFS-8773
> URL: https://issues.apache.org/jira/browse/HDFS-8773
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-8773-00.patch
>
>
> This jira is to document missing metrics in the [Metrics 
> page|https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Metrics.html#FSNamesystem].
>  Following are not documented:
> {code}
> MissingReplOneBlocks
> NumFilesUnderConstruction
> NumActiveClients
> HAState
> FSState
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-8721) Add a metric for number of encryption zones

2016-08-25 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437228#comment-15437228
 ] 

Zhe Zhang commented on HDFS-8721:
-

This is a pretty good improvement. I just backported to branch-2.7.

> Add a metric for number of encryption zones
> ---
>
> Key: HDFS-8721
> URL: https://issues.apache.org/jira/browse/HDFS-8721
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-8721-00.patch, HDFS-8721-01.patch
>
>
> Would be good to expose the number of encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-8721) Add a metric for number of encryption zones

2016-08-25 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-8721:

Fix Version/s: 2.7.4

> Add a metric for number of encryption zones
> ---
>
> Key: HDFS-8721
> URL: https://issues.apache.org/jira/browse/HDFS-8721
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: encryption
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 2.8.0, 2.7.4
>
> Attachments: HDFS-8721-00.patch, HDFS-8721-01.patch
>
>
> Would be good to expose the number of encryption zones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10798) Make the threshold of reporting FSNamesystem lock contention configurable

2016-08-25 Thread Zhe Zhang (JIRA)

Zhe Zhang created HDFS-10798:


 Summary: Make the threshold of reporting FSNamesystem lock 
contention configurable
 Key: HDFS-10798
 URL: https://issues.apache.org/jira/browse/HDFS-10798
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: Zhe Zhang


Currently {{FSNamesystem#WRITELOCK_REPORTING_THRESHOLD}} is set at 1 second. In 
a busy system this might add too much overhead. We should make the threshold 
configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10797) Disk usage summary of snapshots causes renamed blocks to get counted twice

2016-08-25 Thread Sean Mackrory (JIRA)

Sean Mackrory created HDFS-10797:


 Summary: Disk usage summary of snapshots causes renamed blocks to 
get counted twice
 Key: HDFS-10797
 URL: https://issues.apache.org/jira/browse/HDFS-10797
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Sean Mackrory


DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how much 
disk usage is used by a snapshot by tallying up the files in the snapshot that 
have since been deleted (that way it won't overlap with regular files whose 
disk usage is computed separately). However that is determined from a diff that 
shows moved (to Trash or otherwise) or renamed files as a deletion and a 
creation operation that may overlap with the list of blocks. Only the deletion 
operation is taken into consideration, and this causes those blocks to get 
represented twice in the disk usage tallying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10748) TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout

2016-08-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437189#comment-15437189
 ] 

Hudson commented on HDFS-10748:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10346 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10346/])
HDFS-10748. TestFileTruncate#testTruncateWithDataNodesRestart runs (xyao: rev 
4da5000dd33cf013e7212848ed2c44f1e60e860e)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java


> TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout
> 
>
> Key: HDFS-10748
> URL: https://issues.apache.org/jira/browse/HDFS-10748
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiaoyu Yao
>Assignee: Yiqun Lin
> Fix For: 2.8.0
>
> Attachments: HDFS-10748.001.patch, HDFS-10748.002.patch
>
>
> This was fixed by HDFS-7886. But some recent [Jenkins 
> Results|https://builds.apache.org/job/PreCommit-HDFS-Build/16390/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
>  started seeing this again: 
> {code}
> Tests run: 18, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 172.025 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestFileTruncate
> testTruncateWithDataNodesRestart(org.apache.hadoop.hdfs.server.namenode.TestFileTruncate)
>   Time elapsed: 43.861 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Timed out waiting for 
> /test/testTruncateWithDataNodesRestart to reach 3 replicas
>   at 
> org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:751)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestart(TestFileTruncate.java:704)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10768) Optimize mkdir ops

2016-08-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437169#comment-15437169
 ] 

Kihwal Lee commented on HDFS-10768:
---

The test failure was already reported in HDFS-10498. I've added the log and 
initial analysis there.
[~daryn], please take care of the findbugs warning.

> Optimize mkdir ops
> --
>
> Key: HDFS-10768
> URL: https://issues.apache.org/jira/browse/HDFS-10768
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10768.patch
>
>
> Directory creation causes excessive object allocation: ex. an immutable list 
> builder, containing the string of components converted from the IIP's 
> byte[]s, sublist views of the string list, iterable, followed by string to 
> byte[] conversion.  This can all be eliminated by accessing the component's 
> byte[] in the IIP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10748) TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout

2016-08-25 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-10748:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

Thanks [~linyiqun] for the contribution. I've commit the fix to trunk, branch-2 
and branch-2.8.

> TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout
> 
>
> Key: HDFS-10748
> URL: https://issues.apache.org/jira/browse/HDFS-10748
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiaoyu Yao
>Assignee: Yiqun Lin
> Fix For: 2.8.0
>
> Attachments: HDFS-10748.001.patch, HDFS-10748.002.patch
>
>
> This was fixed by HDFS-7886. But some recent [Jenkins 
> Results|https://builds.apache.org/job/PreCommit-HDFS-Build/16390/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
>  started seeing this again: 
> {code}
> Tests run: 18, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 172.025 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestFileTruncate
> testTruncateWithDataNodesRestart(org.apache.hadoop.hdfs.server.namenode.TestFileTruncate)
>   Time elapsed: 43.861 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Timed out waiting for 
> /test/testTruncateWithDataNodesRestart to reach 3 replicas
>   at 
> org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:751)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestart(TestFileTruncate.java:704)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10498) Intermittent test failure org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength

2016-08-25 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10498:
--
Attachment: test_failure.txt

Attaching a test log from a precommit build.

The test calls {{append()}} and makes an assumption on when the block state 
changes on datanodes. In the failed test, {{getFileChecksum()}} reached a 
datanode right after {{append()}}. This made {{BLOCK_CHECKSUM}} op on the 
datanode to fail with an {{EOFException}} while trying to read the meta file. 
If the {{BLOCK_CHECKSUM}} op reached the datanode a tiny bit later, it would 
have failed with the expected exception.



> Intermittent test failure 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength
> ---
>
> Key: HDFS-10498
> URL: https://issues.apache.org/jira/browse/HDFS-10498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, snapshots
>Affects Versions: 3.0.0-alpha1
>Reporter: Hanisha Koneru
> Attachments: test_failure.txt
>
>
> Error Details
> Per https://builds.apache.org/job/PreCommit-HDFS-Build/15646/testReport/, we 
> had the following failure. Local rerun is successful.
> Error Details:
> {panel}
> Fail to get block MD5 for 
> LocatedBlock{BP-145245805-172.17.0.3-1464981728847:blk_1073741826_1002; 
> getBlockSize()=1; corrupt=false; offset=1024; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:55764,DS-a33d7c97-9d4a-4694-a47e-a3187a33ed5a,DISK]]}
> {panel}
> Stack Trace: 
> {panel}
> java.io.IOException: Fail to get block MD5 for 
> LocatedBlock{BP-145245805-172.17.0.3-1464981728847:blk_1073741826_1002; 
> getBlockSize()=1; corrupt=false; offset=1024; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:55764,DS-a33d7c97-9d4a-4694-a47e-a3187a33ed5a,DISK]]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$ReplicatedFileChecksumComputer.checksumBlocks(FileChecksumHelper.java:289)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:206)
>   at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:1731)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$31.doCall(DistributedFileSystem.java:1482)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$31.doCall(DistributedFileSystem.java:1479)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1490)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength(TestSnapshotFileLength.java:137)
>  Standard Output  7 sec
> {panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10619) Cache path in InodesInPath

2016-08-25 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437129#comment-15437129
 ] 

Kihwal Lee commented on HDFS-10619:
---

[~daryn], you need to rebase the patch.

> Cache path in InodesInPath
> --
>
> Key: HDFS-10619
> URL: https://issues.apache.org/jira/browse/HDFS-10619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10619.patch
>
>
> INodesInPath#getPath, a frequently called method, dynamically builds the 
> path.  IIP should cache the path upon construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10748) TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout

2016-08-25 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437123#comment-15437123
 ] 

Xiaoyu Yao commented on HDFS-10748:
---

Thanks [~linyiqun] for working on this. The patch v02 LGTM,  +1. I will commit 
it shortly.

> TestFileTruncate#testTruncateWithDataNodesRestart runs sometimes timeout
> 
>
> Key: HDFS-10748
> URL: https://issues.apache.org/jira/browse/HDFS-10748
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Xiaoyu Yao
>Assignee: Yiqun Lin
> Attachments: HDFS-10748.001.patch, HDFS-10748.002.patch
>
>
> This was fixed by HDFS-7886. But some recent [Jenkins 
> Results|https://builds.apache.org/job/PreCommit-HDFS-Build/16390/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt]
>  started seeing this again: 
> {code}
> Tests run: 18, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 172.025 sec 
> <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.TestFileTruncate
> testTruncateWithDataNodesRestart(org.apache.hadoop.hdfs.server.namenode.TestFileTruncate)
>   Time elapsed: 43.861 sec  <<< ERROR!
> java.util.concurrent.TimeoutException: Timed out waiting for 
> /test/testTruncateWithDataNodesRestart to reach 3 replicas
>   at 
> org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:751)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateWithDataNodesRestart(TestFileTruncate.java:704)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-4660) Block corruption can happen during pipeline recovery

2016-08-25 Thread Nathan Roberts (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Roberts updated HDFS-4660:
-
Attachment: periodic_hflush.patch

> Block corruption can happen during pipeline recovery
> 
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.3-alpha, 3.0.0-alpha1
>Reporter: Peng Zhang
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.6.4
>
> Attachments: HDFS-4660.br26.patch, HDFS-4660.patch, HDFS-4660.patch, 
> HDFS-4660.v2.patch, periodic_hflush.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-4660) Block corruption can happen during pipeline recovery

2016-08-25 Thread Nathan Roberts (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-4660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437036#comment-15437036
 ] 

Nathan Roberts commented on HDFS-4660:
--

Hi [~yzhangal]. Had to go back to an old git stash, but I'll attach a sample 
patch to TeraOutputFormat.

> Block corruption can happen during pipeline recovery
> 
>
> Key: HDFS-4660
> URL: https://issues.apache.org/jira/browse/HDFS-4660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.3-alpha, 3.0.0-alpha1
>Reporter: Peng Zhang
>Assignee: Kihwal Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.6.4
>
> Attachments: HDFS-4660.br26.patch, HDFS-4660.patch, HDFS-4660.patch, 
> HDFS-4660.v2.patch
>
>
> pipeline DN1  DN2  DN3
> stop DN2
> pipeline added node DN4 located at 2nd position
> DN1  DN4  DN3
> recover RBW
> DN4 after recover rbw
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1004
> 2013-04-01 21:02:31,570 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134144
>   getBytesOnDisk() = 134144
>   getVisibleLength()= 134144
> end at chunk (134144/512=262)
> DN3 after recover rbw
> 2013-04-01 21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Recover 
> RBW replica 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_10042013-04-01
>  21:02:31,575 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_-9076133543772600337_1004, RBW
>   getNumBytes() = 134028 
>   getBytesOnDisk() = 134028
>   getVisibleLength()= 134028
> client send packet after recover pipeline
> offset=133632  len=1008
> DN4 after flush 
> 2013-04-01 21:02:31,779 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1063
> // meta end position should be floor(134640/512)*4 + 7 == 1059, but now it is 
> 1063.
> DN3 after flush
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005, 
> type=LAST_IN_PIPELINE, downstreams=0:[]: enqueue Packet(seqno=219, 
> lastPacketInBlock=false, offsetInBlock=134640, 
> ackEnqueueNanoTime=8817026136871545)
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Changing 
> meta file offset of block 
> BP-325305253-10.2.201.14-1364820083462:blk_-9076133543772600337_1005 from 
> 1055 to 1051
> 2013-04-01 21:02:31,782 DEBUG 
> org.apache.hadoop.hdfs.server.datanode.DataNode: FlushOrsync, file 
> offset:134640; meta offset:1059
> After checking meta on DN4, I found checksum of chunk 262 is duplicated, but 
> data not.
> Later after block was finalized, DN4's scanner detected bad block, and then 
> reported it to NM. NM send a command to delete this block, and replicate this 
> block from other DN in pipeline to satisfy duplication num.
> I think this is because in BlockReceiver it skips data bytes already written, 
> but not skips checksum bytes already written. And function 
> adjustCrcFilePosition is only used for last non-completed chunk, but
> not for this situation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9038) DFS reserved space is erroneously counted towards non-DFS used.

2016-08-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15437020#comment-15437020
 ] 

Hadoop QA commented on HDFS-9038:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 52s{color} 
| {color:red} hadoop-hdfs-project generated 1 new + 51 unchanged - 1 fixed = 52 
total (was 52) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 
583 unchanged - 5 fixed = 584 total (was 588) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
57s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.mover.TestStorageMover |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12825464/HDFS-9038-010.patch |
| JIRA Issue | HDFS-9038 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 4f982e9d4706 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 525d52b |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16537/artifact/patchprocess/diff-compile-javac-hadoop-hdfs-project.txt
 |
| checkstyle |

[jira] [Updated] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode

2016-08-25 Thread Xinwei Qin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinwei Qin  updated HDFS-7859:
--
Attachment: (was: HDFS-7859.008.patch)

> Erasure Coding: Persist erasure coding policies in NameNode
> ---
>
> Key: HDFS-7859
> URL: https://issues.apache.org/jira/browse/HDFS-7859
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Xinwei Qin 
>  Labels: BB2015-05-TBR, hdfs-ec-3.0-must-do
> Attachments: HDFS-7859-HDFS-7285.002.patch, 
> HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
> HDFS-7859.001.patch, HDFS-7859.002.patch, HDFS-7859.004.patch, 
> HDFS-7859.005.patch, HDFS-7859.006.patch, HDFS-7859.007.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
> persist EC schemas in NameNode centrally and reliably, so that EC zones can 
> reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 115 matches

Mail list logo