[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247215#comment-15247215
 ] 

Hadoop QA commented on HDFS-10284:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 54s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 54m 9s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 135m 29s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.TestDecommissionWithStriped |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitCache |
\\
\\
|| Subsystem || 

[jira] [Commented] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-18 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247129#comment-15247129
 ] 

Walter Su commented on HDFS-10301:
--

Oh, I see. In this case, the reports are not splitted. And because the for-loop 
is outside the lock, the 2 for-loops interleaved.
{code}
for (int r = 0; r < reports.length; r++) {
{code}

> Blocks removed by thousands due to falsely detected zombie storages
> ---
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Priority: Critical
> Attachments: zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10284:
-
Attachment: HDFS-10284.003.patch

The v3 patch is to address [~brahmareddy]'s comment about code comment. 

As the test cases {{testCheckSafeModeX()}} need more than 3 words each to 
explain and are grouped together with leading detailed javadoc why we split 
them, I did not rename the test methods further.

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10284.000.patch, HDFS-10284.001.patch, 
> HDFS-10284.002.patch, HDFS-10284.003.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9684) DataNode stopped sending heartbeat after getting OutOfMemoryError form DataTransfer thread.

2016-04-18 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247037#comment-15247037
 ] 

Walter Su commented on HDFS-9684:
-

My previous comment is incorrect. It turns out that the MR tasks swallowed all 
the virtual memories.

> DataNode stopped sending heartbeat after getting OutOfMemoryError form 
> DataTransfer thread.
> ---
>
> Key: HDFS-9684
> URL: https://issues.apache.org/jira/browse/HDFS-9684
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Blocker
> Attachments: HDFS-9684.01.patch
>
>
> {noformat}
> java.lang.OutOfMemoryError: unable to create new native thread
>   at java.lang.Thread.start0(Native Method)
>   at java.lang.Thread.start(Thread.java:714)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlock(DataNode.java:1999)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.transferBlocks(DataNode.java:2008)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:657)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:615)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:857)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:671)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:823)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246999#comment-15246999
 ] 

Mingliang Liu commented on HDFS-10306:
--

Thank you [~brahmareddy] for your review (and previous work). Thanks 
[~jingzhao] for your review and commit.

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.9.0
>
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-18 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246996#comment-15246996
 ] 

Walter Su commented on HDFS-10301:
--

1. IPC reader is single-thread by default. If it's multi-threaded, The order of 
putting rpc requests into {{callQueue}} is unspecified.
1. IPC {{callQueue}} is fifo.
2. IPC Handler is multi-threaded. If 2 handlers are both waiting the fsn lock, 
the entry order depends on the fairness of the lock.
bq. When constructed as fair, threads contend for entry using an 
*approximately* arrival-order policy. When the currently held lock is released 
either the longest-waiting single writer thread will be assigned the write 
lock... (quore from 
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html)

I think if DN can't get acked from NN, it shouldn't assume the 
arrival/processing order(esp when reestablish a connection). Well, I'm still 
curious about how the interleave happened. Any thoughts?

> Blocks removed by thousands due to falsely detected zombie storages
> ---
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Priority: Critical
> Attachments: zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246990#comment-15246990
 ] 

Lin Yiqun commented on HDFS-10275:
--

Thanks [~walter.k.su] for commit!

> TestDataNodeMetrics failing intermittently due to TotalWriteTime counted 
> incorrectly
> 
>
> Key: HDFS-10275
> URL: https://issues.apache.org/jira/browse/HDFS-10275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Fix For: 2.7.3
>
> Attachments: HDFS-10275.001.patch
>
>
> The unit test {{TestDataNodeMetrics}} fails intermittently. The failed info 
> show these:
> {code}
> Results :
> Failed tests: 
>   
> TestDataNodeVolumeFailureToleration.testVolumeAndTolerableConfiguration:195->testVolumeConfig:232
>  expected: but was:
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 ? IO Timed out waiting for 
> Min...
>   TestDataNodeMetrics.testDataNodeTimeSpend:279 ? Timeout Timed out waiting 
> for ...
>   TestHFlush.testHFlushInterrupted ? IO The stream is closed
> {code}
> In line 279 in {{TestDataNodeMetrics}}, it takes place timed out. Then I 
> looked into the code and found the real reason is that the metric of 
> {{TotalWriteTime}} frequently count 0 in each iteration of creating file. And 
> the this leads to retry operations till timeout.
> I debug the test in my local. I found the most suspect reason which cause 
> {{TotalWriteTime}} metric count always be 0 is that we using the 
> {{SimulatedFSDataset}} for spending time test. In {{SimulatedFSDataset}}, it 
> will use the inner class's method {{SimulatedOutputStream#write}} to count 
> the write time and the method of this class just updates the {{length}} and 
> throws its data away.
> {code}
> @Override
> public void write(byte[] b,
>   int off,
>   int len) throws IOException  {
>   length += len;
> }
> {code} 
> So the writing operation hardly not costs any time. So we should use a real 
> way to create file instead of simulated way. I have tested in my local that 
> the test is passed just one time when I delete the simulated way, while the 
> test retries many times to count write time in old way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10302) BlockPlacementPolicyDefault should use default replication considerload value

2016-04-18 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246988#comment-15246988
 ] 

Lin Yiqun commented on HDFS-10302:
--

Thanks [~kihwal] for quick review and commit!

> BlockPlacementPolicyDefault should use default replication considerload value
> -
>
> Key: HDFS-10302
> URL: https://issues.apache.org/jira/browse/HDFS-10302
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-10302.001.patch
>
>
> Now in method {{BlockPlacementPolicyDefault#initialize}}, it just uses value 
> {{true}} as the replication considerload default value rather than using the 
> existed string constant value 
> {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}}.
> {code}
>   @Override
>   public void initialize(Configuration conf,  FSClusterStats stats,
>  NetworkTopology clusterMap, 
>  Host2NodesMap host2datanodeMap) {
> this.considerLoad = conf.getBoolean(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, true);
> this.considerLoadFactor = conf.getDouble(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR,
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR_DEFAULT);
> this.stats = stats;
> this.clusterMap = clusterMap;
> this.host2datanodeMap = host2datanodeMap;
> this.heartbeatInterval = conf.getLong(
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY,
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT) * 1000;
> this.tolerateHeartbeatMultiplier = conf.getInt(
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_KEY,
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_DEFAULT);
> this.staleInterval = conf.getLong(
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_KEY, 
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_DEFAULT);
> this.preferLocalNode = conf.getBoolean(
> DFSConfigKeys.
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_KEY,
> DFSConfigKeys.
> 
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_DEFAULT);
>   }
> {code}
> And now the value {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}} is not be 
> used in any place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246970#comment-15246970
 ] 

Hadoop QA commented on HDFS-10264:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
30s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 102m 34s 
{color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 52s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
41s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 217m 56s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | 

[jira] [Commented] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-18 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246908#comment-15246908
 ] 

Brahma Reddy Battula commented on HDFS-10284:
-

I have one minor nit,
 I think, now we can  move comments before testcase Or change the testcase name 
itself.

Like following..
{code}   
 public void testCheckSafeMode3() {
// PENDING_THRESHOLD -> OFF
{code}  
can be
{code}   
 // PENDING_THRESHOLD -> OFF
 public void testCheckSafeMode3() {
{code}   

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10284.000.patch, HDFS-10284.001.patch, 
> HDFS-10284.002.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246905#comment-15246905
 ] 

Hudson commented on HDFS-10306:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9629 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9629/])
HDFS-10306. SafeModeMonitor should not leave safe mode if name system is 
(jing9: rev be0bce1b7171c49e2dca22f56d4e750e606862fc)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerSafeMode.java


> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.9.0
>
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10306:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

I've committed this into trunk and branch-2. Thanks for the fix, [~liuml07].

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.9.0
>
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246891#comment-15246891
 ] 

Brahma Reddy Battula commented on HDFS-10306:
-

LGTM, [~liuml07] thanks for taking care for this..

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246886#comment-15246886
 ] 

Jing Zhao commented on HDFS-10306:
--

+1. I will commit the patch shortly.

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246844#comment-15246844
 ] 

Mingliang Liu commented on HDFS-10306:
--

Test failures are not related. We don't add new test because this is a 
follow-up of [HDFS-10192], which added two unit test, and [HDFS-10284] refined 
them.

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10284:
-
Priority: Major  (was: Minor)

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10284.000.patch, HDFS-10284.001.patch, 
> HDFS-10284.002.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10232) Ozone: Make config key naming consistent

2016-04-18 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-10232:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

[~arpitagarwal] Thanks for the review. I have committed this to feature branch

> Ozone: Make config key naming consistent
> 
>
> Key: HDFS-10232
> URL: https://issues.apache.org/jira/browse/HDFS-10232
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Trivial
> Attachments: HDFS-10232-HDFS-7240.001.patch, 
> HDFS-10232-HDFS-7240.002.patch, HDFS-10232-HDFS-7240.003.patch
>
>
> We seem to use StorageHandler, ozone, Objectstore etc as prefix. We should 
> pick one -- Ideally ozone and use that consistently as the prefix for the 
> ozone config management.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots

2016-04-18 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246800#comment-15246800
 ] 

Xiao Chen commented on HDFS-8986:
-

The left test failures are not related (HDFS-10291 fixes 
TestShortCircuitLocalRead). The checkstyle I think should be left as such to 
make the code more readable.

Appreciate any review comments. Thanks!

> Add option to -du to calculate directory space usage excluding snapshots
> 
>
> Key: HDFS-8986
> URL: https://issues.apache.org/jira/browse/HDFS-8986
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Reporter: Gautam Gopalakrishnan
>Assignee: Xiao Chen
> Attachments: HDFS-8986.01.patch, HDFS-8986.02.patch
>
>
> When running {{hadoop fs -du}} on a snapshotted directory (or one of its 
> children), the report includes space consumed by blocks that are only present 
> in the snapshots. This is confusing for end users.
> {noformat}
> $  hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -createSnapshot /tmp/parent snap1
> Created snapshot /tmp/parent/.snapshot/snap1
> $ hadoop fs -rm -skipTrash /tmp/parent/sub1/*
> ...
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -deleteSnapshot /tmp/parent snap1
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 0  0  /tmp/parent
> 0  0  /tmp/parent/sub1
> {noformat}
> It would be helpful if we had a flag, say -X, to exclude any snapshot related 
> disk usage in the output



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246786#comment-15246786
 ] 

Hadoop QA commented on HDFS-8986:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 42s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 37s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
6s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 13s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 52s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 39s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 39s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 4s 
{color} | {color:red} root: patch generated 3 new + 166 unchanged - 11 fixed = 
169 total (was 177) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 12s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 51s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 50s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 40s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | 

[jira] [Commented] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246785#comment-15246785
 ] 

Hadoop QA commented on HDFS-3702:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 22 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 4s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 49s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 47s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 16s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 55s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 50s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 50s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 11s 
{color} | {color:red} root: patch generated 10 new + 675 unchanged - 7 fixed = 
685 total (was 682) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 6m 44s {color} 
| {color:red} hadoop-common in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s 
{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 42s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 22s 
{color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} 

[jira] [Commented] (HDFS-9869) Erasure Coding: Rename replication-based names in BlockManager to more generic [part-2]

2016-04-18 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246771#comment-15246771
 ] 

Zhe Zhang commented on HDFS-9869:
-

Thanks Rakesh for the update. Path LGTM overall. A few remaining minor issues:
# Chain deprecation below. Maybe we should combine them?
{code}
| dfs.replication.pending.timeout.sec | 
dfs.namenode.replication.pending.timeout-sec |
| dfs.namenode.replication.pending.timeout-sec | 
dfs.namenode.reconstruction.pending.timeout-sec |
{code}
# As a followup we should also deprecate other replication-related config keys.
# Renaming {{getExcessBlocksCount}} to {{getExtraBlocksCount}} doesn't seem 
necessary. {{FSNamesystem}} is using "excess" anyway. Sorry my previous comment 
pointed to this one. I just noticed the original name uses "blocks" instead of 
"replicas". Maybe we can also keep the "excess" in {{excessRedundancyMap}}? 
This matches better with the getter.

+1 after the above are addressed.

> Erasure Coding: Rename replication-based names in BlockManager to more 
> generic [part-2]
> ---
>
> Key: HDFS-9869
> URL: https://issues.apache.org/jira/browse/HDFS-9869
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: erasure-coding
>Reporter: Rakesh R
>Assignee: Rakesh R
>  Labels: hdfs-ec-3.0-must-do
> Attachments: HDFS-9869-001.patch, HDFS-9869-002.patch, 
> HDFS-9869-003.patch, HDFS-9869-004.patch
>
>
> The idea of this jira is to rename the following entities in BlockManager as,
> - {{PendingReplicationBlocks}} to {{PendingReconstructionBlocks}}
> - {{excessReplicateMap}} to {{extraRedundancyMap}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10305) Hdfs audit shouldn't log mkdir operaton if the directory already exists.

2016-04-18 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246716#comment-15246716
 ] 

Rushabh S Shah commented on HDFS-10305:
---

I think it is not a failed operation.

> Hdfs audit shouldn't log mkdir operaton if the directory already exists.
> 
>
> Key: HDFS-10305
> URL: https://issues.apache.org/jira/browse/HDFS-10305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Minor
>
> Currently Hdfs audit logs mkdir operation even if the directory already 
> exists.
> This creates confusion while analyzing audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10305) Hdfs audit shouldn't log mkdir operaton if the directory already exists.

2016-04-18 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246710#comment-15246710
 ] 

Mingliang Liu commented on HDFS-10305:
--

Agreed.

> Hdfs audit shouldn't log mkdir operaton if the directory already exists.
> 
>
> Key: HDFS-10305
> URL: https://issues.apache.org/jira/browse/HDFS-10305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Minor
>
> Currently Hdfs audit logs mkdir operation even if the directory already 
> exists.
> This creates confusion while analyzing audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10305) Hdfs audit shouldn't log mkdir operaton if the directory already exists.

2016-04-18 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246701#comment-15246701
 ] 

Ravi Prakash commented on HDFS-10305:
-

I believe the audit log was supposed to capture failed operations as well. I'd 
be inclined to close this JIRA as WON'T FIX

> Hdfs audit shouldn't log mkdir operaton if the directory already exists.
> 
>
> Key: HDFS-10305
> URL: https://issues.apache.org/jira/browse/HDFS-10305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Minor
>
> Currently Hdfs audit logs mkdir operation even if the directory already 
> exists.
> This creates confusion while analyzing audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9530) huge Non-DFS Used in hadoop 2.6.2 & 2.7.1

2016-04-18 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246697#comment-15246697
 ] 

Ravi Prakash commented on HDFS-9530:


To answer one of my own questions: "Could you please point me to the code where 
you see this happening?"
In 2, Brahma is likely referring to 
[BlockReceiver:283|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java#L283]
 -> 
[ReplicaInPipeline:163|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInPipeline.java#L163]
 -> 
[FsVolumeImpl:480|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java#L480]
In 3, Brahma is likely referring to 
[BlockReceiver:956|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java#L956]
 -> 
[ReplicaInPipeline:163|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaInPipeline.java#L163]
 -> 
[FsVolumeImpl:480|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java#L480]

> huge Non-DFS Used in hadoop 2.6.2 & 2.7.1
> -
>
> Key: HDFS-9530
> URL: https://issues.apache.org/jira/browse/HDFS-9530
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Fei Hui
> Attachments: HDFS-9530-01.patch
>
>
> i think there are bugs in HDFS
> ===
> here is config
>   
> dfs.datanode.data.dir
> 
> 
> file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2
> 
>   
> here is dfsadmin report 
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 238604832768 (222.22 GB)
> DFS Remaining: 215772954624 (200.95 GB)
> DFS Used: 22831878144 (21.26 GB)
> DFS Used%: 9.57%
> Under replicated blocks: 4
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7190958080 (6.70 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72343986176 (67.38 GB)
> DFS Used%: 8.96%
> DFS Remaining%: 90.14%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:02 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7219073024 (6.72 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72315871232 (67.35 GB)
> DFS Used%: 9.00%
> DFS Remaining%: 90.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> Name: 10.117.15.38:50010 (worker-1)
> Hostname: worker-1
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 8421847040 (7.84 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 71113097216 (66.23 GB)
> DFS Used%: 10.49%
> DFS Remaining%: 88.61%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> 
> when running hive job , dfsadmin report as follows
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 108266011136 (100.83 GB)
> DFS Remaining: 80078416384 (74.58 GB)
> DFS Used: 28187594752 (26.25 GB)
> DFS Used%: 26.04%
> Under replicated blocks: 7
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 9015627776 (8.40 GB)
> Non DFS Used: 44303742464 (41.26 GB)
> DFS Remaining: 

[jira] [Commented] (HDFS-9744) TestDirectoryScanner#testThrottling occasionally time out after 300 seconds

2016-04-18 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246648#comment-15246648
 ] 

Daniel Templeton commented on HDFS-9744:


LGTM

> TestDirectoryScanner#testThrottling occasionally time out after 300 seconds
> ---
>
> Key: HDFS-9744
> URL: https://issues.apache.org/jira/browse/HDFS-9744
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Lin Yiqun
>Priority: Minor
>  Labels: test
> Attachments: HDFS-9744.001.patch
>
>
> I have seen quite a few test failures in TestDirectoryScanner#testThrottling.
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/2793/testReport/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testThrottling/
> Looking at the log, it does not look like the test got stucked. On my local 
> machine, this test took 219 seconds. It is likely that this test takes more 
> than 300 seconds to complete on a busy jenkins slave. I think it is 
> reasonable to set a longer time out value, or reduce the number of blocks to 
> reduce the duration of the test.
> Error Message
> {noformat}
> test timed out after 30 milliseconds
> {noformat}
> Stacktrace
> {noformat}
> java.lang.Exception: test timed out after 30 milliseconds
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:503)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.waitAndQueuePacket(DataStreamer.java:804)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacket(DFSOutputStream.java:423)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.enqueueCurrentPacketFull(DFSOutputStream.java:432)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:418)
>   at 
> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:217)
>   at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:125)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:111)
>   at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:57)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:418)
>   at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:376)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.createFile(TestDirectoryScanner.java:108)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:584)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246647#comment-15246647
 ] 

Hadoop QA commented on HDFS-10306:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 58s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 44s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 134m 59s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.TestHFlush |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799326/HDFS-10306.000.patch |
| JIRA Issue | HDFS-10306 |
| Optional Tests |  

[jira] [Commented] (HDFS-10304) implement moveToLocal or remove it from the usage list

2016-04-18 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246620#comment-15246620
 ] 

Xiaobing Zhou commented on HDFS-10304:
--

[~steve_l] thanks for this checking. How about implementing it by combination 
of get and rm?

> implement moveToLocal or remove it from the usage list
> --
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10232) Ozone: Make config key naming consistent

2016-04-18 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246618#comment-15246618
 ] 

Anu Engineer commented on HDFS-10232:
-

Test failures are not related to this patch.

> Ozone: Make config key naming consistent
> 
>
> Key: HDFS-10232
> URL: https://issues.apache.org/jira/browse/HDFS-10232
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Trivial
> Attachments: HDFS-10232-HDFS-7240.001.patch, 
> HDFS-10232-HDFS-7240.002.patch, HDFS-10232-HDFS-7240.003.patch
>
>
> We seem to use StorageHandler, ozone, Objectstore etc as prefix. We should 
> pick one -- Ideally ozone and use that consistently as the prefix for the 
> ozone config management.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10232) Ozone: Make config key naming consistent

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246613#comment-15246613
 ] 

Hadoop QA commented on HDFS-10232:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 11 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
46s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
35s {color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 45s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 13s 
{color} | {color:green} HDFS-7240 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 5s 
{color} | {color:green} HDFS-7240 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 10s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 48s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 47s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 3s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
41s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 197m 2s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|  

[jira] [Commented] (HDFS-3743) QJM: improve formatting behavior for JNs

2016-04-18 Thread Jian Fang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246601#comment-15246601
 ] 

Jian Fang commented on HDFS-3743:
-

More likely the above issue was caused by some race condition in restarting 
name nodes and journal nodes instead of my code changes. Will create a separate 
JIRA to add the "-newEditsOnly" option to initializeSharedEdits and link it 
here later.

> QJM: improve formatting behavior for JNs
> 
>
> Key: HDFS-3743
> URL: https://issues.apache.org/jira/browse/HDFS-3743
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>
> Currently, the JournalNodes automatically format themselves when a new writer 
> takes over, if they don't have any data for that namespace. However, this has 
> a few problems:
> 1) if the administrator accidentally points a new NN at the wrong quorum (eg 
> corresponding to another cluster), it will auto-format a directory on those 
> nodes. This doesn't cause any data loss, but would be better to bail out with 
> an error indicating that they need to be formatted.
> 2) if a journal node crashes and needs to be reformatted, it should be able 
> to re-join the cluster and start storing new segments without having to fail 
> over to a new NN.
> 3) if 2/3 JNs get accidentally reformatted (eg the mount point becomes 
> undone), and the user starts the NN, it should fail to start, because it may 
> end up missing edits. If it auto-formats in this case, the user might have 
> silent "rollback" of the most recent edits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246596#comment-15246596
 ] 

Xiaobing Zhou commented on HDFS-10264:
--

[~boky01] and [~shv] thanks for the comments. v001 used SLF4J logging.

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246587#comment-15246587
 ] 

Andras Bokor commented on HDFS-10264:
-

[~xiaobingo] Please check the comments above. SLF4J is preferred.

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10264:
-
Attachment: HDFS-10264.001.patch

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch, HDFS-10264.001.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-18 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246571#comment-15246571
 ] 

Konstantin Shvachko commented on HDFS-10301:


Hey Daryn, not sure how HDFS-9198 eliminates it from occurring. DataNodes are 
still waiting for NN to process each BR, so they can timeout and send the same 
block report multiple times. On the NN side, BR ops processing is 
multi-threaded, so it can still interleave processing storages from different 
reports. Could you please clarify, what am I missing?

> Blocks removed by thousands due to falsely detected zombie storages
> ---
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Priority: Critical
> Attachments: zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246550#comment-15246550
 ] 

Xiaobing Zhou commented on HDFS-10264:
--

I posted the patch v000, please kindly review, thanks.

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10264:
-
Attachment: HDFS-10264.000.patch

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HDFS-10264:
-
Status: Patch Available  (was: Open)

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
> Attachments: HDFS-10264.000.patch
>
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246526#comment-15246526
 ] 

Hudson commented on HDFS-10265:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9628 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9628/])
HDFS-10265. OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has 
(cmccabe: rev cb3ca460efb97be8c031bdb14bb7705cc25f2117)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java


> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.1, 2.7.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Fix For: 2.8.0
>
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-18 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-10265:

  Resolution: Fixed
   Fix Version/s: 2.8.0
Target Version/s: 2.8.0
  Status: Resolved  (was: Patch Available)

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.1, 2.7.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Fix For: 2.8.0
>
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-18 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246503#comment-15246503
 ] 

Colin Patrick McCabe commented on HDFS-10265:
-

+1.  Thanks, [~wanchang].

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.1, 2.7.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246500#comment-15246500
 ] 

Konstantin Shvachko commented on HDFS-10264:


Makes sense. Good suggestions, Andras.

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246466#comment-15246466
 ] 

Hadoop QA commented on HDFS-9958:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 0s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 20s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 137m 12s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.TestFileCorruption |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.TestFileCorruption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12799304/HDFS-9958.002.patch |
| JIRA Issue | HDFS-9958 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6e7a2dcae2f5 

[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-04-18 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.012.patch

bq. BTW, we should check if excludedNodes.contains(writer) is already true; 
otherwise, fallback does not help.

Fixed in the newly updated patch. 

[~szetszwo] What do you think about [~cmccabe] and [~stack]'s suggestions? 
Would that works for you?

Thanks!




> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8986) Add option to -du to calculate directory space usage excluding snapshots

2016-04-18 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-8986:

Attachment: HDFS-8986.02.patch

Patch 2 fixes all reported errors. The 2 added tests in {{DFSShell}} can pass 
locally, not sure why it failed on jenkins.

> Add option to -du to calculate directory space usage excluding snapshots
> 
>
> Key: HDFS-8986
> URL: https://issues.apache.org/jira/browse/HDFS-8986
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Reporter: Gautam Gopalakrishnan
>Assignee: Xiao Chen
> Attachments: HDFS-8986.01.patch, HDFS-8986.02.patch
>
>
> When running {{hadoop fs -du}} on a snapshotted directory (or one of its 
> children), the report includes space consumed by blocks that are only present 
> in the snapshots. This is confusing for end users.
> {noformat}
> $  hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -createSnapshot /tmp/parent snap1
> Created snapshot /tmp/parent/.snapshot/snap1
> $ hadoop fs -rm -skipTrash /tmp/parent/sub1/*
> ...
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -deleteSnapshot /tmp/parent snap1
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 0  0  /tmp/parent
> 0  0  /tmp/parent/sub1
> {noformat}
> It would be helpful if we had a flag, say -X, to exclude any snapshot related 
> disk usage in the output



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9016) Display upgrade domain information in fsck

2016-04-18 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246375#comment-15246375
 ] 

Allen Wittenauer commented on HDFS-9016:


fsck has already shipped in a previous release of Hadoop.  Changing it's output 
is not a compatible change in the entirety of branch-2.

> Display upgrade domain information in fsck
> --
>
> Key: HDFS-9016
> URL: https://issues.apache.org/jira/browse/HDFS-9016
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9016.patch
>
>
> This will make it easy for people to use fsck to check block placement when 
> upgrade domain is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10306:
-
Description: 
This is a follow-up of [HDFS-10192].

The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
{{namesystem#inTransitionToActive()}}, while it should. According to the fix of 
[HDFS-10192], we should add this check to prevent the {{smmthread}} from 
calling {{leaveSafeMode()}} too early.

  was:
This is a follow-up of [HDFS-10192].

The {{BlockManagerSafeMode$SafeModeMonitor#canLeave90}} is not checking the 
{{namesystem#inTransitionToActive()}}, while it should. According to the fix of 
[HDFS-10192], we should add this check to prevent the {{smmthread}} from 
calling {{leaveSafeMode()}} too early.


> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave()}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10306:
-
Status: Patch Available  (was: Open)

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave90}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10306:
-
Attachment: HDFS-10306.000.patch

Thanks [~walter.k.su] for suggesting separating this code out of [HDFS-10284], 
which was to address 
{{o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode}}
 unit test intermittent failure.

The v0 patch was from the v1 patch of [HDFS-10284]. Please review.


> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Attachments: HDFS-10306.000.patch
>
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave90}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10284) o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode fails intermittently

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10284:
-
Attachment: HDFS-10284.002.patch

Thank you [~walter.k.su] and [~vinayrpet] for your kind review.

The v2 patch is to remove the changes of {{namesystem#inTransitionToActive()}} 
to a separate patch, as [~walter.k.su] suggested. I created jira [HDFS-10306] 
for this.

> o.a.h.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode 
> fails intermittently
> -
>
> Key: HDFS-10284
> URL: https://issues.apache.org/jira/browse/HDFS-10284
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Minor
> Attachments: HDFS-10284.000.patch, HDFS-10284.001.patch, 
> HDFS-10284.002.patch
>
>
> *Stacktrace*
> {code}
> org.mockito.exceptions.misusing.UnfinishedStubbingException: 
> Unfinished stubbing detected here:
> -> at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> E.g. thenReturn() may be missing.
> Examples of correct stubbing:
> when(mock.isOk()).thenReturn(true);
> when(mock.isOk()).thenThrow(exception);
> doThrow(exception).when(mock).someVoidMethod();
> Hints:
>  1. missing thenReturn()
>  2. although stubbed methods may return mocks, you cannot inline mock 
> creation (mock()) call inside a thenReturn method (see issue 53)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManagerSafeMode.testCheckSafeMode(TestBlockManagerSafeMode.java:169)
> {code}
> Sample failing pre-commit UT: 
> https://builds.apache.org/job/PreCommit-HDFS-Build/15153/testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlockManagerSafeMode/testCheckSafeMode/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10306:
-
Description: 
This is a follow-up of [HDFS-10192].

The {{BlockManagerSafeMode$SafeModeMonitor#canLeave90}} is not checking the 
{{namesystem#inTransitionToActive()}}, while it should. According to the fix of 
[HDFS-10192], we should add this check to prevent the {{smmthread}} from 
calling {{leaveSafeMode()}} too early.

  was:
Scenario:
===
write some blocks
wait till roll edits happen
Stop SNN
Delete some blocks in ANN, wait till the blocks are deleted in DN also.
restart the SNN and Wait till block reports come from datanode to SNN
Kill ANN then make SNN to Active.


> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> This is a follow-up of [HDFS-10192].
> The {{BlockManagerSafeMode$SafeModeMonitor#canLeave90}} is not checking the 
> {{namesystem#inTransitionToActive()}}, while it should. According to the fix 
> of [HDFS-10192], we should add this check to prevent the {{smmthread}} from 
> calling {{leaveSafeMode()}} too early.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10306:
-
Fix Version/s: (was: 2.9.0)

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>
> Scenario:
> ===
> write some blocks
> wait till roll edits happen
> Stop SNN
> Delete some blocks in ANN, wait till the blocks are deleted in DN also.
> restart the SNN and Wait till block reports come from datanode to SNN
> Kill ANN then make SNN to Active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HDFS-10306:
-
Hadoop Flags:   (was: Reviewed)

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.9.0
>
>
> Scenario:
> ===
> write some blocks
> wait till roll edits happen
> Stop SNN
> Delete some blocks in ANN, wait till the blocks are deleted in DN also.
> restart the SNN and Wait till block reports come from datanode to SNN
> Kill ANN then make SNN to Active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)
Mingliang Liu created HDFS-10306:


 Summary: SafeModeMonitor should not leave safe mode if name system 
is starting active service
 Key: HDFS-10306
 URL: https://issues.apache.org/jira/browse/HDFS-10306
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Mingliang Liu
Assignee: Brahma Reddy Battula
 Fix For: 2.9.0


Scenario:
===
write some blocks
wait till roll edits happen
Stop SNN
Delete some blocks in ANN, wait till the blocks are deleted in DN also.
restart the SNN and Wait till block reports come from datanode to SNN
Kill ANN then make SNN to Active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-10306) SafeModeMonitor should not leave safe mode if name system is starting active service

2016-04-18 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HDFS-10306:


Assignee: Mingliang Liu  (was: Brahma Reddy Battula)

> SafeModeMonitor should not leave safe mode if name system is starting active 
> service
> 
>
> Key: HDFS-10306
> URL: https://issues.apache.org/jira/browse/HDFS-10306
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
> Fix For: 2.9.0
>
>
> Scenario:
> ===
> write some blocks
> wait till roll edits happen
> Stop SNN
> Delete some blocks in ANN, wait till the blocks are deleted in DN also.
> restart the SNN and Wait till block reports come from datanode to SNN
> Kill ANN then make SNN to Active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9016) Display upgrade domain information in fsck

2016-04-18 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246308#comment-15246308
 ] 

Ming Ma commented on HDFS-9016:
---

Thanks, [~eddyxu]! It shouldn't change fsck's output format given upgrade 
domain isn't defined by default and it can't be defined until HDFS-9005 was 
available. Given 2.8 hasn't been released yet, it seems ok as long as this jira 
is added to 2.8. [~aw], [~andrew.wang], thoughts?

> Display upgrade domain information in fsck
> --
>
> Key: HDFS-9016
> URL: https://issues.apache.org/jira/browse/HDFS-9016
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9016.patch
>
>
> This will make it easy for people to use fsck to check block placement when 
> upgrade domain is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10299) libhdfs++: File length doesn't always count the last block if it's being written to

2016-04-18 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10299:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> libhdfs++: File length doesn't always count the last block if it's being 
> written to
> ---
>
> Key: HDFS-10299
> URL: https://issues.apache.org/jira/browse/HDFS-10299
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-10299.HDFS-8707.000.patch
>
>
> It looks like we aren't factoring in the last block of files that are being 
> written to or haven't been closed yet into the length of the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10232) Ozone: Make config key naming consistent

2016-04-18 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246272#comment-15246272
 ] 

Arpit Agarwal commented on HDFS-10232:
--

+1 thanks [~anu].

> Ozone: Make config key naming consistent
> 
>
> Key: HDFS-10232
> URL: https://issues.apache.org/jira/browse/HDFS-10232
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Trivial
> Attachments: HDFS-10232-HDFS-7240.001.patch, 
> HDFS-10232-HDFS-7240.002.patch, HDFS-10232-HDFS-7240.003.patch
>
>
> We seem to use StorageHandler, ozone, Objectstore etc as prefix. We should 
> pick one -- Ideally ozone and use that consistently as the prefix for the 
> ozone config management.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10232) Ozone: Make config key naming consistent

2016-04-18 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-10232:

Attachment: HDFS-10232-HDFS-7240.003.patch

Missed a comment from Arpit in earlier patch. Removed DFS prefix from Ozone 
Keys.

> Ozone: Make config key naming consistent
> 
>
> Key: HDFS-10232
> URL: https://issues.apache.org/jira/browse/HDFS-10232
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Trivial
> Attachments: HDFS-10232-HDFS-7240.001.patch, 
> HDFS-10232-HDFS-7240.002.patch, HDFS-10232-HDFS-7240.003.patch
>
>
> We seem to use StorageHandler, ozone, Objectstore etc as prefix. We should 
> pick one -- Ideally ozone and use that consistently as the prefix for the 
> ozone config management.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9016) Display upgrade domain information in fsck

2016-04-18 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246223#comment-15246223
 ] 

Allen Wittenauer commented on HDFS-9016:


If it changes the output in any way/shape/form, it's not backward compatible.  
See 
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Command_Line_Interface_CLI
 .

> Display upgrade domain information in fsck
> --
>
> Key: HDFS-9016
> URL: https://issues.apache.org/jira/browse/HDFS-9016
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9016.patch
>
>
> This will make it easy for people to use fsck to check block placement when 
> upgrade domain is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9958) BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed storages.

2016-04-18 Thread Kuhu Shukla (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated HDFS-9958:
--
Attachment: HDFS-9958.002.patch

Updating patch per comments. Added test is now reliable.

> BlockManager#createLocatedBlocks can throw NPE for corruptBlocks on failed 
> storages.
> 
>
> Key: HDFS-9958
> URL: https://issues.apache.org/jira/browse/HDFS-9958
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: HDFS-9958-Test-v1.txt, HDFS-9958.001.patch, 
> HDFS-9958.002.patch
>
>
> In a scenario where the corrupt replica is on a failed storage, before it is 
> taken out of blocksMap, there is a race which causes the creation of 
> LocatedBlock on a {{machines}} array element that is not populated. 
> Following is the root cause,
> {code}
> final int numCorruptNodes = countNodes(blk).corruptReplicas();
> {code}
> countNodes only looks at nodes with storage state as NORMAL, which in the 
> case where corrupt replica is on failed storage will amount to 
> numCorruptNodes being zero. 
> {code}
> final int numNodes = blocksMap.numNodes(blk);
> {code}
> However, numNodes will count all nodes/storages irrespective of the state of 
> the storage. Therefore numMachines will include such (failed) nodes. The 
> assert would fail only if the system is enabled to catch Assertion errors, 
> otherwise it goes ahead and tries to create LocatedBlock object for that is 
> not put in the {{machines}} array.
> Here is the stack trace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:45)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.toDatanodeInfos(DatanodeStorageInfo.java:40)
>   at 
> org.apache.hadoop.hdfs.protocol.LocatedBlock.(LocatedBlock.java:84)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:878)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:826)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlockList(BlockManager.java:799)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:899)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1849)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9016) Display upgrade domain information in fsck

2016-04-18 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246203#comment-15246203
 ] 

Lei (Eddy) Xu commented on HDFS-9016:
-

Hi, [~mingma]

Thanks for the work. 
I +1 for the code. But it'd be better for [~aw] or [~andrew.wang] to comment 
the compatibility.

> Display upgrade domain information in fsck
> --
>
> Key: HDFS-9016
> URL: https://issues.apache.org/jira/browse/HDFS-9016
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: HDFS-9016.patch
>
>
> This will make it easy for people to use fsck to check block placement when 
> upgrade domain is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-10304) implement moveToLocal or remove it from the usage list

2016-04-18 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou reassigned HDFS-10304:


Assignee: Xiaobing Zhou

> implement moveToLocal or remove it from the usage list
> --
>
> Key: HDFS-10304
> URL: https://issues.apache.org/jira/browse/HDFS-10304
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Xiaobing Zhou
>Priority: Minor
>
> if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 
> If you try to use the command, it tells you off "Option '-moveToLocal' is not 
> implemented yet."
> Either the command should be implemented, or it should be removed from the 
> usage list, as it is not technically a command you can use, except in the 
> special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9530) huge Non-DFS Used in hadoop 2.6.2 & 2.7.1

2016-04-18 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246179#comment-15246179
 ] 

Ravi Prakash commented on HDFS-9530:


bq. 1. Reservation happens only when the block is being received using 
BlockReceiver. No other places reservation happens, so no need to release as 
well.
Thanks for reminding me Brahma! Do you think we should change 
{{reservedForReplicas}} when a datanode is started up and an older RBW replica 
is recovered? Specifically 
[BlockPoolSlice.getVolumeMap|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/BlockPoolSlice.java#L361]
 {{addToReplicasMap(volumeMap, rbwDir, lazyWriteReplicaMap, false);}} . Also it 
seems to me, since we aren't calling {{reserveSpaceForReplica}} in 
BlockReceiver but instead at a lower level, we will have to worry about calling 
{{releaseReservedSpace}} at that lower level.
{quote}2. BlockReceiver constructor have a try-catch block where it will 
release all the bytes reserved, if there is any exceptions after reserving.
3. BlockReceiver#receiveBlock() have the try-catch block where it will release 
all the bytes reserved if there is any exceptions during the receiving 
process.{quote}
Could you please point me to the code where you see this happening? I mean 
specific instances of {{FsVolumeImpl.releaseReservedSpace}} being called with 
the stack trace.

bq. Only place left is in DataXceiver#writeBlock(), exception can happen after 
creation of BlockReceiver and before calling BlockReceiver#receiveBlock(), if 
failed to connect to Mirror nodes.
Do you mean to imply that the places I found in [this 
comment|https://issues.apache.org/jira/browse/HDFS-9530?focusedCommentId=15231164=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15231164]
 need not call {{reserveSpaceForReplica}} / {{releaseReservedSpace}} ?


> huge Non-DFS Used in hadoop 2.6.2 & 2.7.1
> -
>
> Key: HDFS-9530
> URL: https://issues.apache.org/jira/browse/HDFS-9530
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Fei Hui
> Attachments: HDFS-9530-01.patch
>
>
> i think there are bugs in HDFS
> ===
> here is config
>   
> dfs.datanode.data.dir
> 
> 
> file:///mnt/disk4,file:///mnt/disk1,file:///mnt/disk3,file:///mnt/disk2
> 
>   
> here is dfsadmin report 
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 238604832768 (222.22 GB)
> DFS Remaining: 215772954624 (200.95 GB)
> DFS Used: 22831878144 (21.26 GB)
> DFS Used%: 9.57%
> Under replicated blocks: 4
> Blocks with corrupt replicas: 0
> Missing blocks: 0
> -
> Live datanodes (3):
> Name: 10.117.60.59:50010 (worker-2)
> Hostname: worker-2
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7190958080 (6.70 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72343986176 (67.38 GB)
> DFS Used%: 8.96%
> DFS Remaining%: 90.14%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:02 CST 2015
> Name: 10.168.156.0:50010 (worker-3)
> Hostname: worker-3
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 7219073024 (6.72 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 72315871232 (67.35 GB)
> DFS Used%: 9.00%
> DFS Remaining%: 90.11%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> Name: 10.117.15.38:50010 (worker-1)
> Hostname: worker-1
> Decommission Status : Normal
> Configured Capacity: 80256417792 (74.74 GB)
> DFS Used: 8421847040 (7.84 GB)
> Non DFS Used: 721473536 (688.05 MB)
> DFS Remaining: 71113097216 (66.23 GB)
> DFS Used%: 10.49%
> DFS Remaining%: 88.61%
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> Xceivers: 1
> Last contact: Wed Dec 09 15:55:03 CST 2015
> 
> when running hive job , dfsadmin report as follows
> [hadoop@worker-1 ~]$ hadoop dfsadmin -report
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> Configured Capacity: 240769253376 (224.23 GB)
> Present Capacity: 

[jira] [Commented] (HDFS-10256) Use GenericTestUtils.getTestDir method in tests for temporary directories

2016-04-18 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246137#comment-15246137
 ] 

Kihwal Lee commented on HDFS-10256:
---

The entire base dir is supposed to be wiped out by the shutdown hook. The space 
won't get cleaned up between test cases running on the same jvm, but I thought 
the space usage increase would be negligible.  I will check it out further.

> Use GenericTestUtils.getTestDir method in tests for temporary directories
> -
>
> Key: HDFS-10256
> URL: https://issues.apache.org/jira/browse/HDFS-10256
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10305) Hdfs audit shouldn't log mkdir operaton if the directory already exists.

2016-04-18 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10305:
--
Description: 
Currently Hdfs audit logs mkdir operation even if the directory already exists.
This creates confusion while analyzing audit logs.

  was:
Currently Hdfs audit logs mkdir operation if the directory already exists.
This creates confusion while analyzing audit logs.


> Hdfs audit shouldn't log mkdir operaton if the directory already exists.
> 
>
> Key: HDFS-10305
> URL: https://issues.apache.org/jira/browse/HDFS-10305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Minor
>
> Currently Hdfs audit logs mkdir operation even if the directory already 
> exists.
> This creates confusion while analyzing audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10305) Hdfs audit shouldn't log mkdir operaton if the directory already exists.

2016-04-18 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10305:
--
Summary: Hdfs audit shouldn't log mkdir operaton if the directory already 
exists.  (was: Hdfs audit shouldn't log if the directory already exists.)

> Hdfs audit shouldn't log mkdir operaton if the directory already exists.
> 
>
> Key: HDFS-10305
> URL: https://issues.apache.org/jira/browse/HDFS-10305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Minor
>
> Hdfs audit logs mkdir operation if the directory already exists.
> This creates confusion while analyzing audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10305) Hdfs audit shouldn't log mkdir operaton if the directory already exists.

2016-04-18 Thread Rushabh S Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rushabh S Shah updated HDFS-10305:
--
Description: 
Currently Hdfs audit logs mkdir operation if the directory already exists.
This creates confusion while analyzing audit logs.

  was:
Hdfs audit logs mkdir operation if the directory already exists.
This creates confusion while analyzing audit logs.


> Hdfs audit shouldn't log mkdir operaton if the directory already exists.
> 
>
> Key: HDFS-10305
> URL: https://issues.apache.org/jira/browse/HDFS-10305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
>Priority: Minor
>
> Currently Hdfs audit logs mkdir operation if the directory already exists.
> This creates confusion while analyzing audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10305) Hdfs audit shouldn't log if the directory already exists.

2016-04-18 Thread Rushabh S Shah (JIRA)
Rushabh S Shah created HDFS-10305:
-

 Summary: Hdfs audit shouldn't log if the directory already exists.
 Key: HDFS-10305
 URL: https://issues.apache.org/jira/browse/HDFS-10305
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Rushabh S Shah
Assignee: Rushabh S Shah
Priority: Minor


Hdfs audit logs mkdir operation if the directory already exists.
This creates confusion while analyzing audit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246061#comment-15246061
 ] 

Hadoop QA commented on HDFS-10265:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 47s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
3s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 
314 unchanged - 1 fixed = 315 total (was 315) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 58s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 55s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 146m 5s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl |
|   | hadoop.hdfs.TestHFlush |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| 

[jira] [Commented] (HDFS-10256) Use GenericTestUtils.getTestDir method in tests for temporary directories

2016-04-18 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246047#comment-15246047
 ] 

Vinayakumar B commented on HDFS-10256:
--

bq. Can we actually make sure each MiniDFSCluster gets a unique base directory?
v2 patch by [~ste...@apache.org] in HADOOP-12984 had actually done this. But as 
mentioned by [~cnauroth] 
[here|https://issues.apache.org/jira/browse/HADOOP-12984?focusedCommentId=14969932=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14969932]
 in HADOOP-12984 it had a side effect of using lot of disk space at the end of 
complete test run.
Do you think cleanup not happening properly?

> Use GenericTestUtils.getTestDir method in tests for temporary directories
> -
>
> Key: HDFS-10256
> URL: https://issues.apache.org/jira/browse/HDFS-10256
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10304) implement moveToLocal or remove it from the usage list

2016-04-18 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246040#comment-15246040
 ] 

Steve Loughran commented on HDFS-10304:
---

{code}
$ hdfs dfs
Usage: hadoop fs [generic options]
[-appendToFile  ... ]
[-cat [-ignoreCrc]  ...]
[-checksum  ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R]  PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l]  ... ]
[-copyToLocal [-p] [-ignoreCrc] [-crc]  ... ]
[-count [-q] [-h] [-v] [-t []]  ...]
[-cp [-f] [-p | -p[topax]]  ... ]
[-createSnapshot  []]
[-deleteSnapshot  ]
[-df [-h] [ ...]]
[-du [-s] [-h]  ...]
[-expunge]
[-find  ...  ...]
[-get [-p] [-ignoreCrc] [-crc]  ... ]
[-getfacl [-R] ]
[-getfattr [-R] {-n name | -d} [-e en] ]
[-getmerge [-nl]  ]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [ ...]]
[-mkdir [-p]  ...]
[-moveFromLocal  ... ]
[-moveToLocal  ]
[-mv  ... ]
[-put [-f] [-p] [-l]  ... ]
[-renameSnapshot   ]
[-rm [-f] [-r|-R] [-skipTrash] [-safely]  ...]
[-rmdir [--ignore-fail-on-non-empty]  ...]
[-setfacl [-R] [{-b|-k} {-m|-x } ]|[--set  
]]
[-setfattr {-n name [-v value] | -x name} ]
[-setrep [-R] [-w]   ...]
[-stat [format]  ...]
[-tail [-f] ]
[-test -[defsz] ]
[-text [-ignoreCrc]  ...]
[-touchz  ...]
[-truncate [-w]   ...]
[-usage [cmd ...]]

Generic options supported are
-conf  specify an application configuration file
-D 

[jira] [Created] (HDFS-10304) implement moveToLocal or remove it from the usage list

2016-04-18 Thread Steve Loughran (JIRA)
Steve Loughran created HDFS-10304:
-

 Summary: implement moveToLocal or remove it from the usage list
 Key: HDFS-10304
 URL: https://issues.apache.org/jira/browse/HDFS-10304
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts
Affects Versions: 2.8.0
Reporter: Steve Loughran
Priority: Minor


if you get the usage list of {{hdfs dfs}} it tells you of "-moveToLocal". 

If you try to use the command, it tells you off "Option '-moveToLocal' is not 
implemented yet."

Either the command should be implemented, or it should be removed from the 
usage list, as it is not technically a command you can use, except in the 
special case of "I want my shell to print "not implemented yet""



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9543) DiskBalancer : Add Data mover

2016-04-18 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246011#comment-15246011
 ] 

Anu Engineer commented on HDFS-9543:


[~eddyxu] Thank you for  the code review comments. Please see some of my 
thoughts on your suggestions.

bq.Could you put getNextBlock() logic into a separate Iterator, and make it 
Closable, which will include getBlockToCopy(), openPoolIters(), getNextBlock(), 
closePoolIters(). There are a few draw backs of separating them into different 
functions. 

I thought that this is an excellent idea, and I explored this path of code 
organization. But I got stuck in an intractable problem which makes me want to 
abandon this path. Please do let me know if you have suggestions on how you 
think I can solve this.

In supporting an Iterator interface, we have to support hasNext. In our case it 
is a scan of the block list and finding a block that is small than the required 
move size. There are 2 ways to do this, look for the block and report true if 
found, but there is no gurantee that it will indeed be returned in the next() 
call since these blocks can go away underneath us. 

So a common pattern of iterator code becomes complex to write -- 
while(hasNext()) next() -- pattern now needs to worry next() failing even when 
hasNext has been successful.

We can keep a pointer to the found block in memory, and return that in the next 
call, but that means that we have to do some unnecessary block state management 
in the iterator.

I ended up writing all this and found that code is getting more complex instead 
of simpler and has kind of decided to abandon this approach.


bq. 1) The states (i.e. poolIndex,) are stored outside these functions, the 
caller needs maintain these states. 

These are part of BlockMover class, and copyblocks in the only call made by 
other classes. So it is not visible to caller at all.

bq. 2) poolIndex is never initialized and is not able be reset.
PoolIndex is a index to a circular array,  if you like I can initialize this to 
0, but in most cases we just move to next block pool and get the next block. 
Before each block fetch we init the count variable so that we know if we have 
visited all the block pools. In other words, users should not be able to see 
this, nor need to reset this variable.

bq. Please always log the IOEs. And I think it is better to throw IOE here as 
well as many other places.
I will log a debug trace here. The reason why we are not throwing is because it 
is possible that we might encounter a damaged block when we try to move large 
number of blocks from one volume to another. Instead of failing or aborting the 
action we keep a count of errors we have encountered. We will just ignore that 
block and continue copy, until we hit max_error_counts. For each move -- that 
is a source disk, destination disk, bytes to move -- you can specify the max 
error count. Hence we are not throwing but keeping track of the error count.

bq. Can it be a private static List openPoolIters() ?
Since we are not doing the iterator interface, I am skipping this too.

bq. In a few such places, should we actually break the while loop? Wouldn't 
continue here just generate a lot of LOGS and spend CPU cycles?
You are right  and I did think of writing a break here. The reason that I chose 
continue over break was this. It is easier to reason about a loop if it has 
only one exit point. With break, you have to reason about all exit points. This 
loop is a pretty complicated one since it can exit in many ways. Here are some :

# when we meet the maximum error count.
# if we have reach close enough to move bytes -- say close to 10% of the target 
# if we get an interrupt call from the client.
# if we get a shutdown call.
# if we are not able to find any blocks to move.
# if we are out of destination space.

so instead of making people reason about each of these, we set exit flag and 
loop back up. Since the while will turn to false, it will have one single exit, 
and will exit without any extra logging.

bq. Why do you need to change float to double. In this case, wouldn't float 
good enough ? 
This is a stylistic change, Java docs recommend double as the default choice 
over float. Hence this fix.

> DiskBalancer : Add Data mover 
> --
>
> Key: HDFS-9543
> URL: https://issues.apache.org/jira/browse/HDFS-9543
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: HDFS-9543-HDFS-1312.001.patch
>
>
> This patch adds the actual mover logic to the datanode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10296) FileContext.getDelegationTokens() fails to obtain KMS delegation token

2016-04-18 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246008#comment-15246008
 ] 

Wei-Chiu Chuang commented on HDFS-10296:


Thanks for providing a code snippet to demonstrate the issue, Andreas.

Please note that in your case, if default FS is a local file system, it will 
not have delegation tokens.
While you reported this on a CDH Hadoop, this same behavior also holds for 
Apache Hadoop.

> FileContext.getDelegationTokens() fails to obtain KMS delegation token
> --
>
> Key: HDFS-10296
> URL: https://issues.apache.org/jira/browse/HDFS-10296
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 2.6.0
> Environment: CDH 5.6 with a Java KMS
>Reporter: Andreas Neumann
>
> This little program demonstrates the problem: With FileSystem, we can get 
> both the HDFS and the kms-dt token, whereas with FileContext, we can only 
> obtain the HDFS delegation token. 
> {code}
> public class SimpleTest {
>   public static void main(String[] args) throws IOException {
> YarnConfiguration hConf = new YarnConfiguration();
> String renewer = "renewer";
> FileContext fc = FileContext.getFileContext(hConf);
> List tokens = fc.getDelegationTokens(new Path("/"), renewer);
> for (Token token : tokens) {
>   System.out.println("Token from FC: " + token);
> }
> FileSystem fs = FileSystem.get(hConf);
> for (Token token : fs.addDelegationTokens(renewer, new Credentials())) 
> {
>   System.out.println("Token from FS: " + token);
> }
>   }
> }
> {code}
> Sample output (host/user name x'ed out):
> {noformat}
> Token from FC: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:xxx, Ident: 
> (HDFS_DELEGATION_TOKEN token 49 for xxx)
> Token from FS: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:xxx, Ident: 
> (HDFS_DELEGATION_TOKEN token 50 for xxx)
> Token from FS: Kind: kms-dt, Service: xx.xx.xx.xx:16000, Ident: 00 04 63 64 
> 61 70 07 72 65 6e 65 77 65 72 00 8a 01 54 16 96 c2 95 8a 01 54 3a a3 46 95 0e 
> 02
> {noformat}
> Apparently FileContext does not return the KMS token. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10301) Blocks removed by thousands due to falsely detected zombie storages

2016-04-18 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245867#comment-15245867
 ] 

Daryn Sharp commented on HDFS-10301:


Enabling HDFS-9198 will fifo process BRs.  It doesn't solve this implementation 
bug but virtually eliminates it from occurring.

> Blocks removed by thousands due to falsely detected zombie storages
> ---
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Priority: Critical
> Attachments: zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-18 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HDFS-10265:
-
Status: Open  (was: Patch Available)

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1, 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Attachments: HDFS-10265-001.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-18 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HDFS-10265:
-
Status: Patch Available  (was: Open)

submit patch

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.7.1, 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10265) OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag

2016-04-18 Thread Wan Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wan Chang updated HDFS-10265:
-
Attachment: HDFS-10265-002.patch

> OEV tool fails to read edit xml file if OP_UPDATE_BLOCKS has no BLOCK tag
> -
>
> Key: HDFS-10265
> URL: https://issues.apache.org/jira/browse/HDFS-10265
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.4.1, 2.7.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Minor
>  Labels: patch
> Attachments: HDFS-10265-001.patch, HDFS-10265-002.patch
>
>
> I use OEV tool to convert editlog to xml file, then convert the xml file back 
> to binary editslog file(so that low version NameNode can load edits that 
> generated by higher version NameNode). But when OP_UPDATE_BLOCKS has no BLOCK 
> tag, the OEV tool doesn't handle the case and exits with InvalidXmlException.
> Here is the stack:
> {code}
> fromXml error decoding opcode null
> {{"/tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5"},
>  {"-2"}, {},
> {"3875711"}}
> Encountered exception. Exiting: no entry found for BLOCK
> org.apache.hadoop.hdfs.util.XMLUtils$InvalidXmlException: no entry found for 
> BLOCK
> at 
> org.apache.hadoop.hdfs.util.XMLUtils$Stanza.getChildren(XMLUtils.java:242)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$UpdateBlocksOp.fromXml(FSEditLogOp.java:908)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogOp.decodeXml(FSEditLogOp.java:3942)
> ...
> {code}
> Here is part of the xml file:
> {code}
> 
>   OP_UPDATE_BLOCKS
>   
> 3875711
> 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5
> 
> -2
>   
> 
> {code}
> I tracked the NN's log and found those operation:
> 0. The file 
> /tmp/100M3/slive/data/subDir_13/subDir_7/subDir_15/subDir_11/subFile_5 is 
> very small and contains only one block.
> 1. Client ask NN to add block to the file.
> 2. Client failed to write to DN and asked NameNode to abandon block.
> 3. NN remove the block and write an OP_UPDATE_BLOCKS to editlog
> Finally NN generated a OP_UPDATE_BLOCKS with no BLOCK tags.
> In FSEditLogOp$UpdateBlocksOp.fromXml, we need to handle the case above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10303) DataStreamer#ResponseProcessor calculate packet acknowledge duration wrongly.

2016-04-18 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245720#comment-15245720
 ] 

Surendra Singh Lilhore commented on HDFS-10303:
---

I am getting the "Slow ReadProcessor read" log in my cluster when I increase 
the socket timeout for client.

{noformat}
16/04/14 17:57:59 WARN DataStreamer: Slow ReadProcessor read fields for block 
BP-873267638-192.168.100.12-1460002479721:blk_1073752739_11917 took 47858ms 
(threshold=3ms); ack: seqno: 3 reply: SUCCESS reply: SUCCESS reply: SUCCESS 
downstreamAckTimeNanos: 803180 flag: 0 flag: 0 flag: 0, targets: 
[DatanodeInfoWithStorage[192.168.100.9:25009,DS-d552bfd7-1c38-430d-8703-c3b539caf351,DISK],
 
DatanodeInfoWithStorage[192.168.100.11:25009,DS-02897c9b-bceb-4790-b08a-f711d8e3fd81,DISK],
 
DatanodeInfoWithStorage[192.168.100.10:25009,DS-fae7b497-a269-4614-afe5-7006660eafcf,DISK]]
{noformat}

But when I checked the packet send time, it is same as packet acknowledge time

{noformat}
16/04/14 17:57:59 DEBUG DataStreamer: DataStreamer block 
BP-873267638-192.168.100.12-1460002479721:blk_1073752739_11917 sending packet 
packet seqno: 3 offsetInBlock: 8704 lastPacketInBlock: false 
lastByteOffsetInBlock: 12316
{noformat}




This is coming because {{ResponseProcessor}} set the current time as begin time 
and wait for the packet ack, after getting the ack it will calculate the 
duration and compare with the {{dfs.client.slow.io.warning.threshold.ms}}.

{code}
  // read an ack from the pipeline
  long begin = Time.monotonicNow();
  ack.readFields(blockReplyStream);
  long duration = Time.monotonicNow() - begin;
{code}

Suppose client sent two packets and now he doesn't have data to write, after 
some time he got more data and sent third packet.

Client waited for some time after sending second packet. Time between second 
packet and third packet should not be considered by {{ResponseProcessor}} in 
packet acknowledge duration.


> DataStreamer#ResponseProcessor calculate packet acknowledge duration wrongly.
> -
>
> Key: HDFS-10303
> URL: https://issues.apache.org/jira/browse/HDFS-10303
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.2
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>
> Packets acknowledge duration should be calculated based on the packet send 
> time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10303) DataStreamer#ResponseProcessor calculate packet acknowledge duration wrongly.

2016-04-18 Thread Surendra Singh Lilhore (JIRA)
Surendra Singh Lilhore created HDFS-10303:
-

 Summary: DataStreamer#ResponseProcessor calculate packet 
acknowledge duration wrongly.
 Key: HDFS-10303
 URL: https://issues.apache.org/jira/browse/HDFS-10303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.7.2
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


Packets acknowledge duration should be calculated based on the packet send time.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7859) Erasure Coding: Persist erasure coding policies in NameNode

2016-04-18 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245677#comment-15245677
 ] 

Xinwei Qin  commented on HDFS-7859:
---

[~rakeshr] [~drankye], and [~zhz], thanks for your comments and clarifications.
Now, it is a good time to update this patch, though we should have a more clear 
about the details of custom policies. I am glad to rebase the patch with latest 
code and maybe attach it tomorrow.

> Erasure Coding: Persist erasure coding policies in NameNode
> ---
>
> Key: HDFS-7859
> URL: https://issues.apache.org/jira/browse/HDFS-7859
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Kai Zheng
>Assignee: Xinwei Qin 
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7859-HDFS-7285.002.patch, 
> HDFS-7859-HDFS-7285.002.patch, HDFS-7859-HDFS-7285.003.patch, 
> HDFS-7859.001.patch, HDFS-7859.002.patch
>
>
> In meetup discussion with [~zhz] and [~jingzhao], it's suggested that we 
> persist EC schemas in NameNode centrally and reliably, so that EC zones can 
> reference them by name efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9271) Implement basic NN operations

2016-04-18 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9271:
--
Attachment: HDFS-9271.HDFS-8707.000.patch

First patch, half baked and not ready to use.  Uploading to keep it around 
until I have some more time to finish it up.

-Added blocking rpc calls to the protobuf stub generator
-Started adding wrappers to populate protobuf messages handed off to the rpc 
calls

> Implement basic NN operations
> -
>
> Key: HDFS-9271
> URL: https://issues.apache.org/jira/browse/HDFS-9271
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: James Clampffer
> Attachments: HDFS-9271.HDFS-8707.000.patch
>
>
> Expose via C and C++ API:
> * mkdirs
> * rename
> * delete
> * stat
> * chmod
> * chown
> * getListing
> * setOwner
> * fsync



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10299) libhdfs++: File length doesn't always count the last block if it's being written to

2016-04-18 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245660#comment-15245660
 ] 

James Clampffer commented on HDFS-10299:


Committed this to HDFS-8707.

We should add some more tests that include concurrency between reads and writes 
to try and catch this stuff sooner.

> libhdfs++: File length doesn't always count the last block if it's being 
> written to
> ---
>
> Key: HDFS-10299
> URL: https://issues.apache.org/jira/browse/HDFS-10299
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-10299.HDFS-8707.000.patch
>
>
> It looks like we aren't factoring in the last block of files that are being 
> written to or haven't been closed yet into the length of the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10299) libhdfs++: File length doesn't always count the last block if it's being written to

2016-04-18 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10299:
---
Summary: libhdfs++: File length doesn't always count the last block if it's 
being written to  (was: libhdfs++: File length doesn't always going the last 
block if it's being written to)

> libhdfs++: File length doesn't always count the last block if it's being 
> written to
> ---
>
> Key: HDFS-10299
> URL: https://issues.apache.org/jira/browse/HDFS-10299
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-10299.HDFS-8707.000.patch
>
>
> It looks like we aren't factoring in the last block of files that are being 
> written to or haven't been closed yet into the length of the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245624#comment-15245624
 ] 

Hudson commented on HDFS-10275:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9626 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9626/])
HDFS-10275. TestDataNodeMetrics failing intermittently due to (waltersu4549: 
rev ab903029a9d353677184ff5602966b11ffb408b9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java


> TestDataNodeMetrics failing intermittently due to TotalWriteTime counted 
> incorrectly
> 
>
> Key: HDFS-10275
> URL: https://issues.apache.org/jira/browse/HDFS-10275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Fix For: 2.7.3
>
> Attachments: HDFS-10275.001.patch
>
>
> The unit test {{TestDataNodeMetrics}} fails intermittently. The failed info 
> show these:
> {code}
> Results :
> Failed tests: 
>   
> TestDataNodeVolumeFailureToleration.testVolumeAndTolerableConfiguration:195->testVolumeConfig:232
>  expected: but was:
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 ? IO Timed out waiting for 
> Min...
>   TestDataNodeMetrics.testDataNodeTimeSpend:279 ? Timeout Timed out waiting 
> for ...
>   TestHFlush.testHFlushInterrupted ? IO The stream is closed
> {code}
> In line 279 in {{TestDataNodeMetrics}}, it takes place timed out. Then I 
> looked into the code and found the real reason is that the metric of 
> {{TotalWriteTime}} frequently count 0 in each iteration of creating file. And 
> the this leads to retry operations till timeout.
> I debug the test in my local. I found the most suspect reason which cause 
> {{TotalWriteTime}} metric count always be 0 is that we using the 
> {{SimulatedFSDataset}} for spending time test. In {{SimulatedFSDataset}}, it 
> will use the inner class's method {{SimulatedOutputStream#write}} to count 
> the write time and the method of this class just updates the {{length}} and 
> throws its data away.
> {code}
> @Override
> public void write(byte[] b,
>   int off,
>   int len) throws IOException  {
>   length += len;
> }
> {code} 
> So the writing operation hardly not costs any time. So we should use a real 
> way to create file instead of simulated way. I have tested in my local that 
> the test is passed just one time when I delete the simulated way, while the 
> test retries many times to count write time in old way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10302) BlockPlacementPolicyDefault should use default replication considerload value

2016-04-18 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245623#comment-15245623
 ] 

Hudson commented on HDFS-10302:
---

FAILURE: Integrated in Hadoop-trunk-Commit #9626 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9626/])
HDFS-10302. BlockPlacementPolicyDefault should use default replication (kihwal: 
rev d8b729e16fb253e6c84f414d419b5663d9219a43)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> BlockPlacementPolicyDefault should use default replication considerload value
> -
>
> Key: HDFS-10302
> URL: https://issues.apache.org/jira/browse/HDFS-10302
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-10302.001.patch
>
>
> Now in method {{BlockPlacementPolicyDefault#initialize}}, it just uses value 
> {{true}} as the replication considerload default value rather than using the 
> existed string constant value 
> {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}}.
> {code}
>   @Override
>   public void initialize(Configuration conf,  FSClusterStats stats,
>  NetworkTopology clusterMap, 
>  Host2NodesMap host2datanodeMap) {
> this.considerLoad = conf.getBoolean(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, true);
> this.considerLoadFactor = conf.getDouble(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR,
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR_DEFAULT);
> this.stats = stats;
> this.clusterMap = clusterMap;
> this.host2datanodeMap = host2datanodeMap;
> this.heartbeatInterval = conf.getLong(
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY,
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT) * 1000;
> this.tolerateHeartbeatMultiplier = conf.getInt(
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_KEY,
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_DEFAULT);
> this.staleInterval = conf.getLong(
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_KEY, 
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_DEFAULT);
> this.preferLocalNode = conf.getBoolean(
> DFSConfigKeys.
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_KEY,
> DFSConfigKeys.
> 
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_DEFAULT);
>   }
> {code}
> And now the value {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}} is not be 
> used in any place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10302) BlockPlacementPolicyDefault should use default replication considerload value

2016-04-18 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10302:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

> BlockPlacementPolicyDefault should use default replication considerload value
> -
>
> Key: HDFS-10302
> URL: https://issues.apache.org/jira/browse/HDFS-10302
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Trivial
> Fix For: 2.8.0
>
> Attachments: HDFS-10302.001.patch
>
>
> Now in method {{BlockPlacementPolicyDefault#initialize}}, it just uses value 
> {{true}} as the replication considerload default value rather than using the 
> existed string constant value 
> {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}}.
> {code}
>   @Override
>   public void initialize(Configuration conf,  FSClusterStats stats,
>  NetworkTopology clusterMap, 
>  Host2NodesMap host2datanodeMap) {
> this.considerLoad = conf.getBoolean(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, true);
> this.considerLoadFactor = conf.getDouble(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR,
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR_DEFAULT);
> this.stats = stats;
> this.clusterMap = clusterMap;
> this.host2datanodeMap = host2datanodeMap;
> this.heartbeatInterval = conf.getLong(
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY,
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT) * 1000;
> this.tolerateHeartbeatMultiplier = conf.getInt(
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_KEY,
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_DEFAULT);
> this.staleInterval = conf.getLong(
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_KEY, 
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_DEFAULT);
> this.preferLocalNode = conf.getBoolean(
> DFSConfigKeys.
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_KEY,
> DFSConfigKeys.
> 
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_DEFAULT);
>   }
> {code}
> And now the value {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}} is not be 
> used in any place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10302) BlockPlacementPolicyDefault should use default replication considerload value

2016-04-18 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245614#comment-15245614
 ] 

Kihwal Lee commented on HDFS-10302:
---

Committed this to trunk, branch-2 and branch-2.8. Thanks for fixing this, 
[~linyiqun].

> BlockPlacementPolicyDefault should use default replication considerload value
> -
>
> Key: HDFS-10302
> URL: https://issues.apache.org/jira/browse/HDFS-10302
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Trivial
> Attachments: HDFS-10302.001.patch
>
>
> Now in method {{BlockPlacementPolicyDefault#initialize}}, it just uses value 
> {{true}} as the replication considerload default value rather than using the 
> existed string constant value 
> {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}}.
> {code}
>   @Override
>   public void initialize(Configuration conf,  FSClusterStats stats,
>  NetworkTopology clusterMap, 
>  Host2NodesMap host2datanodeMap) {
> this.considerLoad = conf.getBoolean(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, true);
> this.considerLoadFactor = conf.getDouble(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR,
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR_DEFAULT);
> this.stats = stats;
> this.clusterMap = clusterMap;
> this.host2datanodeMap = host2datanodeMap;
> this.heartbeatInterval = conf.getLong(
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY,
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT) * 1000;
> this.tolerateHeartbeatMultiplier = conf.getInt(
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_KEY,
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_DEFAULT);
> this.staleInterval = conf.getLong(
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_KEY, 
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_DEFAULT);
> this.preferLocalNode = conf.getBoolean(
> DFSConfigKeys.
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_KEY,
> DFSConfigKeys.
> 
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_DEFAULT);
>   }
> {code}
> And now the value {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}} is not be 
> used in any place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10302) BlockPlacementPolicyDefault should use default replication considerload value

2016-04-18 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245607#comment-15245607
 ] 

Kihwal Lee commented on HDFS-10302:
---

+1

> BlockPlacementPolicyDefault should use default replication considerload value
> -
>
> Key: HDFS-10302
> URL: https://issues.apache.org/jira/browse/HDFS-10302
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>Priority: Trivial
> Attachments: HDFS-10302.001.patch
>
>
> Now in method {{BlockPlacementPolicyDefault#initialize}}, it just uses value 
> {{true}} as the replication considerload default value rather than using the 
> existed string constant value 
> {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}}.
> {code}
>   @Override
>   public void initialize(Configuration conf,  FSClusterStats stats,
>  NetworkTopology clusterMap, 
>  Host2NodesMap host2datanodeMap) {
> this.considerLoad = conf.getBoolean(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, true);
> this.considerLoadFactor = conf.getDouble(
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR,
> DFSConfigKeys.DFS_NAMENODE_REPLICATION_CONSIDERLOAD_FACTOR_DEFAULT);
> this.stats = stats;
> this.clusterMap = clusterMap;
> this.host2datanodeMap = host2datanodeMap;
> this.heartbeatInterval = conf.getLong(
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY,
> DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT) * 1000;
> this.tolerateHeartbeatMultiplier = conf.getInt(
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_KEY,
> DFSConfigKeys.DFS_NAMENODE_TOLERATE_HEARTBEAT_MULTIPLIER_DEFAULT);
> this.staleInterval = conf.getLong(
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_KEY, 
> DFSConfigKeys.DFS_NAMENODE_STALE_DATANODE_INTERVAL_DEFAULT);
> this.preferLocalNode = conf.getBoolean(
> DFSConfigKeys.
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_KEY,
> DFSConfigKeys.
> 
> DFS_NAMENODE_BLOCKPLACEMENTPOLICY_DEFAULT_PREFER_LOCAL_NODE_DEFAULT);
>   }
> {code}
> And now the value {{DFS_NAMENODE_REPLICATION_CONSIDERLOAD_DEFAULT}} is not be 
> used in any place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10264) Logging improvements in FSImageFormatProtobuf.Saver

2016-04-18 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245601#comment-15245601
 ] 

Andras Bokor commented on HDFS-10264:
-

[~shv]
Since the migrating from Log4j/Commons Logging to SLF4J is in progress 
gradually I suggest to use the following format:
{code}
LOG.info("Saving image file {} using {}.", newFile, compression);
{code}
{code}
LOG.info("Image file {} of size {} bytes saved in {} seconds.", newFile, 
newFile.length(), (now() - startTime)/1000);
{code}

Also, the type of LOG variable needs to be changed.

What do you think?

> Logging improvements in FSImageFormatProtobuf.Saver
> ---
>
> Key: HDFS-10264
> URL: https://issues.apache.org/jira/browse/HDFS-10264
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.0
>Reporter: Konstantin Shvachko
>Assignee: Xiaobing Zhou
>  Labels: newbie
>
> There are two missing LOG messages in {{FSImageFormat.Saver}} that are 
> missing in {{FSImageFormatProtobuf.Saver}}, which mark start and end of 
> fsimage saving. Would be good to have them logged for protobuf images as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10276) Different results for exist call for file.ext/name

2016-04-18 Thread Kevin Cox (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245590#comment-15245590
 ] 

Kevin Cox commented on HDFS-10276:
--

Oops, my mistake. That chmod should be 666 (non executable) which would cause 
the `cat` to throw the exception. The new log is below.

{code}
% hdfs --config starscream/hadoop dfs -put <(echo test) /test
2016-04-18 08:15:41,615 WARN  [main] util.NativeCodeLoader 
(NativeCodeLoader.java:(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
hdfs --config starscream/hadoop dfs -put <(echo test) /test  2.72s user 0.18s 
system 128% cpu 2.269 total
% hdfs --config starscream/hadoop dfs -chmod 666 /test   
2016-04-18 08:16:52,903 WARN  [main] util.NativeCodeLoader 
(NativeCodeLoader.java:(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
hdfs --config starscream/hadoop dfs -chmod 666 /test  2.37s user 0.16s system 
182% cpu 1.390 total
% hdfs --config starscream/hadoop dfs -cat /test/bar
2016-04-18 08:16:55,743 WARN  [main] util.NativeCodeLoader 
(NativeCodeLoader.java:(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
cat: Permission denied: user=foo, access=EXECUTE, 
inode="/test/bar":kevincox:supergroup:-rw-rw-rw-
HADOOP_USER_NAME=foo hdfs --config starscream/hadoop dfs -cat /test/bar  2.40s 
user 0.16s system 185% cpu 1.378 total
{code}

> Different results for exist call for file.ext/name
> --
>
> Key: HDFS-10276
> URL: https://issues.apache.org/jira/browse/HDFS-10276
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kevin Cox
>Assignee: Yuanbo Liu
>
> Given you have a file {{/file}} an existence check for the path 
> {{/file/whatever}} will give different responses for different 
> implementations of FileSystem.
> LocalFileSystem will return false while DistributedFileSystem will throw 
> {{org.apache.hadoop.security.AccessControlException: Permission denied: ..., 
> access=EXECUTE, ...}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-10275:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.3
   Status: Resolved  (was: Patch Available)

Committed to trunk, branch-2, branch-2.8, branch-2.7. Thanks [~linyiqun] for 
the contribution!

> TestDataNodeMetrics failing intermittently due to TotalWriteTime counted 
> incorrectly
> 
>
> Key: HDFS-10275
> URL: https://issues.apache.org/jira/browse/HDFS-10275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Fix For: 2.7.3
>
> Attachments: HDFS-10275.001.patch
>
>
> The unit test {{TestDataNodeMetrics}} fails intermittently. The failed info 
> show these:
> {code}
> Results :
> Failed tests: 
>   
> TestDataNodeVolumeFailureToleration.testVolumeAndTolerableConfiguration:195->testVolumeConfig:232
>  expected: but was:
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 ? IO Timed out waiting for 
> Min...
>   TestDataNodeMetrics.testDataNodeTimeSpend:279 ? Timeout Timed out waiting 
> for ...
>   TestHFlush.testHFlushInterrupted ? IO The stream is closed
> {code}
> In line 279 in {{TestDataNodeMetrics}}, it takes place timed out. Then I 
> looked into the code and found the real reason is that the metric of 
> {{TotalWriteTime}} frequently count 0 in each iteration of creating file. And 
> the this leads to retry operations till timeout.
> I debug the test in my local. I found the most suspect reason which cause 
> {{TotalWriteTime}} metric count always be 0 is that we using the 
> {{SimulatedFSDataset}} for spending time test. In {{SimulatedFSDataset}}, it 
> will use the inner class's method {{SimulatedOutputStream#write}} to count 
> the write time and the method of this class just updates the {{length}} and 
> throws its data away.
> {code}
> @Override
> public void write(byte[] b,
>   int off,
>   int len) throws IOException  {
>   length += len;
> }
> {code} 
> So the writing operation hardly not costs any time. So we should use a real 
> way to create file instead of simulated way. I have tested in my local that 
> the test is passed just one time when I delete the simulated way, while the 
> test retries many times to count write time in old way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245569#comment-15245569
 ] 

Walter Su commented on HDFS-10275:
--

sorry I didn't see that. The patch LGTM. +1.

> TestDataNodeMetrics failing intermittently due to TotalWriteTime counted 
> incorrectly
> 
>
> Key: HDFS-10275
> URL: https://issues.apache.org/jira/browse/HDFS-10275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-10275.001.patch
>
>
> The unit test {{TestDataNodeMetrics}} fails intermittently. The failed info 
> show these:
> {code}
> Results :
> Failed tests: 
>   
> TestDataNodeVolumeFailureToleration.testVolumeAndTolerableConfiguration:195->testVolumeConfig:232
>  expected: but was:
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 ? IO Timed out waiting for 
> Min...
>   TestDataNodeMetrics.testDataNodeTimeSpend:279 ? Timeout Timed out waiting 
> for ...
>   TestHFlush.testHFlushInterrupted ? IO The stream is closed
> {code}
> In line 279 in {{TestDataNodeMetrics}}, it takes place timed out. Then I 
> looked into the code and found the real reason is that the metric of 
> {{TotalWriteTime}} frequently count 0 in each iteration of creating file. And 
> the this leads to retry operations till timeout.
> I debug the test in my local. I found the most suspect reason which cause 
> {{TotalWriteTime}} metric count always be 0 is that we using the 
> {{SimulatedFSDataset}} for spending time test. In {{SimulatedFSDataset}}, it 
> will use the inner class's method {{SimulatedOutputStream#write}} to count 
> the write time and the method of this class just updates the {{length}} and 
> throws its data away.
> {code}
> @Override
> public void write(byte[] b,
>   int off,
>   int len) throws IOException  {
>   length += len;
> }
> {code} 
> So the writing operation hardly not costs any time. So we should use a real 
> way to create file instead of simulated way. I have tested in my local that 
> the test is passed just one time when I delete the simulated way, while the 
> test retries many times to count write time in old way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10299) libhdfs++: File length doesn't always going the last block if it's being written to

2016-04-18 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245552#comment-15245552
 ] 

James Clampffer commented on HDFS-10299:


This looks good to me, thanks for the fix Xiaowei.  I'll commit it momentarily.

> libhdfs++: File length doesn't always going the last block if it's being 
> written to
> ---
>
> Key: HDFS-10299
> URL: https://issues.apache.org/jira/browse/HDFS-10299
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-10299.HDFS-8707.000.patch
>
>
> It looks like we aren't factoring in the last block of files that are being 
> written to or haven't been closed yet into the length of the file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-04-18 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245534#comment-15245534
 ] 

Kai Zheng commented on HDFS-8449:
-

Thanks Bo for this work about metrics. I will look at it.

> Add tasks count metrics to datanode for ECWorker
> 
>
> Key: HDFS-8449
> URL: https://issues.apache.org/jira/browse/HDFS-8449
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Li Bo
>Assignee: Li Bo
> Attachments: HDFS-8449-000.patch, HDFS-8449-001.patch, 
> HDFS-8449-002.patch, HDFS-8449-003.patch, HDFS-8449-004.patch
>
>
> This sub task try to record ec recovery tasks that a datanode has done, 
> including total tasks, failed tasks and sucessful tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Lin Yiqun (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245521#comment-15245521
 ] 

Lin Yiqun commented on HDFS-10275:
--

Hi, [~walter.k.su], I have removed {{SimulatedFSDataset.setFactory(conf);}} in 
my patch, do you means there is no need to bump the timeout time in addition?

> TestDataNodeMetrics failing intermittently due to TotalWriteTime counted 
> incorrectly
> 
>
> Key: HDFS-10275
> URL: https://issues.apache.org/jira/browse/HDFS-10275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-10275.001.patch
>
>
> The unit test {{TestDataNodeMetrics}} fails intermittently. The failed info 
> show these:
> {code}
> Results :
> Failed tests: 
>   
> TestDataNodeVolumeFailureToleration.testVolumeAndTolerableConfiguration:195->testVolumeConfig:232
>  expected: but was:
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 ? IO Timed out waiting for 
> Min...
>   TestDataNodeMetrics.testDataNodeTimeSpend:279 ? Timeout Timed out waiting 
> for ...
>   TestHFlush.testHFlushInterrupted ? IO The stream is closed
> {code}
> In line 279 in {{TestDataNodeMetrics}}, it takes place timed out. Then I 
> looked into the code and found the real reason is that the metric of 
> {{TotalWriteTime}} frequently count 0 in each iteration of creating file. And 
> the this leads to retry operations till timeout.
> I debug the test in my local. I found the most suspect reason which cause 
> {{TotalWriteTime}} metric count always be 0 is that we using the 
> {{SimulatedFSDataset}} for spending time test. In {{SimulatedFSDataset}}, it 
> will use the inner class's method {{SimulatedOutputStream#write}} to count 
> the write time and the method of this class just updates the {{length}} and 
> throws its data away.
> {code}
> @Override
> public void write(byte[] b,
>   int off,
>   int len) throws IOException  {
>   length += len;
> }
> {code} 
> So the writing operation hardly not costs any time. So we should use a real 
> way to create file instead of simulated way. I have tested in my local that 
> the test is passed just one time when I delete the simulated way, while the 
> test retries many times to count write time in old way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8449) Add tasks count metrics to datanode for ECWorker

2016-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245457#comment-15245457
 ] 

Hadoop QA commented on HDFS-8449:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 25s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
4s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 51s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
6s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs: patch generated 2 new + 
62 unchanged - 0 fixed = 64 total (was 62) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m 5s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 27s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
37s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 246m 15s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.shortcircuit.TestShortCircuitLocalRead |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.TestFileAppend |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.TestSafeModeWithStripedFile |
| JDK v1.7.0_95 Failed junit tests | 

[jira] [Commented] (HDFS-10275) TestDataNodeMetrics failing intermittently due to TotalWriteTime counted incorrectly

2016-04-18 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245394#comment-15245394
 ] 

Walter Su commented on HDFS-10275:
--

Good analysis! I think a better way to do this is to use a real FSDataset? Just 
remove {{SimulatedFSDataset.setFactory(conf);}}. What do you think ?

> TestDataNodeMetrics failing intermittently due to TotalWriteTime counted 
> incorrectly
> 
>
> Key: HDFS-10275
> URL: https://issues.apache.org/jira/browse/HDFS-10275
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-10275.001.patch
>
>
> The unit test {{TestDataNodeMetrics}} fails intermittently. The failed info 
> show these:
> {code}
> Results :
> Failed tests: 
>   
> TestDataNodeVolumeFailureToleration.testVolumeAndTolerableConfiguration:195->testVolumeConfig:232
>  expected: but was:
> Tests in error: 
>   TestOpenFilesWithSnapshot.testWithCheckpoint:94 ? IO Timed out waiting for 
> Min...
>   TestDataNodeMetrics.testDataNodeTimeSpend:279 ? Timeout Timed out waiting 
> for ...
>   TestHFlush.testHFlushInterrupted ? IO The stream is closed
> {code}
> In line 279 in {{TestDataNodeMetrics}}, it takes place timed out. Then I 
> looked into the code and found the real reason is that the metric of 
> {{TotalWriteTime}} frequently count 0 in each iteration of creating file. And 
> the this leads to retry operations till timeout.
> I debug the test in my local. I found the most suspect reason which cause 
> {{TotalWriteTime}} metric count always be 0 is that we using the 
> {{SimulatedFSDataset}} for spending time test. In {{SimulatedFSDataset}}, it 
> will use the inner class's method {{SimulatedOutputStream#write}} to count 
> the write time and the method of this class just updates the {{length}} and 
> throws its data away.
> {code}
> @Override
> public void write(byte[] b,
>   int off,
>   int len) throws IOException  {
>   length += len;
> }
> {code} 
> So the writing operation hardly not costs any time. So we should use a real 
> way to create file instead of simulated way. I have tested in my local that 
> the test is passed just one time when I delete the simulated way, while the 
> test retries many times to count write time in old way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10291) TestShortCircuitLocalRead failing

2016-04-18 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-10291:
--
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

thanks -patch in

> TestShortCircuitLocalRead failing
> -
>
> Key: HDFS-10291
> URL: https://issues.apache.org/jira/browse/HDFS-10291
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: 2.8.0
>
> Attachments: HDFS-10291-001.patch
>
>
> {{TestShortCircuitLocalRead}} failing as length of read is considered off end 
> of buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >