[jira] [Updated] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-262:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch, 
> HDDS-262.03.patch, HDDS-262.04.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553883#comment-16553883
 ] 

Nanda kumar commented on HDDS-262:
--

Thanks [~bharatviswa] for the contribution and to [~ajayydv] & [~xyao] for the 
review. I have committed this to trunk.

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch, 
> HDDS-262.03.patch, HDDS-262.04.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553860#comment-16553860
 ] 

Nanda kumar commented on HDDS-262:
--

+1, LGTM. I will commit this shortly.

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch, 
> HDDS-262.03.patch, HDDS-262.04.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553855#comment-16553855
 ] 

genericqa commented on HDDS-285:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
53s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-285 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932816/HDDS-285.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 072c745f9853 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2ced3ef |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/602/testReport/ |
| Max. process+thread count | 334 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/common U: hadoop-hdds/common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/602/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Dis

[jira] [Commented] (HDFS-13752) fs.Path stores file path in java.net.URI causes big memory waste

2018-07-23 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553853#comment-16553853
 ] 

Xiao Chen commented on HDFS-13752:
--

Thanks all for the investigation and discussion here.

>From the benchmark and analysis above, I agree uri is consuming a fair amount 
>of memory. However, the downside of replacing it with a String is that, all 
>the current methods in {{Path}} that's based on the convenience of a cached 
>uri will incur CPU cost to calculate the uri from string, and possibly young 
>gen object allocations. This includes {{toUri}}, {{isUriPathAbsolute}} and 
>{{isAbsolute}}, all of which are widely used. We might use an additional 
>boolean to cache isAbsolute, but I don't see a good solution for {{toUri}} and 
>there are 922 usages of it in trunk.

I'd like to see more benchmarks to prove that my concern is false, before we 
make this rather fundamental change.

> fs.Path stores file path in java.net.URI causes big memory waste
> 
>
> Key: HDFS-13752
> URL: https://issues.apache.org/jira/browse/HDFS-13752
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.7.6
> Environment: Hive 2.1.1 and hadoop 2.7.6 
>Reporter: Barnabas Maidics
>Priority: Major
> Attachments: Screen Shot 2018-07-20 at 11.12.38.png, 
> heapdump-10partitions.html
>
>
> I was looking at HiveServer2 memory usage, and a big percentage of this was 
> because of org.apache.hadoop.fs.Path, where you store file paths in a 
> java.net.URI object. The URI implementation stores the same string in 3 
> different objects (see the attached image). In Hive when there are many 
> partitions this cause a big memory usage. In my particular case 42% of memory 
> was used by java.net.URI so it could be reduced to 14%. 
> I wonder if the community is open to replace it with a more memory efficient 
> implementation and what other things should be considered here? It can be a 
> huge memory improvement for Hadoop and for Hive as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553849#comment-16553849
 ] 

genericqa commented on HDDS-262:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 51s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} container-service in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-262 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932818/HDDS-262.04.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0a5a16b45594 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 
08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2ced3ef |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/601/testReport/ |
| Max. process+thread count | 410 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/container-service U: hadoop-hdds/container-service |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/601/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issu

[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-23 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553830#comment-16553830
 ] 

Xiao Chen commented on HDFS-13672:
--

Thanks all for the discussion here. I tend to agree with Andrew: since the 
scrubber has safemode checks to , this probably doesn't worth the effort to do 
a behavior change.

The idea of adding counters for lazy persist sounds good, can do that as an 
improvement jira if interested. :)

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch, 
> HDFS-13672.003.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-188) TestOmMetrcis should not use the deprecated WhiteBox class

2018-07-23 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDDS-188:

Description: TestOmMetrcis should stop using 
{{org.apache.hadoop.test.Whitebox}}.  (was: TestKSMMetrcis (also needs to be 
renamed) should stop using {{org.apache.hadoop.test.Whitebox}}.)

> TestOmMetrcis should not use the deprecated WhiteBox class
> --
>
> Key: HDDS-188
> URL: https://issues.apache.org/jira/browse/HDDS-188
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
>  Labels: newbie
> Fix For: 0.2.1
>
>
> TestOmMetrcis should stop using {{org.apache.hadoop.test.Whitebox}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-188) TestKSMMetrcis should not use the deprecated WhiteBox class

2018-07-23 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553826#comment-16553826
 ] 

Junping Du commented on HDDS-188:
-

Looks like TestKSMMetrics get renamed to TestOmMetrics, update title and 
description to reflect the change.

> TestKSMMetrcis should not use the deprecated WhiteBox class
> ---
>
> Key: HDDS-188
> URL: https://issues.apache.org/jira/browse/HDDS-188
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
>  Labels: newbie
> Fix For: 0.2.1
>
>
> TestKSMMetrcis (also needs to be renamed) should stop using 
> {{org.apache.hadoop.test.Whitebox}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-188) TestOmMetrcis should not use the deprecated WhiteBox class

2018-07-23 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDDS-188:

Summary: TestOmMetrcis should not use the deprecated WhiteBox class  (was: 
TestKSMMetrcis should not use the deprecated WhiteBox class)

> TestOmMetrcis should not use the deprecated WhiteBox class
> --
>
> Key: HDDS-188
> URL: https://issues.apache.org/jira/browse/HDDS-188
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Arpit Agarwal
>Assignee: Arpit Agarwal
>Priority: Major
>  Labels: newbie
> Fix For: 0.2.1
>
>
> TestKSMMetrcis (also needs to be renamed) should stop using 
> {{org.apache.hadoop.test.Whitebox}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-203) Add getCommittedBlockLength API in datanode

2018-07-23 Thread Mukul Kumar Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553818#comment-16553818
 ] 

Mukul Kumar Singh commented on HDDS-203:


Thanks for the updated patch [~shashikant]. Apart from Nicholas comments, 
couple of more comments.

1) with HDDS-181, the close container should commit the pending block. Can 
TestCommittedBlockLengthAPI:103, be replaced with a writechunk request. So that 
we can check that a key is committed on closed and the committed length is 
correct.
2) TestCommittedBlockLengthAPI:152, the exception thrown here will result in a 
case where "xceiverClientManager.releaseClient(client);" is not called.


> Add getCommittedBlockLength API in datanode
> ---
>
> Key: HDDS-203
> URL: https://issues.apache.org/jira/browse/HDDS-203
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client, Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-203.00.patch, HDDS-203.01.patch, HDDS-203.02.patch, 
> HDDS-203.03.patch, HDDS-203.04.patch
>
>
> When a container gets closed on the Datanode while the active Writes are 
> happening by OzoneClient, Client Write requests will fail with 
> ContainerClosedException. In such case, ozone Client needs to enquire the 
> last committed block length from dataNodes and update the OzoneMaster with 
> the updated length for the block. This Jira proposes to add to RPC call to 
> get the last committed length of a block on a Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-258) Helper methods to generate NodeReport and ContainerReport for testing

2018-07-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553815#comment-16553815
 ] 

Hudson commented on HDDS-258:
-

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14621 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14621/])
HDDS-258. Helper methods to generate NodeReport and ContainerReport for (xyao: 
rev 2ced3efe94eecc3e6076be1f0341bf6a2f2affab)
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementCapacity.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainer.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/command/TestCommandStatusReportHandler.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/TestUtils.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/ozone/container/common/TestEndPoint.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/MockNodeManager.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/server/TestSCMDatanodeHeartbeatDispatcher.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestNodeManager.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/replication/TestReplicationManager.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementRandom.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestContainerPlacement.java
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestSCMNodeStorageStatMap.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/TestMiniOzoneCluster.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/closer/TestContainerCloser.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestContainerMapping.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/metrics/TestContainerMetrics.java
* (edit) 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/server/TestContainerServer.java
* (edit) 
hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/node/TestNodeReportHandler.java


> Helper methods to generate NodeReport and ContainerReport for testing
> -
>
> Key: HDDS-258
> URL: https://issues.apache.org/jira/browse/HDDS-258
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-258.000.patch, HDDS-258.001.patch, 
> HDDS-258.002.patch, HDDS-258.003.patch
>
>
> Having helper methods to generate NodeReport and ContainerReport for testing 
> SCM will make our life easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-258) Helper methods to generate NodeReport and ContainerReport for testing

2018-07-23 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-258:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~nandakumar131] for the contribution. I've commit the patch to trunk. 
Unit test failures are unrelated to this change. Opened HDDS-286 for one issue 
found.

> Helper methods to generate NodeReport and ContainerReport for testing
> -
>
> Key: HDDS-258
> URL: https://issues.apache.org/jira/browse/HDDS-258
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-258.000.patch, HDDS-258.001.patch, 
> HDDS-258.002.patch, HDDS-258.003.patch
>
>
> Having helper methods to generate NodeReport and ContainerReport for testing 
> SCM will make our life easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-286) Fix NodeReportPublisher.getReport NPE

2018-07-23 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-286:

Fix Version/s: 0.2.1

> Fix NodeReportPublisher.getReport NPE
> -
>
> Key: HDDS-286
> URL: https://issues.apache.org/jira/browse/HDDS-286
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Priority: Major
> Fix For: 0.2.1
>
>
> This can be reproed with TestKeys#testPutKey
> {code}
> 2018-07-23 21:33:55,598 WARN  concurrent.ExecutorHelper 
> (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in 
> thread Datanode ReportManager Thread - 0: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107)
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:350)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:260)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64)
>   at 
> org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86)
>   at 
> org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-286) Fix NodeReportPublisher.getReport NPE

2018-07-23 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HDDS-286:
---

 Summary: Fix NodeReportPublisher.getReport NPE
 Key: HDDS-286
 URL: https://issues.apache.org/jira/browse/HDDS-286
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Xiaoyu Yao


This can be reproed with TestKeys#testPutKey

{code}
2018-07-23 21:33:55,598 WARN  concurrent.ExecutorHelper 
(ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in 
thread Datanode ReportManager Thread - 0: 
java.lang.NullPointerException
at 
org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107)
at 
org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:350)
at 
org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:260)
at 
org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64)
at 
org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39)
at 
org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86)
at 
org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13583) RBF: Router admin clrQuota is not synchronized with nameservice

2018-07-23 Thread Dibyendu Karmakar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553809#comment-16553809
 ] 

Dibyendu Karmakar commented on HDFS-13583:
--

Thanks [~linyiqun] for commiting it. 

> RBF: Router admin clrQuota is not synchronized with nameservice
> ---
>
> Key: HDFS-13583
> URL: https://issues.apache.org/jira/browse/HDFS-13583
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.2
>
> Attachments: HDFS-13583-000.patch, HDFS-13583-001.patch, 
> HDFS-13583-002.patch, HDFS-13583-branch-2-001.patch
>
>
> Router admin -clrQuota command is removing the quota from the mount table 
> only, it is not getting synchronized with nameservice.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-117) Wrapper for set/get Standalone, Ratis and Rest Ports in DatanodeDetails.

2018-07-23 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du reassigned HDDS-117:
---

Assignee: chao.wu  (was: Junping Du)

> Wrapper for set/get Standalone, Ratis and Rest Ports in DatanodeDetails.
> 
>
> Key: HDDS-117
> URL: https://issues.apache.org/jira/browse/HDDS-117
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Nanda kumar
>Assignee: chao.wu
>Priority: Major
>  Labels: newbie
>
> It will be very helpful to have a wrapper for set/get Standalone, Ratis and 
> Rest Ports in DatanodeDetails.
> Search and Replace usage of DatanodeDetails#newPort directly in current code. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-199) Implement ReplicationManager to handle underreplication of closed containers

2018-07-23 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553771#comment-16553771
 ] 

Anu Engineer commented on HDDS-199:
---

[~xyao] Thanks man, you just let [~elek] escape out of the rebase hell. He owes 
you a beer for sure :)


> Implement ReplicationManager to handle underreplication of closed containers
> 
>
> Key: HDDS-199
> URL: https://issues.apache.org/jira/browse/HDDS-199
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-199.001.patch, HDDS-199.002.patch, 
> HDDS-199.003.patch, HDDS-199.004.patch, HDDS-199.005.patch, 
> HDDS-199.006.patch, HDDS-199.007.patch, HDDS-199.008.patch, 
> HDDS-199.009.patch, HDDS-199.010.patch, HDDS-199.011.patch, 
> HDDS-199.012.patch, HDDS-199.013.patch, HDDS-199.014.patch, 
> HDDS-199.015.patch, HDDS-199.016.patch, HDDS-199.017.patch
>
>
> HDDS/Ozone supports Open and Closed containers. In case of specific 
> conditions (container is full, node is failed) the container will be closed 
> and will be replicated in a different way. The replication of Open containers 
> are handled with Ratis and PipelineManger.
> The ReplicationManager should handle the replication of the ClosedContainers. 
> The replication information will be sent as an event 
> (UnderReplicated/OverReplicated). 
> The Replication manager will collect all of the events in a priority queue 
> (to replicate first the containers where more replica is missing) calculate 
> the destination datanode (first with a very simple algorithm, later with 
> calculating scatter-width) and send the Copy/Delete container to the datanode 
> (CommandQueue).
> A CopyCommandWatcher/DeleteCommandWatcher are also included to retry the 
> copy/delete in case of failure. This is an in-memory structure (based on 
> HDDS-195) which can requeue the underreplicated/overreplicated events to the 
> prioirity queue unless the confirmation of the copy/delete command is arrived.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13583) RBF: Router admin clrQuota is not synchronized with nameservice

2018-07-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553760#comment-16553760
 ] 

Hudson commented on HDFS-13583:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14620 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14620/])
HDFS-13583. RBF: Router admin clrQuota is not synchronized with (yqlin: rev 
17a87977f29ced49724f561a68565217c8cb4e94)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterQuotaManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/records/impl/pb/MountTablePBImpl.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/store/records/TestMountTable.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/Quota.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterQuotaUsage.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterQuotaManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdmin.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterAdminServer.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/records/MountTable.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterQuota.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterQuotaUpdateService.java


> RBF: Router admin clrQuota is not synchronized with nameservice
> ---
>
> Key: HDFS-13583
> URL: https://issues.apache.org/jira/browse/HDFS-13583
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.2
>
> Attachments: HDFS-13583-000.patch, HDFS-13583-001.patch, 
> HDFS-13583-002.patch, HDFS-13583-branch-2-001.patch
>
>
> Router admin -clrQuota command is removing the quota from the mount table 
> only, it is not getting synchronized with nameservice.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13583) RBF: Router admin clrQuota is not synchronized with nameservice

2018-07-23 Thread Yiqun Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13583:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.2
   3.2.0
   2.10.0
   Status: Resolved  (was: Patch Available)

Committed this to trunk, branch-3.1 and branch-2.
Thanks [~dibyendu_hadoop] for the contribution!

> RBF: Router admin clrQuota is not synchronized with nameservice
> ---
>
> Key: HDFS-13583
> URL: https://issues.apache.org/jira/browse/HDFS-13583
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.2
>
> Attachments: HDFS-13583-000.patch, HDFS-13583-001.patch, 
> HDFS-13583-002.patch, HDFS-13583-branch-2-001.patch
>
>
> Router admin -clrQuota command is removing the quota from the mount table 
> only, it is not getting synchronized with nameservice.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13583) RBF: Router admin clrQuota is not synchronized with nameservice

2018-07-23 Thread Yiqun Lin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-13583:
-
Affects Version/s: 3.1.0

> RBF: Router admin clrQuota is not synchronized with nameservice
> ---
>
> Key: HDFS-13583
> URL: https://issues.apache.org/jira/browse/HDFS-13583
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.1.0
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.2
>
> Attachments: HDFS-13583-000.patch, HDFS-13583-001.patch, 
> HDFS-13583-002.patch, HDFS-13583-branch-2-001.patch
>
>
> Router admin -clrQuota command is removing the quota from the mount table 
> only, it is not getting synchronized with nameservice.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-258) Helper methods to generate NodeReport and ContainerReport for testing

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553737#comment-16553737
 ] 

genericqa commented on HDDS-258:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 18 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
52s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
42s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
34s{color} | {color:green} server-scm in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  6m 44s{color} 
| {color:red} integration-test in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}131m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.container.ozoneimpl.TestOzoneContainer |
|   | hadoop.ozone.container.common.TestBlockDeletingService |
|   | hadoop.ozone.web.client.TestKeys |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-258 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932753/HDDS-258.003.patch |
| Optional Tests |  asflicense  

[jira] [Updated] (HDFS-13760) improve ZKFC fencing action when network of ZKFC interrupt

2018-07-23 Thread chencan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chencan updated HDFS-13760:
---
Description: 
+underlined text+when host of Active NameNode & ZKFC meet network fault for 
quite a time, HDFS will be not available since ZKFC located on Standby NameNode 
will never ssh fence success due to it could not ssh to Active NameNode. In 
such situation, for Client, it could not connect to Active NameNode, then 
failover to Standby but it could not provide READ/WRITE.
{code:xml}
2018-07-23 15:57:10,836 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 40 
time(s); maxRetries=45
2018-07-23 15:57:30,856 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 41 
time(s); maxRetries=45
2018-07-23 15:57:50,872 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 42 
time(s); maxRetries=45
2018-07-23 15:58:10,892 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 43 
time(s); maxRetries=45
2018-07-23 15:58:30,912 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 44 
time(s); maxRetries=45
2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ZKFailoverController: get old 
active state exception: org.apache.hadoop.net.ConnectTimeoutException: 2 
millis timeout while waiting for channel to be 
ready for connect. ch : java.nio.channels.SocketChannel[connection-pending 
local=/ip:port remote=hostname]
2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ActiveStandbyElector: old 
active is not healthy. need to create znode
2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ActiveStandbyElector: Elector 
callbacks for NameNode at standbynn start create node, now time: 
45179010079342817
2018-07-23 15:58:50,936 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
CreateNode result: 0 code:OK for path: /hadoop-ha/ns/ActiveStandbyElectorLock 
connectionState: CONNECTED  for elector id=469098346 
appData=0a07727a2d6e6e313312046e6e31331a1f727a2d646174612d6864702d6e6e31332e727a2e73616e6b7561692e636f6d20e83e28d33e
 cb=Elector callbacks for NameNode at standbynamenode
2018-07-23 15:58:50,936 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2018-07-23 15:58:50,938 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old 
node exists: 
0a07727a2d6e6e313312046e6e31341a1f727a2d646174612d6864702d6e6e31342e727a2e73616e6b7561692e636f6d20e83e28d33e
2018-07-23 15:58:50,939 INFO org.apache.hadoop.ha.ZKFailoverController: Should 
fence: NameNode at activenamenode
2018-07-23 15:59:10,960 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: activenamenode. Already tried 0 time(s); maxRetries=1
2018-07-23 15:59:30,980 WARN org.apache.hadoop.ha.FailoverController: Unable to 
gracefully make NameNode at activenamenode standby (unable to connect)
org.apache.hadoop.net.ConnectTimeoutException: Call From standbynamenode to 
activenamenode failed on socket timeout exception: 
org.apache.hadoop.net.ConnectTimeoutException: 2 millis timeout while 
waiting for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending local=ip:port 
remote=activenamenode]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout
{code}

I propose that when Active NameNode meet network fault, ZKFC force this 
NameNode to become Standby, and another ZKFC could hold the ZNode for election 
and transition other NameNode to Active even when ssh fence fail.

There is no available patch now, and I am very welcome to hear some suggestion.

  was:
when host of Active NameNode & ZKFC meet network fault for quite a time, HDFS 
will be not available since ZKFC located on Standby NameNode will never ssh 
fence success due to it could not ssh to Active NameNode. In such situation, 
for Client, it could not connect to Active NameNode, then failover to Standby 
but it could not provide READ/WRITE.
{code:xml}
2018-07-23 15:57:10,836 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 40 
time(s); maxRetries=45
2018-07-23 15:57:30,856 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 41 
time(s); maxRetries=45
2018-07-23 15:57:50,872 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 42 
time(s); maxRetries=45
2018-07-23 15:58:10,892 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 43 
time(s); maxRetries=45
20

[jira] [Commented] (HDFS-13583) RBF: Router admin clrQuota is not synchronized with nameservice

2018-07-23 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553718#comment-16553718
 ] 

Yiqun Lin commented on HDFS-13583:
--

Thanks [~dibyendu_hadoop] for updating the patch. I have verified the failed 
UT, it's a flaky test, sometimes failed in my local. But I'm sure this is not 
related with current change.
+1. Commit this shortly.

> RBF: Router admin clrQuota is not synchronized with nameservice
> ---
>
> Key: HDFS-13583
> URL: https://issues.apache.org/jira/browse/HDFS-13583
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Dibyendu Karmakar
>Assignee: Dibyendu Karmakar
>Priority: Major
> Attachments: HDFS-13583-000.patch, HDFS-13583-001.patch, 
> HDFS-13583-002.patch, HDFS-13583-branch-2-001.patch
>
>
> Router admin -clrQuota command is removing the quota from the mount table 
> only, it is not getting synchronized with nameservice.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13688) Introduce msync API call

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553716#comment-16553716
 ] 

genericqa commented on HDFS-13688:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12943 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
16s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
19s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
41s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
24s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
21s{color} | {color:green} HDFS-12943 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 27m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
10s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
26s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  1m 47s{color} 
| {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 11s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m 
50s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}267m 44s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Thread passed where Runnable expected in 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(RpcController,
 ClientNamenodeProtocolProtos$MsyncRequestProto)  At 
ClientNamenodeProtocolServerSideTranslatorPB.java:in 
org.apache.hadoop.hdfs.protocolPB.Cl

[jira] [Commented] (HDFS-13688) Introduce msync API call

2018-07-23 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553676#comment-16553676
 ] 

Konstantin Shvachko commented on HDFS-13688:


# As [previously 
discussed|https://issues.apache.org/jira/browse/HDFS-12977?focusedCommentId=16368689&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16368689]
 there is a constant {{HdfsServerConstants.INVALID_TXID}} for 
{{INVALID_STATEID}}. But I don't think you need it in {{AlignmentContext}} at 
all. If Server ever returns {{INVALID_TXID}}, you will catch it with {{((txid < 
clientId)}} condition but not the other one, since it never equals 
{{Long.MIN_VALUE}}.
# On the client we should always address {{ClientGSIContext}} instead of 
{{AlignmentContext}}. The former already has {{getLastSeenStateId()}}. So you 
don't need to add it to {{AlignmentContext}}. On the server 
{{getLastSeenStateId()}} doesn't make sense.
# We should not just create a {{ClientGSIContext}} instance in DFSClient. It 
lives deep inside {{ObserverReadProxyProvider}} and is generally not exposed to 
DFSClient. You mostly use it for testing, but there should another way. 
HDFS-13399 was dedicated to this. [This is the relevant 
comment|https://issues.apache.org/jira/browse/HDFS-13399?focusedCommentId=16459341&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16459341].
 There was also discussion around it in HDFS-12976.
I see you are passing it into {{ObserverReadProxyProvider}}, which we were 
trying to avoid it in previous steps.

These are the main issues. I also see some unused imports and non-parameterized 
generics.

> Introduce msync API call
> 
>
> Key: HDFS-13688
> URL: https://issues.apache.org/jira/browse/HDFS-13688
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13688-HDFS-12943.001.patch, 
> HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, 
> HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch
>
>
> As mentioned in the design doc in HDFS-12943, to ensure consistent read, we 
> need to introduce an RPC call {{msync}}. Specifically, client can issue a 
> msync call to Observer node along with a transactionID. The msync will only 
> return when the Observer's transactionID has caught up to the given ID. This 
> JIRA is to add this API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553674#comment-16553674
 ] 

Bharat Viswanadham commented on HDDS-262:
-

In the tests baseDir is not getting deleted.

Added that code in shutDown().

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch, 
> HDDS-262.03.patch, HDDS-262.04.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-262:

Attachment: HDDS-262.04.patch

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch, 
> HDDS-262.03.patch, HDDS-262.04.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553666#comment-16553666
 ] 

Bharat Viswanadham commented on HDDS-262:
-

Hi [~xyao]

Thanks for the review.

Addressed your review comment in patch v03.

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch, 
> HDDS-262.03.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-262:

Attachment: HDDS-262.03.patch

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch, 
> HDDS-262.03.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Fix Version/s: 0.2.1

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-285.00.patch, HDDS-285.01.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553644#comment-16553644
 ] 

Bharat Viswanadham edited comment on HDDS-285 at 7/24/18 1:16 AM:
--

Fixed Jenkins reported issues.

With this need to remove the (Assert.) in other test methods in the test class. 
This is done along with this jira.


was (Author: bharatviswa):
Fixed Jenkins reported issues.

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch, HDDS-285.01.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Attachment: HDDS-285.01.patch

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch, HDDS-285.01.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Attachment: (was: HDDS-285.01.patch)

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Attachment: HDDS-285.01.patch

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch, HDDS-285.01.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Attachment: HDDS-285.01.patch

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Attachment: (was: HDDS-285.01.patch)

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-282) Consolidate logging in scm/container-service

2018-07-23 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553643#comment-16553643
 ] 

Xiaoyu Yao commented on HDDS-282:
-

Thanks [~elek] for fixing this, +1 for the v1 patch. I will commit it shortly.

> Consolidate logging in scm/container-service 
> -
>
> Key: HDDS-282
> URL: https://issues.apache.org/jira/browse/HDDS-282
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-282.001.patch
>
>
> During real cluster tests, I found some logging/error handling very annoying.
> I propose to improve the following behaviour:
>  # In case of datanode-> scm communication failure we don't log the 
> exception. As there (EndpointStateMachine.java:L206). As the messages have 
> already been throttled I think it's safe to log the exception.
>  # In BlockDeletingServlce:L123, I would log the message (Plan to choose {} 
> containers for block deletion, actually returns {} valid containers) only if 
> the number of valid containers is greater than 0.
>  # EventQueue could log a warning if handlers is missing for a message 
> (instead of an exception)
>  # TypedEvent should have a toString method (as it's used in the EventQueue 
> logging).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553644#comment-16553644
 ] 

Bharat Viswanadham commented on HDDS-285:
-

Fixed Jenkins reported issues.

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13322) fuse dfs - uid persists when switching between ticket caches

2018-07-23 Thread Istvan Fajth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Fajth updated HDFS-13322:

Attachment: TestFuse2.java
TestFuse.java
perftest_old_behaviour_10k_different_1KB.txt
perftest_old_behaviour_1MB.txt
perftest_old_behaviour_1KB.txt
perftest_old_behaviour_1B.txt
perftest_new_behaviour_10k_different_1KB.txt
perftest_new_behaviour_1MB.txt
perftest_new_behaviour_1KB.txt
perftest_new_behaviour_1B.txt
catter2.sh
catter.sh

> fuse dfs - uid persists when switching between ticket caches
> 
>
> Key: HDFS-13322
> URL: https://issues.apache.org/jira/browse/HDFS-13322
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 2.6.0
> Environment: Linux xx.xx.xx.xxx 3.10.0-514.el7.x86_64 #1 SMP Wed 
> Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
>  
>Reporter: Alex Volskiy
>Assignee: Istvan Fajth
>Priority: Minor
> Attachments: HDFS-13322.001.patch, HDFS-13322.002.patch, 
> HDFS-13322.003.patch, TestFuse.java, TestFuse2.java, catter.sh, catter2.sh, 
> perftest_new_behaviour_10k_different_1KB.txt, perftest_new_behaviour_1B.txt, 
> perftest_new_behaviour_1KB.txt, perftest_new_behaviour_1MB.txt, 
> perftest_old_behaviour_10k_different_1KB.txt, perftest_old_behaviour_1B.txt, 
> perftest_old_behaviour_1KB.txt, perftest_old_behaviour_1MB.txt, 
> testHDFS-13322.sh, test_after_patch.out, test_before_patch.out
>
>
> The symptoms of this issue are the same as described in HDFS-3608 except the 
> workaround that was applied (detect changes in UID ticket cache) doesn't 
> resolve the issue when multiple ticket caches are in use by the same user.
> Our use case requires that a job scheduler running as a specific uid obtain 
> separate kerberos sessions per job and that each of these sessions use a 
> separate cache. When switching sessions this way, no change is made to the 
> original ticket cache so the cached filesystem instance doesn't get 
> regenerated.
>  
> {{$ export KRB5CCNAME=/tmp/krb5cc_session1}}
> {{$ kinit user_a@domain}}
> {{$ touch /fuse_mount/tmp/testfile1}}
> {{$ ls -l /fuse_mount/tmp/testfile1}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile1*}}
> {{$ export KRB5CCNAME=/tmp/krb5cc_session2}}
> {{$ kinit user_b@domain}}
> {{$ touch /fuse_mount/tmp/testfile2}}
> {{$ ls -l /fuse_mount/tmp/testfile2}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile2*}}
> {{   }}{color:#d04437}*{{** expected owner to be user_b **}}*{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13322) fuse dfs - uid persists when switching between ticket caches

2018-07-23 Thread Istvan Fajth (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553641#comment-16553641
 ] 

Istvan Fajth commented on HDFS-13322:
-

Hello [~fabbri],

finally I was able to find some time to setup the environment for testing, and 
prepare and run the tests.

I am attaching the test codes that I have used, and also I am attaching the 
measurements from 10 run from both tests on the old and the new fuse code.

The test was run on a single linux vm against a CDH5.14 cluster with 3 
DataNodes. The randomfile I used was a 1MB and a 1KB file created with 
/dev/urandom as the source, and a 1 byte file that contained the letter "a"
||test version||1MB file read * 10k avg.||1KB file read * 10k avg.||1B file 
read * 10k avg.||10k different 10KB file||
|Original version with catter.sh|174.064 sec|78.725 sec|79.195 sec|90.683 sec|
|Patched version with catter.sh|180.675 sec|81.028 sec|81.187 sec|92.859 sec|
|*Performance degradation*|*3.8%*|*2.9%*|*2.5%* |*2.4%* |
|Original version with TestFuse.java|137.159 sec|65.982 sec|65.713 sec|67.411 
sec|
|Patched version with TestFuse.java|139.095 sec |68.457 sec |68.919 sec|69.101 
sec|
|*Performance degradation*|*1.4%*| *3.8%*|*4.9%*|*2.5%* |

After running these test, I thought I check if the page cache has any effect, 
and tried with 10k 10KB file generated from /dev/urandom as well, it seems that 
under a certain size network traffic of getting the data is not really a 
factor, so I got suspicious hence I ran a test with different files as well as 
the last step.

As it seems from the result the performance degradation due to the change in 
the proposed patch is under 5% in all of the scenarios I have tested, and 
mostly in the 2-4% range.

Let me know if you have any observation on the provided code and perf test, 
also let me know if these values seem to be acceptable.

> fuse dfs - uid persists when switching between ticket caches
> 
>
> Key: HDFS-13322
> URL: https://issues.apache.org/jira/browse/HDFS-13322
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 2.6.0
> Environment: Linux xx.xx.xx.xxx 3.10.0-514.el7.x86_64 #1 SMP Wed 
> Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
>  
>Reporter: Alex Volskiy
>Assignee: Istvan Fajth
>Priority: Minor
> Attachments: HDFS-13322.001.patch, HDFS-13322.002.patch, 
> HDFS-13322.003.patch, testHDFS-13322.sh, test_after_patch.out, 
> test_before_patch.out
>
>
> The symptoms of this issue are the same as described in HDFS-3608 except the 
> workaround that was applied (detect changes in UID ticket cache) doesn't 
> resolve the issue when multiple ticket caches are in use by the same user.
> Our use case requires that a job scheduler running as a specific uid obtain 
> separate kerberos sessions per job and that each of these sessions use a 
> separate cache. When switching sessions this way, no change is made to the 
> original ticket cache so the cached filesystem instance doesn't get 
> regenerated.
>  
> {{$ export KRB5CCNAME=/tmp/krb5cc_session1}}
> {{$ kinit user_a@domain}}
> {{$ touch /fuse_mount/tmp/testfile1}}
> {{$ ls -l /fuse_mount/tmp/testfile1}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile1*}}
> {{$ export KRB5CCNAME=/tmp/krb5cc_session2}}
> {{$ kinit user_b@domain}}
> {{$ touch /fuse_mount/tmp/testfile2}}
> {{$ ls -l /fuse_mount/tmp/testfile2}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile2*}}
> {{   }}{color:#d04437}*{{** expected owner to be user_b **}}*{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-3584) Blocks are getting marked as corrupt with append operation under high load.

2018-07-23 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553637#comment-16553637
 ] 

Wei-Chiu Chuang commented on HDFS-3584:
---

bq. I think we need to block the older clients to close the file at this stage?
My take is that once recoverLease() or whatever calls that starts lease 
recovery reaches NameNode, NN should reject follow-up close()/recoverLease() 
calls until it is able to recover the lease. If the lease recovery doesn't 
complete, you just can't safely assume the file can be closed (e.g. file 
doesn't have sufficient number of replicas)

bq. what if append call takes the new lease ownership and removes the older 
client lease?
Meaning append takes over lease without a lease recovery, so it doesn't bump up 
GS? That doesn't sounds right to me.

> Blocks are getting marked as corrupt with append operation under high load.
> ---
>
> Key: HDFS-3584
> URL: https://issues.apache.org/jira/browse/HDFS-3584
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Brahma Reddy Battula
>Priority: Major
>
> Scenario:
> = 
> 1. There are 2 clients cli1 and cli2 cli1 write a file F1 and not closed
> 2. The cli2 will call append on unclosed file and triggers a leaserecovery
> 3. Cli1 is closed
> 4. Lease recovery is completed and with updated GS in DN and got BlockReport 
> since there is a mismatch in GS the block got corrupted
> 5. Now we got a CommitBlockSync this will also fail since the File is already 
> closed by cli1 and state in NN is Finalized



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13761) Add toString Method to AclFeature Class

2018-07-23 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553598#comment-16553598
 ] 

Xiao Chen commented on HDFS-13761:
--

Thanks for filing the jira and providing a patch [~shwetayakkali].

Can we please add the class name in front of the toString? Otherwise it seems 
this message will be hard to tell from the log what this message is about. In 
general it would also be helpful on providing an example log message on what 
this will show exactly. (Right now it looks like: {{20059 Size of entries : 
2}}) Minor but I think printing out the hashCode as Hexadecimal is more 
canonical.

> Add toString Method to AclFeature Class
> ---
>
> Key: HDFS-13761
> URL: https://issues.apache.org/jira/browse/HDFS-13761
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Shweta
>Assignee: Shweta
>Priority: Minor
> Attachments: HDFS-13761.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13217) Log audit event only used last EC policy name when add multiple policies from file

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553594#comment-16553594
 ] 

genericqa commented on HDFS-13217:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 31m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 59s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13217 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12920137/HDFS-13217.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b55acedfdf05 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 17e2616 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24642/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24642/testReport/ |
| Max. process+thread count | 2969 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/had

[jira] [Updated] (HDFS-13076) [SPS]: Cleanup work for HDFS-10285

2018-07-23 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-13076:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-10285
   Status: Resolved  (was: Patch Available)

> [SPS]: Cleanup work for HDFS-10285
> --
>
> Key: HDFS-13076
> URL: https://issues.apache.org/jira/browse/HDFS-13076
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Major
> Fix For: HDFS-10285
>
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, HDFS-13076-HDFS-10285-00.patch, 
> HDFS-13076-HDFS-10285-01.patch, HDFS-13076-HDFS-10285-02.patch, 
> HDFS-13076-HDFS-10285-03.patch
>
>
> This Jira is to run aggregated HDFS-10285 branch patch against trunk and 
> check for any jenkins issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-13076) [SPS]: Cleanup work for HDFS-10285

2018-07-23 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-13076:
--

Assignee: Rakesh R

> [SPS]: Cleanup work for HDFS-10285
> --
>
> Key: HDFS-13076
> URL: https://issues.apache.org/jira/browse/HDFS-13076
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
>Priority: Major
> Fix For: HDFS-10285
>
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, HDFS-13076-HDFS-10285-00.patch, 
> HDFS-13076-HDFS-10285-01.patch, HDFS-13076-HDFS-10285-02.patch, 
> HDFS-13076-HDFS-10285-03.patch
>
>
> This Jira is to run aggregated HDFS-10285 branch patch against trunk and 
> check for any jenkins issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553528#comment-16553528
 ] 

genericqa commented on HDDS-285:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 25s{color} 
| {color:red} hadoop-hdds_common generated 9 new + 0 unchanged - 0 fixed = 9 
total (was 0) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
51s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 62m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-285 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932783/HDDS-285.00.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 49fa7b11405e 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 17e2616 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDDS-Build/595/artifact/out/diff-compile-javac-hadoop-hdds_common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/595/testReport/ |
| Max. process+thread count | 336 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/common U: hadoop-hdds/common |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/595/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Create a generic Metadata Iterator

[jira] [Commented] (HDFS-13076) [SPS]: Cleanup work for HDFS-10285

2018-07-23 Thread Uma Maheswara Rao G (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553520#comment-16553520
 ] 

Uma Maheswara Rao G commented on HDFS-13076:


+1 Thanks Rakesh for the cleanup. I have pushed it to branch.

[~surendrasingh] please take latest code for your verification. Thanks

> [SPS]: Cleanup work for HDFS-10285
> --
>
> Key: HDFS-13076
> URL: https://issues.apache.org/jira/browse/HDFS-13076
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Priority: Major
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, HDFS-13076-HDFS-10285-00.patch, 
> HDFS-13076-HDFS-10285-01.patch, HDFS-13076-HDFS-10285-02.patch, 
> HDFS-13076-HDFS-10285-03.patch
>
>
> This Jira is to run aggregated HDFS-10285 branch patch against trunk and 
> check for any jenkins issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-3584) Blocks are getting marked as corrupt with append operation under high load.

2018-07-23 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553493#comment-16553493
 ] 

Wei-Chiu Chuang commented on HDFS-3584:
---

This is definitely the same as HDFS-10240, and it is reproducible even with one 
client thread.

> Blocks are getting marked as corrupt with append operation under high load.
> ---
>
> Key: HDFS-3584
> URL: https://issues.apache.org/jira/browse/HDFS-3584
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Brahma Reddy Battula
>Priority: Major
>
> Scenario:
> = 
> 1. There are 2 clients cli1 and cli2 cli1 write a file F1 and not closed
> 2. The cli2 will call append on unclosed file and triggers a leaserecovery
> 3. Cli1 is closed
> 4. Lease recovery is completed and with updated GS in DN and got BlockReport 
> since there is a mismatch in GS the block got corrupted
> 5. Now we got a CommitBlockSync this will also fail since the File is already 
> closed by cli1 and state in NN is Finalized



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13757) After HDFS-12886, close() can throw AssertionError "Negative replicas!"

2018-07-23 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553488#comment-16553488
 ] 

Wei-Chiu Chuang commented on HDFS-13757:


It wouldn't fail for the "Negative Replica" before HDFS-13757.

> After HDFS-12886, close() can throw AssertionError "Negative replicas!"
> ---
>
> Key: HDFS-13757
> URL: https://issues.apache.org/jira/browse/HDFS-13757
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.0, 2.10.0, 2.9.1, 3.2.0, 3.0.3
>Reporter: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-13757.test.02.patch, HDFS-13757.test.patch
>
>
> While investigating a data corruption bug caused by concurrent recoverLease() 
> and close(), I found HDFS-12886 may cause close() to throw AssertionError 
> under a corner case, because the block has zero live replica, and client 
> calls recoverLease() immediately followed by close().
> {noformat}
> org.apache.hadoop.ipc.RemoteException(java.lang.AssertionError): Negative 
> replicas!
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.LowRedundancyBlocks.getPriority(LowRedundancyBlocks.java:197)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.LowRedundancyBlocks.update(LowRedundancyBlocks.java:422)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.updateNeededReconstructions(BlockManager.java:4274)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.commitOrCompleteLastBlock(BlockManager.java:1001)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.commitOrCompleteLastBlock(FSNamesystem.java:3471)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFileInternal(FSDirWriteFileOp.java:713)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.completeFile(FSDirWriteFileOp.java:671)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2854)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:928)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:607)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1689)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {noformat}
> I have a test case to reproduce it.
> [~lukmajercak] [~elgoiri] would you please take a look at it? I think we 
> should add a check to reject completeFile() if the block is under recovery, 
> similar to what's proposed in HDFS-10240.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13688) Introduce msync API call

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553483#comment-16553483
 ] 

genericqa commented on HDFS-13688:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12943 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
 9s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 
28s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
13s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
45s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
51s{color} | {color:green} HDFS-12943 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 38m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 38m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 38m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 22s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
37s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
55s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 20s{color} 
| {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
58s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}292m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Thread passed where Runnable expected in 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(RpcController,
 ClientNamenodeProtocolProtos$MsyncRequestProto)  At 
ClientNamenodeProtocolServerSideTranslatorPB.java:in 
org.apache.hadoop.hdfs.protocolPB.Cl

[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Description: This Jira is to track the work to have a wrapper class 
Iterator for MetadataStore and use that iterator during iterating db.  (was: 
This Jira is to track the work to have a wrapper class Iterator and use that 
iterator during iterating db.)

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator for 
> MetadataStore and use that iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13752) fs.Path stores file path in java.net.URI causes big memory waste

2018-07-23 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13752:
---
Affects Version/s: 2.7.6

> fs.Path stores file path in java.net.URI causes big memory waste
> 
>
> Key: HDFS-13752
> URL: https://issues.apache.org/jira/browse/HDFS-13752
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.7.6
> Environment: Hive 2.1.1 and hadoop 2.7.6 
>Reporter: Barnabas Maidics
>Priority: Major
> Attachments: Screen Shot 2018-07-20 at 11.12.38.png, 
> heapdump-10partitions.html
>
>
> I was looking at HiveServer2 memory usage, and a big percentage of this was 
> because of org.apache.hadoop.fs.Path, where you store file paths in a 
> java.net.URI object. The URI implementation stores the same string in 3 
> different objects (see the attached image). In Hive when there are many 
> partitions this cause a big memory usage. In my particular case 42% of memory 
> was used by java.net.URI so it could be reduced to 14%. 
> I wonder if the community is open to replace it with a more memory efficient 
> implementation and what other things should be considered here? It can be a 
> huge memory improvement for Hadoop and for Hive as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13752) fs.Path stores file path in java.net.URI causes big memory waste

2018-07-23 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13752:
---
Environment: Hive 2.1.1 and hadoop 2.7.6 

> fs.Path stores file path in java.net.URI causes big memory waste
> 
>
> Key: HDFS-13752
> URL: https://issues.apache.org/jira/browse/HDFS-13752
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: fs
> Environment: Hive 2.1.1 and hadoop 2.7.6 
>Reporter: Barnabas Maidics
>Priority: Major
> Attachments: Screen Shot 2018-07-20 at 11.12.38.png, 
> heapdump-10partitions.html
>
>
> I was looking at HiveServer2 memory usage, and a big percentage of this was 
> because of org.apache.hadoop.fs.Path, where you store file paths in a 
> java.net.URI object. The URI implementation stores the same string in 3 
> different objects (see the attached image). In Hive when there are many 
> partitions this cause a big memory usage. In my particular case 42% of memory 
> was used by java.net.URI so it could be reduced to 14%. 
> I wonder if the community is open to replace it with a more memory efficient 
> implementation and what other things should be considered here? It can be a 
> huge memory improvement for Hadoop and for Hive as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553468#comment-16553468
 ] 

Bharat Viswanadham commented on HDDS-285:
-

This Jira has added new classes, not integrated the new classes into the code. 
Will do that in further Jira's.

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator and use that 
> iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container

2018-07-23 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553462#comment-16553462
 ] 

Bharat Viswanadham commented on HDDS-271:
-

Hi [~xyao]

This Jira is added to have a generic iterator for blocks and then have a 
key-value specific iterator to iterate for blocks.

Agreed, getBlockCount() and getCurrentPos() are not required, I will remove 
those methods.

 

> Create a block iterator to iterate blocks in a container
> 
>
> Key: HDDS-271
> URL: https://issues.apache.org/jira/browse/HDDS-271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-271.00.patch
>
>
> Create a block iterator to scan all blocks in a container.
> This one will be useful during implementation of container scanner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13688) Introduce msync API call

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553457#comment-16553457
 ] 

genericqa commented on HDFS-13688:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-12943 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
58s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 
21s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
51s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
26s{color} | {color:green} HDFS-12943 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
24s{color} | {color:green} HDFS-12943 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 26m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 26m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
11s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
24s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  1m 41s{color} 
| {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 12s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 17m  
0s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}267m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Thread passed where Runnable expected in 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.msync(RpcController,
 ClientNamenodeProtocolProtos$MsyncRequestProto)  At 
ClientNamenodeProtocolServerSideTranslatorPB.java:in 
org.apache.hadoop.hdfs.protocolPB.Cl

[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Attachment: HDDS-285.00.patch

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator and use that 
> iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-285:

Status: Patch Available  (was: In Progress)

> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-285.00.patch
>
>
> This Jira is to track the work to have a wrapper class Iterator and use that 
> iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDDS-285 started by Bharat Viswanadham.
---
> Create a generic Metadata Iterator
> --
>
> Key: HDDS-285
> URL: https://issues.apache.org/jira/browse/HDDS-285
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> This Jira is to track the work to have a wrapper class Iterator and use that 
> iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-285) Create a generic Metadata Iterator

2018-07-23 Thread Bharat Viswanadham (JIRA)
Bharat Viswanadham created HDDS-285:
---

 Summary: Create a generic Metadata Iterator
 Key: HDDS-285
 URL: https://issues.apache.org/jira/browse/HDDS-285
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


This Jira is to track the work to have a wrapper class Iterator and use that 
iterator during iterating db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-262) Send SCM healthy and failed volumes in the heartbeat

2018-07-23 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553433#comment-16553433
 ] 

Xiaoyu Yao commented on HDDS-262:
-

Thanks [~bharatviswa] for the patch. Patch v2 looks good to me. I just have one 
minor suggestion:

TestVolumeSet.java

Line 245-246: can we wrap this in the try{} final{} so that the readonly test 
dir are cleaned up after the test?

+1 after that being fixed.

> Send SCM healthy and failed volumes in the heartbeat
> 
>
> Key: HDDS-262
> URL: https://issues.apache.org/jira/browse/HDDS-262
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-262.00.patch, HDDS-262.01.patch, HDDS-262.02.patch
>
>
> The current code only sends volumes which are successfully created during 
> datanode startup. For any volume an error occurred during HddsVolume object 
> creation, we should move that volume to failedVolume Map. This should be sent 
> to SCM as part of NodeReports.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container

2018-07-23 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553423#comment-16553423
 ] 

Xiaoyu Yao commented on HDDS-271:
-

bq. do we really need those APIs for an Iterator?
[~nandakumar131], I have similar question as well. Correct me if I'm wrong 
[~bharatviswa], seems we are trying to make a generic iterator for container 
metadata, on top of the current one that is more tied with key-value container 
metadata. 

[~ajayydv] the container db is indexed by a flat block id space, we don't have 
a common prefix excepted for deleted block or secondary index. Agree with 
[~anu], we can add advanced filter later as new use cases come up.

> Create a block iterator to iterate blocks in a container
> 
>
> Key: HDDS-271
> URL: https://issues.apache.org/jira/browse/HDDS-271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-271.00.patch
>
>
> Create a block iterator to scan all blocks in a container.
> This one will be useful during implementation of container scanner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13217) Log audit event only used last EC policy name when add multiple policies from file

2018-07-23 Thread Xiao Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553415#comment-16553415
 ] 

Xiao Chen commented on HDFS-13217:
--

Hi [~liaoyuxiangqin], do you want to get the last checkstyle warning addressed, 
and push this through the finish line? Thanks for your contribution on this.

> Log audit event only used last EC policy name when add multiple policies from 
> file 
> ---
>
> Key: HDFS-13217
> URL: https://issues.apache.org/jira/browse/HDFS-13217
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: liaoyuxiangqin
>Assignee: liaoyuxiangqin
>Priority: Major
> Attachments: HDFS-13217.001.patch, HDFS-13217.002.patch, 
> HDFS-13217.003.patch, HDFS-13217.004.patch
>
>
> When i read the addErasureCodingPolicies() of FSNamesystem class in namenode, 
> i found the following code only used last ec policy name for  logAuditEvent, 
> i think this audit log can't track whole policies for the add multiple 
> erasure coding policies to the ErasureCodingPolicyManager. Thanks.
> {code:java|title=FSNamesystem.java|borderStyle=solid}
> try {
>   checkOperation(OperationCategory.WRITE);
>   checkNameNodeSafeMode("Cannot add erasure coding policy");
>   for (ErasureCodingPolicy policy : policies) {
> try {
>   ErasureCodingPolicy newPolicy =
>   FSDirErasureCodingOp.addErasureCodingPolicy(this, policy,
>   logRetryCache);
>   addECPolicyName = newPolicy.getName();
>   responses.add(new AddErasureCodingPolicyResponse(newPolicy));
> } catch (HadoopIllegalArgumentException e) {
>   responses.add(new AddErasureCodingPolicyResponse(policy, e));
> }
>   }
>   success = true;
>   return responses.toArray(new AddErasureCodingPolicyResponse[0]);
> } finally {
>   writeUnlock(operationName);
>   if (success) {
> getEditLog().logSync();
>   }
>   logAuditEvent(success, operationName,addECPolicyName, null, null);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553398#comment-16553398
 ] 

genericqa commented on HDFS-13658:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
8s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 29m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
32s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
48s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m  8s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 36s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
56s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}279m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
|   | hadoop.fs.contract.router.web.TestRouterWebHDFSContractAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13658 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932726/HDFS-13658.008.patch |
| Optional Tests |  asflicense  mvnsite  compile  javac  javadoc

[jira] [Commented] (HDDS-203) Add getCommittedBlockLength API in datanode

2018-07-23 Thread Tsz Wo Nicholas Sze (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553385#comment-16553385
 ] 

Tsz Wo Nicholas Sze commented on HDDS-203:
--

Thanks Shash for the new patch.  Some thoughts:
- It seems quite expensive to parse and create the entire KeyData object in 
order get the block length.  How about either (1) storing the size as a field 
in KeyData or (2) retrieving the chunk sizes without prasing KeyData object?  
It seems that (1) is better although it needs more works.
- Let's have blockID in GetCommittedBlockLengthResponseProto.


> Add getCommittedBlockLength API in datanode
> ---
>
> Key: HDDS-203
> URL: https://issues.apache.org/jira/browse/HDDS-203
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client, Ozone Datanode
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-203.00.patch, HDDS-203.01.patch, HDDS-203.02.patch, 
> HDDS-203.03.patch, HDDS-203.04.patch
>
>
> When a container gets closed on the Datanode while the active Writes are 
> happening by OzoneClient, Client Write requests will fail with 
> ContainerClosedException. In such case, ozone Client needs to enquire the 
> last committed block length from dataNodes and update the OzoneMaster with 
> the updated length for the block. This Jira proposes to add to RPC call to 
> get the last committed length of a block on a Datanode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-266) Integrate checksum into .container file

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553347#comment-16553347
 ] 

genericqa commented on HDDS-266:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
54s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-ozone/integration-test {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
35s{color} | {color:red} hadoop-hdds_container-service generated 1 new + 2 
unchanged - 0 fixed = 3 total (was 2) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
7s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} container-service in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 12s{color} 
| {color:red} integration-test in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
32s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}132m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.web.TestDistributedOzoneVolumes |
|   | hadoop.ozone.container.common.TestBlockDeletingService |
|   | 
hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerByPipeline
 |
|   | hadoop.ozone.freon.TestDataValidate |
|   | hadoop.ozone.TestMini

[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553250#comment-16553250
 ] 

genericqa commented on HDFS-13672:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 15s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13672 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932727/HDFS-13672.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6d7229b49206 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bbe2f62 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24636/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test 

[jira] [Commented] (HDDS-75) Ozone: Support CopyContainer

2018-07-23 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553239#comment-16553239
 ] 

Nanda kumar commented on HDDS-75:
-

[~elek], can you rebase the patch, it does not apply anymore.

> Ozone: Support CopyContainer
> 
>
> Key: HDDS-75
> URL: https://issues.apache.org/jira/browse/HDDS-75
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Anu Engineer
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-75.005.patch, HDFS-11686-HDFS-7240.001.patch, 
> HDFS-11686-HDFS-7240.002.patch, HDFS-11686-HDFS-7240.003.patch, 
> HDFS-11686-HDFS-7240.004.patch
>
>
> Once a container is closed we need to copy the container to the correct pool 
> or re-encode the container to use erasure coding. The copyContainer allows 
> users to get the container as a tarball from the remote machine.
> The copyContainer is a basic step to move the raw container data from one 
> datanode to an other node. It could be used by higher level components such 
> like the scm which ensures that the replication rules are satisfied.
> The CopyContainer by default works in pull model: the destination datanode 
> could read the raw data from one or more source datanode where the container 
> exists.
> The source provides a binary representation of the container over a common 
> interface which has two method:
>  # prepare(containerName)
>  # copyData(String containerName, OutputStream destination)
> Prepare phase is called right after the closing event and the implementation 
> could prepare for the copy by precreate a compressed tar file from the 
> container data. As a first step we can provide a simple implementation which 
> creates the tar files on demand.
> The destination datanode should retry the copy if the container in the source 
> node not yet prepared.
> The raw container data is provided over HTTP. The HTTP endpoint should be 
> separated from the ObjectStore  REST API (similar to the distinctions between 
> HDFS-7240 and HDFS-13074) 
> Long-term the HTTP endpoint should support Http-Range requests: One container 
> could be copied from multiple source by the destination. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-245) Handle ContainerReports in the SCM

2018-07-23 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553231#comment-16553231
 ] 

Nanda kumar commented on HDDS-245:
--

[~elek], can you rebase the patch, it does not apply anymore.

> Handle ContainerReports in the SCM
> --
>
> Key: HDDS-245
> URL: https://issues.apache.org/jira/browse/HDDS-245
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-245.001.patch
>
>
> HDDS-242 provides a new class ContainerReportHandler which could handle the 
> ContainerReports from the SCMHeartbeatDispatchere.
> HDDS-228 introduces a new map to store the container -> datanode[] mapping
> HDDS-199 implements the ReplicationManager which could send commands to the 
> datanodes to copy the datanode.
> To wire all these components, we need to put implementation to the 
> ContainerReportHandler (created in HDDS-242).
> The ContainerReportHandler should process the new ContainerReportForDatanode 
> events, update the containerStateMap and node2ContainerMap and calculate the 
> missing/duplicate containers and send the ReplicateCommand to the 
> ReplicateManager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-258) Helper methods to generate NodeReport and ContainerReport for testing

2018-07-23 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-258:
-
Attachment: HDDS-258.003.patch

> Helper methods to generate NodeReport and ContainerReport for testing
> -
>
> Key: HDDS-258
> URL: https://issues.apache.org/jira/browse/HDDS-258
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-258.000.patch, HDDS-258.001.patch, 
> HDDS-258.002.patch, HDDS-258.003.patch
>
>
> Having helper methods to generate NodeReport and ContainerReport for testing 
> SCM will make our life easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-258) Helper methods to generate NodeReport and ContainerReport for testing

2018-07-23 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553224#comment-16553224
 ] 

Nanda kumar commented on HDDS-258:
--

Rebased the patch v03 on top of latest changes.

> Helper methods to generate NodeReport and ContainerReport for testing
> -
>
> Key: HDDS-258
> URL: https://issues.apache.org/jira/browse/HDDS-258
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-258.000.patch, HDDS-258.001.patch, 
> HDDS-258.002.patch, HDDS-258.003.patch
>
>
> Having helper methods to generate NodeReport and ContainerReport for testing 
> SCM will make our life easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13761) Add toString Method to AclFeature Class

2018-07-23 Thread Shweta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shweta updated HDFS-13761:
--
Attachment: HDFS-13761.01.patch
Status: Patch Available  (was: Open)

Hi [~xiaochen],

 

I have submitted the patch. Please review. Thank you.

> Add toString Method to AclFeature Class
> ---
>
> Key: HDFS-13761
> URL: https://issues.apache.org/jira/browse/HDFS-13761
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Shweta
>Assignee: Shweta
>Priority: Minor
> Attachments: HDFS-13761.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-272) TestBlockDeletingService is failing with DiskOutOfSpaceException

2018-07-23 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar reassigned HDDS-272:


Assignee: Lokesh Jain

> TestBlockDeletingService is failing with DiskOutOfSpaceException
> 
>
> Key: HDDS-272
> URL: https://issues.apache.org/jira/browse/HDDS-272
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: 0.2.1
>
>
> {code}
> [INFO] Running 
> org.apache.hadoop.ozone.container.common.TestBlockDeletingService
> [ERROR] Tests run: 5, Failures: 0, Errors: 5, Skipped: 0, Time elapsed: 1.337 
> s <<< FAILURE! - in 
> org.apache.hadoop.ozone.container.common.TestBlockDeletingService
> [ERROR] 
> testContainerThrottle(org.apache.hadoop.ozone.container.common.TestBlockDeletingService)
>   Time elapsed: 1.02 s  <<< ERROR!
> org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: No storage 
> location configured
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeSet.initializeVolumeSet(VolumeSet.java:156)
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeSet.(VolumeSet.java:114)
>   at 
> org.apache.hadoop.ozone.container.common.volume.VolumeSet.(VolumeSet.java:94)
>   at 
> org.apache.hadoop.ozone.container.common.TestBlockDeletingService.createToDeleteBlocks(TestBlockDeletingService.java:120)
>   at 
> org.apache.hadoop.ozone.container.common.TestBlockDeletingService.testContainerThrottle(TestBlockDeletingService.java:350)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13523) Support observer nodes in MiniDFSCluster

2018-07-23 Thread Sherwood Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sherwood Zheng updated HDFS-13523:
--
Attachment: HDFS-13523-HDFS-12943.003.patch

> Support observer nodes in MiniDFSCluster
> 
>
> Key: HDFS-13523
> URL: https://issues.apache.org/jira/browse/HDFS-13523
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, test
>Reporter: Erik Krogen
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HADOOP-13523-HADOOP-12943.000.patch, 
> HADOOP-13523-HADOOP-12943.001.patch, HDFS-13523-HDFS-12943.001.patch, 
> HDFS-13523-HDFS-12943.002.patch, HDFS-13523-HDFS-12943.003.patch
>
>
> MiniDFSCluster should support Observer nodes so that we can write decent 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13523) Support observer nodes in MiniDFSCluster

2018-07-23 Thread Sherwood Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sherwood Zheng updated HDFS-13523:
--
Attachment: (was: HDFS-13523-HDFS-12943.003.patch)

> Support observer nodes in MiniDFSCluster
> 
>
> Key: HDFS-13523
> URL: https://issues.apache.org/jira/browse/HDFS-13523
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode, test
>Reporter: Erik Krogen
>Assignee: Sherwood Zheng
>Priority: Major
> Attachments: HADOOP-13523-HADOOP-12943.000.patch, 
> HADOOP-13523-HADOOP-12943.001.patch, HDFS-13523-HDFS-12943.001.patch, 
> HDFS-13523-HDFS-12943.002.patch, HDFS-13523-HDFS-12943.003.patch
>
>
> MiniDFSCluster should support Observer nodes so that we can write decent 
> integration tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-258) Helper methods to generate NodeReport and ContainerReport for testing

2018-07-23 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553176#comment-16553176
 ] 

Xiaoyu Yao commented on HDDS-258:
-

[~nandakumar131], thanks for the update. 
TestSCMContainerPlacementCapacity#chooseDatanodes and 
TestSCMContainerPlacementRandom#chooseDatanodes need to be updated to use 
{{datanodes.add(TestUtils.randomDatanodeDetails())}} to fix the build error. 
Otherwise, looks good to me.

> Helper methods to generate NodeReport and ContainerReport for testing
> -
>
> Key: HDDS-258
> URL: https://issues.apache.org/jira/browse/HDDS-258
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-258.000.patch, HDDS-258.001.patch, 
> HDDS-258.002.patch
>
>
> Having helper methods to generate NodeReport and ContainerReport for testing 
> SCM will make our life easy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-266) Integrate checksum into .container file

2018-07-23 Thread Hanisha Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-266:

Attachment: HDDS-266.003.patch

> Integrate checksum into .container file
> ---
>
> Key: HDDS-266
> URL: https://issues.apache.org/jira/browse/HDDS-266
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-266.001.patch, HDDS-266.002.patch, 
> HDDS-266.003.patch
>
>
> Currently, each container metadata has 2 files - .container and .checksum 
> file.
> In this Jira, we propose to integrate the checksum into the .container file 
> itself. This will help with synchronization during container updates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-266) Integrate checksum into .container file

2018-07-23 Thread Hanisha Koneru (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553174#comment-16553174
 ] 

Hanisha Koneru commented on HDDS-266:
-

Thanks for the review [~bharatviswa]. Addressed your comments in patch v03.

Test failures are unrelated.

> Integrate checksum into .container file
> ---
>
> Key: HDDS-266
> URL: https://issues.apache.org/jira/browse/HDDS-266
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-266.001.patch, HDDS-266.002.patch, 
> HDDS-266.003.patch
>
>
> Currently, each container metadata has 2 files - .container and .checksum 
> file.
> In this Jira, we propose to integrate the checksum into the .container file 
> itself. This will help with synchronization during container updates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13688) Introduce msync API call

2018-07-23 Thread Chen Liang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-13688:
--
Attachment: HDFS-13688-HDFS-12943.002.patch

> Introduce msync API call
> 
>
> Key: HDFS-13688
> URL: https://issues.apache.org/jira/browse/HDFS-13688
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13688-HDFS-12943.001.patch, 
> HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.002.patch, 
> HDFS-13688-HDFS-12943.WIP.002.patch, HDFS-13688-HDFS-12943.WIP.patch
>
>
> As mentioned in the design doc in HDFS-12943, to ensure consistent read, we 
> need to introduce an RPC call {{msync}}. Specifically, client can issue a 
> msync call to Observer node along with a transactionID. The msync will only 
> return when the Observer's transactionID has caught up to the given ID. This 
> JIRA is to add this API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13688) Introduce msync API call

2018-07-23 Thread Chen Liang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553152#comment-16553152
 ] 

Chen Liang commented on HDFS-13688:
---

I thought HADOOP-15610 was merge to the branch but I guess I was wrong... I 
have cherry-picked HADOOP-15610. So it should actually be fixed now. Triggering 
another build again.

> Introduce msync API call
> 
>
> Key: HDFS-13688
> URL: https://issues.apache.org/jira/browse/HDFS-13688
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13688-HDFS-12943.001.patch, 
> HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.WIP.002.patch, 
> HDFS-13688-HDFS-12943.WIP.patch
>
>
> As mentioned in the design doc in HDFS-12943, to ensure consistent read, we 
> need to introduce an RPC call {{msync}}. Specifically, client can issue a 
> msync call to Observer node along with a transactionID. The msync will only 
> return when the Observer's transactionID has caught up to the given ID. This 
> JIRA is to add this API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-23 Thread Andrew Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553150#comment-16553150
 ] 

Andrew Wang commented on HDFS-13672:


Hi Gabor, I took a quick look. It doesn't look like the iterator will make 
forward progress between iterations since it's starting from the beginning each 
time. What we need here is a tail iterator that starts at the last processed 
element.

I recommend we close this as wontfix and then open a new JIRA to figure out if 
want to disable this feature (which is incompatible) or do some kind of smarter 
detection as to whether it's necessary. As an example, we check if encryption 
is being used by seeing if there are any encryption zones created.

It's also worth asking if it's worth making a behavior change at all, since 
this long blocking scan probably will only happen during debugging situations 
(and we have a workaround).

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch, 
> HDFS-13672.003.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-199) Implement ReplicationManager to handle underreplication of closed containers

2018-07-23 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-199:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks [~elek] for the contribution and all for the reviews. I've commit the 
latest patch to trunk.

> Implement ReplicationManager to handle underreplication of closed containers
> 
>
> Key: HDDS-199
> URL: https://issues.apache.org/jira/browse/HDDS-199
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-199.001.patch, HDDS-199.002.patch, 
> HDDS-199.003.patch, HDDS-199.004.patch, HDDS-199.005.patch, 
> HDDS-199.006.patch, HDDS-199.007.patch, HDDS-199.008.patch, 
> HDDS-199.009.patch, HDDS-199.010.patch, HDDS-199.011.patch, 
> HDDS-199.012.patch, HDDS-199.013.patch, HDDS-199.014.patch, 
> HDDS-199.015.patch, HDDS-199.016.patch, HDDS-199.017.patch
>
>
> HDDS/Ozone supports Open and Closed containers. In case of specific 
> conditions (container is full, node is failed) the container will be closed 
> and will be replicated in a different way. The replication of Open containers 
> are handled with Ratis and PipelineManger.
> The ReplicationManager should handle the replication of the ClosedContainers. 
> The replication information will be sent as an event 
> (UnderReplicated/OverReplicated). 
> The Replication manager will collect all of the events in a priority queue 
> (to replicate first the containers where more replica is missing) calculate 
> the destination datanode (first with a very simple algorithm, later with 
> calculating scatter-width) and send the Copy/Delete container to the datanode 
> (CommandQueue).
> A CopyCommandWatcher/DeleteCommandWatcher are also included to retry the 
> copy/delete in case of failure. This is an in-memory structure (based on 
> HDDS-195) which can requeue the underreplicated/overreplicated events to the 
> prioirity queue unless the confirmation of the copy/delete command is arrived.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-284) Interleaving CRC for ChunksData

2018-07-23 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-284:

Attachment: Interleaving CRC and Error Detection for Containers.pdf

> Interleaving CRC for ChunksData
> ---
>
> Key: HDDS-284
> URL: https://issues.apache.org/jira/browse/HDDS-284
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: Interleaving CRC and Error Detection for Containers.pdf
>
>
> This Jira is to add CRC for chunks data.
> Right now, in chunkInfo, data is just byte array. We want to change this as 
> below:
> _message Data {_
> _required string magic =”Tullys00” = 1;_
>  _required CRCTYPE crcType =_ _2__;_
> _optional String LegacyMetadata = 3; // Fields to support inplace data 
> migration_
> _optional String LegacyData = 4;_
> _optional ushort ChecksumblockSize = 5; // Size of the block used to compute 
> the checksums_
> _repeated uint32 checksums = 6 ; // Set of Checksums_ 
> *_repeated byte data = 6;_*  _// Actual Data stream_
>  
> _}_
>  
> _This will help in error detection for containers during container scanner._
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-284) Interleaving CRC for ChunksData

2018-07-23 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553137#comment-16553137
 ] 

Bharat Viswanadham commented on HDDS-284:
-

Attached the design document from [~anu] for reference.

> Interleaving CRC for ChunksData
> ---
>
> Key: HDDS-284
> URL: https://issues.apache.org/jira/browse/HDDS-284
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: Interleaving CRC and Error Detection for Containers.pdf
>
>
> This Jira is to add CRC for chunks data.
> Right now, in chunkInfo, data is just byte array. We want to change this as 
> below:
> _message Data {_
> _required string magic =”Tullys00” = 1;_
>  _required CRCTYPE crcType =_ _2__;_
> _optional String LegacyMetadata = 3; // Fields to support inplace data 
> migration_
> _optional String LegacyData = 4;_
> _optional ushort ChecksumblockSize = 5; // Size of the block used to compute 
> the checksums_
> _repeated uint32 checksums = 6 ; // Set of Checksums_ 
> *_repeated byte data = 6;_*  _// Actual Data stream_
>  
> _}_
>  
> _This will help in error detection for containers during container scanner._
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-284) Interleaving CRC for ChunksData

2018-07-23 Thread Bharat Viswanadham (JIRA)
Bharat Viswanadham created HDDS-284:
---

 Summary: Interleaving CRC for ChunksData
 Key: HDDS-284
 URL: https://issues.apache.org/jira/browse/HDDS-284
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Bharat Viswanadham
Assignee: Shashikant Banerjee


This Jira is to add CRC for chunks data.

Right now, in chunkInfo, data is just byte array. We want to change this as 
below:

_message Data {_

_required string magic =”Tullys00” = 1;_

 _required CRCTYPE crcType =_ _2__;_

_optional String LegacyMetadata = 3; // Fields to support inplace data 
migration_

_optional String LegacyData = 4;_

_optional ushort ChecksumblockSize = 5; // Size of the block used to compute 
the checksums_

_repeated uint32 checksums = 6 ; // Set of Checksums_ 

*_repeated byte data = 6;_*  _// Actual Data stream_

 

_}_

 

_This will help in error detection for containers during container scanner._

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13688) Introduce msync API call

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553125#comment-16553125
 ] 

genericqa commented on HDFS-13688:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m  
9s{color} | {color:red} Docker failed to build yetus/hadoop:abb62dd. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13688 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932448/HDFS-13688-HDFS-12943.002.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24637/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Introduce msync API call
> 
>
> Key: HDFS-13688
> URL: https://issues.apache.org/jira/browse/HDFS-13688
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-13688-HDFS-12943.001.patch, 
> HDFS-13688-HDFS-12943.002.patch, HDFS-13688-HDFS-12943.WIP.002.patch, 
> HDFS-13688-HDFS-12943.WIP.patch
>
>
> As mentioned in the design doc in HDFS-12943, to ensure consistent read, we 
> need to introduce an RPC call {{msync}}. Specifically, client can issue a 
> msync call to Observer node along with a transactionID. The msync will only 
> return when the Observer's transactionID has caught up to the given ID. This 
> JIRA is to add this API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-199) Implement ReplicationManager to handle underreplication of closed containers

2018-07-23 Thread Xiaoyu Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553117#comment-16553117
 ] 

Xiaoyu Yao commented on HDDS-199:
-

Thanks [~elek] for the update and [~nandakumar131] for the reviews. Patch v17 
looks good to me, +1. I will commit it shortly. 

> Implement ReplicationManager to handle underreplication of closed containers
> 
>
> Key: HDDS-199
> URL: https://issues.apache.org/jira/browse/HDDS-199
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-199.001.patch, HDDS-199.002.patch, 
> HDDS-199.003.patch, HDDS-199.004.patch, HDDS-199.005.patch, 
> HDDS-199.006.patch, HDDS-199.007.patch, HDDS-199.008.patch, 
> HDDS-199.009.patch, HDDS-199.010.patch, HDDS-199.011.patch, 
> HDDS-199.012.patch, HDDS-199.013.patch, HDDS-199.014.patch, 
> HDDS-199.015.patch, HDDS-199.016.patch, HDDS-199.017.patch
>
>
> HDDS/Ozone supports Open and Closed containers. In case of specific 
> conditions (container is full, node is failed) the container will be closed 
> and will be replicated in a different way. The replication of Open containers 
> are handled with Ratis and PipelineManger.
> The ReplicationManager should handle the replication of the ClosedContainers. 
> The replication information will be sent as an event 
> (UnderReplicated/OverReplicated). 
> The Replication manager will collect all of the events in a priority queue 
> (to replicate first the containers where more replica is missing) calculate 
> the destination datanode (first with a very simple algorithm, later with 
> calculating scatter-width) and send the Copy/Delete container to the datanode 
> (CommandQueue).
> A CopyCommandWatcher/DeleteCommandWatcher are also included to retry the 
> copy/delete in case of failure. This is an in-memory structure (based on 
> HDDS-195) which can requeue the underreplicated/overreplicated events to the 
> prioirity queue unless the confirmation of the copy/delete command is arrived.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13761) Add toString Method to AclFeature Class

2018-07-23 Thread Shweta (JIRA)
Shweta created HDFS-13761:
-

 Summary: Add toString Method to AclFeature Class
 Key: HDFS-13761
 URL: https://issues.apache.org/jira/browse/HDFS-13761
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Shweta
Assignee: Shweta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-281) Need container size distribution metric in OzoneManager UI

2018-07-23 Thread Ajay Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553097#comment-16553097
 ] 

Ajay Kumar commented on HDDS-281:
-

+1 on this. Also if we can show % of close and open containers.

> Need container size distribution metric in OzoneManager UI
> --
>
> Key: HDDS-281
> URL: https://issues.apache.org/jira/browse/HDDS-281
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Manager
>Reporter: Nilotpal Nandi
>Priority: Minor
>
> It would be good if we have some metric/histogram in OzoneManager UI 
> indicating the different container size range and corresponding percentages 
> for the same created in the cluster.
> For example :
> 0-2 GB           10%
> 2-4 GB .         20%
> 4-5 GB           70%
> 5+ GB            0%
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-23 Thread Gabor Bota (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553029#comment-16553029
 ] 

Gabor Bota commented on HDFS-13672:
---

We had an offline discussion with [~andrew.wang] on this topic with the 
following outcome:
* I upload a new patch for this issue with the iterator created inside the loop 
inside the writelock. If this patch is accepted, then this will be the proposed 
solution for this issue, so it can be committed.
* Another solution for this would be to disable the scrubber interval when 
debugging. In the real world/customer environments, there are no cases when 
there are so many corrupted *lazy* persist files.
* The open question about this is that should we disable the 
clearCorruptLazyPersistFiles by default?

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch, 
> HDFS-13672.003.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-199) Implement ReplicationManager to handle underreplication of closed containers

2018-07-23 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553027#comment-16553027
 ] 

genericqa commented on HDDS-199:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
33s{color} | {color:red} hadoop-hdds/server-scm in trunk has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
54s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
30s{color} | {color:green} framework in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} container-service in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
25s{color} | {color:green} server-scm in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 74m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDDS-199 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12932708/HDDS-199.017.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux bc9bd5b5d553 3.13.0-153-generic

[jira] [Updated] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-23 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-13672:
--
Status: Patch Available  (was: In Progress)

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch, 
> HDFS-13672.003.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode

2018-07-23 Thread Gabor Bota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota updated HDFS-13672:
--
Attachment: HDFS-13672.003.patch

> clearCorruptLazyPersistFiles could crash NameNode
> -
>
> Key: HDFS-13672
> URL: https://issues.apache.org/jira/browse/HDFS-13672
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Gabor Bota
>Priority: Major
> Attachments: HDFS-13672.001.patch, HDFS-13672.002.patch, 
> HDFS-13672.003.patch
>
>
> I started a NameNode on a pretty large fsimage. Since the NameNode is started 
> without any DataNodes, all blocks (100 million) are "corrupt".
> Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write 
> lock for a long time:
> {noformat}
> 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held 
> for 46024 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543)
> java.lang.Thread.run(Thread.java:748)
> Number of suppressed write-lock reports: 0
> Longest write-lock held interval: 46024
> {noformat}
> Here's the relevant code:
> {code}
>   writeLock();
>   try {
> final Iterator it =
> blockManager.getCorruptReplicaBlockIterator();
> while (it.hasNext()) {
>   Block b = it.next();
>   BlockInfo blockInfo = blockManager.getStoredBlock(b);
>   if (blockInfo.getBlockCollection().getStoragePolicyID() == 
> lpPolicy.getId()) {
> filesToDelete.add(blockInfo.getBlockCollection());
>   }
> }
> for (BlockCollection bc : filesToDelete) {
>   LOG.warn("Removing lazyPersist file " + bc.getName() + " with no 
> replicas.");
>   changed |= deleteInternal(bc.getName(), false, false, false);
> }
>   } finally {
> writeUnlock();
>   }
> {code}
> In essence, the iteration over corrupt replica list should be broken down 
> into smaller iterations to avoid a single long wait.
> Since this operation holds NameNode write lock for more than 45 seconds, the 
> default ZKFC connection timeout, it implies an extreme case like this (100 
> million corrupt blocks) could lead to NameNode failover.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica

2018-07-23 Thread Kitti Nanasi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553016#comment-16553016
 ] 

Kitti Nanasi commented on HDFS-13658:
-

I uploaded patch v008, where the new metrics show the number of replicated and 
erasure coded blocks with the highest priority and the deprecated methods are 
not used anymore.

> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica
> ---
>
> Key: HDFS-13658
> URL: https://issues.apache.org/jira/browse/HDFS-13658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch, 
> HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch, 
> HDFS-13658.006.patch, HDFS-13658.007.patch, HDFS-13658.008.patch
>
>
> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica. We have had many cases opened in which a customer has lost a disk 
> or a DN losing files/blocks due to the fact that they had blocks with only 1 
> replica. We need to make the customer better aware of this situation and that 
> they should take action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica

2018-07-23 Thread Kitti Nanasi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kitti Nanasi updated HDFS-13658:

Attachment: HDFS-13658.008.patch

> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica
> ---
>
> Key: HDFS-13658
> URL: https://issues.apache.org/jira/browse/HDFS-13658
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.1.0
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Major
> Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch, 
> HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch, 
> HDFS-13658.006.patch, HDFS-13658.007.patch, HDFS-13658.008.patch
>
>
> fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 
> 1 replica. We have had many cases opened in which a customer has lost a disk 
> or a DN losing files/blocks due to the fact that they had blocks with only 1 
> replica. We need to make the customer better aware of this situation and that 
> they should take action.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-271) Create a block iterator to iterate blocks in a container

2018-07-23 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552924#comment-16552924
 ] 

Nanda kumar commented on HDDS-271:
--

Do we have to use {{getRangeKVs}} and load all the values into memory for an 
iterator implementation? We can take advantage of the db iterator that is 
provided by RocksDB/LevelDB. We might be missing on richer APIs like 
{{getBlockCount}}, but do we really need those APIs for an Iterator? Loading 
all the values into a list for an iteration will have huge memory impact if we 
create iterators for multiple containers at the same time.

> Create a block iterator to iterate blocks in a container
> 
>
> Key: HDDS-271
> URL: https://issues.apache.org/jira/browse/HDDS-271
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-271.00.patch
>
>
> Create a block iterator to scan all blocks in a container.
> This one will be useful during implementation of container scanner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-199) Implement ReplicationManager to handle underreplication of closed containers

2018-07-23 Thread Elek, Marton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552921#comment-16552921
 ] 

Elek, Marton commented on HDDS-199:
---

With the help of [~nandakumar131] I fixed the the eventId handling:

1. Random ID generation is removed and the ID is generated by the SCMCommand

2. Both ReplicateContainerCommand and ReplicationRequestToRepeat use the same 
id (the first generates it in the constructor the second use the id of the 
first one)

3. I improved the unit test to prove that the two comands have the same id 
inside.

> Implement ReplicationManager to handle underreplication of closed containers
> 
>
> Key: HDDS-199
> URL: https://issues.apache.org/jira/browse/HDDS-199
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-199.001.patch, HDDS-199.002.patch, 
> HDDS-199.003.patch, HDDS-199.004.patch, HDDS-199.005.patch, 
> HDDS-199.006.patch, HDDS-199.007.patch, HDDS-199.008.patch, 
> HDDS-199.009.patch, HDDS-199.010.patch, HDDS-199.011.patch, 
> HDDS-199.012.patch, HDDS-199.013.patch, HDDS-199.014.patch, 
> HDDS-199.015.patch, HDDS-199.016.patch, HDDS-199.017.patch
>
>
> HDDS/Ozone supports Open and Closed containers. In case of specific 
> conditions (container is full, node is failed) the container will be closed 
> and will be replicated in a different way. The replication of Open containers 
> are handled with Ratis and PipelineManger.
> The ReplicationManager should handle the replication of the ClosedContainers. 
> The replication information will be sent as an event 
> (UnderReplicated/OverReplicated). 
> The Replication manager will collect all of the events in a priority queue 
> (to replicate first the containers where more replica is missing) calculate 
> the destination datanode (first with a very simple algorithm, later with 
> calculating scatter-width) and send the Copy/Delete container to the datanode 
> (CommandQueue).
> A CopyCommandWatcher/DeleteCommandWatcher are also included to retry the 
> copy/delete in case of failure. This is an in-memory structure (based on 
> HDDS-195) which can requeue the underreplicated/overreplicated events to the 
> prioirity queue unless the confirmation of the copy/delete command is arrived.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13760) improve ZKFC fencing action when network of ZKFC interrupt

2018-07-23 Thread He Xiaoqiao (JIRA)
He Xiaoqiao created HDFS-13760:
--

 Summary: improve ZKFC fencing action when network of ZKFC interrupt
 Key: HDFS-13760
 URL: https://issues.apache.org/jira/browse/HDFS-13760
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Reporter: He Xiaoqiao


when host of Active NameNode & ZKFC meet network fault for quite a time, HDFS 
will be not available since ZKFC located on Standby NameNode will never ssh 
fence success due to it could not ssh to Active NameNode. In such situation, 
for Client, it could not connect to Active NameNode, then failover to Standby 
but it could not provide READ/WRITE.
{code:xml}
2018-07-23 15:57:10,836 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 40 
time(s); maxRetries=45
2018-07-23 15:57:30,856 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 41 
time(s); maxRetries=45
2018-07-23 15:57:50,872 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 42 
time(s); maxRetries=45
2018-07-23 15:58:10,892 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 43 
time(s); maxRetries=45
2018-07-23 15:58:30,912 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 44 
time(s); maxRetries=45
2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ZKFailoverController: get old 
active state exception: org.apache.hadoop.net.ConnectTimeoutException: 2 
millis timeout while waiting for channel to be 
ready for connect. ch : java.nio.channels.SocketChannel[connection-pending 
local=/ip:port remote=hostname]
2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ActiveStandbyElector: old 
active is not healthy. need to create znode
2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ActiveStandbyElector: Elector 
callbacks for NameNode at standbynn start create node, now time: 
45179010079342817
2018-07-23 15:58:50,936 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
CreateNode result: 0 code:OK for path: /hadoop-ha/ns/ActiveStandbyElectorLock 
connectionState: CONNECTED  for elector id=469098346 
appData=0a07727a2d6e6e313312046e6e31331a1f727a2d646174612d6864702d6e6e31332e727a2e73616e6b7561692e636f6d20e83e28d33e
 cb=Elector callbacks for NameNode at standbynamenode
2018-07-23 15:58:50,936 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2018-07-23 15:58:50,938 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old 
node exists: 
0a07727a2d6e6e313312046e6e31341a1f727a2d646174612d6864702d6e6e31342e727a2e73616e6b7561692e636f6d20e83e28d33e
2018-07-23 15:58:50,939 INFO org.apache.hadoop.ha.ZKFailoverController: Should 
fence: NameNode at activenamenode
2018-07-23 15:59:10,960 INFO org.apache.hadoop.ipc.Client: Retrying connect to 
server: activenamenode. Already tried 0 time(s); maxRetries=1
2018-07-23 15:59:30,980 WARN org.apache.hadoop.ha.FailoverController: Unable to 
gracefully make NameNode at activenamenode standby (unable to connect)
org.apache.hadoop.net.ConnectTimeoutException: Call From standbynamenode to 
activenamenode failed on socket timeout exception: 
org.apache.hadoop.net.ConnectTimeoutException: 2 millis timeout while 
waiting for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending local=ip:port 
remote=activenamenode]; For more details see:  
http://wiki.apache.org/hadoop/SocketTimeout
{code}

I propose that when Active NameNode meet network fault, ZKFC force this 
NameNode to become Standby, and another ZKFC could hold the ZNode for election 
and transition other NameNode to Active even when ssh fence fail.

There is no available patch now, and I am very welcome to hear some suggestion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-199) Implement ReplicationManager to handle underreplication of closed containers

2018-07-23 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-199:
--
Attachment: HDDS-199.017.patch

> Implement ReplicationManager to handle underreplication of closed containers
> 
>
> Key: HDDS-199
> URL: https://issues.apache.org/jira/browse/HDDS-199
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
> Fix For: 0.2.1
>
> Attachments: HDDS-199.001.patch, HDDS-199.002.patch, 
> HDDS-199.003.patch, HDDS-199.004.patch, HDDS-199.005.patch, 
> HDDS-199.006.patch, HDDS-199.007.patch, HDDS-199.008.patch, 
> HDDS-199.009.patch, HDDS-199.010.patch, HDDS-199.011.patch, 
> HDDS-199.012.patch, HDDS-199.013.patch, HDDS-199.014.patch, 
> HDDS-199.015.patch, HDDS-199.016.patch, HDDS-199.017.patch
>
>
> HDDS/Ozone supports Open and Closed containers. In case of specific 
> conditions (container is full, node is failed) the container will be closed 
> and will be replicated in a different way. The replication of Open containers 
> are handled with Ratis and PipelineManger.
> The ReplicationManager should handle the replication of the ClosedContainers. 
> The replication information will be sent as an event 
> (UnderReplicated/OverReplicated). 
> The Replication manager will collect all of the events in a priority queue 
> (to replicate first the containers where more replica is missing) calculate 
> the destination datanode (first with a very simple algorithm, later with 
> calculating scatter-width) and send the Copy/Delete container to the datanode 
> (CommandQueue).
> A CopyCommandWatcher/DeleteCommandWatcher are also included to retry the 
> copy/delete in case of failure. This is an in-memory structure (based on 
> HDDS-195) which can requeue the underreplicated/overreplicated events to the 
> prioirity queue unless the confirmation of the copy/delete command is arrived.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >