[jira] [Commented] (HDFS-15249) ThrottledAsyncChecker is not thread-safe.

2020-04-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076943#comment-17076943
 ] 

Hudson commented on HDFS-15249:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18122 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18122/])
HDFS-15249 ThrottledAsyncChecker is not thread-safe. (#1922) (github: rev 
c12ddbd1de40b32bbe2f6a3e484abf843d6d92ae)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/ThrottledAsyncChecker.java


> ThrottledAsyncChecker is not thread-safe.
> -
>
> Key: HDFS-15249
> URL: https://issues.apache.org/jira/browse/HDFS-15249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> ThrottledAsyncChecker should be thread-safe because it can be used by 
> multiple threads when we have multiple namespaces.
> *checksInProgress* and *completedChecks* are respectively HashMap and 
> WeakHashMap which are not thread-safe. So we need to put them in synchronized 
> block whenever we access them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15249) ThrottledAsyncChecker is not thread-safe.

2020-04-06 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076935#comment-17076935
 ] 

Toshihiro Suzuki commented on HDFS-15249:
-

Thank you for reviewing and committing this! [~aajisaka] [~weichiu]

> ThrottledAsyncChecker is not thread-safe.
> -
>
> Key: HDFS-15249
> URL: https://issues.apache.org/jira/browse/HDFS-15249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> ThrottledAsyncChecker should be thread-safe because it can be used by 
> multiple threads when we have multiple namespaces.
> *checksInProgress* and *completedChecks* are respectively HashMap and 
> WeakHashMap which are not thread-safe. So we need to put them in synchronized 
> block whenever we access them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15249) ThrottledAsyncChecker is not thread-safe.

2020-04-06 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-15249.
--
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged the PR into trunk. Thanks [~brfrn169] for the contribution and thanks 
[~weichiu] for the review.

> ThrottledAsyncChecker is not thread-safe.
> -
>
> Key: HDFS-15249
> URL: https://issues.apache.org/jira/browse/HDFS-15249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 3.4.0
>
>
> ThrottledAsyncChecker should be thread-safe because it can be used by 
> multiple threads when we have multiple namespaces.
> *checksInProgress* and *completedChecks* are respectively HashMap and 
> WeakHashMap which are not thread-safe. So we need to put them in synchronized 
> block whenever we access them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15249) ThrottledAsyncChecker is not thread-safe.

2020-04-06 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076924#comment-17076924
 ] 

Toshihiro Suzuki commented on HDFS-15249:
-

[~weichiu] How do we proceed with this?

> ThrottledAsyncChecker is not thread-safe.
> -
>
> Key: HDFS-15249
> URL: https://issues.apache.org/jira/browse/HDFS-15249
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
>
> ThrottledAsyncChecker should be thread-safe because it can be used by 
> multiple threads when we have multiple namespaces.
> *checksInProgress* and *completedChecks* are respectively HashMap and 
> WeakHashMap which are not thread-safe. So we need to put them in synchronized 
> block whenever we access them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15263) Fix the logic of scope and excluded scope in Network Topology

2020-04-06 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15263:

Attachment: HDFS-15263-05.patch

> Fix the logic of scope and excluded scope in Network Topology
> -
>
> Key: HDFS-15263
> URL: https://issues.apache.org/jira/browse/HDFS-15263
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-15263-01.patch, HDFS-15263-02.patch, 
> HDFS-15263-03.patch, HDFS-15263-04.patch, HDFS-15263-05.patch
>
>
> If scope is d1 and excluded scope is d10, still scope will be starting with 
> excluded scope, so by current logic it will return null, It should append 
> Separator and then check.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15256) Fix typo in DataXceiverServer#run()

2020-04-06 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076636#comment-17076636
 ] 

Hudson commented on HDFS-15256:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18121 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18121/])
HDFS-15256. Fix typo in DataXceiverServer#run(). Contributed by Lisheng 
(inigoiri: rev 0b855b9f3570f98ff2f2802114241e10520aded8)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java


> Fix typo in DataXceiverServer#run()
> ---
>
> Key: HDFS-15256
> URL: https://issues.apache.org/jira/browse/HDFS-15256
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Trivial
> Fix For: 3.4.0
>
> Attachments: HDFS-15256.001.patch
>
>
> There is a typo in DataXceiverServer#run(), see the patch for detail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15256) Fix typo in DataXceiverServer#run()

2020-04-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076621#comment-17076621
 ] 

Íñigo Goiri commented on HDFS-15256:


Thanks [~leosun08] for the patch and [~ayushtkn] for the review.
Committed to trunk.

> Fix typo in DataXceiverServer#run()
> ---
>
> Key: HDFS-15256
> URL: https://issues.apache.org/jira/browse/HDFS-15256
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Trivial
> Fix For: 3.4.0
>
> Attachments: HDFS-15256.001.patch
>
>
> There is a typo in DataXceiverServer#run(), see the patch for detail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15256) Fix typo in DataXceiverServer#run()

2020-04-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15256:
---
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Fix typo in DataXceiverServer#run()
> ---
>
> Key: HDFS-15256
> URL: https://issues.apache.org/jira/browse/HDFS-15256
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Trivial
> Fix For: 3.4.0
>
> Attachments: HDFS-15256.001.patch
>
>
> There is a typo in DataXceiverServer#run(), see the patch for detail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15265) HttpFS: validate content-type in HttpFSUtils

2020-04-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15265:
---
Description: Validate that the content-type in HttpFSUtils is JSON.

> HttpFS: validate content-type in HttpFSUtils
> 
>
> Key: HDFS-15265
> URL: https://issues.apache.org/jira/browse/HDFS-15265
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15265.001.patch
>
>
> Validate that the content-type in HttpFSUtils is JSON.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15265) HttpFS: validate content-type in HttpFSUtils

2020-04-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076618#comment-17076618
 ] 

Íñigo Goiri commented on HDFS-15265:


+1 on  [^HDFS-15265.001.patch] .

> HttpFS: validate content-type in HttpFSUtils
> 
>
> Key: HDFS-15265
> URL: https://issues.apache.org/jira/browse/HDFS-15265
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15265.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15263) Fix the logic of scope and excluded scope in Network Topology

2020-04-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076617#comment-17076617
 ] 

Íñigo Goiri commented on HDFS-15263:


* For the appending PATH_SEPARATOR_STR to a new String, yes, something like 
that is good.
* For isInScope(), technically it would be isScopeInScope() but is weird, so 
let's go with isChildScope().

BTW, in the javadocs, let's be consistent finishing with period the sentences.

> Fix the logic of scope and excluded scope in Network Topology
> -
>
> Key: HDFS-15263
> URL: https://issues.apache.org/jira/browse/HDFS-15263
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-15263-01.patch, HDFS-15263-02.patch, 
> HDFS-15263-03.patch, HDFS-15263-04.patch
>
>
> If scope is d1 and excluded scope is d10, still scope will be starting with 
> excluded scope, so by current logic it will return null, It should append 
> Separator and then check.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076533#comment-17076533
 ] 

Hadoop QA commented on HDFS-15255:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
52s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 
12s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 41s{color} | {color:orange} root: The patch generated 1 new + 565 unchanged 
- 0 fixed = 566 total (was 565) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m  
6s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 58s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m  4s{color} 
| {color:red} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}107m  9s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 55s{color} 
| {color:red} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}251m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Call to 
org.apache.hadoop.hdfs.protocol.DatanodeInfoWithStorage.equals(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor)
 in 
org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(LocatedBlock)
  At CacheManager.java: At 

[jira] [Commented] (HDFS-15265) HttpFS: validate content-type in HttpFSUtils

2020-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076509#comment-17076509
 ] 

Hadoop QA commented on HDFS-15265:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  5m 
35s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 72m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:e6455cc864d |
| JIRA Issue | HDFS-15265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999078/HDFS-15265.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 075b5994e179 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ab7495d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29088/testReport/ |
| Max. process+thread count | 613 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-httpfs U: 
hadoop-hdfs-project/hadoop-hdfs-httpfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29088/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> HttpFS: validate content-type in HttpFSUtils
> 
>
> Key: HDFS-15265
> URL: 

[jira] [Updated] (HDFS-15267) Implement Statistics Count for HttpFSFileSystem

2020-04-06 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15267:

Issue Type: Improvement  (was: Bug)

> Implement Statistics Count for HttpFSFileSystem
> ---
>
> Key: HDFS-15267
> URL: https://issues.apache.org/jira/browse/HDFS-15267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>
> As of now, there is  no count of ops maintained for HttpFSFileSystem like 
> DistributedFIleSysmtem & WebHDFSFilesystem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15267) Implement Statistics Count for HttpFSFileSystem

2020-04-06 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076467#comment-17076467
 ] 

Ayush Saxena commented on HDFS-15267:
-

Realized while fixing HDFS-15266, wanted to check if any op isn't missed here 
too, so that can fix in one go only.

Couldn't find them tracked anywhere...

> Implement Statistics Count for HttpFSFileSystem
> ---
>
> Key: HDFS-15267
> URL: https://issues.apache.org/jira/browse/HDFS-15267
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>
> As of now, there is  no count of ops maintained for HttpFSFileSystem like 
> DistributedFIleSysmtem & WebHDFSFilesystem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15267) Implement Statistics Count for HttpFSFileSystem

2020-04-06 Thread Ayush Saxena (Jira)
Ayush Saxena created HDFS-15267:
---

 Summary: Implement Statistics Count for HttpFSFileSystem
 Key: HDFS-15267
 URL: https://issues.apache.org/jira/browse/HDFS-15267
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


As of now, there is  no count of ops maintained for HttpFSFileSystem like 
DistributedFIleSysmtem & WebHDFSFilesystem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15266) Add missing DFSOps Statistics in WebHDFS

2020-04-06 Thread Ayush Saxena (Jira)
Ayush Saxena created HDFS-15266:
---

 Summary: Add missing DFSOps Statistics in WebHDFS
 Key: HDFS-15266
 URL: https://issues.apache.org/jira/browse/HDFS-15266
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena


Couple of operations doesn't increment the count of number of read/write ops 
and DFSOpsCountStatistics

like : getStoragePolicy



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076461#comment-17076461
 ] 

Stephen O'Donnell commented on HDFS-15255:
--

I added some debug to figure out what happens when both options are true for 
the comparator:

{code}
 private Consumer> createSecondaryNodeSorter() {
Consumer> secondarySort =
list -> {
   LOG.info("Running the shuffle");
   Collections.shuffle(list);
};
if (readConsiderStorageType) {
  LOG.info("Read consider storage set");
  Comparator comp =
  Comparator.comparing(DatanodeInfoWithStorage::getStorageType);
  secondarySort = list -> {
LOG.info("Running storage sort");
Collections.sort(list, comp);
  };
}

if (readConsiderLoad) {
  LOG.info("Read consider load set");
  Comparator comp =
  Comparator.comparingInt(DatanodeInfo::getXceiverCount);
  secondarySort = list -> {
LOG.info("Running with load set");
Collections.sort(list, comp);
  };
}
return secondarySort;
  }
{code}

Changing one of the unit tests and running with this extra logging shows only 
the last on set is used, which makes the two features incompatible. I think 
that is OK, it just needs to be documented in the hdfs-site.xml on both the 
parameters.

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch, HDFS-15255.002.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15265) HttpFS: validate content-type in HttpFSUtils

2020-04-06 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15265:
-
Attachment: HDFS-15265.001.patch
Status: Patch Available  (was: Open)

> HttpFS: validate content-type in HttpFSUtils
> 
>
> Key: HDFS-15265
> URL: https://issues.apache.org/jira/browse/HDFS-15265
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15265.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15265) HttpFS: validate content-type in HttpFSUtils

2020-04-06 Thread hemanthboyina (Jira)
hemanthboyina created HDFS-15265:


 Summary: HttpFS: validate content-type in HttpFSUtils
 Key: HDFS-15265
 URL: https://issues.apache.org/jira/browse/HDFS-15265
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: hemanthboyina
Assignee: hemanthboyina






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15246) ArrayIndexOfboundsException in BlockManager CreateLocatedBlock

2020-04-06 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15246:
-
Attachment: HDFS-15246-testrepro.patch

> ArrayIndexOfboundsException in BlockManager CreateLocatedBlock
> --
>
> Key: HDFS-15246
> URL: https://issues.apache.org/jira/browse/HDFS-15246
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15246-testrepro.patch
>
>
> java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
>  
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlock(BlockManager.java:1362)
>  at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.createLocatedBlocks(BlockManager.java:1501)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:179)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2047)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:770)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076423#comment-17076423
 ] 

Lisheng Sun commented on HDFS-15255:


 If the last one overwrite the before one.
According to the original code,Collections.shuffle(list) is also overwrited. 
{code:java}
private Consumer> createSecondaryNodeSorter() {
 Consumer> secondarySort =
 list -> Collections.shuffle(list);
if (readConsiderLoad)
{ Comparator comp = 
Comparator.comparingInt(DatanodeInfo::getXceiverCount); secondarySort = list -> 
Collections.sort(list, comp); }
return secondarySort;
 }
{code}
So the last one does not overwrite the before ones.


> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch, HDFS-15255.002.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076380#comment-17076380
 ] 

Stephen O'Donnell commented on HDFS-15255:
--

{quote}
If both are set, then the comparator here would return both.
Since in follow code, when call Collections.sort(list, comp), list has been 
sorted.
{quote}

I'm not sure if that is correct. This method does not actually do the sorting. 
Rather it returns a `Consumer<>` object which seems to contain a comparator. 
This is then passed to `networktopology.sortByDistance` where the actual sort 
is applied.

Reading this method, I think it returns only one "sort". It starts as a 
shuffle. If `readConsiderLoad` is set, it overwrites it. If 
`readConsiderStorageType` is also set, it overwrites it again and the last one 
set wins and is returned. I don't think each sort is applied on top of the last.

In saying the above, this syntax is a bit unfamiliar to me, so I may be wrong!

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch, HDFS-15255.002.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076365#comment-17076365
 ] 

Lisheng Sun commented on HDFS-15255:


the 002 patch added the UT and fixed checkstyle.

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch, HDFS-15255.002.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15255:
---
Attachment: HDFS-15255.002.patch

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch, HDFS-15255.002.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15207) VolumeScanner skip to scan blocks accessed during recent scan peroid

2020-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076335#comment-17076335
 ] 

Hadoop QA commented on HDFS-15207:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 55s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 501 unchanged - 0 fixed = 502 total (was 501) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 41s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}198m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:e6455cc864d |
| JIRA Issue | HDFS-15207 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999035/HDFS-15207.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 1216d2fdb046 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ab7495d |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 

[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076313#comment-17076313
 ] 

Lisheng Sun commented on HDFS-15255:


Thanx [~sodonnell] for patient review and good suggestions.
{quote}
Looking at the code, it seems this new change (sort by storage type) is not 
compatible with "sort by load" (HDFS-14882). Is that correct? If both are set, 
then the comparator here would return just the last one:
{quote}
If both are set, then the comparator here would return both. 
Since in follow code, when call  Collections.sort(list, comp), list has been 
sorted.
{code:java}
 if (readConsiderLoad) {
  Comparator comp =
  Comparator.comparingInt(DatanodeInfo::getXceiverCount);
  secondarySort = list -> Collections.sort(list, comp);
}
{code}

{quote}
Just a question - if network distance is always the same, and there is only 1 
SSD storage and this new feature is enabled. Would the SSD storage be the only 
one ever picked? Could like result in load problems on that host?
{quote}
i think this problem is a trade off. Rely on the client to want a faster read 
latency or focus on DN's load.
i can put this new feature before the LOAD in  HDFS-14882.



> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076293#comment-17076293
 ] 

Stephen O'Donnell commented on HDFS-15255:
--

For the findbugs warning, it is related to this code:

{code}
  for (DatanodeInfo loc : block.getLocations()) {
if (loc.equals(datanode)) {.  // Line 955
  block.addCachedLoc(loc);
  found = true;
  break;
}
  }
{code}

Its probably because you have changed block.getLocations() to return 
DatanodeInfoWithStorage instead of DatanodeInfo. The equals and hashCode 
methods are delegated to super, so I am not sure why it is throwing this 
warning. Might be somehow related to the object type check in DatanodeID, but I 
am not sure.

The more detailed report says:

{code}
This method calls equals(Object) on two references of different class types and 
analysis suggests they will be to objects of different classes at runtime. 
Further, examination of the equals methods that would be invoked suggest that 
either this call will always return false, or else the equals method is not be 
symmetric (which is a property required by the contract for equals in class 
Object). 
{code}

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076284#comment-17076284
 ] 

Stephen O'Donnell commented on HDFS-15255:
--

I'd like to confirm I understand how this works correctly.

The nodes are first sorted by "distance from the client". There is then a 
secondary sort, which by default is a shuffle which is applied to nodes with 
the same distance.

This change will allow the "same distance nodes" to be sorted by storage type, 
putting the fastest storage first. We also have an option to sort by DN load, 
added by HDFS-14882.

Looking at the code, it seems this new change (sort by storage type) is not 
compatible with "sort by load" (HDFS-14882). Is that correct? If both are set, 
then the comparator here would return just the last one:

{code}
  private Consumer> createSecondaryNodeSorter() {
Consumer> secondarySort =
list -> Collections.shuffle(list);
if (readConsiderLoad) {
  Comparator comp =
  Comparator.comparingInt(DatanodeInfo::getXceiverCount);
  secondarySort = list -> Collections.sort(list, comp);
}

if (readConsiderStorageType) {
  Comparator comp =
  Comparator.comparing(DatanodeInfoWithStorage::getStorageType);
  secondarySort = list -> Collections.sort(list, comp);
}
return secondarySort;
}
{code}

1. We would need to document this limitation in the hdfs-site.xml for both 
`dfs.namenode.read.consider.storageType` and `dfs.namenode.read.considerLoad` 
to make that clear.

2. We should probably also align the two parameters so they have similar names 
for consistency, eg change this one to `dfs.namenode.read.considerStorageType`.

3. Could you add a test for this change please - you could probably use the one 
in HDFS-14882 as a starting point.

4. Just a question - if network distance is always the same, and there is only 
1 SSD storage and this new feature is enabled. Would the SSD storage be the 
only one ever picked? Could like result in load problems on that host?

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15207) VolumeScanner skip to scan blocks accessed during recent scan peroid

2020-04-06 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076191#comment-17076191
 ] 

Yang Yun commented on HDFS-15207:
-

Thanks [~elgoiri] for the review.

Updated to HDFS-15207.005.patch with new way that uses getBlockLocalPathInfo 
instead of deprecated getReplica.

> VolumeScanner skip to scan blocks accessed during recent scan peroid
> 
>
> Key: HDFS-15207
> URL: https://issues.apache.org/jira/browse/HDFS-15207
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15207.002.patch, HDFS-15207.003.patch, 
> HDFS-15207.004.patch, HDFS-15207.005.patch, HDFS-15207.patch, HDFS-15207.patch
>
>
> Check the access time of block file to avoid scanning recently changed 
> blocks, reducing disk IO.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15207) VolumeScanner skip to scan blocks accessed during recent scan peroid

2020-04-06 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15207:

Attachment: HDFS-15207.005.patch
Status: Patch Available  (was: Open)

> VolumeScanner skip to scan blocks accessed during recent scan peroid
> 
>
> Key: HDFS-15207
> URL: https://issues.apache.org/jira/browse/HDFS-15207
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15207.002.patch, HDFS-15207.003.patch, 
> HDFS-15207.004.patch, HDFS-15207.005.patch, HDFS-15207.patch, HDFS-15207.patch
>
>
> Check the access time of block file to avoid scanning recently changed 
> blocks, reducing disk IO.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15207) VolumeScanner skip to scan blocks accessed during recent scan peroid

2020-04-06 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15207:

Status: Open  (was: Patch Available)

> VolumeScanner skip to scan blocks accessed during recent scan peroid
> 
>
> Key: HDFS-15207
> URL: https://issues.apache.org/jira/browse/HDFS-15207
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15207.002.patch, HDFS-15207.003.patch, 
> HDFS-15207.004.patch, HDFS-15207.005.patch, HDFS-15207.patch, HDFS-15207.patch
>
>
> Check the access time of block file to avoid scanning recently changed 
> blocks, reducing disk IO.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15202) HDFS-client: boost ShortCircuit Cache

2020-04-06 Thread Danil Lipovoy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076149#comment-17076149
 ] 

Danil Lipovoy commented on HDFS-15202:
--

[~leosun08]

Sorry for waiting, was a little bit busy)
I added few UT, all passed:
[[INFO] Tests run: 45, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 116.58 
s - in 
org.apache.hadoop.hdfs.client.impl.TestBlockReaderLocal|http://example.com]


> HDFS-client: boost ShortCircuit Cache
> -
>
> Key: HDFS-15202
> URL: https://issues.apache.org/jira/browse/HDFS-15202
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: dfsclient
> Environment: 4 nodes E5-2698 v4 @ 2.20GHz, 700 Gb Mem.
> 8 RegionServers (2 by host)
> 8 tables by 64 regions by 1.88 Gb data in each = 900 Gb total
> Random read in 800 threads via YCSB and a little bit updates (10% of reads)
>Reporter: Danil Lipovoy
>Assignee: Danil Lipovoy
>Priority: Minor
> Attachments: HDFS_CPU_full_cycle.png, cpu_SSC.png, cpu_SSC2.png, 
> hdfs_cpu.png, hdfs_reads.png, hdfs_scc_3_test.png, 
> hdfs_scc_test_full-cycle.png, locks.png, requests_SSC.png
>
>
> ТотI want to propose how to improve reading performance HDFS-client. The 
> idea: create few instances ShortCircuit caches instead of one. 
> The key points:
> 1. Create array of caches (set by 
> clientShortCircuitNum=*dfs.client.short.circuit.num*, see in the pull 
> requests below):
> {code:java}
> private ClientContext(String name, DfsClientConf conf, Configuration config) {
> ...
> shortCircuitCache = new ShortCircuitCache[this.clientShortCircuitNum];
> for (int i = 0; i < this.clientShortCircuitNum; i++) {
>   this.shortCircuitCache[i] = ShortCircuitCache.fromConf(scConf);
> }
> {code}
> 2 Then divide blocks by caches:
> {code:java}
>   public ShortCircuitCache getShortCircuitCache(long idx) {
> return shortCircuitCache[(int) (idx % clientShortCircuitNum)];
>   }
> {code}
> 3. And how to call it:
> {code:java}
> ShortCircuitCache cache = 
> clientContext.getShortCircuitCache(block.getBlockId());
> {code}
> The last number of offset evenly distributed from 0 to 9 - that's why all 
> caches will full approximately the same.
> It is good for performance. Below the attachment, it is load test reading 
> HDFS via HBase where clientShortCircuitNum = 1 vs 3. We can see that 
> performance grows ~30%, CPU usage about +15%. 
> Hope it is interesting for someone.
> Ready to explain some unobvious things.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15264) Backport HDFS-13571 to branch-3.1

2020-04-06 Thread Lisheng Sun (Jira)
Lisheng Sun created HDFS-15264:
--

 Summary: Backport HDFS-13571 to branch-3.1
 Key: HDFS-15264
 URL: https://issues.apache.org/jira/browse/HDFS-15264
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Lisheng Sun






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15256) Fix typo in DataXceiverServer#run()

2020-04-06 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076106#comment-17076106
 ] 

Lisheng Sun commented on HDFS-15256:


hi [~ayushtkn] [~elgoiri] Should we commit this patch to the trunk? Thank you.

> Fix typo in DataXceiverServer#run()
> ---
>
> Key: HDFS-15256
> URL: https://issues.apache.org/jira/browse/HDFS-15256
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Trivial
> Attachments: HDFS-15256.001.patch
>
>
> There is a typo in DataXceiverServer#run(), see the patch for detail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15255) Consider StorageType when DatanodeManager#sortLocatedBlock()

2020-04-06 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076104#comment-17076104
 ] 

Lisheng Sun commented on HDFS-15255:


{quote}FindBugs:Call to 
org.apache.hadoop.hdfs.protocol.DatanodeInfoWithStorage.equals(org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor)
 in 
org.apache.hadoop.hdfs.server.namenode.CacheManager.setCachedLocations(LocatedBlock)
 At CacheManager.java: At CacheManager.java:[line 955]
 Bug type EC_UNRELATED_TYPES (click for details)
{quote}
About this error, I do n’t know where the problem is.

DatanodeInfoWithStorage and DatanodeDescriptor also extend DatanodeInfo and 
overwrite equals(). 
 i runTestCacheDirectives#testNoLookupsWhenNotUsed that call 
CacheManager#setCachedLocations in local successfully. 
 i use the jdk is openjdk version "1.8.0_242"

hi [~elgoiri] [~sodonnell] [~weichiu]  [~ayushtkn]  Could you have time help 
review this patch? Thank you.

> Consider StorageType when DatanodeManager#sortLocatedBlock()
> 
>
> Key: HDFS-15255
> URL: https://issues.apache.org/jira/browse/HDFS-15255
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15255.001.patch
>
>
> When only one replica of a block is SDD, the others are HDD. 
> When the client reads the data, the current logic is that it considers the 
> distance between the client and the dn. I think it should also consider the 
> StorageType of the replica. Priority to return a replica of the specified 
> StorageType



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org