[jira] [Commented] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920648#comment-16920648
 ] 

Chen Zhang commented on HDFS-14654:
---

Thanks [~ayushtkn], I've filed another Jira to track how to fix 
{{TestRouterRpc#testErasureCoding}}

> RBF: TestRouterRpc#testNamenodeMetrics is flaky
> ---
>
> Key: HDFS-14654
> URL: https://issues.apache.org/jira/browse/HDFS-14654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14654.001.patch, HDFS-14654.002.patch, 
> HDFS-14654.003.patch, HDFS-14654.004.patch, HDFS-14654.005.patch, error.log
>
>
> They sometimes pass and sometimes fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14811) RBF: TestRouterRpc#testErasureCoding is flaky

2019-09-01 Thread Chen Zhang (Jira)
Chen Zhang created HDFS-14811:
-

 Summary: RBF: TestRouterRpc#testErasureCoding is flaky
 Key: HDFS-14811
 URL: https://issues.apache.org/jira/browse/HDFS-14811
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chen Zhang
Assignee: Chen Zhang


The Failed reason:

{code:java}
2019-09-01 18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseRandom(838)) - [
Node /default-rack/127.0.0.1:53148 [
]
Node /default-rack/127.0.0.1:53161 [
]
Node /default-rack/127.0.0.1:53157 [
  Datanode 127.0.0.1:53157 is not chosen since the node is too busy (load: 3 > 
2.6665).
Node /default-rack/127.0.0.1:53143 [
]
Node /default-rack/127.0.0.1:53165 [
]
2019-09-01 18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseRandom(846)) - Not enough replicas was 
chosen. Reason: {NODE_TOO_BUSY=1}
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
replicas, still in need of 1 to reach 6 (unavailableStorages=[], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) 
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
protocol.BlockStoragePolicy (BlockStoragePolicy.java:chooseStorageTypes(161)) - 
Failed to place enough replicas: expected size is 1 but only 0 storage types 
can be selected (replication=6, selected=[], unavailable=[DISK], 
removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
replicas, still in need of 1 to reach 6 (unavailableStorages=[DISK], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All 
required storage types are unavailable:  unavailableStorages=[DISK], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] INFO  
ipc.Server (Server.java:logException(2982)) - IPC Server handler 5 on default 
port 53140, call Call#1270 Retry#0 
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 127.0.0.1:53202
java.io.IOException: File /testec/testfile2 could only be written to 5 of the 6 
required nodes for RS-6-3-1024k. There are 6 datanode(s) running and 6 node(s) 
are excluded in this operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2815)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:893)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1001)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:929)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2921)
2019-09-01 18:19:20,942 [IPC Server handler 6 on default port 53197] INFO  
ipc.Server (Server.java:logException(2975)) - IPC Server handler 6 on default 
port 53197, call Call#1268 Retry#0 
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
192.168.1.112:53201: java.io.IOException: File /testec/testfile2 could only be 
written to 5 of the 6 required nodes for RS-6-3-1024k. There are 6 datanode(s) 
running and 6 node(s) are excluded in this operation.
{code}

More discussion, see: 
[HDFS-14654|https://issues.apache.org/jira/browse/HDFS-14654?focusedCommentId=16920439&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16

[jira] [Commented] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920636#comment-16920636
 ] 

Ayush Saxena commented on HDFS-14810:
-

Thanx [~hexiaoqiao] for the patch, I think that is missed, you can update 
adding it and keep it consistent with all others.

Seems like there are couple of comments from HDFS-11246 missed like adding 
success audit for {{isFileClosed}} and {{checkAccess}}. Apart If Jenkins stays 
clean. This should be all.

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch, HDFS-14810.002.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920633#comment-16920633
 ] 

Hadoop QA commented on HDFS-13157:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 47s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 162 unchanged - 1 fixed = 164 total (was 163) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 45s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}101m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.TestErasureCodingPolicyWithSnapshot |
|   | hadoop.hdfs.server.namenode.TestNamenodeRetryCache |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.server.namenode.TestFSNamesystemMBean |
|   | hadoop.hdfs.server.namenode.TestAddStripedBlocks |
|   | hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
|   | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier |
|   | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
|   | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.TestStripedFileAppend |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.hdfs.server.namenode.TestStoragePolicySatisfierWithHA |
|   | hadoop.hdfs.TestGetBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-13157 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979

[jira] [Commented] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920631#comment-16920631
 ] 

Ayush Saxena commented on HDFS-14654:
-

v005 LGTM +1

> RBF: TestRouterRpc#testNamenodeMetrics is flaky
> ---
>
> Key: HDFS-14654
> URL: https://issues.apache.org/jira/browse/HDFS-14654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14654.001.patch, HDFS-14654.002.patch, 
> HDFS-14654.003.patch, HDFS-14654.004.patch, HDFS-14654.005.patch, error.log
>
>
> They sometimes pass and sometimes fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14801) PrometheusMetricsSink: Better support for NNTop

2019-09-01 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-14801:
-
Status: Patch Available  (was: Open)

> PrometheusMetricsSink: Better support for NNTop
> ---
>
> Key: HDFS-14801
> URL: https://issues.apache.org/jira/browse/HDFS-14801
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
>
> Now nntop metrics is flattened as 
> dfs.NNTopUserOpCounts.windowMs=.op=.user=.count.
> I'd like to make windowMs, op, and user as label instead of name for more 
> prometheus-friendly metrics.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14740) HDFS read cache persistence support

2019-09-01 Thread Feilong He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feilong He updated HDFS-14740:
--
Description: In HDFS-13762, persistent memory (PM) is enabled in HDFS 
centralized cache management. Even though PM can persist cache data, for 
simplifying the initial implementation, the previous cache data will be cleaned 
up during DataNode restarts. Here, we are proposing to improve HDFS PM cache by 
taking advantage of PM's data persistence characteristic, i.e., recovering the 
cache status when DataNode restarts, thus, cache warm up time can be saved for 
user.  (was: In HDFS-13762, persistent memory is enabled in HDFS centralized 
cache management. Even though persistent memory can persist cache data, for 
simplifying the implementation, the previous cache data will be cleaned up 
during DataNode restarts. We propose to improve HDFS persistent memory (PM) 
cache by taking advantage of PM's data persistence characteristic, i.e., 
recovering the cache status when DataNode restarts, thus, cache warm up time 
can be saved for user.)

> HDFS read cache persistence support
> ---
>
> Key: HDFS-14740
> URL: https://issues.apache.org/jira/browse/HDFS-14740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Feilong He
>Assignee: Rui Mo
>Priority: Major
> Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, 
> HDFS-14740.002.patch
>
>
> In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache 
> management. Even though PM can persist cache data, for simplifying the 
> initial implementation, the previous cache data will be cleaned up during 
> DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking 
> advantage of PM's data persistence characteristic, i.e., recovering the cache 
> status when DataNode restarts, thus, cache warm up time can be saved for user.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2054) Bad preamble for HttpChannelOverHttp In the Ozone

2019-09-01 Thread Elek, Marton (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920620#comment-16920620
 ] 

Elek, Marton commented on HDDS-2054:


Thank you very much to report this problem [~Jack-Lee].

I tried to reproduce it but it worked well for me.

I have only two ideas:

 * Ozone may started slowly. The services need 10-30 seconds to startup. Can 
you please confirm that you have the same error after 1-2 minutes?
 * AWS credentials can be missing, or in wrong format: I am not sure about 
this, I have a configured, valid AWS key by default. Tried to delete it but 
didn't get the same error message. Can you please confirm if you set your aws 
credentials (or you already had one?)?

You can also upload a detailed log with using a debug flag:

{code}
aws s3api --debug --endpoint http://192.168.99.100:9878 create-bucket --bucket 
bucket1
{code}

It would help me to reproduce it locally as the whole request is printed out to 
the console with all the headers. (If you have any sensitive data in the log, 
for please remove it. AWS access key id can be there, but the secret is not 
only the signature which is generated by the secret).

> Bad preamble for HttpChannelOverHttp In the Ozone
> -
>
> Key: HDDS-2054
> URL: https://issues.apache.org/jira/browse/HDDS-2054
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Client, Ozone Filesystem, Ozone Manager
>Affects Versions: 0.4.0
> Environment: MacOS
>Reporter: lqjacklee
>Priority: Minor
>
> Follow the guide : 
> https://cwiki.apache.org/confluence/display/HADOOP/Running+via+DockerHub 
> I have deploy the ozone in the docker. then execute the command 
> aws s3api --endpoint http://192.168.99.100:9878 create-bucket --bucket bucket1
> The logs shows :
> 2019-08-29 02:07:13 WARN  HttpParser:1454 - bad HTTP parsed: 400 Bad preamble 
> for HttpChannelOverHttp@49ddb402{r=0,c=false,a=IDLE,uri=null}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920612#comment-16920612
 ] 

He Xiaoqiao commented on HDFS-14810:


[^HDFS-14810.002.patch] try to fix some exception and follows comments 
[~ayushtkn] said above.
I am confused about #enableErasureCodingPolicy and #disableErasureCodingPolicy 
which not throw out AccessControlException when meet. Any special consideration?

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch, HDFS-14810.002.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread He Xiaoqiao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14810:
---
Attachment: HDFS-14810.002.patch

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch, HDFS-14810.002.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13843) RBF: Add optional parameter -d for detailed listing of mount points.

2019-09-01 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920600#comment-16920600
 ] 

Hudson commented on HDFS-13843:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17220 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17220/])
HDFS-13843. RBF: Add optional parameter -d for detailed listing of mount 
(ayushsaxena: rev c3abfcefdd256650b2a45ae2aac53c4a22721a46)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterAdminCLI.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/tools/federation/RouterAdmin.java


> RBF: Add optional parameter -d for detailed listing of mount points.
> 
>
> Key: HDFS-13843
> URL: https://issues.apache.org/jira/browse/HDFS-13843
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0
>
> Attachments: HDFS-13843-03.patch, HDFS-13843-04.patch, 
> HDFS-13843.01.patch, HDFS-13843.02.patch
>
>
> *Scenario:*
> Execute the below add/update command for single mount entry for single 
> nameservice pointing to multiple destinations. 
>  # hdfs dfsrouteradmin -add /apps1 hacluster /tmp1
>  # hdfs dfsrouteradmin -add /apps1 hacluster /tmp1,/tmp2,/tmp3
>  # hdfs dfsrouteradmin -update /apps1 hacluster /tmp1,/tmp2,/tmp3 -order 
> RANDOM
> *Actual*. With the above commands, mount entry is successfully updated.
> But order information like HASH, RANDOM is not displayed in mount entries and 
> also not displayed in federation router UI. However order information is 
> updated properly when there are multiple nameservices. This issue is with 
> single nameservice having multiple destinations.
> *Expected:* 
> *Order information should be updated in mount entries so that the user will 
> come to know which order has been set.*
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13913) LazyPersistFileScrubber.run() error handling is poor

2019-09-01 Thread Daniel Templeton (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated HDFS-13913:

Status: Patch Available  (was: Open)

> LazyPersistFileScrubber.run() error handling is poor
> 
>
> Key: HDFS-13913
> URL: https://issues.apache.org/jira/browse/HDFS-13913
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.1.0
>Reporter: Daniel Templeton
>Assignee: Daniel Green
>Priority: Minor
> Attachments: HDFS-13913.001.patch
>
>
> In {{LazyPersistFileScrubber.run()}} we have:
> {code}
> try {
>   clearCorruptLazyPersistFiles();
> } catch (Exception e) {
>   FSNamesystem.LOG.error(
>   "Ignoring exception in LazyPersistFileScrubber:", e);
> }
> {code}
> First problem is that catching {{Exception}} is sloppy.  It should instead be 
> a multicatch for the actual exceptions thrown or better a set of separate 
> catch statements that react appropriately to the type of exception.
> Second problem is that it's bad to log an ERROR that's not actionable and 
> that can be safely ignored.  The log message should be logged at WARN or INFO 
> level.
> Third, the log message is useless.  If it's going to be a WARN or ERROR, a 
> log message should be actionable.  Otherwise it's an info.  A log message 
> should contain enough information for an admin to understand what it means.
> In the end, I think the right thing here is to leave the high-level behavior 
> unchanged: log a message and ignore the error, hoping that the next run will 
> go better.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14806) Bootstrap standby may fail if used in-progress tailing

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920593#comment-16920593
 ] 

Ayush Saxena commented on HDFS-14806:
-

Thanx [~vagarychen] for the report, I had suspects on this logic earlier too, 
but I didn't investigate further, since my problem diverged to a different path.
Well the idea seems quite fair to me. Anyway if the number of transactions are 
too many, Whether there won't be a flood of Rpc's, Can we in any case try 
increasing the number some way dynamically according to the situation. 
Secondly, it seems {{dfs.ha.tail-edits.qjm.rpc.max-txns}} is quite a critical 
conf, shouldn't this be exposed too?

> Bootstrap standby may fail if used in-progress tailing
> --
>
> Key: HDFS-14806
> URL: https://issues.apache.org/jira/browse/HDFS-14806
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.3.0
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Major
> Attachments: HDFS-14806.001.patch
>
>
> One issue we went across was that if in-progress tailing is enabled, 
> bootstrap standby could fail.
> When in-progress tailing is enabled, Bootstrap uses the RPC mechanism to get 
> edits. There is a config {{dfs.ha.tail-edits.qjm.rpc.max-txns}} that sets an 
> upper bound on how many txnid can be included in one RPC call. The default is 
> 5000. Meaning bootstraping NN (say NN1) can only pull at most 5000 edits from 
> JN. However, as part of bootstrap, NN1 queries another NN (say NN2) for NN2's 
> current transactionID, NN2 may return a state that is > 5000 txnid from NN1's 
> current image. But NN1 can only see 5000 more txnid from JNs. At this point 
> NN1 goes panic, because txnid retuned by JNs is behind NN2's returned state, 
> bootstrap then fail.
> Essentially, bootstrap standby can fail if both of two following conditions 
> are met:
>  # in-progress tailing is enabled AND
>  # the boostraping NN is too far (>5000 txid)  behind 
> Increasing the value of {{dfs.ha.tail-edits.qjm.rpc.max-txns}} to some super 
> large value allowed bootstrap to continue. But this is hardly the ideal 
> solution.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14802) The feature of protect directories should be used in RenameOp

2019-09-01 Thread Fei Hui (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fei Hui updated HDFS-14802:
---
Attachment: HDFS-14802.003.patch

> The feature of protect directories should be used in RenameOp
> -
>
> Key: HDFS-14802
> URL: https://issues.apache.org/jira/browse/HDFS-14802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, 
> HDFS-14802.003.patch
>
>
> Now we could set fs.protected.directories to prevent users from deleting 
> important directories. But users can delete directories around the limitation.
> 1. Rename the directories and delete them.
> 2. move the directories to trash and namenode will delete them.
> So I think we should use the feature of protected directories in RenameOp



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14802) The feature of protect directories should be used in RenameOp

2019-09-01 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920579#comment-16920579
 ] 

Fei Hui commented on HDFS-14802:


Upload v003 patch. Fix checkstyle

> The feature of protect directories should be used in RenameOp
> -
>
> Key: HDFS-14802
> URL: https://issues.apache.org/jira/browse/HDFS-14802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch, 
> HDFS-14802.003.patch
>
>
> Now we could set fs.protected.directories to prevent users from deleting 
> important directories. But users can delete directories around the limitation.
> 1. Rename the directories and delete them.
> 2. move the directories to trash and namenode will delete them.
> So I think we should use the feature of protected directories in RenameOp



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14751) TestNameNodeMetadataConsistency#testGenerationStampInFuture fail in trunk

2019-09-01 Thread Sean Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920574#comment-16920574
 ] 

Sean Chow edited comment on HDFS-14751 at 9/2/19 2:38 AM:
--

Hi [~leosun08] Shall we just synchronize {{diffs} ?

{code:java}
  void reconcile() throws IOException {
scan();
synchronized(diffs) {
  for (Entry> entry : diffs.entrySet()) {
String bpid = entry.getKey();
LinkedList diff = entry.getValue();

for (ScanInfo info : diff) {
  dataset.checkAndUpdate(bpid, info.getBlockId(), info.getBlockFile(),
  info.getMetaFile(), info.getVolume());
}
  }
}
if (!retainDiffs) clear();
  }
{code}

And in {{scan()}}:
{code:java}
// Hold FSDataset lock to prevent further changes to the block map
synchronized(dataset) {
synchronized(diffs) {...}
}
{code}



was (Author: seanlook):
Shall we just synchronize {{diffs}

{code:java}
  void reconcile() throws IOException {
scan();
synchronized(diffs) {
  for (Entry> entry : diffs.entrySet()) {
String bpid = entry.getKey();
LinkedList diff = entry.getValue();

for (ScanInfo info : diff) {
  dataset.checkAndUpdate(bpid, info.getBlockId(), info.getBlockFile(),
  info.getMetaFile(), info.getVolume());
}
  }
}
if (!retainDiffs) clear();
  }
{code}

And in {{scan()}}:
{code:java}
// Hold FSDataset lock to prevent further changes to the block map
synchronized(dataset) {
synchronized(diffs) {...}
}
{code}


> TestNameNodeMetadataConsistency#testGenerationStampInFuture fail in trunk
> -
>
> Key: HDFS-14751
> URL: https://issues.apache.org/jira/browse/HDFS-14751
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Attachments: HDFS-14751.001.patch
>
>
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 21.693 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
> [ERROR] 
> testGenerationStampInFuture(org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency)
>   Time elapsed: 7.572 s  <<< ERROR!
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> com.google.common.collect.AbstractMapBasedMultimap$Itr.next(AbstractMapBasedMultimap.java:1153)
>   at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:433)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNodeTestUtils.runDirectoryScanner(DataNodeTestUtils.java:202)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency.testGenerationStampInFuture(TestNameNodeMetadataConsistency.java:92)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.suref

[jira] [Commented] (HDFS-14751) TestNameNodeMetadataConsistency#testGenerationStampInFuture fail in trunk

2019-09-01 Thread Sean Chow (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920574#comment-16920574
 ] 

Sean Chow commented on HDFS-14751:
--

Shall we just synchronize {{diffs}

{code:java}
  void reconcile() throws IOException {
scan();
synchronized(diffs) {
  for (Entry> entry : diffs.entrySet()) {
String bpid = entry.getKey();
LinkedList diff = entry.getValue();

for (ScanInfo info : diff) {
  dataset.checkAndUpdate(bpid, info.getBlockId(), info.getBlockFile(),
  info.getMetaFile(), info.getVolume());
}
  }
}
if (!retainDiffs) clear();
  }
{code}

And in {{scan()}}:
{code:java}
// Hold FSDataset lock to prevent further changes to the block map
synchronized(dataset) {
synchronized(diffs) {...}
}
{code}


> TestNameNodeMetadataConsistency#testGenerationStampInFuture fail in trunk
> -
>
> Key: HDFS-14751
> URL: https://issues.apache.org/jira/browse/HDFS-14751
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Lisheng Sun
>Assignee: Lisheng Sun
>Priority: Minor
> Attachments: HDFS-14751.001.patch
>
>
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 21.693 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency
> [ERROR] 
> testGenerationStampInFuture(org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency)
>   Time elapsed: 7.572 s  <<< ERROR!
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> com.google.common.collect.AbstractMapBasedMultimap$Itr.next(AbstractMapBasedMultimap.java:1153)
>   at 
> java.util.Collections$UnmodifiableCollection$1.next(Collections.java:1044)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:433)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataNodeTestUtils.runDirectoryScanner(DataNodeTestUtils.java:202)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency.testGenerationStampInFuture(TestNameNodeMetadataConsistency.java:92)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> {code}
> Ref:[https://builds.apache.org/job/PreCommit-HDFS-Build/27567/artifact/out/patch-unit-hadoop-hdf

[jira] [Updated] (HDFS-13843) RBF: Add optional parameter -d for detailed listing of mount points.

2019-09-01 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-13843:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to trunk.
Thanx [~elgoiri] for the review!!!

> RBF: Add optional parameter -d for detailed listing of mount points.
> 
>
> Key: HDFS-13843
> URL: https://issues.apache.org/jira/browse/HDFS-13843
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Fix For: 3.3.0
>
> Attachments: HDFS-13843-03.patch, HDFS-13843-04.patch, 
> HDFS-13843.01.patch, HDFS-13843.02.patch
>
>
> *Scenario:*
> Execute the below add/update command for single mount entry for single 
> nameservice pointing to multiple destinations. 
>  # hdfs dfsrouteradmin -add /apps1 hacluster /tmp1
>  # hdfs dfsrouteradmin -add /apps1 hacluster /tmp1,/tmp2,/tmp3
>  # hdfs dfsrouteradmin -update /apps1 hacluster /tmp1,/tmp2,/tmp3 -order 
> RANDOM
> *Actual*. With the above commands, mount entry is successfully updated.
> But order information like HASH, RANDOM is not displayed in mount entries and 
> also not displayed in federation router UI. However order information is 
> updated properly when there are multiple nameservices. This issue is with 
> single nameservice having multiple destinations.
> *Expected:* 
> *Order information should be updated in mount entries so that the user will 
> come to know which order has been set.*
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14787) NameNode error

2019-09-01 Thread Cao, Lionel (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao, Lionel updated HDFS-14787:
---
Summary: NameNode error   (was: [Help] NameNode error )

> NameNode error 
> ---
>
> Key: HDFS-14787
> URL: https://issues.apache.org/jira/browse/HDFS-14787
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Cao, Lionel
>Priority: Major
> Attachments: core-site.xml, 
> hadoop-cmf-hdfs-NAMENODE-smc-nn02.jq.log.out.20190827, hdfs-site.xml, 
> move&concat.java, rt-Append.txt
>
>
> Hi committee,
> We encountered a NN error as below,
> The primary NN was shut down last Thursday and we recover it by remove some 
> OP in the edit log..  But the standby NN was shut down again yesterday by the 
> same error...
> could you pls help address the possible root cause?
>  
> Attach some error log:
> Full log and NameNode configuration pls refer to the attachments.
> Besides, I have attached some java code which could cause the error,
>  # We do some append action in spark streaming program (rt-Append.txt) which 
> caused the primary NN shutdown last Thursday
>  # We do some move & concat operation in data convert 
> program(move&concat.java) which caused the standby NN shutdown yesterday
> 2019-08-27 09:51:12,409 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 
> 766146/953617 transactions completed. (80%)2019-08-27 09:51:12,409 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 
> 766146/953617 transactions completed. (80%)2019-08-27 09:51:12,858 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Increasing replication 
> from 2 to 2 for 
> /user/smcjob/.sparkStaging/application_1561429828507_20423/__spark_libs__2381992047634476351.zip2019-08-27
>  09:51:12,870 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: 
> Increasing replication from 2 to 2 for 
> /user/smcjob/.sparkStaging/application_1561429828507_20423/oozietest2-0.0.1-SNAPSHOT.jar2019-08-27
>  09:51:12,898 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: 
> Increasing replication from 2 to 2 for 
> /user/smcjob/.sparkStaging/application_1561429828507_20423/__spark_conf__.zip2019-08-27
>  09:51:12,910 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: 
> Increasing replication from 2 to 2 for 
> /user/smctest/.sparkStaging/application_1561429828507_20424/__spark_libs__8875310030853528804.zip2019-08-27
>  09:51:12,927 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: 
> Increasing replication from 2 to 2 for 
> /user/smctest/.sparkStaging/application_1561429828507_20424/__spark_conf__.zip2019-08-27
>  09:51:13,777 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: 
> replaying edit log: 857745/953617 transactions completed. (90%)2019-08-27 
> 09:51:14,035 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: 
> Increasing replication from 2 to 2 for 
> /user/smc_ss/.sparkStaging/application_1561429828507_20425/__spark_libs__749681005558653.zip2019-08-27
>  09:51:14,067 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: 
> Increasing replication from 2 to 2 for 
> /user/smc_ss/.sparkStaging/application_1561429828507_20426/__spark_libs__7479542421029947753.zip2019-08-27
>  09:51:14,070 INFO org.apache.hadoop.hdfs.server.namenode.FSDirectory: 
> Increasing replication from 2 to 2 for 
> /user/smctest/.sparkStaging/application_1561429828507_20428/__spark_libs__7647933078788028649.zip2019-08-27
>  09:51:14,075 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: 
> Encountered exception on operation CloseOp [length=0, inodeId=0, 
> path=/**/v2-data-20190826.mayfly.data, replication=2, 
> mtime=1566870616821, atime=1566870359230, blockSize=134217728, 
> blocks=[blk_1270599798_758966421, blk_1270599852_758967928, 
> blk_1270601282_759026903, blk_1270602443_759027052, blk_1270602446_759061086, 
> blk_1270603081_759050235], permissions=smc_ss:smc_ss:rw-r--r--, 
> aclEntries=null, clientName=, clientMachine=, overwrite=false, 
> storagePolicyId=0, erasureCodingPolicyId=0, opCode=OP_CLOSE, 
> txid=4359520942]java.io.IOException: Mismatched block IDs or generation 
> stamps, attempting to replace block blk_1270602446_759027503 with 
> blk_1270602446_759061086 as block # 4/6 of 
> /**/v2-data-20190826.mayfly.data at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:1096)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:452)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdi

[jira] [Commented] (HDFS-14802) The feature of protect directories should be used in RenameOp

2019-09-01 Thread Fei Hui (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920559#comment-16920559
 ] 

Fei Hui commented on HDFS-14802:


[~hexiaoqiao]Thanks for your comments, good suggestions!
Regex match is one way, separated configurable file for protect directories is 
another.  We choose the latter in our scenario. Plan to file a new issue and 
discuss the improvement for protect directories configurations

> The feature of protect directories should be used in RenameOp
> -
>
> Key: HDFS-14802
> URL: https://issues.apache.org/jira/browse/HDFS-14802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch
>
>
> Now we could set fs.protected.directories to prevent users from deleting 
> important directories. But users can delete directories around the limitation.
> 1. Rename the directories and delete them.
> 2. move the directories to trash and namenode will delete them.
> So I think we should use the feature of protected directories in RenameOp



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14762) "Path(Path/String parent, String child)" will fail when "child" contains ":"

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920552#comment-16920552
 ] 

Hadoop QA commented on HDFS-14762:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 54s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  9m  3s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.TestLocalFileSystem |
|   | hadoop.fs.TestPath |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14762 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979082/HDFS-14762.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux be7e63a4ab8e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 751b5a1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27758/artifact/out/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27758/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27758/testReport/ |
| Max. process+thread count | 1516 (vs. ulimit of 5500) |
| modules | C: hadoop-co

[jira] [Commented] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920549#comment-16920549
 ] 

David Mollitor commented on HDFS-13157:
---

OK.

 

I dropped the first patch.  I'm in a bit of a rush, didn't get a change to run 
all the unit tests locally, but wanted to get some movement on this 
none-the-less.

 

As is often the case with this project, I had to touch more than I would have 
wanted.  There is a weird dependency on the Iterator with 
[BlockManager.java|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java#L1589]

I really don't understand why {{BlockManager}} is trying to take a random 
sample here.  Maybe someone else knows why?

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HDFS-13157.1.patch
>
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HDFS-13157:
--
Status: Patch Available  (was: Open)

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HDFS-13157.1.patch
>
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread David Mollitor (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HDFS-13157:
--
Attachment: HDFS-13157.1.patch

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
> Attachments: HDFS-13157.1.patch
>
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920542#comment-16920542
 ] 

Hadoop QA commented on HDFS-14654:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 48s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 34s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup |
|   | hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14654 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979073/HDFS-14654.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 790c9e19ecef 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 751b5a1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27757/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27757/testReport/ |
| Max. process+thread count | 1605 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://build

[jira] [Updated] (HDFS-14762) "Path(Path/String parent, String child)" will fail when "child" contains ":"

2019-09-01 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14762:
-
Attachment: HDFS-14762.002.patch

> "Path(Path/String parent, String child)" will fail when "child" contains ":"
> 
>
> Key: HDFS-14762
> URL: https://issues.apache.org/jira/browse/HDFS-14762
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Shixiong Zhu
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14762.001.patch, HDFS-14762.002.patch
>
>
> When the "child" parameter contains ":", "Path(Path/String parent, String 
> child)" will throw the following exception:
> {code}
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: ...
> {code}
> Not sure if this is a legit bug. But the following places will hit this error 
> when seeing a Path with a file name containing ":":
> https://github.com/apache/hadoop/blob/f9029c4070e8eb046b403f5cb6d0a132c5d58448/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java#L101
> https://github.com/apache/hadoop/blob/f9029c4070e8eb046b403f5cb6d0a132c5d58448/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Globber.java#L270



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920490#comment-16920490
 ] 

Hadoop QA commented on HDFS-14654:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m  9s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 55s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup |
|   | hadoop.hdfs.server.federation.router.TestRouterRpc |
|   | hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14654 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979073/HDFS-14654.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1087ea46d0f6 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 751b5a1 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27756/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27756/testReport/ |
| Max. process+thread count | 1613 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-

[jira] [Commented] (HDFS-13843) RBF: Add optional parameter -d for detailed listing of mount points.

2019-09-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920469#comment-16920469
 ] 

Íñigo Goiri commented on HDFS-13843:


Thanks [~ayushtkn] for the updat.e
+1 on  [^HDFS-13843-04.patch].

> RBF: Add optional parameter -d for detailed listing of mount points.
> 
>
> Key: HDFS-13843
> URL: https://issues.apache.org/jira/browse/HDFS-13843
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: Soumyapn
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-13843-03.patch, HDFS-13843-04.patch, 
> HDFS-13843.01.patch, HDFS-13843.02.patch
>
>
> *Scenario:*
> Execute the below add/update command for single mount entry for single 
> nameservice pointing to multiple destinations. 
>  # hdfs dfsrouteradmin -add /apps1 hacluster /tmp1
>  # hdfs dfsrouteradmin -add /apps1 hacluster /tmp1,/tmp2,/tmp3
>  # hdfs dfsrouteradmin -update /apps1 hacluster /tmp1,/tmp2,/tmp3 -order 
> RANDOM
> *Actual*. With the above commands, mount entry is successfully updated.
> But order information like HASH, RANDOM is not displayed in mount entries and 
> also not displayed in federation router UI. However order information is 
> updated properly when there are multiple nameservices. This issue is with 
> single nameservice having multiple destinations.
> *Expected:* 
> *Order information should be updated in mount entries so that the user will 
> come to know which order has been set.*
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14762) "Path(Path/String parent, String child)" will fail when "child" contains ":"

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920452#comment-16920452
 ] 

Ayush Saxena commented on HDFS-14762:
-

Test failures seems related. Please check!!!

> "Path(Path/String parent, String child)" will fail when "child" contains ":"
> 
>
> Key: HDFS-14762
> URL: https://issues.apache.org/jira/browse/HDFS-14762
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Shixiong Zhu
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14762.001.patch
>
>
> When the "child" parameter contains ":", "Path(Path/String parent, String 
> child)" will throw the following exception:
> {code}
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: ...
> {code}
> Not sure if this is a legit bug. But the following places will hit this error 
> when seeing a Path with a file name containing ":":
> https://github.com/apache/hadoop/blob/f9029c4070e8eb046b403f5cb6d0a132c5d58448/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java#L101
> https://github.com/apache/hadoop/blob/f9029c4070e8eb046b403f5cb6d0a132c5d58448/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Globber.java#L270



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920449#comment-16920449
 ] 

Hadoop QA commented on HDFS-14810:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
4s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.namenode.TestEditLog |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14810 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979070/HDFS-14810.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 46550df79427 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 18d74fe |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27755/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27755/testReport/ |
| Max. process+thread count | 3429 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hd

[jira] [Commented] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920441#comment-16920441
 ] 

He Xiaoqiao commented on HDFS-13157:


Thanks [~belugabehr] for the great work and detailed analysis. I believe this 
issue is more obvious in Federation arch. setup. +1 for the deep dig via 
[~sodonnell], and we could tune parameters [blocksReplWorkMultiplier, 
maxReplicationStreams, maxReplicationStreamsHardLimit] just for per namespace, 
but the common operation to decommission node is triggered from shell even at 
the same time, and different namenode send replication command also at the same 
time if this node is reporting to different namespace. Then load of this 
decommission in progress node is out of control. I have met both network and 
single disk i/o bottleneck.
I believe the current parameter is enough to use for solving network bottleneck.
fo single disk io bottleneck, +1 for update DatanodeDescriptor#BlockIterator 
and support to iterator block from alternate disks rather than iterator blocks 
from disk one by one.
another thought, we should not dispatch write operation to decommission in 
progress node and decrease read priority to the lowest just as decommissioned 
node, then if high load of decommissioning nodes or not is completely not 
affect to client or cluster.
this discussion is not including RAID and scenarios [~zhangchen] mentioned 
above.

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920439#comment-16920439
 ] 

Ayush Saxena commented on HDFS-14654:
-

Thanx [~zhangchen] for the analysis, I remember having discussion earlier 
regarding the number to be 6 or more and we reached to a consensus to make it 6 
only, so as to verify I am not sure but I guess Underreplicated blocks scenario 
or something similar. So changing to 9, isn't a solution,  may be preventing or 
analyzing why the Dn's are rejecting, shall be the solution. Anyway you can 
raise a seperate JIRA for it, if you happen to find a fix or root cause for it.

> RBF: TestRouterRpc#testNamenodeMetrics is flaky
> ---
>
> Key: HDFS-14654
> URL: https://issues.apache.org/jira/browse/HDFS-14654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14654.001.patch, HDFS-14654.002.patch, 
> HDFS-14654.003.patch, HDFS-14654.004.patch, HDFS-14654.005.patch, error.log
>
>
> They sometimes pass and sometimes fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920438#comment-16920438
 ] 

Chen Zhang edited comment on HDFS-14654 at 9/1/19 4:27 PM:
---

BTW, the test {{testErasureCoding}} happened to fail again on my machine, we've 
encountered this failure in the penultimate build. The failure reason is some 
node too busy when allocating block:
{code:java}
2019-09-01 18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseRandom(838)) - [
Node /default-rack/127.0.0.1:53148 [
]
Node /default-rack/127.0.0.1:53161 [
]
Node /default-rack/127.0.0.1:53157 [
  Datanode 127.0.0.1:53157 is not chosen since the node is too busy (load: 3 > 
2.6665).
Node /default-rack/127.0.0.1:53143 [
]
Node /default-rack/127.0.0.1:53165 [
]
2019-09-01 18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseRandom(846)) - Not enough replicas was 
chosen. Reason: {NODE_TOO_BUSY=1}
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
replicas, still in need of 1 to reach 6 (unavailableStorages=[], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) 
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
protocol.BlockStoragePolicy (BlockStoragePolicy.java:chooseStorageTypes(161)) - 
Failed to place enough replicas: expected size is 1 but only 0 storage types 
can be selected (replication=6, selected=[], unavailable=[DISK], 
removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
replicas, still in need of 1 to reach 6 (unavailableStorages=[DISK], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All 
required storage types are unavailable:  unavailableStorages=[DISK], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] INFO  
ipc.Server (Server.java:logException(2982)) - IPC Server handler 5 on default 
port 53140, call Call#1270 Retry#0 
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 127.0.0.1:53202
java.io.IOException: File /testec/testfile2 could only be written to 5 of the 6 
required nodes for RS-6-3-1024k. There are 6 datanode(s) running and 6 node(s) 
are excluded in this operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2815)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:893)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1001)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:929)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2921)
2019-09-01 18:19:20,942 [IPC Server handler 6 on default port 53197] INFO  
ipc.Server (Server.java:logException(2975)) - IPC Server handler 6 on default 
port 53197, call Call#1268 Retry#0 
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
192.168.1.112:53201: java.io.IOException: File /testec/testfile2 could only be 
written to 5 of the 6 required nodes for RS-6-3-1024k. There are 6 datanode(s) 
running and 6 node(s) are excluded in this operation.
{code}
When we creating an EC file with the policy 6+3, it requires at least 6 block 
succeed

[jira] [Commented] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920438#comment-16920438
 ] 

Chen Zhang commented on HDFS-14654:
---

BTW, the test {{testErasureCoding}} happened to fail again on my machine, we've 
encountered this failure in the penultimate build. The failure reason is some 
node too busy when allocating block:
{code:java}
019-09-01 18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseRandom(838)) - [019-09-01 18:19:20,940 
[IPC Server handler 5 on default port 53140] INFO  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseRandom(838)) - [Node 
/default-rack/127.0.0.1:53148 []Node /default-rack/127.0.0.1:53161 []Node 
/default-rack/127.0.0.1:53157 [  Datanode 127.0.0.1:53157 is not chosen since 
the node is too busy (load: 3 > 2.6665).Node 
/default-rack/127.0.0.1:53143 []Node /default-rack/127.0.0.1:53165 []2019-09-01 
18:19:20,940 [IPC Server handler 5 on default port 53140] INFO  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseRandom(846)) - Not enough replicas was 
chosen. Reason: {NODE_TOO_BUSY=1}2019-09-01 18:19:20,941 [IPC Server handler 5 
on default port 53140] WARN  blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
replicas, still in need of 1 to reach 6 (unavailableStorages=[], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) 
2019-09-01 18:19:20,941 [IPC Server handler 5 on default port 53140] WARN  
protocol.BlockStoragePolicy (BlockStoragePolicy.java:chooseStorageTypes(161)) - 
Failed to place enough replicas: expected size is 1 but only 0 storage types 
can be selected (replication=6, selected=[], unavailable=[DISK], 
removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]})2019-09-01 18:19:20,941 
[IPC Server handler 5 on default port 53140] WARN  
blockmanagement.BlockPlacementPolicy 
(BlockPlacementPolicyDefault.java:chooseTarget(449)) - Failed to place enough 
replicas, still in need of 1 to reach 6 (unavailableStorages=[DISK], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All 
required storage types are unavailable:  unavailableStorages=[DISK], 
storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], 
creationFallbacks=[], replicationFallbacks=[ARCHIVE]}2019-09-01 18:19:20,941 
[IPC Server handler 5 on default port 53140] INFO  ipc.Server 
(Server.java:logException(2982)) - IPC Server handler 5 on default port 53140, 
call Call#1270 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock 
from 127.0.0.1:53202java.io.IOException: File /testec/testfile2 could only be 
written to 5 of the 6 required nodes for RS-6-3-1024k. There are 6 datanode(s) 
running and 6 node(s) are excluded in this operation. at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:)
 at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2815)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:893)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1001) at 
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:929) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2921)2019-09-01 
18:19:20,942 [IPC Server handler 6 on default port 53197] INFO  ipc.Server 
(Server.java:logException(2975)) - IPC Server handler 6 on default port 53197, 
call Call#1268 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock 
from 192.168.1.112:53201: java.io.IOException: File /testec/testfile2 could 
only be written to 5 of the 6 required nodes for RS-6-3-1024k. There are 6 
datanode(s) running and 6 node(s) are excluded in this operation.
{code}
When we creating an EC file with the policy 6+3, it requires at least 6 bl

[jira] [Commented] (HDFS-14799) Do Not Call Map containsKey In Conjunction with get

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920437#comment-16920437
 ] 

Ayush Saxena commented on HDFS-14799:
-

+1

> Do Not Call Map containsKey In Conjunction with get
> ---
>
> Key: HDFS-14799
> URL: https://issues.apache.org/jira/browse/HDFS-14799
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: hemanthboyina
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HDFS-14799.001.patch
>
>
> {code:java|title=InvalidateBlocks.java}
>   private final Map>
>   nodeToBlocks = new HashMap<>();
>   private final Map>
>   nodeToECBlocks = new HashMap<>();
> ...
>   private LightWeightHashSet getBlocksSet(final DatanodeInfo dn) {
> if (nodeToBlocks.containsKey(dn)) {
>   return nodeToBlocks.get(dn);
> }
> return null;
>   }
>   private LightWeightHashSet getECBlocksSet(final DatanodeInfo dn) {
> if (nodeToECBlocks.containsKey(dn)) {
>   return nodeToECBlocks.get(dn);
> }
> return null;
>   }
> {code}
> There is no need to check for {{containsKey}} here since a call to {{get}} 
> will already return 'null' if the key is not there.  This just adds overhead 
> of having to dive into the Map twice to get the value.
> {code}
>   private LightWeightHashSet getECBlocksSet(final DatanodeInfo dn) {
> return nodeToECBlocks.get(dn);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14799) Do Not Call Map containsKey In Conjunction with get

2019-09-01 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920436#comment-16920436
 ] 

hemanthboyina commented on HDFS-14799:
--

Uupdated the patch, pls review [~ayushtkn]

> Do Not Call Map containsKey In Conjunction with get
> ---
>
> Key: HDFS-14799
> URL: https://issues.apache.org/jira/browse/HDFS-14799
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: hemanthboyina
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HDFS-14799.001.patch
>
>
> {code:java|title=InvalidateBlocks.java}
>   private final Map>
>   nodeToBlocks = new HashMap<>();
>   private final Map>
>   nodeToECBlocks = new HashMap<>();
> ...
>   private LightWeightHashSet getBlocksSet(final DatanodeInfo dn) {
> if (nodeToBlocks.containsKey(dn)) {
>   return nodeToBlocks.get(dn);
> }
> return null;
>   }
>   private LightWeightHashSet getECBlocksSet(final DatanodeInfo dn) {
> if (nodeToECBlocks.containsKey(dn)) {
>   return nodeToECBlocks.get(dn);
> }
> return null;
>   }
> {code}
> There is no need to check for {{containsKey}} here since a call to {{get}} 
> will already return 'null' if the key is not there.  This just adds overhead 
> of having to dive into the Map twice to get the value.
> {code}
>   private LightWeightHashSet getECBlocksSet(final DatanodeInfo dn) {
> return nodeToECBlocks.get(dn);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14762) "Path(Path/String parent, String child)" will fail when "child" contains ":"

2019-09-01 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920435#comment-16920435
 ] 

hemanthboyina commented on HDFS-14762:
--

Updated the patch , pls review [~ayushtkn]

> "Path(Path/String parent, String child)" will fail when "child" contains ":"
> 
>
> Key: HDFS-14762
> URL: https://issues.apache.org/jira/browse/HDFS-14762
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Shixiong Zhu
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14762.001.patch
>
>
> When the "child" parameter contains ":", "Path(Path/String parent, String 
> child)" will throw the following exception:
> {code}
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
> path in absolute URI: ...
> {code}
> Not sure if this is a legit bug. But the following places will hit this error 
> when seeing a Path with a file name containing ":":
> https://github.com/apache/hadoop/blob/f9029c4070e8eb046b403f5cb6d0a132c5d58448/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFileSystem.java#L101
> https://github.com/apache/hadoop/blob/f9029c4070e8eb046b403f5cb6d0a132c5d58448/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Globber.java#L270



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.

2019-09-01 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920434#comment-16920434
 ] 

hemanthboyina commented on HDFS-14630:
--

Updated the patch

> Configuration.getTimeDurationHelper() should not log time unit warning in 
> info log.
> ---
>
> Key: HDFS-14630
> URL: https://issues.apache.org/jira/browse/HDFS-14630
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: hemanthboyina
>Priority: Minor
> Attachments: HDFS-14630.001.patch, HDFS-14630.patch
>
>
> To solve [HDFS-12920|https://issues.apache.org/jira/browse/HDFS-12920] issue 
> we configured "dfs.client.datanode-restart.timeout" without time unit. No log 
> file is full of
> {noformat}
> 2019-06-22 20:13:14,605 | INFO  | pool-12-thread-1 | No unit for 
> dfs.client.datanode-restart.timeout(30) assuming SECONDS 
> org.apache.hadoop.conf.Configuration.logDeprecation(Configuration.java:1409){noformat}
> No need to log this, just give the behavior in property description.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920426#comment-16920426
 ] 

Chen Zhang commented on HDFS-14654:
---

It's a good catch, [~ayushtkn], added a finally block, upload patch v5.

> RBF: TestRouterRpc#testNamenodeMetrics is flaky
> ---
>
> Key: HDFS-14654
> URL: https://issues.apache.org/jira/browse/HDFS-14654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14654.001.patch, HDFS-14654.002.patch, 
> HDFS-14654.003.patch, HDFS-14654.004.patch, HDFS-14654.005.patch, error.log
>
>
> They sometimes pass and sometimes fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Chen Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhang updated HDFS-14654:
--
Attachment: HDFS-14654.005.patch

> RBF: TestRouterRpc#testNamenodeMetrics is flaky
> ---
>
> Key: HDFS-14654
> URL: https://issues.apache.org/jira/browse/HDFS-14654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14654.001.patch, HDFS-14654.002.patch, 
> HDFS-14654.003.patch, HDFS-14654.004.patch, HDFS-14654.005.patch, error.log
>
>
> They sometimes pass and sometimes fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920413#comment-16920413
 ] 

He Xiaoqiao commented on HDFS-14810:


Thanks [~ayushtkn] for your feedback, I will update it next patch. Thanks again.

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920407#comment-16920407
 ] 

Ayush Saxena commented on HDFS-14810:
-

Thanx [~hexiaoqiao]

I had a quick view of the patch. Looks good overall, Couple of doubts :

* In SetReplication() now we will be doing {{getEditLog().logSync()}} with even 
success being false?
* Similarlly in {{clearCorruptLazyPersistFiles()}} and renameTo sync even 
unchanged.
* Some places we are now logging success variable, instead true as in setRep(). 
There was a discussion in {{HDFS-13772}} Xiao Chen mentioned audit false should 
be only for ACE, May be you can check once the discussion there.

SetErasureCodingPolicy doesn't log audit for ACE? 

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920397#comment-16920397
 ] 

He Xiaoqiao commented on HDFS-14305:


Thanks [~csun] for your comments. To be honest, I have no practice about 
multi-nns setup, so I have no idea that if it is stable based on or rely on the 
configurations. is there any case that Observe NameNode without SBN config? we 
can make sure that ANN and SBN has the same configuration with all namenode 
items in HA-mode. Please confirm the result for multi-nns install case if you 
have any experience. 
Another side as you said above, it could not resolve case of adding namenodes 
to cluster if we just rely on configurations. FYI.
Thanks [~csun] again.

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, security
>Reporter: Chao Sun
>Assignee: He Xiaoqiao
>Priority: Major
> Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>
> Attachments: HDFS-14305.001.patch, HDFS-14305.002.patch, 
> HDFS-14305.003.patch, HDFS-14305.004.patch, HDFS-14305.005.patch, 
> HDFS-14305.006.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920396#comment-16920396
 ] 

Stephen O'Donnell commented on HDFS-13157:
--

The way I believe the redundancyMonitor works is as follows:

It picks the next live_nodes * work_multiplier (default 2) blocks from the 
needs_replication queue in the order they were added to the queue.

Then it looks at the nodes which host the block and randomly picks one of them 
to do the replication that does not have more than maxReplicationStreams 
already allocated. I see the comment in chooseSourceDatanodes() that states it 
prefers decommissioning nodes, and I think it implements this by allocating 
max_replication_streams blocks to IN_SERVICE nodes but 
max_replication_stream_Hard_Limit to decommissioning nodes, ie the 
decommissioning nodes have a higher limit normally.

Imagine we have maxReplicationStreams of 5 (a very low setting - many clusters 
will have this set to 50 or more) maxReplicationStreamsHardLimit of 10 and and 
200 live nodes.

This means the redundancy monitor will pick 200 * 2 (work multiplier default) = 
400 blocks to process on each iteration.

It will then randomly select 1 of the datanodes as a source, meaning the 
decommissioning node will get allocated 10 (hard limit) out of the first 30 
blocks on average. However, then it will have reached its maxStreamsLimit and 
the remaining 370 blocks should be assigned to other nodes (assuming they have 
capacity).

Therefore for replication factor 3 blocks, the decommissioning node will likely 
replicate much less than a third of its own blocks, but all those blocks will 
likely be on one disk.

My reading of the logic therefore also suggests that if you decommission 
several nodes, then it will take the redundancy monitor some time to consider 
the blocks from the second node, as it will work through the list of blocks in 
order. However there is a good chance those other decommissioning nodes will be 
participating in replicating blocks from the first node, so they are unlikely 
to be idle.

However the scenario [~zhangchen] mentioned is an interesting one, where blocks 
have replication factor 1 as the decommissioning node must be the source. In 
that case I think the redundancy monitor would process 400 blocks, assign 10 of 
them and skip the next 390 on each iteration until it hits the second node 
being decommissioned and repeat this until it cycles back to the start of the 
under replicated list again. So if you are decommission nodes with replication 
factor 1 blocks, not only would it use only 1 disk, but it would only work on 
one decommissioning node at a time. I have not tested this, so there may be 
some logic I have not understood fully to handle this sort of case.

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14802) The feature of protect directories should be used in RenameOp

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920394#comment-16920394
 ] 

He Xiaoqiao commented on HDFS-14802:


Thanks [~ferhui] for your contribution, I think maybe we need regex to match 
the `protect directories`. such as protect db directories to delete in 
warehouse, and I have to config many many paths based on my own practice. if we 
support regex match, it should be simpler. FYI.

> The feature of protect directories should be used in RenameOp
> -
>
> Key: HDFS-14802
> URL: https://issues.apache.org/jira/browse/HDFS-14802
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.0.4, 3.3.0, 3.2.1, 3.1.3
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Major
> Attachments: HDFS-14802.001.patch, HDFS-14802.002.patch
>
>
> Now we could set fs.protected.directories to prevent users from deleting 
> important directories. But users can delete directories around the limitation.
> 1. Rename the directories and delete them.
> 2. move the directories to trash and namenode will delete them.
> So I think we should use the feature of protected directories in RenameOp



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14807) SetTimes updates all negative values apart from -1

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920393#comment-16920393
 ] 

He Xiaoqiao commented on HDFS-14807:


[^HDFS-14807-02.patch] LGTM.
+1(no binding).

> SetTimes updates all negative values apart from -1
> --
>
> Key: HDFS-14807
> URL: https://issues.apache.org/jira/browse/HDFS-14807
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Harshakiran Reddy
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-14807-01.patch, HDFS-14807-02.patch
>
>
> Set Times API, updates negative time on all negative values apart from -1.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920392#comment-16920392
 ] 

Ayush Saxena commented on HDFS-12733:
-

We can take the solution as mentioned by Konstantin, as I mentioned in my 
previous comment and have proper documentation for it.

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch, 
> HDFS-12733.006.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12733) Option to disable to namenode local edits

2019-09-01 Thread He Xiaoqiao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao reassigned HDFS-12733:
--

Assignee: He Xiaoqiao  (was: Brahma Reddy Battula)

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch, 
> HDFS-12733.006.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920390#comment-16920390
 ] 

He Xiaoqiao commented on HDFS-12733:


To [~brahmareddy], I just assign this JIRA to myself, please feel free to 
assign back if back and would like to continue to following up this ticket. 
Thanks.

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch, 
> HDFS-12733.006.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920388#comment-16920388
 ] 

He Xiaoqiao commented on HDFS-12733:


Thanks [~ayushtkn] pick up this ticket, and I would like to continue to update 
it and the patches is ready for different solutions. However it seems that We 
did not reach the same agreement. any thought? do we need to anymore comments 
and discussions or vote through mail list? Thanks again.

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Brahma Reddy Battula
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch, 
> HDFS-12733.006.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread He Xiaoqiao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920386#comment-16920386
 ] 

He Xiaoqiao commented on HDFS-14810:


some unnecessary edit logs such as HDFS-11291 reported do not update in the 
ticket.

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread He Xiaoqiao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-14810:
---
Attachment: HDFS-14810.001.patch
Status: Patch Available  (was: Open)

submit init patch and pending Jenkins.

> review FSNameSystem editlog sync
> 
>
> Key: HDFS-14810
> URL: https://issues.apache.org/jira/browse/HDFS-14810
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: He Xiaoqiao
>Assignee: He Xiaoqiao
>Priority: Major
> Attachments: HDFS-14810.001.patch
>
>
> refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
> mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14810) review FSNameSystem editlog sync

2019-09-01 Thread He Xiaoqiao (Jira)
He Xiaoqiao created HDFS-14810:
--

 Summary: review FSNameSystem editlog sync
 Key: HDFS-14810
 URL: https://issues.apache.org/jira/browse/HDFS-14810
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: He Xiaoqiao
Assignee: He Xiaoqiao


refactor and unified type of edit log sync in FSNamesystem as HDFS-11246 
mentioned.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920381#comment-16920381
 ] 

Hadoop QA commented on HDFS-14630:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 33s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}215m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.security.TestFixKerberosTicketOrder |
|   | hadoop.security.TestRaceWhenRelogin |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14630 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979069/HDFS-14630.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 8eaf73c66191 4.15.0-52-g

[jira] [Comment Edited] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920359#comment-16920359
 ] 

Chen Zhang edited comment on HDFS-13157 at 9/1/19 9:55 AM:
---

We've observed same problem on our production cluster about half year ago, it's 
very easy to repro the issue when we decommissioning the node with warm data, 
we use HDFS Raid to convert these data to Erasure Coding, so every block have 
only 1 replica, and every block only have 1 source node to replicate the data.

We observed that the disks I/O utilization raise to 100% one by one on the 
decommissioning node, it make the decommission progress very slow.


was (Author: zhangchen):
We've observed same problem on our production cluster about half year ago, it's 
very easy to repro the issue when we decommissioning the node with warm data, 
we use HDFS Raid to convert these data to Erasure Coding, so every block have 
only 1 replica, so every block only have 1 source node to replicate the data.

We observed that the disks I/O utilization raise to 100% one by one on the 
decommissioning node, it make the decommission progress very slow.

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920359#comment-16920359
 ] 

Chen Zhang edited comment on HDFS-13157 at 9/1/19 9:54 AM:
---

We've observed same problem on our production cluster about half year ago, it's 
very easy to repro the issue when we decommissioning the node with warm data, 
we use HDFS Raid to convert these data to Erasure Coding, so every block have 
only 1 replica, so every block only have 1 source node to replicate the data.

We observed that the disks I/O utilization raise to 100% one by one on the 
decommissioning node, it make the decommission progress very slow.


was (Author: zhangchen):
We've observed same problem on our production cluster about half year ago, it's 
very to repro the issue when we decommissioning the node with warm data, we use 
HDFS Raid to convert these data to Erasure Coding, so every block have only 1 
replica, so every block only have 1 source node to replicate the data.

We observed that the disks I/O utilization raise to 100% one by one on the 
decommissioning node, it make the decommission progress very slow.

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13157) Do Not Remove Blocks Sequentially During Decommission

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920359#comment-16920359
 ] 

Chen Zhang commented on HDFS-13157:
---

We've observed same problem on our production cluster about half year ago, it's 
very to repro the issue when we decommissioning the node with warm data, we use 
HDFS Raid to convert these data to Erasure Coding, so every block have only 1 
replica, so every block only have 1 source node to replicate the data.

We observed that the disks I/O utilization raise to 100% one by one on the 
decommissioning node, it make the decommission progress very slow.

> Do Not Remove Blocks Sequentially During Decommission 
> --
>
> Key: HDFS-13157
> URL: https://issues.apache.org/jira/browse/HDFS-13157
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, namenode
>Affects Versions: 3.0.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>
> From what I understand of [DataNode 
> decommissioning|https://github.com/apache/hadoop/blob/42a1c98597e6dba2e371510a6b2b6b1fb94e4090/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeAdminManager.java]
>  it appears that all the blocks are scheduled for removal _in order._. I'm 
> not 100% sure what the ordering is exactly, but I think it loops through each 
> data volume and schedules each block to be replicated elsewhere. The net 
> affect is that during a decommission, all of the DataNode transfer threads 
> slam on a single volume until it is cleaned out. At which point, they all 
> slam on the next volume, etc.
> Please randomize the block list so that there is a more even distribution 
> across all volumes when decommissioning a node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14799) Do Not Call Map containsKey In Conjunction with get

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920354#comment-16920354
 ] 

Hadoop QA commented on HDFS-14799:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 20s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 25s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 52s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.viewfs.TestViewFsAtHdfsRoot |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14799 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979065/HDFS-14799.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cadbc2ceaa3a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / fef65b4 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27753/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27753/testReport/ |
| Max. process+thread count | 3223 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https:

[jira] [Comment Edited] (HDFS-14699) Erasure Coding: Can NOT trigger the reconstruction when have the dup internal blocks and missing one internal block

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920346#comment-16920346
 ] 

Ayush Saxena edited comment on HDFS-14699 at 9/1/19 9:16 AM:
-

Thanx [~zhaoyim] for the patch. The UT passes without the fix too, for me. Can 
you give a check once!!!
Anyway, Why can't we just pull the whole if part up, rather than just pulling 
half part?


was (Author: ayushtkn):
Thanx [~zhaoyim] for the patch. The UT passes without the fix too, for me. Can 
you give a check once!!!

> Erasure Coding: Can NOT trigger the reconstruction when have the dup internal 
> blocks and missing one internal block
> ---
>
> Key: HDFS-14699
> URL: https://issues.apache.org/jira/browse/HDFS-14699
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.2.0, 3.1.1, 3.3.0
>Reporter: Zhao Yi Ming
>Assignee: Zhao Yi Ming
>Priority: Critical
>  Labels: patch
> Attachments: HDFS-14699.00.patch, HDFS-14699.01.patch, 
> HDFS-14699.02.patch, HDFS-14699.03.patch, image-2019-08-20-19-58-51-872.png
>
>
> We are tried the EC function on 80 node cluster with hadoop 3.1.1, we hit the 
> same scenario as you said https://issues.apache.org/jira/browse/HDFS-8881. 
> Following are our testing steps, hope it can helpful.(following DNs have the 
> testing internal blocks)
>  # we customized a new 10-2-1024k policy and use it on a path, now we have 12 
> internal block(12 live block)
>  # decommission one DN, after the decommission complete. now we have 13 
> internal block(12 live block and 1 decommission block)
>  # then shutdown one DN which did not have the same block id as 1 
> decommission block, now we have 12 internal block(11 live block and 1 
> decommission block)
>  # after wait for about 600s (before the heart beat come) commission the 
> decommissioned DN again, now we have 12 internal block(11 live block and 1 
> duplicate block)
>  # Then the EC is not reconstruct the missed block
> We think this is a critical issue for using the EC function in a production 
> env. Could you help? Thanks a lot!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14699) Erasure Coding: Can NOT trigger the reconstruction when have the dup internal blocks and missing one internal block

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920346#comment-16920346
 ] 

Ayush Saxena commented on HDFS-14699:
-

Thanx [~zhaoyim] for the patch. The UT passes without the fix too, for me. Can 
you give a check once!!!

> Erasure Coding: Can NOT trigger the reconstruction when have the dup internal 
> blocks and missing one internal block
> ---
>
> Key: HDFS-14699
> URL: https://issues.apache.org/jira/browse/HDFS-14699
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.2.0, 3.1.1, 3.3.0
>Reporter: Zhao Yi Ming
>Assignee: Zhao Yi Ming
>Priority: Critical
>  Labels: patch
> Attachments: HDFS-14699.00.patch, HDFS-14699.01.patch, 
> HDFS-14699.02.patch, HDFS-14699.03.patch, image-2019-08-20-19-58-51-872.png
>
>
> We are tried the EC function on 80 node cluster with hadoop 3.1.1, we hit the 
> same scenario as you said https://issues.apache.org/jira/browse/HDFS-8881. 
> Following are our testing steps, hope it can helpful.(following DNs have the 
> testing internal blocks)
>  # we customized a new 10-2-1024k policy and use it on a path, now we have 12 
> internal block(12 live block)
>  # decommission one DN, after the decommission complete. now we have 13 
> internal block(12 live block and 1 decommission block)
>  # then shutdown one DN which did not have the same block id as 1 
> decommission block, now we have 12 internal block(11 live block and 1 
> decommission block)
>  # after wait for about 600s (before the heart beat come) commission the 
> decommissioned DN again, now we have 12 internal block(11 live block and 1 
> duplicate block)
>  # Then the EC is not reconstruct the missed block
> We think this is a critical issue for using the EC function in a production 
> env. Could you help? Thanks a lot!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.

2019-09-01 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14630:
-
Attachment: HDFS-14630.001.patch

> Configuration.getTimeDurationHelper() should not log time unit warning in 
> info log.
> ---
>
> Key: HDFS-14630
> URL: https://issues.apache.org/jira/browse/HDFS-14630
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: hemanthboyina
>Priority: Minor
> Attachments: HDFS-14630.001.patch, HDFS-14630.patch
>
>
> To solve [HDFS-12920|https://issues.apache.org/jira/browse/HDFS-12920] issue 
> we configured "dfs.client.datanode-restart.timeout" without time unit. No log 
> file is full of
> {noformat}
> 2019-06-22 20:13:14,605 | INFO  | pool-12-thread-1 | No unit for 
> dfs.client.datanode-restart.timeout(30) assuming SECONDS 
> org.apache.hadoop.conf.Configuration.logDeprecation(Configuration.java:1409){noformat}
> No need to log this, just give the behavior in property description.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-2015) Encrypt/decrypt key using symmetric key while writing/reading

2019-09-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2015?focusedWorklogId=304957&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304957
 ]

ASF GitHub Bot logged work on HDDS-2015:


Author: ASF GitHub Bot
Created on: 01/Sep/19 08:28
Start Date: 01/Sep/19 08:28
Worklog Time Spent: 10m 
  Work Description: dineshchitlangia commented on issue #1386: HDDS-2015. 
Encrypt/decrypt key using symmetric key while writing/reading
URL: https://github.com/apache/hadoop/pull/1386#issuecomment-526898798
 
 
   @anuengineer failures seems unrelated to the patch. Pls confirm.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304957)
Time Spent: 1h 50m  (was: 1h 40m)

> Encrypt/decrypt key using symmetric key while writing/reading
> -
>
> Key: HDDS-2015
> URL: https://issues.apache.org/jira/browse/HDDS-2015
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> *Key Write Path (Encryption)*
> When a bucket metadata has gdprEnabled=true, we generate the GDPRSymmetricKey 
> and add it to Key Metadata before we create the Key.
> This ensures that key is encrypted before writing.
> *Key Read Path(Decryption)*
> While reading the Key, we check for gdprEnabled=true and they get the 
> GDPRSymmetricKey based on secret/algorithm as fetched from Key Metadata.
> Create a stream to decrypt the key and pass it on to client.
> *Test*
> Create Key in GDPR Enabled Bucket -> Read Key -> Verify content is as 
> expected -> Update Key Metadata to remove the gdprEnabled flag -> Read Key -> 
> Confirm the content is not as expected.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14711) RBF: RBFMetrics throws NullPointerException if stateStore disabled

2019-09-01 Thread Chen Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920340#comment-16920340
 ] 

Chen Zhang commented on HDFS-14711:
---

Thanks [~ayushtkn] for the commit

> RBF: RBFMetrics throws NullPointerException if stateStore disabled
> --
>
> Key: HDFS-14711
> URL: https://issues.apache.org/jira/browse/HDFS-14711
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14711.001.patch, HDFS-14711.002.patch, 
> HDFS-14711.003.patch, HDFS-14711.004.patch, HDFS-14711.005.patch
>
>
> In current implementation, if \{{stateStore}} initialize fail, only log an 
> error message. Actually RBFMetrics can't work normally at this time.
> {code:java}
> 2019-08-08 22:43:58,024 [qtp812446698-28] ERROR jmx.JMXJsonServlet 
> (JMXJsonServlet.java:writeAttribute(345)) - getting attribute FilesTotal of 
> Hadoop:service=NameNode,name=FSNamesystem-2 threw an exception
> javax.management.RuntimeMBeanException: java.lang.NullPointerException
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:839)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:852)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:651)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
> at 
> org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:338)
> at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:316)
> at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:210)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
> at 
> org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.doFilter(ProxyUserAuthenticationFilter.java:104)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
> at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:51)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1604)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:539)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
> at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
> at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at 
> org.eclipse.jetty.util.thread.strategy.Exec

[jira] [Commented] (HDFS-13843) RBF: Add optional parameter -d for detailed listing of mount points.

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920339#comment-16920339
 ] 

Hadoop QA commented on HDFS-13843:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 90m 
36s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m  3s{color} 
| {color:red} hadoop-hdfs-rbf in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.federation.security.TestRouterHttpDelegationToken |
|   | hadoop.hdfs.server.federation.router.TestRouterWithSecureStartup |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-13843 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979060/HDFS-13843-04.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0cbc92e6f966 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c7ef4fb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |

[jira] [Commented] (HDFS-14762) "Path(Path/String parent, String child)" will fail when "child" contains ":"

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920336#comment-16920336
 ] 

Hadoop QA commented on HDFS-14762:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  8m 35s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
46s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}104m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.fs.TestAfsCheckPath |
|   | hadoop.fs.TestFileUtil |
|   | hadoop.fs.TestDelegateToFsCheckPath |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14762 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979062/HDFS-14762.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c6bcf321d609 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c7ef4fb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27752/artifact/out/whitespace-eol.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27752/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/27752/testReport/ |
| Max. process+thread count | 1497 (v

[jira] [Commented] (HDFS-14630) Configuration.getTimeDurationHelper() should not log time unit warning in info log.

2019-09-01 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920334#comment-16920334
 ] 

Surendra Singh Lilhore commented on HDFS-14630:
---

Changes LGTM, please re-base the patch..

> Configuration.getTimeDurationHelper() should not log time unit warning in 
> info log.
> ---
>
> Key: HDFS-14630
> URL: https://issues.apache.org/jira/browse/HDFS-14630
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: hemanthboyina
>Priority: Minor
> Attachments: HDFS-14630.patch
>
>
> To solve [HDFS-12920|https://issues.apache.org/jira/browse/HDFS-12920] issue 
> we configured "dfs.client.datanode-restart.timeout" without time unit. No log 
> file is full of
> {noformat}
> 2019-06-22 20:13:14,605 | INFO  | pool-12-thread-1 | No unit for 
> dfs.client.datanode-restart.timeout(30) assuming SECONDS 
> org.apache.hadoop.conf.Configuration.logDeprecation(Configuration.java:1409){noformat}
> No need to log this, just give the behavior in property description.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14807) SetTimes updates all negative values apart from -1

2019-09-01 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920335#comment-16920335
 ] 

Hadoop QA commented on HDFS-14807:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
53s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}103m 25s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}183m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.server.balancer.TestBalancerService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | HDFS-14807 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12979059/HDFS-14807-02.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 72e244517622 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c7ef

[jira] [Commented] (HDFS-13276) TestDistributedFileSystem doesn't cleanup MiniDFSCluster if test times out

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920327#comment-16920327
 ] 

Ayush Saxena commented on HDFS-13276:
-

Thanx [~elgoiri] for the explanation, Makes sense, I just checked the trunk 
patch doesn't apply now, I think you need to rebase.  The idea seems good and 
we should ideally use this way only. :)

> TestDistributedFileSystem doesn't cleanup MiniDFSCluster if test times out
> --
>
> Key: HDFS-13276
> URL: https://issues.apache.org/jira/browse/HDFS-13276
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Lukas Majercak
>Priority: Major
> Attachments: HDFS-13276-branch-2.000.patch, HDFS-13276.000.patch, 
> HDFS-13276.001.patch, HDFS-13276.002.patch
>
>
> If a unit tests times out, it may leave a MiniDFSCluster behing. This is 
> particularly bad in Windows where the new MiniDFSCluster cannot start and all 
> tests will fail after this one.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14654) RBF: TestRouterRpc#testNamenodeMetrics is flaky

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920307#comment-16920307
 ] 

Ayush Saxena edited comment on HDFS-14654 at 9/1/19 7:38 AM:
-

Thaxn [~zhangchen] for the patch.
I guess in {{TestRouterRpc}} {{resolver.setDisableRegistration(true);}} this 
should be reverted back in the finally block, if there is an error before we 
set to false, it may affect the following tests.


was (Author: ayushtkn):
Thaxn [~zhangchen] for the patch.
v004 LGTM +1

> RBF: TestRouterRpc#testNamenodeMetrics is flaky
> ---
>
> Key: HDFS-14654
> URL: https://issues.apache.org/jira/browse/HDFS-14654
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Takanobu Asanuma
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14654.001.patch, HDFS-14654.002.patch, 
> HDFS-14654.003.patch, HDFS-14654.004.patch, error.log
>
>
> They sometimes pass and sometimes fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14711) RBF: RBFMetrics throws NullPointerException if stateStore disabled

2019-09-01 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-14711:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> RBF: RBFMetrics throws NullPointerException if stateStore disabled
> --
>
> Key: HDFS-14711
> URL: https://issues.apache.org/jira/browse/HDFS-14711
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14711.001.patch, HDFS-14711.002.patch, 
> HDFS-14711.003.patch, HDFS-14711.004.patch, HDFS-14711.005.patch
>
>
> In current implementation, if \{{stateStore}} initialize fail, only log an 
> error message. Actually RBFMetrics can't work normally at this time.
> {code:java}
> 2019-08-08 22:43:58,024 [qtp812446698-28] ERROR jmx.JMXJsonServlet 
> (JMXJsonServlet.java:writeAttribute(345)) - getting attribute FilesTotal of 
> Hadoop:service=NameNode,name=FSNamesystem-2 threw an exception
> javax.management.RuntimeMBeanException: java.lang.NullPointerException
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:839)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:852)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:651)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
> at 
> org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:338)
> at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:316)
> at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:210)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
> at 
> org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.doFilter(ProxyUserAuthenticationFilter.java:104)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
> at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:51)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1604)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:539)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
> at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
> at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at 
> o

[jira] [Commented] (HDFS-14711) RBF: RBFMetrics throws NullPointerException if stateStore disabled

2019-09-01 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920325#comment-16920325
 ] 

Hudson commented on HDFS-14711:
---

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17218 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17218/])
HDFS-14711. RBF: RBFMetrics throws NullPointerException if stateStore 
(ayushsaxena: rev 18d74fe41c0982dc1540367805b0c3d0d4fc29d3)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/metrics/RBFMetrics.java


> RBF: RBFMetrics throws NullPointerException if stateStore disabled
> --
>
> Key: HDFS-14711
> URL: https://issues.apache.org/jira/browse/HDFS-14711
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14711.001.patch, HDFS-14711.002.patch, 
> HDFS-14711.003.patch, HDFS-14711.004.patch, HDFS-14711.005.patch
>
>
> In current implementation, if \{{stateStore}} initialize fail, only log an 
> error message. Actually RBFMetrics can't work normally at this time.
> {code:java}
> 2019-08-08 22:43:58,024 [qtp812446698-28] ERROR jmx.JMXJsonServlet 
> (JMXJsonServlet.java:writeAttribute(345)) - getting attribute FilesTotal of 
> Hadoop:service=NameNode,name=FSNamesystem-2 threw an exception
> javax.management.RuntimeMBeanException: java.lang.NullPointerException
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:839)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:852)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:651)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
> at 
> org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:338)
> at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:316)
> at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:210)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
> at 
> org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.doFilter(ProxyUserAuthenticationFilter.java:104)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
> at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:51)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1604)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:539)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
> at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.ja

[jira] [Commented] (HDFS-14711) RBF: RBFMetrics throws NullPointerException if stateStore disabled

2019-09-01 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920326#comment-16920326
 ] 

Ayush Saxena commented on HDFS-14711:
-

Committed to trunk.
Thanx [~zhangchen] for the contribution!!!

> RBF: RBFMetrics throws NullPointerException if stateStore disabled
> --
>
> Key: HDFS-14711
> URL: https://issues.apache.org/jira/browse/HDFS-14711
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Chen Zhang
>Assignee: Chen Zhang
>Priority: Major
> Attachments: HDFS-14711.001.patch, HDFS-14711.002.patch, 
> HDFS-14711.003.patch, HDFS-14711.004.patch, HDFS-14711.005.patch
>
>
> In current implementation, if \{{stateStore}} initialize fail, only log an 
> error message. Actually RBFMetrics can't work normally at this time.
> {code:java}
> 2019-08-08 22:43:58,024 [qtp812446698-28] ERROR jmx.JMXJsonServlet 
> (JMXJsonServlet.java:writeAttribute(345)) - getting attribute FilesTotal of 
> Hadoop:service=NameNode,name=FSNamesystem-2 threw an exception
> javax.management.RuntimeMBeanException: java.lang.NullPointerException
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:839)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:852)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:651)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
> at 
> org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:338)
> at org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:316)
> at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:210)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
> at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
> at 
> org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.doFilter(ProxyUserAuthenticationFilter.java:104)
> at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
> at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:51)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1604)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
> at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:539)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
> at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
> at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> at 
> org.eclipse.jetty.util.thread.strategy.Ex

[jira] [Updated] (HDFS-14799) Do Not Call Map containsKey In Conjunction with get

2019-09-01 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14799:
-
Attachment: HDFS-14799.001.patch
Status: Patch Available  (was: Open)

> Do Not Call Map containsKey In Conjunction with get
> ---
>
> Key: HDFS-14799
> URL: https://issues.apache.org/jira/browse/HDFS-14799
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.2.0
>Reporter: David Mollitor
>Assignee: hemanthboyina
>Priority: Minor
>  Labels: newbie, noob
> Attachments: HDFS-14799.001.patch
>
>
> {code:java|title=InvalidateBlocks.java}
>   private final Map>
>   nodeToBlocks = new HashMap<>();
>   private final Map>
>   nodeToECBlocks = new HashMap<>();
> ...
>   private LightWeightHashSet getBlocksSet(final DatanodeInfo dn) {
> if (nodeToBlocks.containsKey(dn)) {
>   return nodeToBlocks.get(dn);
> }
> return null;
>   }
>   private LightWeightHashSet getECBlocksSet(final DatanodeInfo dn) {
> if (nodeToECBlocks.containsKey(dn)) {
>   return nodeToECBlocks.get(dn);
> }
> return null;
>   }
> {code}
> There is no need to check for {{containsKey}} here since a call to {{get}} 
> will already return 'null' if the key is not there.  This just adds overhead 
> of having to dive into the Map twice to get the value.
> {code}
>   private LightWeightHashSet getECBlocksSet(final DatanodeInfo dn) {
> return nodeToECBlocks.get(dn);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14809) Make a new BlockReader for hdfs client lib

2019-09-01 Thread KenCao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16920322#comment-16920322
 ] 

KenCao commented on HDFS-14809:
---

[~jojochuang] i will have a try. But i don't think i am a good coder, so it may 
take me a long time to contribute an implementation. And i also need 
confirmation of my analysis above from some hdfs committers. :)

> Make a new BlockReader  for hdfs client lib
> ---
>
> Key: HDFS-14809
> URL: https://issues.apache.org/jira/browse/HDFS-14809
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Affects Versions: 2.6.0
>Reporter: KenCao
>Priority: Major
>
> as we known, the hdfs client java lib uses BlockReaderLocal for short circuit 
> read by default, which allocate shared memory first, and make a slot within 
> it. After all these steps, it will request the fds from the DataNode. 
> However, the slot and shared memory sturcture is only used by DataNode when 
> uncaching replicas, the client process can work well just with the fds asked 
> later and it is nearly impossible to cache replicas in product environment. 
> The api to release fds is called by client only with the slot given, the fds 
> is close in the client process finally.  
> so i think we can make a new BlockReader implementation which just requests 
> the fds, and it will reduce the rpc calls from 3(allocate shm, request fds, 
> release fds) to 1(request fds).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org