[jira] [Commented] (HDFS-15199) NPE in BlockSender

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047262#comment-17047262
 ] 

Ayush Saxena commented on HDFS-15199:
-

{{String ioem = e.getMessage();}} is null for {{ClosedChannelException}}

 

Will add a null check for ioem.

 

> NPE in BlockSender
> --
>
> Key: HDFS-15199
> URL: https://issues.apache.org/jira/browse/HDFS-15199
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>
> {noformat}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:662)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:819)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:766)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:607)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:152)
>   at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:104)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-02-28 11:49:13,357 [stripedRead-0] INFO  datanode.DataNode 
> (StripedBlockReader.java:call(182)) - Premature EOF reading from 
> org.apache.hadoop.net.SocketInputStream@8a99d11
> 2020-02-28 11:49:13,362 [ResponseProcessor for block 
> BP-1162371257-10.19.127.112-1582870703783:blk_-9223372036854775774_1004] WARN 
>  hdfs.DataStreamer (DataStreamer.java:run(1217)) - Exception for 
> BP-1162371257-10.19.127.112-1582870703783:blk
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15199) NPE in BlockSender

2020-02-27 Thread Ayush Saxena (Jira)
Ayush Saxena created HDFS-15199:
---

 Summary: NPE in BlockSender
 Key: HDFS-15199
 URL: https://issues.apache.org/jira/browse/HDFS-15199
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ayush Saxena
Assignee: Ayush Saxena



{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:662)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:819)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:766)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:607)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:152)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:104)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:290)
at java.lang.Thread.run(Thread.java:748)
2020-02-28 11:49:13,357 [stripedRead-0] INFO  datanode.DataNode 
(StripedBlockReader.java:call(182)) - Premature EOF reading from 
org.apache.hadoop.net.SocketInputStream@8a99d11
2020-02-28 11:49:13,362 [ResponseProcessor for block 
BP-1162371257-10.19.127.112-1582870703783:blk_-9223372036854775774_1004] WARN  
hdfs.DataStreamer (DataStreamer.java:run(1217)) - Exception for 
BP-1162371257-10.19.127.112-1582870703783:blk
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047260#comment-17047260
 ] 

Hadoop QA commented on HDFS-15033:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 54s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}168m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
|   | hadoop.hdfs.server.namenode.TestEditLogRace |
|   | hadoop.hdfs.tools.TestECAdmin |
|   | hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15033 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994836/HDFS-15033.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux f611f31774d1 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a43510e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28869/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047233#comment-17047233
 ] 

Hadoop QA commented on HDFS-15033:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  2s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 11s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}192m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDeadNodeDetection |
|   | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
|   | hadoop.hdfs.server.balancer.TestBalancer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15033 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994833/HDFS-15033.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux efd5d3d8e7e0 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a43510e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28868/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28868/testReport/ |
| Max. process+thread count | 2958 (vs. ulimit of 5500) 

[jira] [Updated] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15188:

Attachment: HDFS-15188.patch
Status: Patch Available  (was: Open)

> Add option to set Write/Read timeout extension for different StorageType
> 
>
> Key: HDFS-15188
> URL: https://issues.apache.org/jira/browse/HDFS-15188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, dfsclient
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15188.patch, HDFS-15188.patch, HDFS-15188.patch
>
>
> Different storage types have different speeds. Especially for low-speed 
> Archive volume, errors are often reported under current timeout. Add an 
> unified solution to set options for different StorageType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15198) In Secure Mode, Router can't refresh other router's mountTableEntries

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047228#comment-17047228
 ] 

Ayush Saxena commented on HDFS-15198:
-

Can you extend a test for this?

> In Secure Mode, Router can't refresh other router's mountTableEntries
> -
>
> Key: HDFS-15198
> URL: https://issues.apache.org/jira/browse/HDFS-15198
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: zhengchenyu
>Priority: Major
> Attachments: HDFS-15198.001.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In issue HDFS-13443, update mount table cache imediately. The specified 
> router update their own mount table cache imediately, then update other's by 
> rpc protocol refreshMountTableEntries. But in secure mode, can't refresh 
> other's router's. In specified router's log, error like this
> {code}
> 2020-02-27 22:59:07,212 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2020-02-27 22:59:07,213 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread: 
> Failed to refresh mount table entries cache at router $host:8111
> java.io.IOException: DestHost:destPort host:8111 , LocalHost:localPort 
> $host/$ip:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:288)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
> 2020-02-27 22:59:07,214 INFO 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver: Added 
> new mount point /test_11 to resolver
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15188:

Status: Open  (was: Patch Available)

> Add option to set Write/Read timeout extension for different StorageType
> 
>
> Key: HDFS-15188
> URL: https://issues.apache.org/jira/browse/HDFS-15188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, dfsclient
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15188.patch, HDFS-15188.patch, HDFS-15188.patch
>
>
> Different storage types have different speeds. Especially for low-speed 
> Archive volume, errors are often reported under current timeout. Add an 
> unified solution to set options for different StorageType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15198) In Secure Mode, Router can't refresh other router's mountTableEntries

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047226#comment-17047226
 ] 

Hadoop QA commented on HDFS-15198:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
1s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15198 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994834/HDFS-15198.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a40d25b771f2 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a43510e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28870/testReport/ |
| Max. process+thread count | 3192 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28870/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> In Secure Mode, Router can't refresh other 

[jira] [Commented] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047219#comment-17047219
 ] 

Hadoop QA commented on HDFS-15188:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  9m 
21s{color} | {color:red} root in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
36s{color} | {color:red} root in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  8m 
54s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  8m 54s{color} 
| {color:red} root in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} The patch fails to run checkstyle in root 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 43 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch 600 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
45s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
59s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 48s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
39s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}194m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | 

[jira] [Updated] (HDFS-15198) In Secure Mode, Router can't refresh other router's mountTableEntries

2020-02-27 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15198:
-
Status: Patch Available  (was: Open)

> In Secure Mode, Router can't refresh other router's mountTableEntries
> -
>
> Key: HDFS-15198
> URL: https://issues.apache.org/jira/browse/HDFS-15198
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: zhengchenyu
>Priority: Major
> Attachments: HDFS-15198.001.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In issue HDFS-13443, update mount table cache imediately. The specified 
> router update their own mount table cache imediately, then update other's by 
> rpc protocol refreshMountTableEntries. But in secure mode, can't refresh 
> other's router's. In specified router's log, error like this
> {code}
> 2020-02-27 22:59:07,212 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2020-02-27 22:59:07,213 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread: 
> Failed to refresh mount table entries cache at router $host:8111
> java.io.IOException: DestHost:destPort host:8111 , LocalHost:localPort 
> $host/$ip:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:288)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
> 2020-02-27 22:59:07,214 INFO 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver: Added 
> new mount point /test_11 to resolver
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047195#comment-17047195
 ] 

Yang Yun commented on HDFS-15033:
-

Thanks [~ayushtkn] for the review.

Got the time usage and removed tweaking the imports .

 

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch, HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15033:

Attachment: HDFS-15033.patch
Status: Patch Available  (was: Open)

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch, HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15033:

Status: Open  (was: Patch Available)

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15198) In Secure Mode, Router can't refresh other router's mountTableEntries

2020-02-27 Thread zhengchenyu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated HDFS-15198:
---
Attachment: HDFS-15198.001.patch

> In Secure Mode, Router can't refresh other router's mountTableEntries
> -
>
> Key: HDFS-15198
> URL: https://issues.apache.org/jira/browse/HDFS-15198
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: zhengchenyu
>Priority: Major
> Attachments: HDFS-15198.001.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In issue HDFS-13443, update mount table cache imediately. The specified 
> router update their own mount table cache imediately, then update other's by 
> rpc protocol refreshMountTableEntries. But in secure mode, can't refresh 
> other's router's. In specified router's log, error like this
> {code}
> 2020-02-27 22:59:07,212 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2020-02-27 22:59:07,213 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread: 
> Failed to refresh mount table entries cache at router $host:8111
> java.io.IOException: DestHost:destPort host:8111 , LocalHost:localPort 
> $host/$ip:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:288)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
> 2020-02-27 22:59:07,214 INFO 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver: Added 
> new mount point /test_11 to resolver
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13377) The owner of folder can set quota for his sub folder

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047191#comment-17047191
 ] 

Ayush Saxena commented on HDFS-13377:
-

Thanx [~hadoop_yangyun]  for the patch. Overall LGTM.

{\{TestHdfsConfigFields}} is related, You need to add the new config in the 
\{{hdfs-defaults.xml}} file too for the test to pass.

> The owner of folder can set quota for his sub folder
> 
>
> Key: HDFS-13377
> URL: https://issues.apache.org/jira/browse/HDFS-13377
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-13377.patch, HDFS-13377.patch, HDFS-13377.patch, 
> HDFS-13377.patch
>
>
> Currently, only  super user can set quota. That is huge burden for 
> administrator in a large system. Add a new feature to let the owner of a 
> folder also has the privilege to set quota for his sub folders. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15198) In Secure Mode, Router can't refresh other router's mountTableEntries

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047190#comment-17047190
 ] 

Ayush Saxena commented on HDFS-15198:
-

Thanx [~zhengchenyu]  for the report.

Do you want to contribute a fix?

> In Secure Mode, Router can't refresh other router's mountTableEntries
> -
>
> Key: HDFS-15198
> URL: https://issues.apache.org/jira/browse/HDFS-15198
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: zhengchenyu
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In issue HDFS-13443, update mount table cache imediately. The specified 
> router update their own mount table cache imediately, then update other's by 
> rpc protocol refreshMountTableEntries. But in secure mode, can't refresh 
> other's router's. In specified router's log, error like this
> {code}
> 2020-02-27 22:59:07,212 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2020-02-27 22:59:07,213 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread: 
> Failed to refresh mount table entries cache at router $host:8111
> java.io.IOException: DestHost:destPort host:8111 , LocalHost:localPort 
> $host/$ip:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:288)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
> 2020-02-27 22:59:07,214 INFO 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver: Added 
> new mount point /test_11 to resolver
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15198) In Secure Mode, Router can't refresh other router's mountTableEntries

2020-02-27 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15198:

Fix Version/s: (was: HDFS-13891)
   (was: 3.3.0)

> In Secure Mode, Router can't refresh other router's mountTableEntries
> -
>
> Key: HDFS-15198
> URL: https://issues.apache.org/jira/browse/HDFS-15198
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: zhengchenyu
>Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In issue HDFS-13443, update mount table cache imediately. The specified 
> router update their own mount table cache imediately, then update other's by 
> rpc protocol refreshMountTableEntries. But in secure mode, can't refresh 
> other's router's. In specified router's log, error like this
> {code}
> 2020-02-27 22:59:07,212 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2020-02-27 22:59:07,213 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread: 
> Failed to refresh mount table entries cache at router $host:8111
> java.io.IOException: DestHost:destPort host:8111 , LocalHost:localPort 
> $host/$ip:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:288)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
> 2020-02-27 22:59:07,214 INFO 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver: Added 
> new mount point /test_11 to resolver
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047188#comment-17047188
 ] 

Ayush Saxena commented on HDFS-15033:
-

Thanx [~hadoop_yangyun] for the patch . Had a quick look on this, Overall seems 
fine.
{code:java}
+
+
+  dfs.datanode.replica.cache.expiry.time.ms
+  30
+  
+ Living time of replica cached files in milliseconds.
+ 

{code}
Here once we are using time units, we can set the value also with time suffix 
like 3ms or may be you can set the value in seconds for better readability 
and I think in the configuration name and description we can remove ms, as it 
will supports all time units

Avoid tweaking the imports :
{code:java}
-import java.util.concurrent.ConcurrentLinkedQueue;
-import java.util.concurrent.ExecutionException;
-import java.util.concurrent.ForkJoinPool;
-import java.util.concurrent.ForkJoinTask;
-import java.util.concurrent.RecursiveAction;
+import java.util.concurrent.*;
{code}

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15198) In Secure Mode, Router can't refresh other router's mountTableEntries

2020-02-27 Thread zhengchenyu (Jira)
zhengchenyu created HDFS-15198:
--

 Summary: In Secure Mode, Router can't refresh other router's 
mountTableEntries
 Key: HDFS-15198
 URL: https://issues.apache.org/jira/browse/HDFS-15198
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf
Reporter: zhengchenyu
 Fix For: 3.3.0, HDFS-13891


In issue HDFS-13443, update mount table cache imediately. The specified router 
update their own mount table cache imediately, then update other's by rpc 
protocol refreshMountTableEntries. But in secure mode, can't refresh other's 
router's. In specified router's log, error like this

{code}

2020-02-27 22:59:07,212 WARN org.apache.hadoop.ipc.Client: Exception 
encountered while connecting to the server : javax.security.sasl.SaslException: 
GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos tgt)]
2020-02-27 22:59:07,213 ERROR 
org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread: 
Failed to refresh mount table entries cache at router $host:8111
java.io.IOException: DestHost:destPort host:8111 , LocalHost:localPort 
$host/$ip:0. Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
at 
org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:288)
at 
org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
2020-02-27 22:59:07,214 INFO 
org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver: Added new 
mount point /test_11 to resolver

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047182#comment-17047182
 ] 

Haibin Huang commented on HDFS-15155:
-

Ok, i will update the patch soon

> writeIoRate of DataNodeVolumeMetrics is never used
> --
>
> Key: HDFS-15155
> URL: https://issues.apache.org/jira/browse/HDFS-15155
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch
>
>
> There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is 
> never used and syncIoRate should be replaced by writeIoRate in the following 
> code:
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15196) RBF: RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047180#comment-17047180
 ] 

Ayush Saxena commented on HDFS-15196:
-

Thanx [~fengnanli] for the patch. 
You can add a test to may be {{TestRouterRpc}} or any other RBF test clas 
rather than tweaking common tests

> RBF: RouterRpcServer getListing cannot list large dirs correctly
> 
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Critical
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047179#comment-17047179
 ] 

Íñigo Goiri commented on HDFS-15155:


I do think we need some assert, right now we are not really covering if the 
values make sense.

> writeIoRate of DataNodeVolumeMetrics is never used
> --
>
> Key: HDFS-15155
> URL: https://issues.apache.org/jira/browse/HDFS-15155
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch
>
>
> There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is 
> never used and syncIoRate should be replaced by writeIoRate in the following 
> code:
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15196) RBF: RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047178#comment-17047178
 ] 

Íñigo Goiri commented on HDFS-15196:


The tests don't look very happy.
Minor comments:
* Avoid changes in RouterRpcServer.
* Can we do the test without using touchf in AbstractContractGetFileStatusTest?
* I don't think we should do the test in TestRouterHDFSContractGetFileStatus as 
it overwrites the standard one.

> RBF: RouterRpcServer getListing cannot list large dirs correctly
> 
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Critical
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15196) RBF: RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15196:
---
Summary: RBF: RouterRpcServer getListing cannot list large dirs correctly  
(was: RouterRpcServer getListing cannot list large dirs correctly)

> RBF: RouterRpcServer getListing cannot list large dirs correctly
> 
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Critical
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047173#comment-17047173
 ] 

Yang Yun commented on HDFS-15033:
-

Thanks [~elgoiri] for the review.

Added the time suffix  in  hdfs-default.xml and get the time value directly in 
millis.

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15033:

Attachment: (was: HDFS-15033.patch)

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15033:

Attachment: (was: HDFS-15033.patch)

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15033:

Attachment: (was: HDFS-15033.patch)

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15033:

Status: Open  (was: Patch Available)

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch, HDFS-15033.patch, HDFS-15033.patch, 
> HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15033:

Attachment: HDFS-15033.patch
Status: Patch Available  (was: Open)

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch, HDFS-15033.patch, HDFS-15033.patch, 
> HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047171#comment-17047171
 ] 

Haibin Huang edited comment on HDFS-15155 at 2/28/20 3:05 AM:
--

[~elgoiri] ,thanks for reviewing this patch, i think 
TestDataNodeVolumeMetrics#testVolumeMetrics has checked the metrics of writIo, 
and it need to remove this line before building the MiniDFSCluster:

 
{code:java}
SimulatedFSDataset.setFactory(conf);
{code}
and you will see the different in 
TestDataNodeVolumeMetrics#verifyDataNodeVolumeMetrics after applying this 
patch, you can focus on these output line:
{code:java}
LOG.info("writeIoSampleCount : " + metrics.getWriteIoSampleCount());
LOG.info("writeIoMean : " + metrics.getWriteIoMean());
LOG.info("writeIoStdDev : " + metrics.getWriteIoStdDev());
{code}
if need some more asserts, i will update soon

 


was (Author: huanghaibin):
I think TestDataNodeVolumeMetrics#testVolumeMetrics has checked the metrics of 
writIo, and it need to remove this line before building the MiniDFSCluster:

 
{code:java}
SimulatedFSDataset.setFactory(conf);
{code}
and you will see the different in 
TestDataNodeVolumeMetrics#verifyDataNodeVolumeMetrics after applying this 
patch, you can focus on these output line:
{code:java}
LOG.info("writeIoSampleCount : " + metrics.getWriteIoSampleCount());
LOG.info("writeIoMean : " + metrics.getWriteIoMean());
LOG.info("writeIoStdDev : " + metrics.getWriteIoStdDev());
{code}
if need some more asserts, i will update soon, thanks for reviewing

 

> writeIoRate of DataNodeVolumeMetrics is never used
> --
>
> Key: HDFS-15155
> URL: https://issues.apache.org/jira/browse/HDFS-15155
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch
>
>
> There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is 
> never used and syncIoRate should be replaced by writeIoRate in the following 
> code:
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047171#comment-17047171
 ] 

Haibin Huang commented on HDFS-15155:
-

I think TestDataNodeVolumeMetrics#testVolumeMetrics has checked the metrics of 
writIo, and it need to remove this line before building the MiniDFSCluster:

 
{code:java}
SimulatedFSDataset.setFactory(conf);
{code}
and you will see the different in 
TestDataNodeVolumeMetrics#verifyDataNodeVolumeMetrics after applying this 
patch, you can focus on these output line:
{code:java}
LOG.info("writeIoSampleCount : " + metrics.getWriteIoSampleCount());
LOG.info("writeIoMean : " + metrics.getWriteIoMean());
LOG.info("writeIoStdDev : " + metrics.getWriteIoStdDev());
{code}
if need some more asserts, i will update soon, thanks for reviewing

 

> writeIoRate of DataNodeVolumeMetrics is never used
> --
>
> Key: HDFS-15155
> URL: https://issues.apache.org/jira/browse/HDFS-15155
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch
>
>
> There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is 
> never used and syncIoRate should be replaced by writeIoRate in the following 
> code:
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047165#comment-17047165
 ] 

Hadoop QA commented on HDFS-15196:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 22m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 20s{color} | {color:orange} root: The patch generated 1 new + 6 unchanged - 
0 fixed = 7 total (was 6) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m 35s{color} 
| {color:red} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 42s{color} 
| {color:red} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
58s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.metrics2.source.TestJvmMetrics |
|   | hadoop.ipc.TestRetryCache |
|   | hadoop.hdfs.server.federation.router.TestRouterQuota |
|   | hadoop.hdfs.server.federation.router.TestRouterMultiRack |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15196 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994827/HDFS-15196.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b09a3dd8905f 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git 

[jira] [Commented] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-27 Thread Yang Yun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047159#comment-17047159
 ] 

Yang Yun commented on HDFS-15188:
-

Thanks [~elgoiri] for the review.

Fixed the checkstyle warnings. Yes,  it should be getReadTimeout().

Please help me review again.

 

 

 

> Add option to set Write/Read timeout extension for different StorageType
> 
>
> Key: HDFS-15188
> URL: https://issues.apache.org/jira/browse/HDFS-15188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, dfsclient
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15188.patch, HDFS-15188.patch
>
>
> Different storage types have different speeds. Especially for low-speed 
> Archive volume, errors are often reported under current timeout. Add an 
> unified solution to set options for different StorageType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15188:

Attachment: HDFS-15188.patch
Status: Patch Available  (was: Open)

> Add option to set Write/Read timeout extension for different StorageType
> 
>
> Key: HDFS-15188
> URL: https://issues.apache.org/jira/browse/HDFS-15188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, dfsclient
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15188.patch, HDFS-15188.patch
>
>
> Different storage types have different speeds. Especially for low-speed 
> Archive volume, errors are often reported under current timeout. Add an 
> unified solution to set options for different StorageType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-27 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15188:

Status: Open  (was: Patch Available)

> Add option to set Write/Read timeout extension for different StorageType
> 
>
> Key: HDFS-15188
> URL: https://issues.apache.org/jira/browse/HDFS-15188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, dfsclient
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15188.patch
>
>
> Different storage types have different speeds. Especially for low-speed 
> Archive volume, errors are often reported under current timeout. Add an 
> unified solution to set options for different StorageType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

2020-02-27 Thread Uma Maheswara Rao G (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047136#comment-17047136
 ] 

Uma Maheswara Rao G commented on HDFS-12090:


Hi [~ehiggs] / [~Thomas Demoor]

  Do you have updated design doc? [~PhiloHe] is interested in spending his time 
on this task. It would be great if you can post latest docs based on current 
state of this work. 

> Handling writes from HDFS to Provided storages
> --
>
> Key: HDFS-12090
> URL: https://issues.apache.org/jira/browse/HDFS-12090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Virajith Jalaparti
>Priority: Major
> Attachments: External-SyncService-CreateFile.001.png, 
> HDFS-12090-Functional-Specification.001.pdf, 
> HDFS-12090-Functional-Specification.002.pdf, 
> HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, 
> HDFS-12090..patch, HDFS-12090.0001.patch
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15197) [SBN read] Change ObserverRetryOnActiveException log to debug

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047121#comment-17047121
 ] 

Hadoop QA commented on HDFS-15197:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 55s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 19s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-client: The 
patch generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
15s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15197 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994828/HDFS-15197.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6d849ce3dc83 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a43510e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28866/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28866/testReport/ |
| Max. process+thread count | 337 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: 

[jira] [Updated] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15113:
---
Target Version/s: 3.3.0

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15113:
---
Priority: Blocker  (was: Major)

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-02-27 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047116#comment-17047116
 ] 

Wei-Chiu Chuang commented on HDFS-15113:


This has to be a blocker for 3.3.0. Updated jira to reflect the reality.

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13660) DistCp job fails when new data is appended in the file while the distCp copy job is running

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-13660:
---
Fix Version/s: 3.2.2
   3.1.4

> DistCp job fails when new data is appended in the file while the distCp copy 
> job is running
> ---
>
> Key: HDFS-13660
> URL: https://issues.apache.org/jira/browse/HDFS-13660
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: distcp
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Critical
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: distcp_failure_when_file_append.log
>
>
> Steps to reproduce: 
> Suppose distcp MR job is copying the file /tmp/web_returns_merged/data-m-002 
> and 
> we append some more data to this file using command 
> hadoop fs -appendToFile xaa  /tmp/web_returns_merged/data-m-002
> the job fails with exception 
>  Mismatch in length of 
> source:hdfs://mycluster0/tmp/web_returns_merged/data-m-002 and target.
> Attached the logs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12999) When reach the end of the block group, it may not need to flush all the data packets(flushAllInternals) twice.

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-12999:
---
Fix Version/s: 3.2.2
   3.1.4

> When reach the end of the block group, it may not need to flush all the data 
> packets(flushAllInternals) twice. 
> ---
>
> Key: HDFS-12999
> URL: https://issues.apache.org/jira/browse/HDFS-12999
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: erasure-coding, hdfs-client
>Affects Versions: 3.0.0-beta1, 3.1.0
>Reporter: lufei
>Assignee: lufei
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-12999.001.patch, HDFS-12999.002.patch, 
> HDFS-12999.003.patch
>
>
> In order to make the process simplification. It's no need to flush all the 
> data packets(flushAllInternals) twice,when reach the end of the block group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15068) DataNode could meet deadlock if invoke refreshVolumes when register

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15068:
---
Fix Version/s: 3.2.2
   3.1.4

> DataNode could meet deadlock if invoke refreshVolumes when register
> ---
>
> Key: HDFS-15068
> URL: https://issues.apache.org/jira/browse/HDFS-15068
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15068.001.patch, HDFS-15068.002.patch, 
> HDFS-15068.003.patch, HDFS-15068.004.patch, HDFS-15068.005.patch
>
>
> DataNode could meet deadlock when invoke `dfsadmin -reconfig datanode ip:host 
> start` to trigger #refreshVolumes.
> 1. DataNod#refreshVolumes hold datanode instance ownable {{synchronizer}} 
> when enter this method first, then try to hold BPOfferService {{readlock}} 
> when `bpos.getNamespaceInfo()` in following code segment. 
> {code:java}
> for (BPOfferService bpos : blockPoolManager.getAllNamenodeThreads()) {
>   nsInfos.add(bpos.getNamespaceInfo());
> }
> {code}
> 2. BPOfferService#registrationSucceeded (which is invoked by #register when 
> DataNode start or #reregister when processCommandFromActor) hold 
> BPOfferService {{writelock}} first, then try to hold datanode instance 
> ownable {{synchronizer}} in following method.
> {code:java}
>   synchronized void bpRegistrationSucceeded(DatanodeRegistration 
> bpRegistration,
>   String blockPoolId) throws IOException {
> id = bpRegistration;
> if(!storage.getDatanodeUuid().equals(bpRegistration.getDatanodeUuid())) {
>   throw new IOException("Inconsistent Datanode IDs. Name-node returned "
>   + bpRegistration.getDatanodeUuid()
>   + ". Expecting " + storage.getDatanodeUuid());
> }
> 
> registerBlockPoolWithSecretManager(bpRegistration, blockPoolId);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15197) [SBN read] Change ObserverRetryOnActiveException log to debug

2020-02-27 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15197:

Summary: [SBN read] Change ObserverRetryOnActiveException log to debug  
(was: Change ObserverRetryOnActiveException log to debug)

> [SBN read] Change ObserverRetryOnActiveException log to debug
> -
>
> Key: HDFS-15197
> URL: https://issues.apache.org/jira/browse/HDFS-15197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Minor
> Attachments: HDFS-15197.001.patch
>
>
> Currently in ObserverReadProxyProvider, when a ObserverRetryOnActiveException 
> happens, ObserverReadProxyProvider logs a message at INFO level. This can be 
> a large volume of logs in some scenarios. For example, when some job tries to 
> access lots of files that haven't been accessed for a long time, all these 
> accesses may trigger atime updates, which led to 
> ObserverRetryOnActiveException. We should change this log to DEBUG.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047086#comment-17047086
 ] 

Hadoop QA commented on HDFS-15154:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 1062 unchanged - 1 fixed = 1062 total (was 1063) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 56s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m  5s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
55s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSMkdirs |
|   | hadoop.hdfs.TestParallelUnixDomainRead |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
|   | hadoop.hdfs.TestReadStripedFileWithDNFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15154 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994820/HDFS-15154.09.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 6a40fa564098 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a43510e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28864/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-15197) Change ObserverRetryOnActiveException log to debug

2020-02-27 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047081#comment-17047081
 ] 

Chao Sun commented on HDFS-15197:
-

+1

> Change ObserverRetryOnActiveException log to debug
> --
>
> Key: HDFS-15197
> URL: https://issues.apache.org/jira/browse/HDFS-15197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Minor
> Attachments: HDFS-15197.001.patch
>
>
> Currently in ObserverReadProxyProvider, when a ObserverRetryOnActiveException 
> happens, ObserverReadProxyProvider logs a message at INFO level. This can be 
> a large volume of logs in some scenarios. For example, when some job tries to 
> access lots of files that haven't been accessed for a long time, all these 
> accesses may trigger atime updates, which led to 
> ObserverRetryOnActiveException. We should change this log to DEBUG.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14731) [FGL] Remove redundant locking on NameNode.

2020-02-27 Thread Konstantin Shvachko (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-14731:
---
Fix Version/s: 3.2.2
   3.1.4
   3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed this to branches 3.2 and 3.1 as well.

> [FGL] Remove redundant locking on NameNode.
> ---
>
> Key: HDFS-14731
> URL: https://issues.apache.org/jira/browse/HDFS-14731
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
>Priority: Major
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-14731.001.patch
>
>
> Currently NameNode has two global locks: FSNamesystemLock and 
> FSDirectoryLock. An analysis shows that single FSNamesystemLock is sufficient 
> to guarantee consistency of the NameNode state. FSDirectoryLock can be 
> removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15033) Support to save replica cached files to other place and make expired time configurable

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047075#comment-17047075
 ] 

Íñigo Goiri commented on HDFS-15033:


Can we use a time suffix in hdfs-default.xml? I'm guessing is 300 seconds? In 
getTimeDuration you cna get directly in millis.

> Support to save replica cached files to other place and make expired time 
> configurable
> --
>
> Key: HDFS-15033
> URL: https://issues.apache.org/jira/browse/HDFS-15033
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15033.patch, HDFS-15033.patch, HDFS-15033.patch
>
>
> For slow volume with many replicas,  add an option to save the replica files 
> to high-speed disk and speed up the saving.
>  Also add a option to change the expire time of the replica file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047074#comment-17047074
 ] 

Chao Sun edited comment on HDFS-15196 at 2/28/20 12:05 AM:
---

+1. Patch LGTM but will be great if [~elgoiri] or others who're familiar with 
RBF can take a look.


was (Author: csun):
Patch LGTM but will be great if [~elgoiri] or others who're familiar with RBF 
can take a look.

> RouterRpcServer getListing cannot list large dirs correctly
> ---
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Critical
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047074#comment-17047074
 ] 

Chao Sun commented on HDFS-15196:
-

Patch LGTM but will be great if [~elgoiri] or others who're familiar with RBF 
can take a look.

> RouterRpcServer getListing cannot list large dirs correctly
> ---
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Critical
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15188) Add option to set Write/Read timeout extension for different StorageType

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047070#comment-17047070
 ] 

Íñigo Goiri commented on HDFS-15188:


Thanks for the patch, [~hadoop_yangyun].
* Can you take care of the checkstyle warnings?
* Should it be getReadTimeout() in DataXceiver#810? The unit test should catch 
this issue.

> Add option to set Write/Read timeout extension for different StorageType
> 
>
> Key: HDFS-15188
> URL: https://issues.apache.org/jira/browse/HDFS-15188
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, dfsclient
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15188.patch
>
>
> Different storage types have different speeds. Especially for low-speed 
> Archive volume, errors are often reported under current timeout. Add an 
> unified solution to set options for different StorageType.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Chao Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047069#comment-17047069
 ] 

Chao Sun commented on HDFS-15196:
-

Thanks Fengnan for the patch. Raising this to Critical since it is a 
correctness issue.

> RouterRpcServer getListing cannot list large dirs correctly
> ---
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Critical
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Chao Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HDFS-15196:

Priority: Critical  (was: Major)

> RouterRpcServer getListing cannot list large dirs correctly
> ---
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Critical
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15197) Change ObserverRetryOnActiveException log to debug

2020-02-27 Thread Chen Liang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-15197:
--
Status: Patch Available  (was: Open)

> Change ObserverRetryOnActiveException log to debug
> --
>
> Key: HDFS-15197
> URL: https://issues.apache.org/jira/browse/HDFS-15197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Minor
> Attachments: HDFS-15197.001.patch
>
>
> Currently in ObserverReadProxyProvider, when a ObserverRetryOnActiveException 
> happens, ObserverReadProxyProvider logs a message at INFO level. This can be 
> a large volume of logs in some scenarios. For example, when some job tries to 
> access lots of files that haven't been accessed for a long time, all these 
> accesses may trigger atime updates, which led to 
> ObserverRetryOnActiveException. We should change this log to DEBUG.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15197) Change ObserverRetryOnActiveException log to debug

2020-02-27 Thread Chen Liang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-15197:
--
Attachment: HDFS-15197.001.patch

> Change ObserverRetryOnActiveException log to debug
> --
>
> Key: HDFS-15197
> URL: https://issues.apache.org/jira/browse/HDFS-15197
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Chen Liang
>Assignee: Chen Liang
>Priority: Minor
> Attachments: HDFS-15197.001.patch
>
>
> Currently in ObserverReadProxyProvider, when a ObserverRetryOnActiveException 
> happens, ObserverReadProxyProvider logs a message at INFO level. This can be 
> a large volume of logs in some scenarios. For example, when some job tries to 
> access lots of files that haven't been accessed for a long time, all these 
> accesses may trigger atime updates, which led to 
> ObserverRetryOnActiveException. We should change this log to DEBUG.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15197) Change ObserverRetryOnActiveException log to debug

2020-02-27 Thread Chen Liang (Jira)
Chen Liang created HDFS-15197:
-

 Summary: Change ObserverRetryOnActiveException log to debug
 Key: HDFS-15197
 URL: https://issues.apache.org/jira/browse/HDFS-15197
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Chen Liang
Assignee: Chen Liang


Currently in ObserverReadProxyProvider, when a ObserverRetryOnActiveException 
happens, ObserverReadProxyProvider logs a message at INFO level. This can be a 
large volume of logs in some scenarios. For example, when some job tries to 
access lots of files that haven't been accessed for a long time, all these 
accesses may trigger atime updates, which led to 
ObserverRetryOnActiveException. We should change this log to DEBUG.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-15196:
--
Status: Patch Available  (was: Open)

[~inigoiri] Can you help take a look? Thanks very much!

Tagging [~sunchao] as well

> RouterRpcServer getListing cannot list large dirs correctly
> ---
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Fengnan Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-15196:
--
Attachment: HDFS-15196.001.patch

> RouterRpcServer getListing cannot list large dirs correctly
> ---
>
> Key: HDFS-15196
> URL: https://issues.apache.org/jira/browse/HDFS-15196
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-15196.001.patch
>
>
> In RouterRpcServer, getListing function is handled as two parts:
>  # Union all partial listings from destination ns + paths
>  # Append mount points for the dir to be listed
> In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
> (with default value 1k), the batch listing will be used and the startAfter 
> will be used to define the boundary of each batch listing. However, step 2 
> here will add existing mount points, which will mess up with the boundary of 
> the batch, thus making the next batch startAfter wrong.
> The fix is just to append the mount points when there is no more batch query 
> necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15196) RouterRpcServer getListing cannot list large dirs correctly

2020-02-27 Thread Fengnan Li (Jira)
Fengnan Li created HDFS-15196:
-

 Summary: RouterRpcServer getListing cannot list large dirs 
correctly
 Key: HDFS-15196
 URL: https://issues.apache.org/jira/browse/HDFS-15196
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Fengnan Li
Assignee: Fengnan Li


In RouterRpcServer, getListing function is handled as two parts:
 # Union all partial listings from destination ns + paths
 # Append mount points for the dir to be listed

In the case of large dir which is bigger than DFSConfigKeys.DFS_LIST_LIMIT 
(with default value 1k), the batch listing will be used and the startAfter will 
be used to define the boundary of each batch listing. However, step 2 here will 
add existing mount points, which will mess up with the boundary of the batch, 
thus making the next batch startAfter wrong.

The fix is just to append the mount points when there is no more batch query 
necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15111) stopStandbyServices() should log which service state it is transitioning from.

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15111:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Cherrypicked to branch-3.1

> stopStandbyServices() should log which service state it is transitioning from.
> --
>
> Key: HDFS-15111
> URL: https://issues.apache.org/jira/browse/HDFS-15111
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, logging
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie++
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15111.001.patch, HDFS-15111.002.patch, 
> HDFS-15111.003.patch
>
>
> Trying to transition Observer to Standby state. {{stopStandbyServices()}} 
> logs that it is "Stopping services started for standby state". It should be 
> "Stopping services started for observer state"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15111) stopStandbyServices() should log which service state it is transitioning from.

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15111:
---
Fix Version/s: 3.1.4

> stopStandbyServices() should log which service state it is transitioning from.
> --
>
> Key: HDFS-15111
> URL: https://issues.apache.org/jira/browse/HDFS-15111
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, logging
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Assignee: Xieming Li
>Priority: Major
>  Labels: newbie++
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15111.001.patch, HDFS-15111.002.patch, 
> HDFS-15111.003.patch
>
>
> Trying to transition Observer to Standby state. {{stopStandbyServices()}} 
> logs that it is "Stopping services started for standby state". It should be 
> "Stopping services started for observer state"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-02-27 Thread Siddharth Wagle (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047016#comment-17047016
 ] 

Siddharth Wagle commented on HDFS-15154:


09 => rebased 08.

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch, 
> HDFS-15154.09.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-02-27 Thread Siddharth Wagle (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-15154:
---
Attachment: HDFS-15154.09.patch

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch, 
> HDFS-15154.09.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-02-27 Thread Siddharth Wagle (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Wagle updated HDFS-15154:
---
Attachment: HDFS-15154.08.patch

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-02-27 Thread Siddharth Wagle (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047005#comment-17047005
 ] 

Siddharth Wagle commented on HDFS-15154:


08 => Explicity verified the deprecated config still takes effect in the 
absence of new config.

> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15154) Allow only hdfs superusers the ability to assign HDFS storage policies

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047006#comment-17047006
 ] 

Hadoop QA commented on HDFS-15154:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} HDFS-15154 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15154 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994819/HDFS-15154.08.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28863/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Allow only hdfs superusers the ability to assign HDFS storage policies
> --
>
> Key: HDFS-15154
> URL: https://issues.apache.org/jira/browse/HDFS-15154
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.0.0
>Reporter: Bob Cauthen
>Assignee: Siddharth Wagle
>Priority: Major
> Attachments: HDFS-15154.01.patch, HDFS-15154.02.patch, 
> HDFS-15154.03.patch, HDFS-15154.04.patch, HDFS-15154.05.patch, 
> HDFS-15154.06.patch, HDFS-15154.07.patch, HDFS-15154.08.patch
>
>
> Please provide a way to limit only HDFS superusers the ability to assign HDFS 
> Storage Policies to HDFS directories.
> Currently, and based on Jira HDFS-7093, all storage policies can be disabled 
> cluster wide by setting the following:
> dfs.storage.policy.enabled to false
> But we need a way to allow only HDFS superusers the ability to assign an HDFS 
> Storage Policy to an HDFS directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out

2020-02-27 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046990#comment-17046990
 ] 

Ahmed Hussein commented on HDFS-15149:
--

Thanks [~leosun08] for making the changes.
 Thanks [~inigoiri] for helping with the review.
 +1 (non-binding) [^HDFS-15149.005.patch]

> TestDeadNodeDetection test cases time-out
> -
>
> Key: HDFS-15149
> URL: https://issues.apache.org/jira/browse/HDFS-15149
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15149-001.patch, HDFS-15149-002.patch, 
> HDFS-15149-003.patch, HDFS-15149.003.patch, HDFS-15149.004.patch, 
> HDFS-15149.005.patch
>
>
> TestDeadNodeDetection JUnit time out times out with the following stack 
> traces:
> * 1- testDeadNodeDetectionInBackground*
> {code:bash}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection
> [ERROR] 
> testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection)
>   Time elapsed: 125.806 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> Timed out waiting for condition. Thread diagnostics:
> Timestamp: 2020-01-24 08:31:07,023
> "client DomainSocketWatcher" daemon prio=5 tid=117 runnable
> java.lang.Thread.State: RUNNABLE
> at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native 
> Method)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503)
> at java.lang.Thread.run(Thread.java:748)
> "Session-HouseKeeper-48c3205a"  prio=5 tid=350 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at sun.misc.Unsafe.park(Native Method)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty 
> queue]" daemon prio=5 tid=752 in Object.wait()
> java.lang.Thread.State: WAITING (on object monitor)
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> "CacheReplicationMonitor(1960356187)"  prio=5 tid=386 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at sun.misc.Unsafe.park(Native Method)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181)
> "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at java.lang.Object.wait(Native Method)
> at java.util.TimerThread.mainLoop(Timer.java:552)
> at java.util.TimerThread.run(Timer.java:505)
> "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460"
>  daemon prio=5 tid=385 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:4420)
> at java.lang.Thread.run(Thread.java:748)
> "qtp164757726-349" daemon prio=5 tid=349 runnable
> 

[jira] [Commented] (HDFS-15149) TestDeadNodeDetection test cases time-out

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046986#comment-17046986
 ] 

Íñigo Goiri commented on HDFS-15149:


[^HDFS-15149.005.patch] looks good to me.
[~ahussein], further comments?

> TestDeadNodeDetection test cases time-out
> -
>
> Key: HDFS-15149
> URL: https://issues.apache.org/jira/browse/HDFS-15149
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Ahmed Hussein
>Assignee: Lisheng Sun
>Priority: Major
> Attachments: HDFS-15149-001.patch, HDFS-15149-002.patch, 
> HDFS-15149-003.patch, HDFS-15149.003.patch, HDFS-15149.004.patch, 
> HDFS-15149.005.patch
>
>
> TestDeadNodeDetection JUnit time out times out with the following stack 
> traces:
> * 1- testDeadNodeDetectionInBackground*
> {code:bash}
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 264.757 s <<< FAILURE! - in org.apache.hadoop.hdfs.TestDeadNodeDetection
> [ERROR] 
> testDeadNodeDetectionInBackground(org.apache.hadoop.hdfs.TestDeadNodeDetection)
>   Time elapsed: 125.806 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> Timed out waiting for condition. Thread diagnostics:
> Timestamp: 2020-01-24 08:31:07,023
> "client DomainSocketWatcher" daemon prio=5 tid=117 runnable
> java.lang.Thread.State: RUNNABLE
> at org.apache.hadoop.net.unix.DomainSocketWatcher.doPoll0(Native 
> Method)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher.access$900(DomainSocketWatcher.java:52)
> at 
> org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:503)
> at java.lang.Thread.run(Thread.java:748)
> "Session-HouseKeeper-48c3205a"  prio=5 tid=350 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at sun.misc.Unsafe.park(Native Method)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> "java.util.concurrent.ThreadPoolExecutor$Worker@3ae54156[State = -1, empty 
> queue]" daemon prio=5 tid=752 in Object.wait()
> java.lang.Thread.State: WAITING (on object monitor)
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> "CacheReplicationMonitor(1960356187)"  prio=5 tid=386 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at sun.misc.Unsafe.park(Native Method)
> at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:181)
> "Timer for 'NameNode' metrics system" daemon prio=5 tid=339 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at java.lang.Object.wait(Native Method)
> at java.util.TimerThread.mainLoop(Timer.java:552)
> at java.util.TimerThread.run(Timer.java:505)
> "org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber@6b760460"
>  daemon prio=5 tid=385 timed_waiting
> java.lang.Thread.State: TIMED_WAITING
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:4420)
> at java.lang.Thread.run(Thread.java:748)
> "qtp164757726-349" daemon prio=5 tid=349 runnable
> java.lang.Thread.State: RUNNABLE
> at 

[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046968#comment-17046968
 ] 

Ahmed Hussein commented on HDFS-15147:
--

Thanks [~kihwal] for committing the patches.

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046956#comment-17046956
 ] 

Hadoop QA commented on HDFS-15190:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs-httpfs: The 
patch generated 1 new + 485 unchanged - 1 fixed = 486 total (was 486) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
29s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15190 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994812/HDFS-15190.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d526133de7df 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 791270a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28862/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28862/testReport/ |
| Max. process+thread count | 624 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-httpfs U: 
hadoop-hdfs-project/hadoop-hdfs-httpfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28862/console |
| Powered by | Apache 

[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046949#comment-17046949
 ] 

Hadoop QA commented on HDFS-15155:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m  5s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}179m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15155 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994804/HDFS-15155.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 31d4d1425266 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 57aa048 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28860/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28860/testReport/ |
| Max. process+thread count | 2897 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28860/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |



[jira] [Commented] (HDFS-14977) Quota Usage and Content summary are not same in Truncate with Snapshot

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046948#comment-17046948
 ] 

Íñigo Goiri commented on HDFS-14977:


Thanks [~hemanthboyina] for the patch.
Let's add a few comments to the test and make it a little more readable.

> Quota Usage and Content summary are not same in Truncate with Snapshot 
> ---
>
> Key: HDFS-14977
> URL: https://issues.apache.org/jira/browse/HDFS-14977
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14977.001.patch
>
>
> steps : hdfs dfs -mkdir /dir
>            hdfs dfs -put file /dir          (file size = 10bytes)
>            hdfs dfsadmin -allowSnapshot /dir
>            hdfs dfs -createSnapshot /dir s1 
> space consumed with Quotausage and Content Summary is 30bytes
>            hdfs dfs -truncate -w 5 /dir/file
> space consumed with Quotausage , Content Summary is 45 bytes
>            hdfs dfs -deleteSnapshot /dir s1
> space consumed with Quotausage is 45bytes and Content Summary is 15bytes 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046947#comment-17046947
 ] 

Íñigo Goiri commented on HDFS-15190:


It makes sense.
The checkstyle is just for consistency so let's leave it as is.
+1 on  [^HDFS-15190.002.patch].

> HttpFS : Add Support for Storage Policy Satisfier 
> --
>
> Key: HDFS-15190
> URL: https://issues.apache.org/jira/browse/HDFS-15190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15190.001.patch, HDFS-15190.002.patch
>
>
> Add support for SPS in httpfs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14977) Quota Usage and Content summary are not same in Truncate with Snapshot

2020-02-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046932#comment-17046932
 ] 

Hadoop QA commented on HDFS-14977:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 37s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 19 unchanged - 0 fixed = 22 total (was 19) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 22s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}154m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.diskbalancer.command.TestDiskBalancerCommand |
|   | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.6 Server=19.03.6 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-14977 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12994803/HDFS-14977.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 9905a75c30e8 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 57aa048 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28861/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28861/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-27 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046923#comment-17046923
 ] 

Hudson commented on HDFS-15186:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18006 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18006/])
HDFS-15186. Erasure Coding: Decommission may generate the parity block's 
(ayushsaxena: rev 429da635ec70f9abe5ab71e24c9f2eec0aa36e18)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommissionWithStriped.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java


> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch, 
> HDFS-15186.003.patch, HDFS-15186.004.patch, HDFS-15186.005.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14892) Close the output stream if createWrappedOutputStream() fails

2020-02-27 Thread Xiaoyu Yao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046910#comment-17046910
 ] 

Xiaoyu Yao commented on HDFS-14892:
---

Thanks [~weichiu] for linking the issues. Looks like the same issue has been 
reported by [~kihwal] here. 

> Close the output stream if createWrappedOutputStream() fails
> 
>
> Key: HDFS-14892
> URL: https://issues.apache.org/jira/browse/HDFS-14892
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Reporter: Kihwal Lee
>Priority: Major
>
> create() in an encryption zone is a two step process by the client. First, a 
> regular FSOutputStream is created and then it is wrapped with an encrypted 
> stream.  When there is a system issue or a KMS ACL-based denial, the second 
> phase will fail. If the client terminates right away, the shutdown hook 
> closes the output stream opened in the first phase.  But if the client lives 
> on, the output stream will leak.
> Datanode's WebHdfsHandler, DFSClient, DistributedFileSystem, Hdfs 
> (FileContext) and RpcProgramNfs3 do this.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-27 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15186:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch, 
> HDFS-15186.003.patch, HDFS-15186.004.patch, HDFS-15186.005.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046900#comment-17046900
 ] 

Ayush Saxena commented on HDFS-15186:
-

Committed to trunk.
Thanx [~yaoguangdong] for the contributon, [~ferhui] and [~weichiu] for the 
reveiws!!!

Will cherry-pick this post HDFS-14768

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch, 
> HDFS-15186.003.patch, HDFS-15186.004.patch, HDFS-15186.005.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15186) Erasure Coding: Decommission may generate the parity block's content with all 0 in some case

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046896#comment-17046896
 ] 

Ayush Saxena commented on HDFS-15186:
-

v005 LGTM +1

> Erasure Coding: Decommission may generate the parity block's content with all 
> 0 in some case
> 
>
> Key: HDFS-15186
> URL: https://issues.apache.org/jira/browse/HDFS-15186
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, erasure-coding
>Affects Versions: 3.0.3, 3.2.1, 3.1.3
>Reporter: Yao Guangdong
>Assignee: Yao Guangdong
>Priority: Critical
> Attachments: HDFS-15186.001.patch, HDFS-15186.002.patch, 
> HDFS-15186.003.patch, HDFS-15186.004.patch, HDFS-15186.005.patch
>
>
> I can find some parity block's content with all 0 when i decommission some 
> DataNode(more than 1) from a cluster. And the probability is very big(parts 
> per thousand).This is a big problem.You can think that if we read data from 
> the zero parity block or use the zero parity block to recover a block which 
> can make us use the error data even we don't know it.
> There is some case in the below:
> B: Busy DataNode, 
> D:Decommissioning DataNode,
> Others is normal.
> 1.Group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 2.Group indices is [0(B,D), 1, 2, 3, 4, 5, 6(B,D), 7, 8(D)].
> 
> In the first case when the block group indices is [0, 1, 2, 3, 4, 5, 6(B,D), 
> 7, 8(D)], the DN may received reconstruct block command and the 
> liveIndices=[0, 1, 2, 3, 4, 5, 7, 8] and the targets's(the field which  in 
> the class StripedReconstructionInfo) length is 2. 
> The targets's length is 2 which mean that the DataNode need recover 2 
> internal block in current code.But from the liveIndices we only can find 1 
> missing block, so the method StripedWriter#initTargetIndices will use 0 as 
> the default recover block and don't care the indices 0 is in the sources 
> indices or not.
> When they use sources indices [0, 1, 2, 3, 4, 5] to recover indices [6, 0] 
> use the ec algorithm.We can find that the indices [0] is in the both the 
> sources indices and the targets indices in this case. The returned target 
> buffer in the indices [6] is always 0 from the ec  algorithm.So I think this 
> is the ec algorithm's problem. Because it should more fault tolerance.I try 
> to fixed it .But it is too hard. Because the case is too more. The second is 
> another case in the example above(use sources indices [1, 2, 3, 4, 5, 7] to 
> recover indices [0, 6, 0]). So I changed my mind.Invoke the ec  algorithm 
> with a correct parameters. Which mean that remove the duplicate target 
> indices 0 in this case.Finally, I fixed it in this way.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-27 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15190:
-
Attachment: HDFS-15190.002.patch

> HttpFS : Add Support for Storage Policy Satisfier 
> --
>
> Key: HDFS-15190
> URL: https://issues.apache.org/jira/browse/HDFS-15190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15190.001.patch, HDFS-15190.002.patch
>
>
> Add support for SPS in httpfs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046836#comment-17046836
 ] 

Íñigo Goiri commented on HDFS-15155:


The test is not checking anything, can we add there some meaningful asserts?

> writeIoRate of DataNodeVolumeMetrics is never used
> --
>
> Key: HDFS-15155
> URL: https://issues.apache.org/jira/browse/HDFS-15155
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch
>
>
> There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is 
> never used and syncIoRate should be replaced by writeIoRate in the following 
> code:
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15195) In place namenode federation

2020-02-27 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15195:
---
Summary: In place namenode federation  (was: In place namenode fedaration)

> In place namenode federation
> 
>
> Key: HDFS-15195
> URL: https://issues.apache.org/jira/browse/HDFS-15195
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Amithsha
>Priority: Major
>
> In the current scenario federating the existing data is not possible. This 
> impacts the implementation of HDFS federation on the production cluster with 
> more than PB of data. Because we need to copy the data from the old set of 
> namenodes to the new set of namenodes. From the data node directory structure 
> its clear that if we move the blocks of particular data from namenode_set_1 
> dir (dfs/data/current/BP-xxx) to namenode_set_2 dir (dfs/data/current/BP-yyy) 
> will solve the issue. Why can’t we make this us a new future where it will 
> ask for dir to get federated and stop the write process until move completes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15195) In place namenode fedaration

2020-02-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046829#comment-17046829
 ] 

Kihwal Lee commented on HDFS-15195:
---

There were some discussions in the past including HDFS-7702. This will be a 
nice feature.

> In place namenode fedaration
> 
>
> Key: HDFS-15195
> URL: https://issues.apache.org/jira/browse/HDFS-15195
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Amithsha
>Priority: Major
>
> In the current scenario federating the existing data is not possible. This 
> impacts the implementation of HDFS federation on the production cluster with 
> more than PB of data. Because we need to copy the data from the old set of 
> namenodes to the new set of namenodes. From the data node directory structure 
> its clear that if we move the blocks of particular data from namenode_set_1 
> dir (dfs/data/current/BP-xxx) to namenode_set_2 dir (dfs/data/current/BP-yyy) 
> will solve the issue. Why can’t we make this us a new future where it will 
> ask for dir to get federated and stop the write process until move completes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15195) In place namenode fedaration

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046826#comment-17046826
 ] 

Ayush Saxena commented on HDFS-15195:
-

Do you tend to change the BlockPool of the block to move the data to another 
namenode?
If so? It isn't that simple as just moving the block data from one directory to 
another in the namenode. for the other namenode to recognize,  The metadata 
needs to present too at the namenode.
HDFS-2139 had some interesting stuff, It used hard links to move blocks in same 
datanode. May be you can give a check.
If you tend to have some design work done, do share the design document

> In place namenode fedaration
> 
>
> Key: HDFS-15195
> URL: https://issues.apache.org/jira/browse/HDFS-15195
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Amithsha
>Priority: Major
>
> In the current scenario federating the existing data is not possible. This 
> impacts the implementation of HDFS federation on the production cluster with 
> more than PB of data. Because we need to copy the data from the old set of 
> namenodes to the new set of namenodes. From the data node directory structure 
> its clear that if we move the blocks of particular data from namenode_set_1 
> dir (dfs/data/current/BP-xxx) to namenode_set_2 dir (dfs/data/current/BP-yyy) 
> will solve the issue. Why can’t we make this us a new future where it will 
> ask for dir to get federated and stop the write process until move completes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15124) Crashing bugs in NameNode when using a valid configuration for `dfs.namenode.audit.loggers`

2020-02-27 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046823#comment-17046823
 ] 

Hudson commented on HDFS-15124:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18004 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18004/])
HDFS-15124. Crashing bugs in NameNode when using a valid configuration 
(ayushsaxena: rev cd2c6b1aac470991b9b90339ce2721ba179e7c48)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/top/TopAuditLogger.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSNamesystem.java


> Crashing bugs in NameNode when using a valid configuration for 
> `dfs.namenode.audit.loggers`
> ---
>
> Key: HDFS-15124
> URL: https://issues.apache.org/jira/browse/HDFS-15124
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-15124.000.patch, HDFS-15124.001.patch, 
> HDFS-15124.002.patch, HDFS-15124.003.patch, HDFS-15124.004.patch, 
> HDFS-15124.005.patch, HDFS-15124.006.patch
>
>
> I am using Hadoop-2.10.0.
> The configuration parameter `dfs.namenode.audit.loggers` allows `default` 
> (which is the default value) and 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`.
> When I use `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> namenode will not be started successfully because of an 
> `InstantiationException` thrown from 
> `org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers`. 
> The root cause is that while initializing namenode, `initAuditLoggers` will 
> be called and it will try to call the default constructor of 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger` which doesn't 
> have a default constructor. Thus the `InstantiationException` exception is 
> thrown.
>  
> *Symptom*
> *$ ./start-dfs.sh*
> {code:java}
> 2019-12-18 14:05:20,670 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.java.lang.RuntimeException: 
> java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1024)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:858)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:677)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:674)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:736)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:961)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1714)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1782)
> Caused by: java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at java.lang.Class.newInstance(Class.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1017)...
> 8 more
> Caused by: java.lang.NoSuchMethodException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger.()
> at java.lang.Class.getConstructor0(Class.java:3082)
> at java.lang.Class.newInstance(Class.java:412)
> ... 9 more{code}
>  
>  
> *Detailed Root Cause*
> There is no default constructor in 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`: 
> {code:java}
> /** 
>  * An {@link AuditLogger} that sends logged data directly to the metrics 
>  * systems. It is used when the top service is used directly by the name node 
>  */ 
> @InterfaceAudience.Private 
> public class TopAuditLogger implements AuditLogger { 
>   public static finalLogger LOG = 
> LoggerFactory.getLogger(TopAuditLogger.class); 
>   private final TopMetrics topMetrics; 
>   public TopAuditLogger(TopMetrics topMetrics) {
> Preconditions.checkNotNull(topMetrics, "Cannot init with a null " + 
> "TopMetrics");
> this.topMetrics = topMetrics; 
>   }
>   @Override
>   public void initialize(Configuration conf) { 
>   }
> {code}
> As long as the configuration parameter `dfs.namenode.audit.loggers` is set to 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> `initAuditLoggers` will try to call its default constructor to make a new 
> instance: 
> {code:java}
> private List 

[jira] [Commented] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046809#comment-17046809
 ] 

Haibin Huang commented on HDFS-15155:
-

[~elgoiri]  [~ayushtkn] , i have updated the patch, can you take a look at it, 
sorry for taking so long

> writeIoRate of DataNodeVolumeMetrics is never used
> --
>
> Key: HDFS-15155
> URL: https://issues.apache.org/jira/browse/HDFS-15155
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch
>
>
> There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is 
> never used and syncIoRate should be replaced by writeIoRate in the following 
> code:
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15124) Crashing bugs in NameNode when using a valid configuration for `dfs.namenode.audit.loggers`

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046803#comment-17046803
 ] 

Ayush Saxena commented on HDFS-15124:
-

Committed to trunk.
Thanx [~ctest.team] for the contribution and [~elgoiri] for the review!!!

> Crashing bugs in NameNode when using a valid configuration for 
> `dfs.namenode.audit.loggers`
> ---
>
> Key: HDFS-15124
> URL: https://issues.apache.org/jira/browse/HDFS-15124
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
> Attachments: HDFS-15124.000.patch, HDFS-15124.001.patch, 
> HDFS-15124.002.patch, HDFS-15124.003.patch, HDFS-15124.004.patch, 
> HDFS-15124.005.patch, HDFS-15124.006.patch
>
>
> I am using Hadoop-2.10.0.
> The configuration parameter `dfs.namenode.audit.loggers` allows `default` 
> (which is the default value) and 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`.
> When I use `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> namenode will not be started successfully because of an 
> `InstantiationException` thrown from 
> `org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers`. 
> The root cause is that while initializing namenode, `initAuditLoggers` will 
> be called and it will try to call the default constructor of 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger` which doesn't 
> have a default constructor. Thus the `InstantiationException` exception is 
> thrown.
>  
> *Symptom*
> *$ ./start-dfs.sh*
> {code:java}
> 2019-12-18 14:05:20,670 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.java.lang.RuntimeException: 
> java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1024)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:858)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:677)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:674)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:736)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:961)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1714)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1782)
> Caused by: java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at java.lang.Class.newInstance(Class.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1017)...
> 8 more
> Caused by: java.lang.NoSuchMethodException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger.()
> at java.lang.Class.getConstructor0(Class.java:3082)
> at java.lang.Class.newInstance(Class.java:412)
> ... 9 more{code}
>  
>  
> *Detailed Root Cause*
> There is no default constructor in 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`: 
> {code:java}
> /** 
>  * An {@link AuditLogger} that sends logged data directly to the metrics 
>  * systems. It is used when the top service is used directly by the name node 
>  */ 
> @InterfaceAudience.Private 
> public class TopAuditLogger implements AuditLogger { 
>   public static finalLogger LOG = 
> LoggerFactory.getLogger(TopAuditLogger.class); 
>   private final TopMetrics topMetrics; 
>   public TopAuditLogger(TopMetrics topMetrics) {
> Preconditions.checkNotNull(topMetrics, "Cannot init with a null " + 
> "TopMetrics");
> this.topMetrics = topMetrics; 
>   }
>   @Override
>   public void initialize(Configuration conf) { 
>   }
> {code}
> As long as the configuration parameter `dfs.namenode.audit.loggers` is set to 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> `initAuditLoggers` will try to call its default constructor to make a new 
> instance: 
> {code:java}
> private List initAuditLoggers(Configuration conf) {
>   // Initialize the custom access loggers if configured.
>   Collection alClasses =
>       conf.getTrimmedStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);
>   List auditLoggers = Lists.newArrayList();
>   if (alClasses != null && !alClasses.isEmpty()) {
>     for (String className : alClasses) {
>       try {
>         AuditLogger logger;
>         if (DFS_NAMENODE_DEFAULT_AUDIT_LOGGER_NAME.equals(className)) {
>           logger = new DefaultAuditLogger();
>         } else {
>           logger = 

[jira] [Updated] (HDFS-15124) Crashing bugs in NameNode when using a valid configuration for `dfs.namenode.audit.loggers`

2020-02-27 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-15124:

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Crashing bugs in NameNode when using a valid configuration for 
> `dfs.namenode.audit.loggers`
> ---
>
> Key: HDFS-15124
> URL: https://issues.apache.org/jira/browse/HDFS-15124
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
> Fix For: 3.3.0
>
> Attachments: HDFS-15124.000.patch, HDFS-15124.001.patch, 
> HDFS-15124.002.patch, HDFS-15124.003.patch, HDFS-15124.004.patch, 
> HDFS-15124.005.patch, HDFS-15124.006.patch
>
>
> I am using Hadoop-2.10.0.
> The configuration parameter `dfs.namenode.audit.loggers` allows `default` 
> (which is the default value) and 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`.
> When I use `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> namenode will not be started successfully because of an 
> `InstantiationException` thrown from 
> `org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers`. 
> The root cause is that while initializing namenode, `initAuditLoggers` will 
> be called and it will try to call the default constructor of 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger` which doesn't 
> have a default constructor. Thus the `InstantiationException` exception is 
> thrown.
>  
> *Symptom*
> *$ ./start-dfs.sh*
> {code:java}
> 2019-12-18 14:05:20,670 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.java.lang.RuntimeException: 
> java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1024)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:858)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:677)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:674)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:736)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:961)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1714)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1782)
> Caused by: java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at java.lang.Class.newInstance(Class.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1017)...
> 8 more
> Caused by: java.lang.NoSuchMethodException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger.()
> at java.lang.Class.getConstructor0(Class.java:3082)
> at java.lang.Class.newInstance(Class.java:412)
> ... 9 more{code}
>  
>  
> *Detailed Root Cause*
> There is no default constructor in 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`: 
> {code:java}
> /** 
>  * An {@link AuditLogger} that sends logged data directly to the metrics 
>  * systems. It is used when the top service is used directly by the name node 
>  */ 
> @InterfaceAudience.Private 
> public class TopAuditLogger implements AuditLogger { 
>   public static finalLogger LOG = 
> LoggerFactory.getLogger(TopAuditLogger.class); 
>   private final TopMetrics topMetrics; 
>   public TopAuditLogger(TopMetrics topMetrics) {
> Preconditions.checkNotNull(topMetrics, "Cannot init with a null " + 
> "TopMetrics");
> this.topMetrics = topMetrics; 
>   }
>   @Override
>   public void initialize(Configuration conf) { 
>   }
> {code}
> As long as the configuration parameter `dfs.namenode.audit.loggers` is set to 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> `initAuditLoggers` will try to call its default constructor to make a new 
> instance: 
> {code:java}
> private List initAuditLoggers(Configuration conf) {
>   // Initialize the custom access loggers if configured.
>   Collection alClasses =
>       conf.getTrimmedStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);
>   List auditLoggers = Lists.newArrayList();
>   if (alClasses != null && !alClasses.isEmpty()) {
>     for (String className : alClasses) {
>       try {
>         AuditLogger logger;
>         if (DFS_NAMENODE_DEFAULT_AUDIT_LOGGER_NAME.equals(className)) {
>           logger = new DefaultAuditLogger();
>         } else {
>     

[jira] [Commented] (HDFS-15190) HttpFS : Add Support for Storage Policy Satisfier

2020-02-27 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046797#comment-17046797
 ] 

hemanthboyina commented on HDFS-15190:
--

thanks for the comment [~elgoiri]

in our cluster , webhdfs with httpfs  we have got this exception
{code:java}
org.apache.hadoop.ipc.RemoteException(com.sun.jersey.api.ParamException$QueryParamException):
 java.lang.IllegalArgumentException: No enum constant   
org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.SATISFYSTORAGEPOLICYorg.apache.hadoop.ipc.RemoteException(com.sun.jersey.api.ParamException$QueryParamException):
 java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.Operation.SATISFYSTORAGEPOLICY
 at 
org.apache.hadoop.hdfs.web.JsonUtilClient.toRemoteException(JsonUtilClient.java:89)
 at or{code}

> HttpFS : Add Support for Storage Policy Satisfier 
> --
>
> Key: HDFS-15190
> URL: https://issues.apache.org/jira/browse/HDFS-15190
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15190.001.patch
>
>
> Add support for SPS in httpfs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15195) In place namenode fedaration

2020-02-27 Thread Amithsha (Jira)
Amithsha created HDFS-15195:
---

 Summary: In place namenode fedaration
 Key: HDFS-15195
 URL: https://issues.apache.org/jira/browse/HDFS-15195
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Amithsha


In the current scenario federating the existing data is not possible. This 
impacts the implementation of HDFS federation on the production cluster with 
more than PB of data. Because we need to copy the data from the old set of 
namenodes to the new set of namenodes. From the data node directory structure 
its clear that if we move the blocks of particular data from namenode_set_1 dir 
(dfs/data/current/BP-xxx) to namenode_set_2 dir (dfs/data/current/BP-yyy) will 
solve the issue. Why can’t we make this us a new future where it will ask for 
dir to get federated and stop the write process until move completes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15155) writeIoRate of DataNodeVolumeMetrics is never used

2020-02-27 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-15155:

Attachment: HDFS-15155.002.patch

> writeIoRate of DataNodeVolumeMetrics is never used
> --
>
> Key: HDFS-15155
> URL: https://issues.apache.org/jira/browse/HDFS-15155
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-15155.001.patch, HDFS-15155.002.patch
>
>
> There is some incorrect object using in DataNodeVolumeMetrics, writeIoRate is 
> never used and syncIoRate should be replaced by writeIoRate in the following 
> code:
> {code:java}
> // Based on writeIoRate
> public long getWriteIoSampleCount() {
>   return syncIoRate.lastStat().numSamples();
> }
> public double getWriteIoMean() {
>   return syncIoRate.lastStat().mean();
> }
> public double getWriteIoStdDev() {
>   return syncIoRate.lastStat().stddev();
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14977) Quota Usage and Content summary are not same in Truncate with Snapshot

2020-02-27 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-14977:
-
Attachment: HDFS-14977.001.patch
Status: Patch Available  (was: Open)

> Quota Usage and Content summary are not same in Truncate with Snapshot 
> ---
>
> Key: HDFS-14977
> URL: https://issues.apache.org/jira/browse/HDFS-14977
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-14977.001.patch
>
>
> steps : hdfs dfs -mkdir /dir
>            hdfs dfs -put file /dir          (file size = 10bytes)
>            hdfs dfsadmin -allowSnapshot /dir
>            hdfs dfs -createSnapshot /dir s1 
> space consumed with Quotausage and Content Summary is 30bytes
>            hdfs dfs -truncate -w 5 /dir/file
> space consumed with Quotausage , Content Summary is 45 bytes
>            hdfs dfs -deleteSnapshot /dir s1
> space consumed with Quotausage is 45bytes and Content Summary is 15bytes 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15124) Crashing bugs in NameNode when using a valid configuration for `dfs.namenode.audit.loggers`

2020-02-27 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046791#comment-17046791
 ] 

Ayush Saxena commented on HDFS-15124:
-

v006 LGTM +1

> Crashing bugs in NameNode when using a valid configuration for 
> `dfs.namenode.audit.loggers`
> ---
>
> Key: HDFS-15124
> URL: https://issues.apache.org/jira/browse/HDFS-15124
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Ctest
>Assignee: Ctest
>Priority: Critical
> Attachments: HDFS-15124.000.patch, HDFS-15124.001.patch, 
> HDFS-15124.002.patch, HDFS-15124.003.patch, HDFS-15124.004.patch, 
> HDFS-15124.005.patch, HDFS-15124.006.patch
>
>
> I am using Hadoop-2.10.0.
> The configuration parameter `dfs.namenode.audit.loggers` allows `default` 
> (which is the default value) and 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`.
> When I use `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> namenode will not be started successfully because of an 
> `InstantiationException` thrown from 
> `org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers`. 
> The root cause is that while initializing namenode, `initAuditLoggers` will 
> be called and it will try to call the default constructor of 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger` which doesn't 
> have a default constructor. Thus the `InstantiationException` exception is 
> thrown.
>  
> *Symptom*
> *$ ./start-dfs.sh*
> {code:java}
> 2019-12-18 14:05:20,670 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.java.lang.RuntimeException: 
> java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1024)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:858)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:677)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:674)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:736)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:961)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1714)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1782)
> Caused by: java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger
> at java.lang.Class.newInstance(Class.java:427)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1017)...
> 8 more
> Caused by: java.lang.NoSuchMethodException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger.()
> at java.lang.Class.getConstructor0(Class.java:3082)
> at java.lang.Class.newInstance(Class.java:412)
> ... 9 more{code}
>  
>  
> *Detailed Root Cause*
> There is no default constructor in 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`: 
> {code:java}
> /** 
>  * An {@link AuditLogger} that sends logged data directly to the metrics 
>  * systems. It is used when the top service is used directly by the name node 
>  */ 
> @InterfaceAudience.Private 
> public class TopAuditLogger implements AuditLogger { 
>   public static finalLogger LOG = 
> LoggerFactory.getLogger(TopAuditLogger.class); 
>   private final TopMetrics topMetrics; 
>   public TopAuditLogger(TopMetrics topMetrics) {
> Preconditions.checkNotNull(topMetrics, "Cannot init with a null " + 
> "TopMetrics");
> this.topMetrics = topMetrics; 
>   }
>   @Override
>   public void initialize(Configuration conf) { 
>   }
> {code}
> As long as the configuration parameter `dfs.namenode.audit.loggers` is set to 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> `initAuditLoggers` will try to call its default constructor to make a new 
> instance: 
> {code:java}
> private List initAuditLoggers(Configuration conf) {
>   // Initialize the custom access loggers if configured.
>   Collection alClasses =
>       conf.getTrimmedStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);
>   List auditLoggers = Lists.newArrayList();
>   if (alClasses != null && !alClasses.isEmpty()) {
>     for (String className : alClasses) {
>       try {
>         AuditLogger logger;
>         if (DFS_NAMENODE_DEFAULT_AUDIT_LOGGER_NAME.equals(className)) {
>           logger = new DefaultAuditLogger();
>         } else {
>           logger = (AuditLogger) Class.forName(className).newInstance();
>         }
>         

[jira] [Commented] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046783#comment-17046783
 ] 

Kihwal Lee commented on HDFS-15147:
---

It has been committed to trunk to branch-2.10. Thanks for working on the patch, 
Amed. Thanks for the review, [~elgoiri].

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2020-02-27 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046761#comment-17046761
 ] 

Hudson commented on HDFS-14668:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18003 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18003/])
HDFS-14668 Support Fuse with Users from multiple Security Realms (#1739) 
(github: rev 57aa048516f5c5fe02441d213b52ce1bbeddf823)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/fuse-dfs/fuse_connect.c


> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Critical
>  Labels: regression
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
>
> UPDATE:
> See 
> [this|https://issues.apache.org/jira/browse/HDFS-14668?focusedCommentId=16979466=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16979466]
>  comment for the complete description of what is happening here.
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-14668) Support Fuse with Users from multiple Security Realms

2020-02-27 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-14668.

Fix Version/s: 3.2.2
   3.1.4
   3.3.0
   Resolution: Fixed

Thanks [~pifta]!

> Support Fuse with Users from multiple Security Realms
> -
>
> Key: HDFS-14668
> URL: https://issues.apache.org/jira/browse/HDFS-14668
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 3.1.0, 3.0.3
>Reporter: Sailesh Patel
>Assignee: Istvan Fajth
>Priority: Critical
>  Labels: regression
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
>
> UPDATE:
> See 
> [this|https://issues.apache.org/jira/browse/HDFS-14668?focusedCommentId=16979466=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16979466]
>  comment for the complete description of what is happening here.
> Users from non-default  krb5 domain can't use hadoop-fuse.
> There are 2 Realms with kdc. 
> -one realm is for human users  (USERS.COM.US) 
> -the other is for service principals.   (SERVICE.COM.US) 
> Cross realm trust is setup.
> In krb5.conf  the default domain  is set to SERVICE.COM.US
> Users within USERS.COM.US Realm are not able to put any files to Fuse mounted 
> location
> The client shows:
>   cp: cannot create regular file ‘/hdfs_mount/tmp/hello_from_fuse.txt’: 
> Input/output error



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Kihwal Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-15147.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Kihwal Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-15147:
--
Fix Version/s: 2.10.1

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15147) LazyPersistTestCase wait logic is error-prone

2020-02-27 Thread Kihwal Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-15147:
--
Fix Version/s: 3.1.4

> LazyPersistTestCase wait logic is error-prone
> -
>
> Key: HDFS-15147
> URL: https://issues.apache.org/jira/browse/HDFS-15147
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Minor
> Fix For: 3.3.0, 3.1.4, 3.2.2
>
> Attachments: HDFS-15147-branch-2.10.001.patch, 
> HDFS-15147-branch-3.2.001.patch, HDFS-15147.001.patch, HDFS-15147.002.patch, 
> HDFS-15147.003.patch
>
>
> {{LazyPersistTestCase}} has some issues hat lead to inconsistent result of 
> the test cases:
> * the wait periods to change of status is too long. It reaches 10 secs in 
> some cases.
> * triggerBlockReport() only triggers FBR of DN with index 0. This is counter 
> intuitive because the JUnit tests restart the DN assuming that the restarted 
> DN will send a FBR. However, this never happens because the DN will get a new 
> index post restart.
> {code:java}
>   protected final void triggerBlockReport()
>   throws IOException, InterruptedException {
> // Trigger block report to NN
> DataNodeTestUtils.triggerBlockReport(cluster.getDataNodes().get(0));
> Thread.sleep(10 * 1000);
>   }
> {code}
> [~inigoiri] suggested that we propagate the findings and fixes from 
> HDFS-13179 and HDFS-15144 into {{LazyPersistTestCase.java}}. This will 
> eventually reduce the runtime and make the test cases more stable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >