[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2020-03-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056656#comment-17056656
 ] 

Hadoop QA commented on HDFS-14612:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}105m 
51s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}170m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-14612 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12996364/HDFS-14612-008.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8a8a6f98e468 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cf9cf83 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28925/testReport/ |
| Max. process+thread count | 2833 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28925/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612

[jira] [Commented] (HDFS-15211) EC: File write hangs during close in case of Exception during updatePipeline

2020-03-10 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056652#comment-17056652
 ] 

Ayush Saxena commented on HDFS-15211:
-

Test failures not related.
[~brahmareddy] [~surendrasingh] can you help review?

> EC: File write hangs during close in case of Exception during updatePipeline
> 
>
> Key: HDFS-15211
> URL: https://issues.apache.org/jira/browse/HDFS-15211
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.1, 3.3.0, 3.2.1
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
> Attachments: HDFS-15211-01.patch, HDFS-15211-02.patch, 
> HDFS-15211-03.patch, TestToRepro-01.patch, Thread-Dump, Thread-Dump-02
>
>
> Ec file write hangs during file close, if there is a exception due to closure 
> of slow stream, and number of data streamers failed increase more than parity 
> block.
> Since in the close, the Stream will try to flush all the healthy streamers, 
> but the streamers won't be having any result due to exception. and the 
> streamers will stay stuck.
> Hence the close will also get stuck.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-10 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056635#comment-17056635
 ] 

Xiaoqiao He commented on HDFS-15113:


Thanks [~brahmareddy],[~elgoiri] for your reviews.
[~weichiu] Would you have bandwidth to review or commit it? Thanks.

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15215) The Timestamp for longest write/read lock held log is wrong

2020-03-10 Thread Toshihiro Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056609#comment-17056609
 ] 

Toshihiro Suzuki commented on HDFS-15215:
-

[~elgoiri] Thank you for reviewing this. I just force-pushed a new commit which 
modified the existing test to verify the fix.

> The Timestamp for longest write/read lock held log is wrong
> ---
>
> Key: HDFS-15215
> URL: https://issues.apache.org/jira/browse/HDFS-15215
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
>
> I found the Timestamp for longest write/read lock held log is wrong in trunk:
> {code}
> 2020-03-10 16:01:26,585 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(281)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 1970-01-03 07:07:40,841+0900 for 3ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> ...
> {code}
> Looking at the code, it looks like the timestamp comes from System.nanoTime() 
> that returns the current value of the running Java Virtual Machine's 
> high-resolution time source and this method can only be used to measure 
> elapsed time:
> https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--
> We need to make the timestamp from System.currentTimeMillis().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2020-03-10 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056599#comment-17056599
 ] 

Haibin Huang commented on HDFS-14612:
-

[~elgoiri]  [~hanishakoneru] , I have updated the test part, can you take a 
look please?

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, 
> HDFS-14612-006.patch, HDFS-14612-007.patch, HDFS-14612-008.patch, 
> HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx alway valid. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2020-03-10 Thread Haibin Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibin Huang updated HDFS-14612:

Attachment: HDFS-14612-008.patch

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, 
> HDFS-14612-006.patch, HDFS-14612-007.patch, HDFS-14612-008.patch, 
> HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx alway valid. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2020-03-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056493#comment-17056493
 ] 

Hadoop QA commented on HDFS-6874:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | {color:orange} hadoop-hdfs-project: The patch generated 3 new + 
541 unchanged - 1 fixed = 544 total (was 542) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
15s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 42s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  4m 49s{color} 
| {color:red} hadoop-hdfs-httpfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
35s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}211m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.fs.http.client.TestHttpFSFWithSWebhdfsFileSystem |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-6874 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12996340/HDFS-6874.011.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4c392da7fa47 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| 

[jira] [Commented] (HDFS-15210) EC : File write hanged when DN is shutdown by admin command.

2020-03-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056458#comment-17056458
 ] 

Hadoop QA commented on HDFS-15210:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
56s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 11m 
28s{color} | {color:red} root in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m  
4s{color} | {color:red} hadoop-hdfs-project in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs-client in trunk failed. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-hdfs in trunk failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-hdfs-client in trunk failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
31s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-hdfs-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m  
2s{color} | {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m  2s{color} 
| {color:red} hadoop-hdfs-project in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} The patch fails to run checkstyle in 
hadoop-hdfs-project {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 29 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
1s{color} | {color:red} The patch 400 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-client generated 100 
new + 0 unchanged - 0 fixed = 100 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 25s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
57s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
32s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}182m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.server.namenode.TestDeadDatanode |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15210 |
| 

[jira] [Commented] (HDFS-15214) WebHDFS: Add snapshot counts to Content Summary

2020-03-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056420#comment-17056420
 ] 

Hadoop QA commented on HDFS-15214:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 67 
unchanged - 1 fixed = 67 total (was 68) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
13s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 41s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 2s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}192m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestBlockTokenWrappingQOP |
|   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
|   | hadoop.hdfs.tools.TestDFSAdminWithHA |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestPmemCacheRecovery |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles |
|   | hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer |
|   | hadoop.hdfs.server.balancer.TestBalancerRPCDelay |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.TestDecommissionWithStripedBackoffMonitor |
|   | 

[jira] [Commented] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056419#comment-17056419
 ] 

Hadoop QA commented on HDFS-15216:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 21s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 19 unchanged - 1 fixed = 19 total (was 20) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}131m 53s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}202m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestAuditLogs |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.server.namenode.TestAddBlockRetry |
|   | hadoop.hdfs.TestMultipleNNPortQOP |
|   | hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier |
|   | hadoop.hdfs.server.namenode.TestNameNodeRpcServerMethods |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | hadoop.hdfs.TestDFSStripedOutputStream |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15216 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12996314/HDFS-15216.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c3371219c8a2 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build 

[jira] [Commented] (HDFS-15218) RBF: MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056340#comment-17056340
 ] 

Hadoop QA commented on HDFS-15218:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
37s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15218 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12996336/HDFS-15218.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f15cd1865535 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cf9cf83 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28922/testReport/ |
| Max. process+thread count | 2779 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28922/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: MountTableRefresherService fail in 

[jira] [Updated] (HDFS-6874) Add GETFILEBLOCKLOCATIONS operation to HttpFS

2020-03-10 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-6874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-6874:

Attachment: HDFS-6874.011.patch

> Add GETFILEBLOCKLOCATIONS operation to HttpFS
> -
>
> Key: HDFS-6874
> URL: https://issues.apache.org/jira/browse/HDFS-6874
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: httpfs
>Affects Versions: 2.4.1, 2.7.3
>Reporter: Gao Zhong Liang
>Assignee: Weiwei Yang
>Priority: Major
>  Labels: BB2015-05-TBR
> Attachments: HDFS-6874-1.patch, HDFS-6874-branch-2.6.0.patch, 
> HDFS-6874.011.patch, HDFS-6874.02.patch, HDFS-6874.03.patch, 
> HDFS-6874.04.patch, HDFS-6874.05.patch, HDFS-6874.06.patch, 
> HDFS-6874.07.patch, HDFS-6874.08.patch, HDFS-6874.09.patch, 
> HDFS-6874.10.patch, HDFS-6874.patch
>
>
> GETFILEBLOCKLOCATIONS operation is missing in HttpFS, which is already 
> supported in WebHDFS.  For the request of GETFILEBLOCKLOCATIONS in 
> org.apache.hadoop.fs.http.server.HttpFSServer, BAD_REQUEST is returned so far:
> ...
>  case GETFILEBLOCKLOCATIONS: {
> response = Response.status(Response.Status.BAD_REQUEST).build();
> break;
>   }
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15198) RBF: In Secure Mode, Router can't refresh other router's mountTableEntries

2020-03-10 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056291#comment-17056291
 ] 

Surendra Singh Lilhore commented on HDFS-15198:
---

Thanks [~zhengchenyu]  for patch.

We can't change the RouterClient.
{code:java}
-this.ugi = UserGroupInformation.getCurrentUser();
+if (UserGroupInformation.isSecurityEnabled()) {
+  this.ugi = UserGroupInformation.getLoginUser();
+} else {
+  this.ugi = UserGroupInformation.getCurrentUser();
+} {code}

It is used in RouterAdmin also and there it should be currentUser() only.  
Please refer DFSAdmin.java

> RBF: In Secure Mode, Router can't refresh other router's mountTableEntries
> --
>
> Key: HDFS-15198
> URL: https://issues.apache.org/jira/browse/HDFS-15198
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Reporter: zhengchenyu
>Assignee: zhengchenyu
>Priority: Major
> Attachments: HDFS-15198.001.patch, HDFS-15198.002.patch, 
> HDFS-15198.003.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> In issue HDFS-13443, update mount table cache imediately. The specified 
> router update their own mount table cache imediately, then update other's by 
> rpc protocol refreshMountTableEntries. But in secure mode, can't refresh 
> other's router's. In specified router's log, error like this
> {code}
> 2020-02-27 22:59:07,212 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server : 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> 2020-02-27 22:59:07,213 ERROR 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread: 
> Failed to refresh mount table entries cache at router $host:8111
> java.io.IOException: DestHost:destPort host:8111 , LocalHost:localPort 
> $host/$ip:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:288)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
> 2020-02-27 22:59:07,214 INFO 
> org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver: Added 
> new mount point /test_11 to resolver
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15218) RBF: MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056285#comment-17056285
 ] 

Surendra Singh Lilhore commented on HDFS-15218:
---

[~elgoiri], yes it is same.

> RBF: MountTableRefresherService fail in secure cluster.
> ---
>
> Key: HDFS-15218
> URL: https://issues.apache.org/jira/browse/HDFS-15218
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-15218.001.patch
>
>
> {code:java}
> 2020-03-09 12:43:50,082 | ERROR | MountTableRefresh_linux-133:25020 | Failed 
> to refresh mount table entries cache at router X:25020 | 
> MountTableRefresherThread.java:69
> java.io.IOException: DestHost:destPort X:25020 , LocalHost:localPort 
> XXX/XXX:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:284)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15218) RBF: MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15218:
--
Status: Patch Available  (was: Open)

> RBF: MountTableRefresherService fail in secure cluster.
> ---
>
> Key: HDFS-15218
> URL: https://issues.apache.org/jira/browse/HDFS-15218
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-15218.001.patch
>
>
> {code:java}
> 2020-03-09 12:43:50,082 | ERROR | MountTableRefresh_linux-133:25020 | Failed 
> to refresh mount table entries cache at router X:25020 | 
> MountTableRefresherThread.java:69
> java.io.IOException: DestHost:destPort X:25020 , LocalHost:localPort 
> XXX/XXX:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:284)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15218) RBF: MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15218:
--
Attachment: HDFS-15218.001.patch

> RBF: MountTableRefresherService fail in secure cluster.
> ---
>
> Key: HDFS-15218
> URL: https://issues.apache.org/jira/browse/HDFS-15218
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-15218.001.patch
>
>
> {code:java}
> 2020-03-09 12:43:50,082 | ERROR | MountTableRefresh_linux-133:25020 | Failed 
> to refresh mount table entries cache at router X:25020 | 
> MountTableRefresherThread.java:69
> java.io.IOException: DestHost:destPort X:25020 , LocalHost:localPort 
> XXX/XXX:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:284)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15210) EC : File write hanged when DN is shutdown by admin command.

2020-03-10 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15210:
--
Status: Patch Available  (was: Open)

> EC : File write hanged when DN is shutdown by admin command.
> 
>
> Key: HDFS-15210
> URL: https://issues.apache.org/jira/browse/HDFS-15210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-15210.001.patch, dump.txt
>
>
> EC Blocks : blk_-9223372036854291632_10668910, 
> blk_-9223372036854291631_10668910, blk_-9223372036854291630_10668910, 
> blk_-9223372036854291629_10668910, blk_-9223372036854291628_10668910
>  
> Two block DN restarted : blk_-9223372036854291630_10668910 & 
> blk_-9223372036854291632_10668910
> {code:java}
> 2020-03-03 18:12:17,074 DEBUG hdfs.DataStreamer: DFSClient seqno: -2 reply: 
> OOB_RESTART downstreamAckTimeNanos: 0 flag: 8
> 2020-03-03 18:13:39,469 DEBUG hdfs.DataStreamer: DFSClient seqno: -2 reply: 
> OOB_RESTART downstreamAckTimeNanos: 0 flag: 8 {code}
>  
> Restarted streams are stuck in below stacktrace :
> {code}
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) 
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.take(DFSStripedOutputStream.java:110)
>  at 
> org.apache.hadoop.hdfs.StripedDataStreamer.setupPipelineInternal(StripedDataStreamer.java:140)
>  at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1540)
>  at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1276)
>  at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:669) at 
> org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:46)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15210) EC : File write hanged when DN is shutdown by admin command.

2020-03-10 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15210:
--
Attachment: HDFS-15210.001.patch

> EC : File write hanged when DN is shutdown by admin command.
> 
>
> Key: HDFS-15210
> URL: https://issues.apache.org/jira/browse/HDFS-15210
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
> Attachments: HDFS-15210.001.patch, dump.txt
>
>
> EC Blocks : blk_-9223372036854291632_10668910, 
> blk_-9223372036854291631_10668910, blk_-9223372036854291630_10668910, 
> blk_-9223372036854291629_10668910, blk_-9223372036854291628_10668910
>  
> Two block DN restarted : blk_-9223372036854291630_10668910 & 
> blk_-9223372036854291632_10668910
> {code:java}
> 2020-03-03 18:12:17,074 DEBUG hdfs.DataStreamer: DFSClient seqno: -2 reply: 
> OOB_RESTART downstreamAckTimeNanos: 0 flag: 8
> 2020-03-03 18:13:39,469 DEBUG hdfs.DataStreamer: DFSClient seqno: -2 reply: 
> OOB_RESTART downstreamAckTimeNanos: 0 flag: 8 {code}
>  
> Restarted streams are stuck in below stacktrace :
> {code}
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) 
> at 
> org.apache.hadoop.hdfs.DFSStripedOutputStream$MultipleBlockingQueue.take(DFSStripedOutputStream.java:110)
>  at 
> org.apache.hadoop.hdfs.StripedDataStreamer.setupPipelineInternal(StripedDataStreamer.java:140)
>  at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1540)
>  at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1276)
>  at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:669) at 
> org.apache.hadoop.hdfs.StripedDataStreamer.run(StripedDataStreamer.java:46)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-03-10 Thread Surendra Singh Lilhore (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-15135:
--
Fix Version/s: 3.2.2
   3.3.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~Sushma_28]  for contribution.

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Fix For: 3.3.0, 3.2.2
>
> Attachments: HDFS-15135-branch-3.2.001.patch, 
> HDFS-15135-branch-3.2.002.patch, HDFS-15135.001.patch, HDFS-15135.002.patch, 
> HDFS-15135.003.patch, HDFS-15135.004.patch, HDFS-15135.005.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15214) WebHDFS: Add snapshot counts to Content Summary

2020-03-10 Thread hemanthboyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hemanthboyina updated HDFS-15214:
-
Attachment: HDFS-15214.002.patch

> WebHDFS: Add snapshot counts to Content Summary
> ---
>
> Key: HDFS-15214
> URL: https://issues.apache.org/jira/browse/HDFS-15214
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HDFS-15214.001.patch, HDFS-15214.002.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056184#comment-17056184
 ] 

Stephen O'Donnell commented on HDFS-15216:
--

Hi [~Sushma_28] - thanks for raising this issue. I made the changes in 
HDFS-7175 to deprecate the -showProgress switch as we were seeing a lot of 
timeouts when progress was not enabled. I completely missed updating the help 
text.

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15216.001.patch
>
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDFS-15216:
-
 Target Version/s: 3.3.0
Affects Version/s: 3.3.0
   Status: Patch Available  (was: Open)

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15216.001.patch
>
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15215) The Timestamp for longest write/read lock held log is wrong

2020-03-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056173#comment-17056173
 ] 

Íñigo Goiri commented on HDFS-15215:


Can we add a test that checks that the date is around the current time?

> The Timestamp for longest write/read lock held log is wrong
> ---
>
> Key: HDFS-15215
> URL: https://issues.apache.org/jira/browse/HDFS-15215
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
>
> I found the Timestamp for longest write/read lock held log is wrong in trunk:
> {code}
> 2020-03-10 16:01:26,585 [main] INFO  namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(281)) - Number of suppressed 
> write-lock reports: 0
>   Longest write-lock held at 1970-01-03 07:07:40,841+0900 for 3ms via 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> ...
> {code}
> Looking at the code, it looks like the timestamp comes from System.nanoTime() 
> that returns the current value of the running Java Virtual Machine's 
> high-resolution time source and this method can only be used to measure 
> elapsed time:
> https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--
> We need to make the timestamp from System.currentTimeMillis().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056167#comment-17056167
 ] 

Íñigo Goiri commented on HDFS-15113:


+1 on  [^HDFS-15113.005.patch].

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15218) RBF: MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056162#comment-17056162
 ] 

Íñigo Goiri commented on HDFS-15218:


Is this the same as HDFS-15198?

> RBF: MountTableRefresherService fail in secure cluster.
> ---
>
> Key: HDFS-15218
> URL: https://issues.apache.org/jira/browse/HDFS-15218
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
>
> {code:java}
> 2020-03-09 12:43:50,082 | ERROR | MountTableRefresh_linux-133:25020 | Failed 
> to refresh mount table entries cache at router X:25020 | 
> MountTableRefresherThread.java:69
> java.io.IOException: DestHost:destPort X:25020 , LocalHost:localPort 
> XXX/XXX:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:284)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15218) RBF: MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-15218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-15218:
---
Summary: RBF: MountTableRefresherService fail in secure cluster.  (was: RBF 
: MountTableRefresherService fail in secure cluster.)

> RBF: MountTableRefresherService fail in secure cluster.
> ---
>
> Key: HDFS-15218
> URL: https://issues.apache.org/jira/browse/HDFS-15218
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rbf
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Major
>
> {code:java}
> 2020-03-09 12:43:50,082 | ERROR | MountTableRefresh_linux-133:25020 | Failed 
> to refresh mount table entries cache at router X:25020 | 
> MountTableRefresherThread.java:69
> java.io.IOException: DestHost:destPort X:25020 , LocalHost:localPort 
> XXX/XXX:0. Failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
> at 
> org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:284)
> at 
> org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14612) SlowDiskReport won't update when SlowDisks is always empty in heartbeat

2020-03-10 Thread Hanisha Koneru (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056144#comment-17056144
 ] 

Hanisha Koneru commented on HDFS-14612:
---

Overall the patch LGTM.
Current place of calling checkAndUpdateReportIfNecessary seems correct.

> SlowDiskReport won't update when SlowDisks is always empty in heartbeat
> ---
>
> Key: HDFS-14612
> URL: https://issues.apache.org/jira/browse/HDFS-14612
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14612-001.patch, HDFS-14612-002.patch, 
> HDFS-14612-003.patch, HDFS-14612-004.patch, HDFS-14612-005.patch, 
> HDFS-14612-006.patch, HDFS-14612-007.patch, HDFS-14612.patch
>
>
> I found SlowDiskReport won't update when slowDisks is always empty in 
> org.apache.hadoop.hdfs.server.blockmanagement.*handleHeartbeat*, this may 
> lead to outdated SlowDiskReport alway staying in jmx of namenode until next 
> time slowDisks isn't empty. So i think this method 
> *checkAndUpdateReportIfNecessary()* should be called firstly when we want to 
> get the jmx information about SlowDiskReport, this can keep the 
> SlowDiskReport on jmx alway valid. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056110#comment-17056110
 ] 

Stephen O'Donnell commented on HDFS-15160:
--

Thanks for the comments. I will check 
DataNode#transferReplicaForPipelineRecovery and push a new patch soon.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15216:
--
Attachment: HDFS-15216.001.patch

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15216.001.patch
>
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15193) Improving the error message for missing `dfs.namenode.rpc-address.$NAMESERVICE`

2020-03-10 Thread Ctest (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056085#comment-17056085
 ] 

Ctest commented on HDFS-15193:
--

[~ayushtkn] 

Thank you for your help. I just uploaded a new patch (004).

> Improving the error message for missing 
> `dfs.namenode.rpc-address.$NAMESERVICE`
> ---
>
> Key: HDFS-15193
> URL: https://issues.apache.org/jira/browse/HDFS-15193
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Ctest
>Assignee: Ctest
>Priority: Minor
> Attachments: HDFS-15193-001.patch, HDFS-15193-002.patch, 
> HDFS-15193-003.patch, HDFS-15193-004.patch
>
>
> I set `dfs.nameservices` with the value of one name service (let’s say `ns1` 
> for simplicity) and forgot to set `dfs.namenode.rpc-address.ns1`. 
> Then I got an error message:
> {code:java}
> [ERROR] testNonFederation(org.apache.hadoop.hdfs.tools.TestGetConf)  Time 
> elapsed: 0.195 s <<< ERROR!
> java.io.IOException: Incorrect configuration: namenode address 
> dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
> at 
> org.apache.hadoop.hdfs.DFSUtil.getNNServiceRpcAddressesForCluster(DFSUtil.java:629)
> at 
> org.apache.hadoop.hdfs.tools.TestGetConf.getAddressListFromConf(TestGetConf.java:132)
> at 
> org.apache.hadoop.hdfs.tools.TestGetConf.verifyAddresses(TestGetConf.java:234)
> at 
> org.apache.hadoop.hdfs.tools.TestGetConf.testNonFederation(TestGetConf.java:308)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.lang.Thread.run(Thread.java:748)
> {code}
>  
> The error message above told me that `dfs.namenode.rpc-address` or 
> `dfs.namenode.servicerpc-address` should be set.
> However, the actual reason for the error is that 
> `dfs.namenode.rpc-address.ns1` or `dfs.namenode.servicerpc-address.ns1` is 
> not set.
>  
> *How to improve* 
> I wrote a patch to improve the error message. Here is the current error 
> message:
> {code:java}
> [ERROR] testNonFederation(org.apache.hadoop.hdfs.tools.TestGetConf)  Time 
> elapsed: 0.195 s <<< ERROR!
> java.io.IOException: Incorrect configuration: namenode address 
> dfs.namenode.servicerpc-address[.$NAMESERVICE] or 
> dfs.namenode.rpc-address[.$NAMESERVICE] is not configured.
> {code}
> Then the users can immediately know that they should set 
> `dfs.namenode.rpc-address.ns1` or `dfs.namenode.servicerpc-address.ns1` 
> according to the error message.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056054#comment-17056054
 ] 

zhuqi edited comment on HDFS-15160 at 3/10/20, 3:23 PM:


Thanks [~sodonnell] for your great works.

LGTM, i agree with  [~hexiaoqiao]  that the 
DataNode#transferReplicaForPipelineRecovery should change 
data.acquireDatasetLock() to data.acquireDatasetReadLock() to get replica 
information.


was (Author: zhuqi):
Thanks [~sodonnell] for your great works.

LGTM, i agree with  [~hexiaoqiao]  that the 
DataNode#transferReplicaForPipelineRecovery should change 

data.acquireDatasetLock() to data.acquireDatasetReadLock() to get replica 
information.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056054#comment-17056054
 ] 

zhuqi commented on HDFS-15160:
--

Thanks [~sodonnell] for your great works.

LGTM, i agree with  [~hexiaoqiao]  that the 
DataNode#transferReplicaForPipelineRecovery should change 

data.acquireDatasetLock() to data.acquireDatasetReadLock() to get replica 
information.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056011#comment-17056011
 ] 

zhuqi commented on HDFS-15180:
--

Hi [~sodonnell]

Yeah, your comment is just my mean.

If we need add the top lock time information to the datanode metrics , so that 
may help to make the future performance improvement decision.

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15218) RBF : MountTableRefresherService fail in secure cluster.

2020-03-10 Thread Surendra Singh Lilhore (Jira)
Surendra Singh Lilhore created HDFS-15218:
-

 Summary: RBF : MountTableRefresherService fail in secure cluster.
 Key: HDFS-15218
 URL: https://issues.apache.org/jira/browse/HDFS-15218
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: rbf
Affects Versions: 3.1.1
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore


{code:java}
2020-03-09 12:43:50,082 | ERROR | MountTableRefresh_linux-133:25020 | Failed to 
refresh mount table entries cache at router X:25020 | 
MountTableRefresherThread.java:69
java.io.IOException: DestHost:destPort X:25020 , LocalHost:localPort 
XXX/XXX:0. Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
at 
org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB.refreshMountTableEntries(RouterAdminProtocolTranslatorPB.java:284)
at 
org.apache.hadoop.hdfs.server.federation.router.MountTableRefresherThread.run(MountTableRefresherThread.java:65)
 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15217) Add more information to longest write/read lock held log

2020-03-10 Thread Toshihiro Suzuki (Jira)
Toshihiro Suzuki created HDFS-15217:
---

 Summary: Add more information to longest write/read lock held log
 Key: HDFS-15217
 URL: https://issues.apache.org/jira/browse/HDFS-15217
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Toshihiro Suzuki
Assignee: Toshihiro Suzuki


Currently, we can see the stack trace in the longest write/read lock held log, 
but sometimes we need more information, for example, a target path of deletion:
{code:java}
2020-03-10 21:51:21,116 [main] INFO  namenode.FSNamesystem 
(FSNamesystemLock.java:writeUnlock(276)) -   Number of suppressed write-lock 
reports: 0
Longest write-lock held at 2020-03-10 21:51:21,107+0900 for 6ms via 
java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1058)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:257)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:233)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1706)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3188)
...
{code}
Adding more information (opName, path, etc.) to the log is useful to 
troubleshoot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-10 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055839#comment-17055839
 ] 

Xiaoqiao He commented on HDFS-15160:


Thanks [~sodonnell] for your works, it is great improvement for DataNode based 
on monitor chart offered by [~zhuqi]. [^HDFS-15160.002.patch] almost LGTM, just 
one minor comment,  DataNode#transferReplicaForPipelineRecovery also hold 
FsDataset write lock currently, it also could change to read lock in my 
opinion. Would you like to do double check?

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15160) ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl methods should use datanode readlock

2020-03-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055802#comment-17055802
 ] 

Stephen O'Donnell commented on HDFS-15160:
--

See this comment 
https://issues.apache.org/jira/browse/HDFS-15180?focusedCommentId=17055739=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17055739
 where [~zhuqi] has tried out this patch on a few nodes in his busy cluster. It 
seems the blocked thread has reduced significantly with this change.

Would anyone like to give this patch a review so we can move toward committing 
it?

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> ---
>
> Key: HDFS-15160
> URL: https://issues.apache.org/jira/browse/HDFS-15160
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.3.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15216:
-

 Summary: Wrong Use Case of -showprogress in fsck 
 Key: HDFS-15216
 URL: https://issues.apache.org/jira/browse/HDFS-15216
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree


*-showprogress* is deprecated and Progress is now shown by default but fsck 
--help shows incorrect use case for the same 

 

Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
-openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
-upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
[-maintenance] [-blockId ]
  start checking from this path
h4. 
 -move move corrupted files to /lost+found
 -delete delete corrupted files
 -files print out files being checked
 -openforwrite print out files opened for write
 -includeSnapshots include snapshot data if the given path indicates a 
snapshottable directory or there are snapshottable directories under it
 -list-corruptfileblocks print out list of missing blocks and files they belong 
to
 -files -blocks print out block report
 -files -blocks -locations print out locations for every block
 -files -blocks -racks print out network topology for data-node locations
 -files -blocks -replicaDetails print out each replica details
 -files -blocks -upgradedomains print out upgrade domains for every block
 -storagepolicies print out storage policy summary for the blocks
 -maintenance print out maintenance state node details
 *-showprogress show progress in output. Default is OFF (no progress)*
 -blockId print out which file this blockId belongs to, locations (nodes, 
racks) of this block, and other diagnostics info (under replicated, corrupted 
or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-15216:
-

Assignee: Ravuri Sushma sree

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055798#comment-17055798
 ] 

Stephen O'Donnell commented on HDFS-15180:
--

[~zhuqi] Thanks for sharing this, it looks promising. It is also good to see 
the patch used in a real world cluster without any issues.

Looking at your chart, are the orange and blue lines using the old code until 
around 5th / 6th March, then you switched to the new RW Fair lock plus 
HDFS-15160 and the blocked thread count has reduced to almost zero? The green 
line has been using HDFS-15160 since at least 4th March?

I am not sure what metrics we should track to prove this change it good, but 
blocked thread count seems like a good one for now. Your chart looks promising. 
It would also be good to see a line on your chart for 2 or 3 nodes where 
HDFS-15160 is NOT applied for a comparison over time, so we can clearly see the 
nodes with HDFS-15160 against nodes without it.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-10 Thread Brahma Reddy Battula (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055774#comment-17055774
 ] 

Brahma Reddy Battula commented on HDFS-15113:
-

+1 from myside, Holding for commit till others review.

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055739#comment-17055739
 ] 

zhuqi edited comment on HDFS-15180 at 3/10/20, 9:44 AM:


!image-2020-03-10-17-22-57-391.png|width=604,height=137!

 

Hi [~sodonnell] 
It's the thread blocked avg per day monitor.

I just gray 3 nodes in busy cluster in different time , the green one is the 
longest, and there are almost no blocked thread now. The two one has a good 
improvement from it has beed changed to RW Lock.
 And i analyze the log to find the long lock time happen when :
 !image-2020-03-10-17-31-58-830.png|width=611,height=148!

DirectoryScanner scan operation.
 And other does not cause too much time when very busy:
 !image-2020-03-10-17-34-26-368.png|width=812,height=136!

Such as the deep copy for the caculation of dfsUsed.


was (Author: zhuqi):
!image-2020-03-10-17-22-57-391.png|width=604,height=137!

 

Hi [~sodonnell]

I just gray 3 nodes in busy cluster in different time , the green one is the 
longest, and there are almost no blocked thread now. The two one has a good 
improvement from it has beed changed to RW Lock.
And i analyze the log to find the long lock time happen when :
!image-2020-03-10-17-31-58-830.png|width=611,height=148!

DirectoryScanner scan operation.
And other does not cause too much time when very busy:
!image-2020-03-10-17-34-26-368.png|width=812,height=136!

Such as the deep copy for the caculation of dfsUsed.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055747#comment-17055747
 ] 

zhuqi commented on HDFS-15180:
--

CC [~sodonnell]
And my version is based cdh 5.16.1, and my lock is fair lock. What is the key 
performance factor about the RW in DataNode you want to know, i can try to 
confirm the RW improvement when i next to gray more nodes in our busy cluster.
May be we need more metrics to help the performance improvement confirm, what 
do you think about it? 

Thanks a lot.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055739#comment-17055739
 ] 

zhuqi edited comment on HDFS-15180 at 3/10/20, 9:35 AM:


!image-2020-03-10-17-22-57-391.png|width=604,height=137!

 

Hi [~sodonnell]

I just gray 3 nodes in busy cluster in different time , the green one is the 
longest, and there are almost no blocked thread now. The two one has a good 
improvement from it has beed changed to RW Lock.
And i analyze the log to find the long lock time happen when :
!image-2020-03-10-17-31-58-830.png|width=611,height=148!

DirectoryScanner scan operation.
And other does not cause too much time when very busy:
!image-2020-03-10-17-34-26-368.png|width=812,height=136!

Such as the deep copy for the caculation of dfsUsed.


was (Author: zhuqi):
!image-2020-03-10-17-22-57-391.png!

 

 

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png, 
> image-2020-03-10-17-31-58-830.png, image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055739#comment-17055739
 ] 

zhuqi commented on HDFS-15180:
--

!image-2020-03-10-17-22-57-391.png!

 

 

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhuqi updated HDFS-15180:
-
Attachment: image-2020-03-10-17-22-57-391.png

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
> Attachments: image-2020-03-10-17-22-57-391.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055694#comment-17055694
 ] 

Stephen O'Donnell commented on HDFS-15180:
--

Hi [~zhuqi] - I just wanted to confirm from your last message - Are you 
actually running HDFS-15160 on a busy cluster and seeing a big improvement from 
it? Have you seen any issues or has it worked OK? If so, that is great feedback 
and thanks for trying it and sharing the results.

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-03-10 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055688#comment-17055688
 ] 

Surendra Singh Lilhore commented on HDFS-15135:
---

+1, will merge today.

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135-branch-3.2.001.patch, 
> HDFS-15135-branch-3.2.002.patch, HDFS-15135.001.patch, HDFS-15135.002.patch, 
> HDFS-15135.003.patch, HDFS-15135.004.patch, HDFS-15135.005.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-10 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055671#comment-17055671
 ] 

zhuqi commented on HDFS-15180:
--

[~sodonnell] , [~hexiaoqiao] 

Thanks for your patient reply.

 [~sodonnell] have done some work, ref. HDFS-15150 introduce read write lock 
and HDFS-15160 is in progress currently. I have used HDFS-15160  in our product 
cluster to gray, and now the blocked thread in datanode  has been reduced a 
lot.--

[~hexiaoqiao] i am looking forward to the {{BlockPoolLockManager}} to split 
{{dataLock}} more fine-grained, i can assign to  [~Aiphag0] anytime if he wants 
to take it.

 

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: zhuqi
>Priority: Major
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15215) The Timestamp for longest write/read lock held log is wrong

2020-03-10 Thread Toshihiro Suzuki (Jira)
Toshihiro Suzuki created HDFS-15215:
---

 Summary: The Timestamp for longest write/read lock held log is 
wrong
 Key: HDFS-15215
 URL: https://issues.apache.org/jira/browse/HDFS-15215
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Toshihiro Suzuki
Assignee: Toshihiro Suzuki


I found the Timestamp for longest write/read lock held log is wrong in trunk:

{code}
2020-03-10 16:01:26,585 [main] INFO  namenode.FSNamesystem 
(FSNamesystemLock.java:writeUnlock(281)) -   Number of suppressed write-lock 
reports: 0
Longest write-lock held at 1970-01-03 07:07:40,841+0900 for 3ms via 
java.lang.Thread.getStackTrace(Thread.java:1559)
...
{code}

Looking at the code, it looks like the timestamp comes from System.nanoTime() 
that returns the current value of the running Java Virtual Machine's 
high-resolution time source and this method can only be used to measure elapsed 
time:
https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#nanoTime--

We need to make the timestamp from System.currentTimeMillis().





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15207) VolumeScanner skip to scan blocks accessed during recent scan peroid

2020-03-10 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055652#comment-17055652
 ] 

Hadoop QA commented on HDFS-15207:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
46s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 57s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 579 unchanged - 
0 fixed = 580 total (was 579) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 46s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 501 unchanged - 0 fixed = 502 total (was 501) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 14s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m 11s{color} 
| {color:red} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestPersistentStoragePolicySatisfier |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.7 Server=19.03.7 Image:yetus/hadoop:c44943d1fc3 |
| JIRA Issue | HDFS-15207 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12996239/HDFS-15207.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 65399ab7a704 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 44afe11 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/28919/artifact/out/diff-compile-javac-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| checkstyle |