date:20210729

[jira] [Commented] (HDFS-14529) NPE while Loading the Editlogs

2021-07-29 Thread Akira Ajisaka (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389296#comment-17389296
 ] 

Akira Ajisaka commented on HDFS-14529:
--

We recently hit the edit log corruption in a Hadoop 2.x cluster. In the edit 
log, there is a SetTimeOp to set atime to a non-existent file. We had set the 
atime to -1 to avoid NPE in the corrupt edit log via offline edits viewer. 
After applying the PR, we will work around the NPE by just supplying -recover 
option. Thanks.

> NPE while Loading the Editlogs
> --
>
> Key: HDFS-14529
> URL: https://issues.apache.org/jira/browse/HDFS-14529
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Harshakiran Reddy
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {noformat}
> 2019-05-31 15:15:42,397 ERROR namenode.FSEditLogLoader: Encountered exception 
> on operation TimesOp [length=0, 
> path=/testLoadSpace/dir0/dir0/dir0/dir2/_file_9096763, mtime=-1, 
> atime=1559294343288, opCode=OP_TIMES, txid=18927893]
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.unprotectedSetTimes(FSDirAttrOp.java:490)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:711)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:286)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:181)
> at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:924)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:771)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.doRecovery(NameNode.java:1558)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1640)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1725){noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15886) Add a way to get protected dirs from a special configuration file

2021-07-29 Thread Max Xie (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max  Xie updated HDFS-15886:

Attachment: HDFS-15886.patch
Status: Patch Available  (was: In Progress)

> Add a way to get protected dirs from a special configuration file
> -
>
> Key: HDFS-15886
> URL: https://issues.apache.org/jira/browse/HDFS-15886
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HDFS-15886.patch, HDFS-15886.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We used protected dirs to ensure that important data directories cannot be 
> deleted by mistake. But protected dirs can only be configured in 
> hdfs-site.xml.
> For ease of management,  we add a way to get the list of protected dirs from 
> a special configuration file.
> How to use.
> 1. set the config in hdfs-site.xml
> ```
> 
>  fs.protected.directories
>  
> /hdfs/path/1,/hdfs/path/2,[file:///path/to/protected.dirs.config]
>  
> ```
> 2.  add some protected dirs to the config file 
> ([file:///path/to/protected.dirs.config])
> ```
> /hdfs/path/4
> /hdfs/path/5
> ```
> 3. use command to refresh fs.protected.directories instead of 
> FSDirectory.setProtectedDirectories(..)
> ```
> hdfs dfsadmin -refreshProtectedDirectories
> ```
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15886) Add a way to get protected dirs from a special configuration file

2021-07-29 Thread Max Xie (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max  Xie updated HDFS-15886:

Attachment: (was: HDFS-15886.patch)

> Add a way to get protected dirs from a special configuration file
> -
>
> Key: HDFS-15886
> URL: https://issues.apache.org/jira/browse/HDFS-15886
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: namenode
>Affects Versions: 3.4.0
>Reporter: Max  Xie
>Assignee: Max  Xie
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HDFS-15886.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We used protected dirs to ensure that important data directories cannot be 
> deleted by mistake. But protected dirs can only be configured in 
> hdfs-site.xml.
> For ease of management,  we add a way to get the list of protected dirs from 
> a special configuration file.
> How to use.
> 1. set the config in hdfs-site.xml
> ```
> 
>  fs.protected.directories
>  
> /hdfs/path/1,/hdfs/path/2,[file:///path/to/protected.dirs.config]
>  
> ```
> 2.  add some protected dirs to the config file 
> ([file:///path/to/protected.dirs.config])
> ```
> /hdfs/path/4
> /hdfs/path/5
> ```
> 3. use command to refresh fs.protected.directories instead of 
> FSDirectory.setProtectedDirectories(..)
> ```
> hdfs dfsadmin -refreshProtectedDirectories
> ```
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-9266) Avoid unsafe split and append on fields that might be IPv6 literals

2021-07-29 Thread Hemanth Boyina (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Boyina updated HDFS-9266:
-
Attachment: HDFS-9266-HADOOP-17800.002.patch

> Avoid unsafe split and append on fields that might be IPv6 literals
> ---
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>Priority: Major
>  Labels: ipv6
> Attachments: HDFS-9266-HADOOP-11890.1.patch, 
> HDFS-9266-HADOOP-11890.2.patch, HDFS-9266-HADOOP-17800.001.patch, 
> HDFS-9266-HADOOP-17800.002.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Resolved] (HDFS-15936) Solve BlockSender#sendPacket() does not record SocketTimeout exception

2021-07-29 Thread Wei-Chiu Chuang (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-15936.

Fix Version/s: 3.3.2
   3.4.0
   Resolution: Fixed

Thanks!

> Solve BlockSender#sendPacket() does not record SocketTimeout exception
> --
>
> Key: HDFS-15936
> URL: https://issues.apache.org/jira/browse/HDFS-15936
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In BlockSender#sendPacket(), if a SocketTimeout exception occurs, no 
> information is recorded here.
> try {
>..
> } catch (IOException e) {
>if (e instanceof SocketTimeoutException) {
>  /*
>   * writing to client timed out. This happens if the client reads
>   * part of a block and then decides not to read the rest (but leaves
>   * the socket open).
>   *
>   * Reporting of this case is done in DataXceiver#run
>   */
>}
> }
> No records are generated here, which is not conducive to troubleshooting.
> We should add a line of warning type log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14540) Block deletion failure causes an infinite polling in TestDeleteBlockPool

2021-07-29 Thread Anton Kutuzov (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389725#comment-17389725
 ] 

Anton Kutuzov commented on HDFS-14540:
--

[~shv], Do you like my proposal or are there better ways for solving the 
problem?

> Block deletion failure causes an infinite polling in TestDeleteBlockPool
> 
>
> Key: HDFS-14540
> URL: https://issues.apache.org/jira/browse/HDFS-14540
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: John Doe
>Priority: Major
>
> In the testDeleteBlockPool function, when file deletion failure, the while 
> loop hangs.
> {code:java}
>   fs1.delete(new Path("/alpha"), true); //deletion failure
>   
>   // Wait till all blocks are deleted from the dn2 for bpid1.
>   while ((MiniDFSCluster.getFinalizedDir(dn2StorageDir1, 
>   bpid1).list().length != 0) || (MiniDFSCluster.getFinalizedDir(
>   dn2StorageDir2, bpid1).list().length != 0)) {
> try {
>   Thread.sleep(3000); 
> } catch (Exception ignored) {
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=631039&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631039
 ]

ASF GitHub Bot logged work on HDFS-16143:
-

Author: ASF GitHub Bot
Created on: 29/Jul/21 07:48
Start Date: 29/Jul/21 07:48
Worklog Time Spent: 10m 
  Work Description: virajjasani edited a comment on pull request #3235:
URL: https://github.com/apache/hadoop/pull/3235#issuecomment-83134


   @aajisaka @jojochuang @tasanuma Could you please review this PR? After the 
latest revision, 4 builds were triggered and we don't see failures in 
`TestEditLogTailer` anymore.
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 631039)
Time Spent: 3h  (was: 2h 50m)

> TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
> -
>
> Key: HDFS-16143
> URL: https://issues.apache.org/jira/browse/HDFS-16143
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3229/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {quote}
> [ERROR] 
> testStandbyTriggersLogRollsWhenTailInProgressEdits[0](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 6.862 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRollsWhenTailInProgressEdits(TestEditLogTailer.java:444)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16143) TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16143?focusedWorklogId=631038&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631038
 ]

ASF GitHub Bot logged work on HDFS-16143:
-

Author: ASF GitHub Bot
Created on: 29/Jul/21 07:48
Start Date: 29/Jul/21 07:48
Worklog Time Spent: 10m 
  Work Description: virajjasani commented on pull request #3235:
URL: https://github.com/apache/hadoop/pull/3235#issuecomment-83134


   @aajisaka @jojochuang @tasanuma Could you please review this PR? After the 
latest revision, 4 builds were triggered and we don't see failures in 
`TestEditLogTailer`.
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 631038)
Time Spent: 2h 50m  (was: 2h 40m)

> TestEditLogTailer#testStandbyTriggersLogRollsWhenTailInProgressEdits is flaky
> -
>
> Key: HDFS-16143
> URL: https://issues.apache.org/jira/browse/HDFS-16143
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Attachments: patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3229/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {quote}
> [ERROR] 
> testStandbyTriggersLogRollsWhenTailInProgressEdits[0](org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer)
>   Time elapsed: 6.862 s  <<< FAILURE!
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestEditLogTailer.testStandbyTriggersLogRollsWhenTailInProgressEdits(TestEditLogTailer.java:444)
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16146) All three replicas are lost due to not adding a new DataNode in time

2021-07-29 Thread Shuyan Zhang (Jira)

Shuyan Zhang created HDFS-16146:
---

 Summary: All three replicas are lost due to not adding a new 
DataNode in time
 Key: HDFS-16146
 URL: https://issues.apache.org/jira/browse/HDFS-16146
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, hdfs
Reporter: Shuyan Zhang
Assignee: Shuyan Zhang


We have a three-replica file, and all replicas of a block are lost when the 
default datanode replacement strategy is used. It happened like this:
1. addBlock() applies for a new block and successfully connects three datanodes 
(dn1, dn2 and dn3) to build a pipeline;
2. Write data;
3. dn1 has an error and was kicked out. At this time, the remaining datanodes 
in the pipeline > 1, according to the replacement strategy, there is no need to 
add a new datanode;
4. After writing is completed, enter PIPELINE_CLOSE;
5. dn2 has an error and was kicked out. But because it is already in the close 
phase, addDatanode2ExistingPipeline() decides to hand over the task of 
transfering the replica to the NameNode. At this time, there is only one 
datanode left in the pipeline;
6. dn3 error, all replicas are lost.
If we add a new datanode in step 5, we can avoid losing all replicas in this 
case. I think error in PIPELINE_CLOSE and error in DATA_STREAMING have the same 
risk of losing replicas,  we should not skip adding a new datanode during 
PIPELINE_CLOSE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16146) All three replicas are lost due to not adding a new DataNode in time

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16146:
--
Labels: pull-request-available  (was: )

> All three replicas are lost due to not adding a new DataNode in time
> 
>
> Key: HDFS-16146
> URL: https://issues.apache.org/jira/browse/HDFS-16146
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have a three-replica file, and all replicas of a block are lost when the 
> default datanode replacement strategy is used. It happened like this:
> 1. addBlock() applies for a new block and successfully connects three 
> datanodes (dn1, dn2 and dn3) to build a pipeline;
> 2. Write data;
> 3. dn1 has an error and was kicked out. At this time, the remaining datanodes 
> in the pipeline > 1, according to the replacement strategy, there is no need 
> to add a new datanode;
> 4. After writing is completed, enter PIPELINE_CLOSE;
> 5. dn2 has an error and was kicked out. But because it is already in the 
> close phase, addDatanode2ExistingPipeline() decides to hand over the task of 
> transfering the replica to the NameNode. At this time, there is only one 
> datanode left in the pipeline;
> 6. dn3 error, all replicas are lost.
> If we add a new datanode in step 5, we can avoid losing all replicas in this 
> case. I think error in PIPELINE_CLOSE and error in DATA_STREAMING have the 
> same risk of losing replicas,  we should not skip adding a new datanode 
> during PIPELINE_CLOSE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16146) All three replicas are lost due to not adding a new DataNode in time

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16146?focusedWorklogId=631051&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631051
 ]

ASF GitHub Bot logged work on HDFS-16146:
-

Author: ASF GitHub Bot
Created on: 29/Jul/21 08:47
Start Date: 29/Jul/21 08:47
Worklog Time Spent: 10m 
  Work Description: zhangshuyan0 opened a new pull request #3247:
URL: https://github.com/apache/hadoop/pull/3247


   …ode in time.
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 631051)
Remaining Estimate: 0h
Time Spent: 10m

> All three replicas are lost due to not adding a new DataNode in time
> 
>
> Key: HDFS-16146
> URL: https://issues.apache.org/jira/browse/HDFS-16146
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have a three-replica file, and all replicas of a block are lost when the 
> default datanode replacement strategy is used. It happened like this:
> 1. addBlock() applies for a new block and successfully connects three 
> datanodes (dn1, dn2 and dn3) to build a pipeline;
> 2. Write data;
> 3. dn1 has an error and was kicked out. At this time, the remaining datanodes 
> in the pipeline > 1, according to the replacement strategy, there is no need 
> to add a new datanode;
> 4. After writing is completed, enter PIPELINE_CLOSE;
> 5. dn2 has an error and was kicked out. But because it is already in the 
> close phase, addDatanode2ExistingPipeline() decides to hand over the task of 
> transfering the replica to the NameNode. At this time, there is only one 
> datanode left in the pipeline;
> 6. dn3 error, all replicas are lost.
> If we add a new datanode in step 5, we can avoid losing all replicas in this 
> case. I think error in PIPELINE_CLOSE and error in DATA_STREAMING have the 
> same risk of losing replicas,  we should not skip adding a new datanode 
> during PIPELINE_CLOSE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15936) Solve BlockSender#sendPacket() does not record SocketTimeout exception

2021-07-29 Thread JiangHua Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389760#comment-17389760
 ] 

JiangHua Zhu commented on HDFS-15936:
-

[~weichiu], thank you for your comment and review.


> Solve BlockSender#sendPacket() does not record SocketTimeout exception
> --
>
> Key: HDFS-15936
> URL: https://issues.apache.org/jira/browse/HDFS-15936
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.2
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> In BlockSender#sendPacket(), if a SocketTimeout exception occurs, no 
> information is recorded here.
> try {
>..
> } catch (IOException e) {
>if (e instanceof SocketTimeoutException) {
>  /*
>   * writing to client timed out. This happens if the client reads
>   * part of a block and then decides not to read the rest (but leaves
>   * the socket open).
>   *
>   * Reporting of this case is done in DataXceiver#run
>   */
>}
> }
> No records are generated here, which is not conducive to troubleshooting.
> We should add a line of warning type log.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-14529) NPE while Loading the Editlogs

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14529?focusedWorklogId=631055&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631055
 ]

ASF GitHub Bot logged work on HDFS-14529:
-

Author: ASF GitHub Bot
Created on: 29/Jul/21 09:14
Start Date: 29/Jul/21 09:14
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3243:
URL: https://github.com/apache/hadoop/pull/3243#issuecomment-888949312


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 59s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 13s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  5s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  15m 51s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 229m 12s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 313m 48s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3243/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3243 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux f27e9bfef1cf 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 458a11982967070f763025ffbe4b05b5e6f854eb |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3243/3/testReport/ |
   | Max. process+thread count | 3496 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3243/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This me

[jira] [Commented] (HDFS-16146) All three replicas are lost due to not adding a new DataNode in time

2021-07-29 Thread Xiaoqiao He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389769#comment-17389769
 ] 

Xiaoqiao He commented on HDFS-16146:


Great catch here! LGTM. Thanks [~zhangshuyan] for your report and contributions.
cc [~weichiu], [~sodonnell], would you mind take another reviews?

> All three replicas are lost due to not adding a new DataNode in time
> 
>
> Key: HDFS-16146
> URL: https://issues.apache.org/jira/browse/HDFS-16146
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, hdfs
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We have a three-replica file, and all replicas of a block are lost when the 
> default datanode replacement strategy is used. It happened like this:
> 1. addBlock() applies for a new block and successfully connects three 
> datanodes (dn1, dn2 and dn3) to build a pipeline;
> 2. Write data;
> 3. dn1 has an error and was kicked out. At this time, the remaining datanodes 
> in the pipeline > 1, according to the replacement strategy, there is no need 
> to add a new datanode;
> 4. After writing is completed, enter PIPELINE_CLOSE;
> 5. dn2 has an error and was kicked out. But because it is already in the 
> close phase, addDatanode2ExistingPipeline() decides to hand over the task of 
> transfering the replica to the NameNode. At this time, there is only one 
> datanode left in the pipeline;
> 6. dn3 error, all replicas are lost.
> If we add a new datanode in step 5, we can avoid losing all replicas in this 
> case. I think error in PIPELINE_CLOSE and error in DATA_STREAMING have the 
> same risk of losing replicas,  we should not skip adding a new datanode 
> during PIPELINE_CLOSE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16146) All three replicas are lost due to not adding a new DataNode in time

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16146?focusedWorklogId=631070&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631070
 ]

ASF GitHub Bot logged work on HDFS-16146:
-

Author: ASF GitHub Bot
Created on: 29/Jul/21 10:19
Start Date: 29/Jul/21 10:19
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3247:
URL: https://github.com/apache/hadoop/pull/3247#issuecomment-888994716


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 22s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 55s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 52s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 21s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m  7s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 16s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 39s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 39s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  89m 45s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3247/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3247 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 85f243d811a3 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / dd5758bc92a1a234b9fb897a48656bf48184b9a6 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3247/1/testReport/ |
   | Max. process+thread count | 670 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client U: 
hadoop-hdfs-project/hadoop-hdfs-client |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3247/1/console |

[jira] [Commented] (HDFS-15175) Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog

2021-07-29 Thread Stephen O'Donnell (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389810#comment-17389810
 ] 

Stephen O'Donnell commented on HDFS-15175:
--

We should backport to branch-3.3 too, otherwise the change might get lost of 
someone was on 3.2.x and upgrades. 3.2 and 3.3 and trunk are the active 3.x 
branchs now. 3.1 is end of life.

> Multiple CloseOp shared block instance causes the standby namenode to crash 
> when rolling editlog
> 
>
> Key: HDFS-15175
> URL: https://issues.apache.org/jira/browse/HDFS-15175
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Assignee: Wan Chang
>Priority: Critical
>  Labels: NameNode
> Fix For: 3.4.0, 3.2.3, 3.2.4
>
> Attachments: HDFS-15175-trunk.1.patch
>
>
>  
> {panel:title=Crash exception}
> 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log 
> tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp 
> [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, 
> atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], 
> permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, 
> clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, 
> txid=32625024993]
>  java.io.IOException: File is not under construction: ..
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:360)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873)
>  at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361)
> {panel}
>  
> {panel:title=Editlog}
> 
>  OP_REASSIGN_LEASE
>  
>  32625021150
>  DFSClient_NONMAPREDUCE_-969060727_197760
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625023743
>  0
>  0
>  ..
>  3
>  1581816135883
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> ..
> 
>  OP_TRUNCATE
>  
>  32625024049
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  ..
>  185818644
>  1581816136336
>  
>  5568434562
>  185818648
>  4495417845
>  
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625024993
>  0
>  0
>  ..
>  3
>  1581816138774
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> {panel}
>  
>  
> The block size should be 185818648 in the first CloseOp. When truncate is 
> used, the block size becomes 185818644. The CloseOp/TruncateOp/CloseOp is 
> synchronized to the JournalNode in the same batch. The block used by CloseOp 
> twice is the same instance, which causes the first CloseOp has wrong block 
> size. When SNN rolling Editlog, TruncateOp does not make the file to the 
> UnderConstruction state. Then, when the second CloseOp is executed, the file 
> is not in the UnderConstruction state, and SNN crashes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-15175) Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog

2021-07-29 Thread Xiaoqiao He (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15175:
---
Fix Version/s: 3.3.2

> Multiple CloseOp shared block instance causes the standby namenode to crash 
> when rolling editlog
> 
>
> Key: HDFS-15175
> URL: https://issues.apache.org/jira/browse/HDFS-15175
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Assignee: Wan Chang
>Priority: Critical
>  Labels: NameNode
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: HDFS-15175-trunk.1.patch
>
>
>  
> {panel:title=Crash exception}
> 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log 
> tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp 
> [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, 
> atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], 
> permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, 
> clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, 
> txid=32625024993]
>  java.io.IOException: File is not under construction: ..
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:360)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873)
>  at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361)
> {panel}
>  
> {panel:title=Editlog}
> 
>  OP_REASSIGN_LEASE
>  
>  32625021150
>  DFSClient_NONMAPREDUCE_-969060727_197760
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625023743
>  0
>  0
>  ..
>  3
>  1581816135883
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> ..
> 
>  OP_TRUNCATE
>  
>  32625024049
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  ..
>  185818644
>  1581816136336
>  
>  5568434562
>  185818648
>  4495417845
>  
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625024993
>  0
>  0
>  ..
>  3
>  1581816138774
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> {panel}
>  
>  
> The block size should be 185818648 in the first CloseOp. When truncate is 
> used, the block size becomes 185818644. The CloseOp/TruncateOp/CloseOp is 
> synchronized to the JournalNode in the same batch. The block used by CloseOp 
> twice is the same instance, which causes the first CloseOp has wrong block 
> size. When SNN rolling Editlog, TruncateOp does not make the file to the 
> UnderConstruction state. Then, when the second CloseOp is executed, the file 
> is not in the UnderConstruction state, and SNN crashes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15175) Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog

2021-07-29 Thread Xiaoqiao He (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389820#comment-17389820
 ] 

Xiaoqiao He commented on HDFS-15175:


Thanks [~sodonnell] for your reminder. cherry-pick-ed to branch-3.3.

> Multiple CloseOp shared block instance causes the standby namenode to crash 
> when rolling editlog
> 
>
> Key: HDFS-15175
> URL: https://issues.apache.org/jira/browse/HDFS-15175
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Assignee: Wan Chang
>Priority: Critical
>  Labels: NameNode
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: HDFS-15175-trunk.1.patch
>
>
>  
> {panel:title=Crash exception}
> 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log 
> tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp 
> [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, 
> atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], 
> permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, 
> clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, 
> txid=32625024993]
>  java.io.IOException: File is not under construction: ..
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:360)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873)
>  at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361)
> {panel}
>  
> {panel:title=Editlog}
> 
>  OP_REASSIGN_LEASE
>  
>  32625021150
>  DFSClient_NONMAPREDUCE_-969060727_197760
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625023743
>  0
>  0
>  ..
>  3
>  1581816135883
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> ..
> 
>  OP_TRUNCATE
>  
>  32625024049
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  ..
>  185818644
>  1581816136336
>  
>  5568434562
>  185818648
>  4495417845
>  
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625024993
>  0
>  0
>  ..
>  3
>  1581816138774
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> {panel}
>  
>  
> The block size should be 185818648 in the first CloseOp. When truncate is 
> used, the block size becomes 185818644. The CloseOp/TruncateOp/CloseOp is 
> synchronized to the JournalNode in the same batch. The block used by CloseOp 
> twice is the same instance, which causes the first CloseOp has wrong block 
> size. When SNN rolling Editlog, TruncateOp does not make the file to the 
> UnderConstruction state. Then, when the second CloseOp is executed, the file 
> is not in the UnderConstruction state, and SNN crashes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15175) Multiple CloseOp shared block instance causes the standby namenode to crash when rolling editlog

2021-07-29 Thread Stephen O'Donnell (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389821#comment-17389821
 ] 

Stephen O'Donnell commented on HDFS-15175:
--

Thanks for committing it. It saved me some work today!

> Multiple CloseOp shared block instance causes the standby namenode to crash 
> when rolling editlog
> 
>
> Key: HDFS-15175
> URL: https://issues.apache.org/jira/browse/HDFS-15175
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Assignee: Wan Chang
>Priority: Critical
>  Labels: NameNode
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: HDFS-15175-trunk.1.patch
>
>
>  
> {panel:title=Crash exception}
> 2020-02-16 09:24:46,426 [507844305] - ERROR [Edit log 
> tailer:FSEditLogLoader@245] - Encountered exception on operation CloseOp 
> [length=0, inodeId=0, path=..., replication=3, mtime=1581816138774, 
> atime=1581814760398, blockSize=536870912, blocks=[blk_5568434562_4495417845], 
> permissions=da_music:hdfs:rw-r-, aclEntries=null, clientName=, 
> clientMachine=, overwrite=false, storagePolicyId=0, opCode=OP_CLOSE, 
> txid=32625024993]
>  java.io.IOException: File is not under construction: ..
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:442)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:237)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:146)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:891)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:872)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:360)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1873)
>  at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:479)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361)
> {panel}
>  
> {panel:title=Editlog}
> 
>  OP_REASSIGN_LEASE
>  
>  32625021150
>  DFSClient_NONMAPREDUCE_-969060727_197760
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625023743
>  0
>  0
>  ..
>  3
>  1581816135883
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> ..
> 
>  OP_TRUNCATE
>  
>  32625024049
>  ..
>  DFSClient_NONMAPREDUCE_1000868229_201260
>  ..
>  185818644
>  1581816136336
>  
>  5568434562
>  185818648
>  4495417845
>  
>  
>  
> ..
> 
>  OP_CLOSE
>  
>  32625024993
>  0
>  0
>  ..
>  3
>  1581816138774
>  1581814760398
>  536870912
>  
>  
>  false
>  
>  5568434562
>  185818644
>  4495417845
>  
>  
>  da_music
>  hdfs
>  416
>  
>  
>  
> {panel}
>  
>  
> The block size should be 185818648 in the first CloseOp. When truncate is 
> used, the block size becomes 185818644. The CloseOp/TruncateOp/CloseOp is 
> synchronized to the JournalNode in the same batch. The block used by CloseOp 
> twice is the same instance, which causes the first CloseOp has wrong block 
> size. When SNN rolling Editlog, TruncateOp does not make the file to the 
> UnderConstruction state. Then, when the second CloseOp is executed, the file 
> is not in the UnderConstruction state, and SNN crashes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDFS-16129) HttpFS signature secret file misusage

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16129?focusedWorklogId=631092&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631092
 ]

ASF GitHub Bot logged work on HDFS-16129:
-

Author: ASF GitHub Bot
Created on: 29/Jul/21 11:27
Start Date: 29/Jul/21 11:27
Worklog Time Spent: 10m 
  Work Description: szilard-nemeth commented on pull request #3209:
URL: https://github.com/apache/hadoop/pull/3209#issuecomment-889038363


   Retriggered build: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/view/change-requests/job/PR-3209/2/console


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 631092)
Time Spent: 20m  (was: 10m)

> HttpFS signature secret file misusage
> -
>
> Key: HDFS-16129
> URL: https://issues.apache.org/jira/browse/HDFS-16129
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: httpfs
>Affects Versions: 3.4.0
>Reporter: Tamas Domok
>Assignee: Tamas Domok
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I started to work on the YARN-10814 issue, and found this bug in the HttpFS. 
> I investigated the problem and I already have some fix for it.
>  
> If the deprecated *httpfs.authentication.signature.secret.file* is not set in 
> the configuration (e.g.: httpfs-site.xml) then the new 
> *hadoop.http.authentication.signature.secret.file* config option won't be 
> used, it will fallback to the random secret provider silently.
> The _HttpFSServerWebServer_ sets an _authFilterConfigurationPrefix_ when 
> building the server for the old path (*httpfs.authentication.*). Later the 
> _AuthenticationFilter.constructSecretProvider_ will immediately fallback to 
> +random+, because the config won't contain the file. If the old path was set 
> too, then it handled the file, and the provider was set to +file+ type.
> The configuration should be based on both the old and the new prefix filter, 
> merging the two. The new config option should win in my opinion.
>  
> There is another issue in the _HttpFSAuthenticationFilter_, it is closely 
> related.
> If both config option is set then the _HttpFSAuthenticationFilter_ will fail 
> with an impossible file path (e.g.: 
> *${httpfs.config.dir}/httpfs-signature.secret*).
> _HttpFSAuthenticationFilter_ constructs the configuration, filtering first 
> the new config prefix then the old prefix. The old prefix code works 
> correctly, it uses the _conf.get(key)_
> instead of the _entry.getValue()_ which gives back the file path mentioned 
> earlier. The code duplication can be eliminated and I think it would be 
> better to change the order, first adding the config options from the old path 
> then the new, and the new should overwrite the old values, with a warning log 
> message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread liuyongpan (Jira)

liuyongpan created HDFS-16147:
-

 Summary: load fsimage with parallelization and compression
 Key: HDFS-16147
 URL: https://issues.apache.org/jira/browse/HDFS-16147
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namanode
Affects Versions: 3.3.0
Reporter: liuyongpan
 Fix For: 3.3.0


   In HDFS-14617,  it  allows the inode and inode directory sections of the 
fsimage to be loaded in parallel. But it can't turn on parallelism and 
compression at the same time. I fixed this defect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread liuyongpan (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuyongpan updated HDFS-16147:
--
Attachment: HDFS-16147.001.patch
Status: Patch Available  (was: Open)

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-16147.001.patch
>
>
>In HDFS-14617,  it  allows the inode and inode directory sections of 
> the fsimage to be loaded in parallel. But it can't turn on parallelism and 
> compression at the same time. I fixed this defect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread liuyongpan (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuyongpan updated HDFS-16147:
--
Release Note: In HDFS-14617, it allows the inode and inode directory 
sections of the fsimage to be loaded in parallel. But it can't turn on 
parallelism and compression at the same time. I fixed this defect.
 Description: (was:In HDFS-14617,  it  allows the inode and 
inode directory sections of the fsimage to be loaded in parallel. But it can't 
turn on parallelism and compression at the same time. I fixed this defect.)

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14617) Improve fsimage load time by writing sub-sections to the fsimage index

2021-07-29 Thread liuyongpan (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389857#comment-17389857
 ] 

liuyongpan commented on HDFS-14617:
---

 Stephen O'Donnell , I have created the issue HDFS-16147.

> Improve fsimage load time by writing sub-sections to the fsimage index
> --
>
> Key: HDFS-14617
> URL: https://issues.apache.org/jira/browse/HDFS-14617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 2.10.0, 3.3.0
>
> Attachments: HDFS-14617.001.patch, ParallelLoading.svg, 
> SerialLoading.svg, dirs-single.svg, flamegraph.parallel.svg, 
> flamegraph.serial.svg, inodes.svg
>
>
> Loading an fsimage is basically a single threaded process. The current 
> fsimage is written out in sections, eg iNode, iNode_Directory, Snapshots, 
> Snapshot_Diff etc. Then at the end of the file, an index is written that 
> contains the offset and length of each section. The image loader code uses 
> this index to initialize an input stream to read and process each section. It 
> is important that one section is fully loaded before another is started, as 
> the next section depends on the results of the previous one.
> What I would like to propose is the following:
> 1. When writing the image, we can optionally output sub_sections to the 
> index. That way, a given section would effectively be split into several 
> sections, eg:
> {code:java}
>inode_section offset 10 length 1000
>  inode_sub_section offset 10 length 500
>  inode_sub_section offset 510 length 500
>  
>inode_dir_section offset 1010 length 1000
>  inode_dir_sub_section offset 1010 length 500
>  inode_dir_sub_section offset 1010 length 500
> {code}
> Here you can see we still have the original section index, but then we also 
> have sub-section entries that cover the entire section. Then a processor can 
> either read the full section in serial, or read each sub-section in parallel.
> 2. In the Image Writer code, we should set a target number of sub-sections, 
> and then based on the total inodes in memory, it will create that many 
> sub-sections per major image section. I think the only sections worth doing 
> this for are inode, inode_reference, inode_dir and snapshot_diff. All others 
> tend to be fairly small in practice.
> 3. If there are under some threshold of inodes (eg 10M) then don't bother 
> with the sub-sections as a serial load only takes a few seconds at that scale.
> 4. The image loading code can then have a switch to enable 'parallel loading' 
> and a 'number of threads' where it uses the sub-sections, or if not enabled 
> falls back to the existing logic to read the entire section in serial.
> Working with a large image of 316M inodes and 35GB on disk, I have a proof of 
> concept of this change working, allowing just inode and inode_dir to be 
> loaded in parallel, but I believe inode_reference and snapshot_diff can be 
> make parallel with the same technique.
> Some benchmarks I have are as follows:
> {code:java}
> Threads   1 2 3 4 
> 
> inodes448   290   226   189 
> inode_dir 326   211   170   161 
> Total 927   651   535   488 (MD5 calculation about 100 seconds)
> {code}
> The above table shows the time in seconds to load the inode section and the 
> inode_directory section, and then the total load time of the image.
> With 4 threads using the above technique, we are able to better than half the 
> load time of the two sections. With the patch in HDFS-13694 it would take a 
> further 100 seconds off the run time, going from 927 seconds to 388, which is 
> a significant improvement. Adding more threads beyond 4 has diminishing 
> returns as there are some synchronized points in the loading code to protect 
> the in memory structures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Issue Comment Deleted] (HDFS-14617) Improve fsimage load time by writing sub-sections to the fsimage index

2021-07-29 Thread liuyongpan (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuyongpan updated HDFS-14617:
--
Comment: was deleted

(was: of course)

> Improve fsimage load time by writing sub-sections to the fsimage index
> --
>
> Key: HDFS-14617
> URL: https://issues.apache.org/jira/browse/HDFS-14617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 2.10.0, 3.3.0
>
> Attachments: HDFS-14617.001.patch, ParallelLoading.svg, 
> SerialLoading.svg, dirs-single.svg, flamegraph.parallel.svg, 
> flamegraph.serial.svg, inodes.svg
>
>
> Loading an fsimage is basically a single threaded process. The current 
> fsimage is written out in sections, eg iNode, iNode_Directory, Snapshots, 
> Snapshot_Diff etc. Then at the end of the file, an index is written that 
> contains the offset and length of each section. The image loader code uses 
> this index to initialize an input stream to read and process each section. It 
> is important that one section is fully loaded before another is started, as 
> the next section depends on the results of the previous one.
> What I would like to propose is the following:
> 1. When writing the image, we can optionally output sub_sections to the 
> index. That way, a given section would effectively be split into several 
> sections, eg:
> {code:java}
>inode_section offset 10 length 1000
>  inode_sub_section offset 10 length 500
>  inode_sub_section offset 510 length 500
>  
>inode_dir_section offset 1010 length 1000
>  inode_dir_sub_section offset 1010 length 500
>  inode_dir_sub_section offset 1010 length 500
> {code}
> Here you can see we still have the original section index, but then we also 
> have sub-section entries that cover the entire section. Then a processor can 
> either read the full section in serial, or read each sub-section in parallel.
> 2. In the Image Writer code, we should set a target number of sub-sections, 
> and then based on the total inodes in memory, it will create that many 
> sub-sections per major image section. I think the only sections worth doing 
> this for are inode, inode_reference, inode_dir and snapshot_diff. All others 
> tend to be fairly small in practice.
> 3. If there are under some threshold of inodes (eg 10M) then don't bother 
> with the sub-sections as a serial load only takes a few seconds at that scale.
> 4. The image loading code can then have a switch to enable 'parallel loading' 
> and a 'number of threads' where it uses the sub-sections, or if not enabled 
> falls back to the existing logic to read the entire section in serial.
> Working with a large image of 316M inodes and 35GB on disk, I have a proof of 
> concept of this change working, allowing just inode and inode_dir to be 
> loaded in parallel, but I believe inode_reference and snapshot_diff can be 
> make parallel with the same technique.
> Some benchmarks I have are as follows:
> {code:java}
> Threads   1 2 3 4 
> 
> inodes448   290   226   189 
> inode_dir 326   211   170   161 
> Total 927   651   535   488 (MD5 calculation about 100 seconds)
> {code}
> The above table shows the time in seconds to load the inode section and the 
> inode_directory section, and then the total load time of the image.
> With 4 threads using the above technique, we are able to better than half the 
> load time of the two sections. With the patch in HDFS-13694 it would take a 
> further 100 seconds off the run time, going from 927 seconds to 388, which is 
> a significant improvement. Adding more threads beyond 4 has diminishing 
> returns as there are some synchronized points in the loading code to protect 
> the in memory structures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14617) Improve fsimage load time by writing sub-sections to the fsimage index

2021-07-29 Thread liuyongpan (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389861#comment-17389861
 ] 

liuyongpan commented on HDFS-14617:
---

Stephen O'Donnell , I have created the issue HDFS-16147, you can take a look at 
it sometime.


> Improve fsimage load time by writing sub-sections to the fsimage index
> --
>
> Key: HDFS-14617
> URL: https://issues.apache.org/jira/browse/HDFS-14617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 2.10.0, 3.3.0
>
> Attachments: HDFS-14617.001.patch, ParallelLoading.svg, 
> SerialLoading.svg, dirs-single.svg, flamegraph.parallel.svg, 
> flamegraph.serial.svg, inodes.svg
>
>
> Loading an fsimage is basically a single threaded process. The current 
> fsimage is written out in sections, eg iNode, iNode_Directory, Snapshots, 
> Snapshot_Diff etc. Then at the end of the file, an index is written that 
> contains the offset and length of each section. The image loader code uses 
> this index to initialize an input stream to read and process each section. It 
> is important that one section is fully loaded before another is started, as 
> the next section depends on the results of the previous one.
> What I would like to propose is the following:
> 1. When writing the image, we can optionally output sub_sections to the 
> index. That way, a given section would effectively be split into several 
> sections, eg:
> {code:java}
>inode_section offset 10 length 1000
>  inode_sub_section offset 10 length 500
>  inode_sub_section offset 510 length 500
>  
>inode_dir_section offset 1010 length 1000
>  inode_dir_sub_section offset 1010 length 500
>  inode_dir_sub_section offset 1010 length 500
> {code}
> Here you can see we still have the original section index, but then we also 
> have sub-section entries that cover the entire section. Then a processor can 
> either read the full section in serial, or read each sub-section in parallel.
> 2. In the Image Writer code, we should set a target number of sub-sections, 
> and then based on the total inodes in memory, it will create that many 
> sub-sections per major image section. I think the only sections worth doing 
> this for are inode, inode_reference, inode_dir and snapshot_diff. All others 
> tend to be fairly small in practice.
> 3. If there are under some threshold of inodes (eg 10M) then don't bother 
> with the sub-sections as a serial load only takes a few seconds at that scale.
> 4. The image loading code can then have a switch to enable 'parallel loading' 
> and a 'number of threads' where it uses the sub-sections, or if not enabled 
> falls back to the existing logic to read the entire section in serial.
> Working with a large image of 316M inodes and 35GB on disk, I have a proof of 
> concept of this change working, allowing just inode and inode_dir to be 
> loaded in parallel, but I believe inode_reference and snapshot_diff can be 
> make parallel with the same technique.
> Some benchmarks I have are as follows:
> {code:java}
> Threads   1 2 3 4 
> 
> inodes448   290   226   189 
> inode_dir 326   211   170   161 
> Total 927   651   535   488 (MD5 calculation about 100 seconds)
> {code}
> The above table shows the time in seconds to load the inode section and the 
> inode_directory section, and then the total load time of the image.
> With 4 threads using the above technique, we are able to better than half the 
> load time of the two sections. With the patch in HDFS-13694 it would take a 
> further 100 seconds off the run time, going from 927 seconds to 388, which is 
> a significant improvement. Adding more threads beyond 4 has diminishing 
> returns as there are some synchronized points in the loading code to protect 
> the in memory structures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Issue Comment Deleted] (HDFS-14617) Improve fsimage load time by writing sub-sections to the fsimage index

2021-07-29 Thread liuyongpan (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liuyongpan updated HDFS-14617:
--
Comment: was deleted

(was:  Stephen O'Donnell , I have created the issue HDFS-16147.)

> Improve fsimage load time by writing sub-sections to the fsimage index
> --
>
> Key: HDFS-14617
> URL: https://issues.apache.org/jira/browse/HDFS-14617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 2.10.0, 3.3.0
>
> Attachments: HDFS-14617.001.patch, ParallelLoading.svg, 
> SerialLoading.svg, dirs-single.svg, flamegraph.parallel.svg, 
> flamegraph.serial.svg, inodes.svg
>
>
> Loading an fsimage is basically a single threaded process. The current 
> fsimage is written out in sections, eg iNode, iNode_Directory, Snapshots, 
> Snapshot_Diff etc. Then at the end of the file, an index is written that 
> contains the offset and length of each section. The image loader code uses 
> this index to initialize an input stream to read and process each section. It 
> is important that one section is fully loaded before another is started, as 
> the next section depends on the results of the previous one.
> What I would like to propose is the following:
> 1. When writing the image, we can optionally output sub_sections to the 
> index. That way, a given section would effectively be split into several 
> sections, eg:
> {code:java}
>inode_section offset 10 length 1000
>  inode_sub_section offset 10 length 500
>  inode_sub_section offset 510 length 500
>  
>inode_dir_section offset 1010 length 1000
>  inode_dir_sub_section offset 1010 length 500
>  inode_dir_sub_section offset 1010 length 500
> {code}
> Here you can see we still have the original section index, but then we also 
> have sub-section entries that cover the entire section. Then a processor can 
> either read the full section in serial, or read each sub-section in parallel.
> 2. In the Image Writer code, we should set a target number of sub-sections, 
> and then based on the total inodes in memory, it will create that many 
> sub-sections per major image section. I think the only sections worth doing 
> this for are inode, inode_reference, inode_dir and snapshot_diff. All others 
> tend to be fairly small in practice.
> 3. If there are under some threshold of inodes (eg 10M) then don't bother 
> with the sub-sections as a serial load only takes a few seconds at that scale.
> 4. The image loading code can then have a switch to enable 'parallel loading' 
> and a 'number of threads' where it uses the sub-sections, or if not enabled 
> falls back to the existing logic to read the entire section in serial.
> Working with a large image of 316M inodes and 35GB on disk, I have a proof of 
> concept of this change working, allowing just inode and inode_dir to be 
> loaded in parallel, but I believe inode_reference and snapshot_diff can be 
> make parallel with the same technique.
> Some benchmarks I have are as follows:
> {code:java}
> Threads   1 2 3 4 
> 
> inodes448   290   226   189 
> inode_dir 326   211   170   161 
> Total 927   651   535   488 (MD5 calculation about 100 seconds)
> {code}
> The above table shows the time in seconds to load the inode section and the 
> inode_directory section, and then the total load time of the image.
> With 4 threads using the above technique, we are able to better than half the 
> load time of the two sections. With the patch in HDFS-13694 it would take a 
> further 100 seconds off the run time, going from 927 seconds to 388, which is 
> a significant improvement. Adding more threads beyond 4 has diminishing 
> returns as there are some synchronized points in the loading code to protect 
> the in memory structures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread Stephen O'Donnell (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389880#comment-17389880
 ] 

Stephen O'Donnell commented on HDFS-16147:
--

I only quickly looked at this, but I have a few questions.

With this change, will tools like OIV be able to read the image? It looks like 
there are a series of new compressed sections, and OIV currently does not read 
the image in parallel - it will just try to read it from the start to the end - 
will this work?

If we have parallel enabled and compressed sub-sections, if we disable parallel 
for some reason, will the image be readable?

Have you been able to benchmark loading a large image in parallel with 
compression enabled and disabled so we can see if compression makes it faster 
or slower?

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14617) Improve fsimage load time by writing sub-sections to the fsimage index

2021-07-29 Thread liuyongpan (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389861#comment-17389861
 ] 

liuyongpan edited comment on HDFS-14617 at 7/29/21, 12:53 PM:
--

[~sodonnell] , I have created the issue HDFS-16147, you can take a look at it 
sometime.


was (Author: mofei):
Stephen O'Donnell , I have created the issue HDFS-16147, you can take a look at 
it sometime.


> Improve fsimage load time by writing sub-sections to the fsimage index
> --
>
> Key: HDFS-14617
> URL: https://issues.apache.org/jira/browse/HDFS-14617
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Fix For: 2.10.0, 3.3.0
>
> Attachments: HDFS-14617.001.patch, ParallelLoading.svg, 
> SerialLoading.svg, dirs-single.svg, flamegraph.parallel.svg, 
> flamegraph.serial.svg, inodes.svg
>
>
> Loading an fsimage is basically a single threaded process. The current 
> fsimage is written out in sections, eg iNode, iNode_Directory, Snapshots, 
> Snapshot_Diff etc. Then at the end of the file, an index is written that 
> contains the offset and length of each section. The image loader code uses 
> this index to initialize an input stream to read and process each section. It 
> is important that one section is fully loaded before another is started, as 
> the next section depends on the results of the previous one.
> What I would like to propose is the following:
> 1. When writing the image, we can optionally output sub_sections to the 
> index. That way, a given section would effectively be split into several 
> sections, eg:
> {code:java}
>inode_section offset 10 length 1000
>  inode_sub_section offset 10 length 500
>  inode_sub_section offset 510 length 500
>  
>inode_dir_section offset 1010 length 1000
>  inode_dir_sub_section offset 1010 length 500
>  inode_dir_sub_section offset 1010 length 500
> {code}
> Here you can see we still have the original section index, but then we also 
> have sub-section entries that cover the entire section. Then a processor can 
> either read the full section in serial, or read each sub-section in parallel.
> 2. In the Image Writer code, we should set a target number of sub-sections, 
> and then based on the total inodes in memory, it will create that many 
> sub-sections per major image section. I think the only sections worth doing 
> this for are inode, inode_reference, inode_dir and snapshot_diff. All others 
> tend to be fairly small in practice.
> 3. If there are under some threshold of inodes (eg 10M) then don't bother 
> with the sub-sections as a serial load only takes a few seconds at that scale.
> 4. The image loading code can then have a switch to enable 'parallel loading' 
> and a 'number of threads' where it uses the sub-sections, or if not enabled 
> falls back to the existing logic to read the entire section in serial.
> Working with a large image of 316M inodes and 35GB on disk, I have a proof of 
> concept of this change working, allowing just inode and inode_dir to be 
> loaded in parallel, but I believe inode_reference and snapshot_diff can be 
> make parallel with the same technique.
> Some benchmarks I have are as follows:
> {code:java}
> Threads   1 2 3 4 
> 
> inodes448   290   226   189 
> inode_dir 326   211   170   161 
> Total 927   651   535   488 (MD5 calculation about 100 seconds)
> {code}
> The above table shows the time in seconds to load the inode section and the 
> inode_directory section, and then the total load time of the image.
> With 4 threads using the above technique, we are able to better than half the 
> load time of the two sections. With the patch in HDFS-13694 it would take a 
> further 100 seconds off the run time, going from 927 seconds to 388, which is 
> a significant improvement. Adding more threads beyond 4 has diminishing 
> returns as there are some synchronized points in the loading code to protect 
> the in memory structures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread Brahma Reddy Battula (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-16147:

Target Version/s:   (was: 3.3.0)

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread Brahma Reddy Battula (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-16147:

Fix Version/s: (was: 3.3.0)

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread liuyongpan (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389888#comment-17389888
 ] 

liuyongpan commented on HDFS-16147:
---

[~sodonnell] 

I tested  it in my  test environment，and checked  your question.

1、OIV can read the image, but can not read the image in parallel, I will do it 
later.

2、Current logic can read both the old Fsimage and the new Fsimage, I have 
tested.

3、It can load 300M Fsimage with  compression and parallelization , which can 
improve 50% loading time.

 

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9266) Avoid unsafe split and append on fields that might be IPv6 literals

2021-07-29 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389890#comment-17389890
 ] 

Hadoop QA commented on HDFS-9266:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
49s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 9 
new or modified test files. {color} |
|| || || || {color:brown} HADOOP-17800 Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 12m 
36s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
34s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
19s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
58s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
31s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 58s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 27m 
29s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  5m 
41s{color} | {color:green}{color} | {color:green} HADOOP-17800 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 4s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
0s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
0s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
50s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
50s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green}{color} | {color:green} hadoop-hdfs-project: The 
patch generated 0 new + 403 unchanged - 2 fixed = 403 total (was 405) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  6s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javad

[jira] [Commented] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread liuyongpan (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389905#comment-17389905
 ] 

liuyongpan commented on HDFS-16147:
---

sorry , OIV can't work

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread Stephen O'Donnell (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389923#comment-17389923
 ] 

Stephen O'Donnell commented on HDFS-16147:
--

It is important that OIV can read these images.

If you create a parallel compressed image with this patch, and then try to load 
it in a NN without this patch and parallel loading disabled, is the NN still 
able to load it? I suspect it won't as there are multiple compressed sections, 
so it cannot read a single compressed stream from end to end.

{quote}
It can load 300M Fsimage with  compression and parallelization , which can 
improve 50% loading time.
{quote}

Is the 50% improvement measured against a compressed single threaded load 
verses parallel compressed loading?

How are the load times between "parallel not compressed " vs "parallel 
compressed"?

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread liuyongpan (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389947#comment-17389947
 ] 

liuyongpan commented on HDFS-16147:
---

[~sodonnell] ,thank you very much for your advice. I will check it carefully.

> load fsimage with parallelization and compression
> -
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namanode
>Affects Versions: 3.3.0
>Reporter: liuyongpan
>Priority: Minor
> Attachments: HDFS-16147.001.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-16147) load fsimage with parallelization and compression

2021-07-29 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390089#comment-17390089
 ] 

Hadoop QA commented on HDFS-16147:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 
46s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 
56s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 17s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 22m  
7s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  3m 
21s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
22s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 29s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  3m 
21s{color} | {color:g

[jira] [Work logged] (HDFS-16129) HttpFS signature secret file misusage

2021-07-29 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-16129?focusedWorklogId=631337&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-631337
 ]

ASF GitHub Bot logged work on HDFS-16129:
-

Author: ASF GitHub Bot
Created on: 29/Jul/21 20:50
Start Date: 29/Jul/21 20:50
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3209:
URL: https://github.com/apache/hadoop/pull/3209#issuecomment-889439594


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  3s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m  7s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m  9s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  29m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  compile  |  22m 56s | 
[/branch-compile-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  root in trunk failed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.  |
   | -0 :warning: |  checkstyle  |   0m 38s | 
[/buildtool-branch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/buildtool-branch-checkstyle-root.txt)
 |  The patch fails to run checkstyle in root  |
   | -1 :x: |  mvnsite  |   0m 39s | 
[/branch-mvnsite-hadoop-common-project_hadoop-auth.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-auth.txt)
 |  hadoop-auth in trunk failed.  |
   | -1 :x: |  mvnsite  |   0m 38s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | -1 :x: |  mvnsite  |   0m 38s | 
[/branch-mvnsite-hadoop-common-project_hadoop-kms.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-kms.txt)
 |  hadoop-kms in trunk failed.  |
   | -1 :x: |  mvnsite  |   0m 38s | 
[/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-httpfs.txt)
 |  hadoop-hdfs-httpfs in trunk failed.  |
   | -1 :x: |  javadoc  |   0m 39s | 
[/branch-javadoc-hadoop-common-project_hadoop-auth-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-javadoc-hadoop-common-project_hadoop-auth-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-auth in trunk failed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | -1 :x: |  javadoc  |   0m 38s | 
[/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-common in trunk failed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | -1 :x: |  javadoc  |   0m 39s | 
[/branch-javadoc-hadoop-common-project_hadoop-kms-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-javadoc-hadoop-common-project_hadoop-kms-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-kms in trunk failed with JDK Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04. 
 |
   | -1 :x: |  javadoc  |   0m 39s | 
[/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-httpfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3209/3/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-httpfs-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-hdfs-httpfs in trunk failed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | -1 :x: |  javadoc  |   0m 39s | 
[/branch-javadoc-hadoop-common-project_hadoop-auth-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-320

[jira] [Commented] (HDFS-15886) Add a way to get protected dirs from a special configuration file

2021-07-29 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-15886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390155#comment-17390155
 ] 

Hadoop QA commented on HDFS-15886:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m  
5s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:blue}0{color} | {color:blue} buf {color} | {color:blue}  0m  1s{color} 
| {color:blue}{color} | {color:blue} buf was not available. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 4 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 12m 
42s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
35s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 26m  
6s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
38s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
15s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
35s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
26m 12s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
57s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 46m 
30s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 10m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
51s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m 
53s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 24m 53s{color} | 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/694/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt{color}
 | {color:red} root-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 25 new + 298 unchanged - 25 
fixed = 323 total (was 323) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 24m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
42s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 21m 42s{color} | 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/694/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt{color}
 | {color:red} root-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 with 
JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 38 new + 285 
unchanged - 38 fixed = 323 total (was 323) {color} |
|

[jira] [Commented] (HDFS-12920) HDFS default value change (with adding time unit) breaks old version MR tarball work with Hadoop 3.x

2021-07-29 Thread Akira Ajisaka (Jira)



[ 
https://issues.apache.org/jira/browse/HDFS-12920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17390232#comment-17390232
 ] 

Akira Ajisaka commented on HDFS-12920:
--

Yes, the default values are different. If anyone writes a script or test case 
for them, they need to update it to allow the values without a time unit. I 
think the people who hit this problem are greater than those who write a script 
for them, so I reverted it.

> HDFS default value change (with adding time unit) breaks old version MR 
> tarball work with Hadoop 3.x
> 
>
> Key: HDFS-12920
> URL: https://issues.apache.org/jira/browse/HDFS-12920
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: configuration, hdfs
>Reporter: Junping Du
>Assignee: Akira Ajisaka
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> After HADOOP-15059 get resolved. I tried to deploy 2.9.0 tar ball with 3.0.0 
> RC1, and run the job with following errors:
> {noformat}
> 2017-12-12 13:29:06,824 INFO [main] 
> org.apache.hadoop.service.AbstractService: Service 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.NumberFormatException: For input string: "30s"
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.NumberFormatException: For input string: "30s"
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:542)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:522)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1764)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:522)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:308)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1722)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1719)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1650)
> {noformat}
> This is because HDFS-10845, we are adding time unit to hdfs-default.xml but 
> it cannot be recognized by old version MR jars. 
> This break our rolling upgrade story, so should mark as blocker.
> A quick workaround is to add values in hdfs-site.xml with removing all time 
> unit. But the right way may be to revert HDFS-10845 (and get rid of noisy 
> warnings).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-9266) Avoid unsafe split and append on fields that might be IPv6 literals

2021-07-29 Thread Brahma Reddy Battula (Jira)



 [ 
https://issues.apache.org/jira/browse/HDFS-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-9266:
---
Fix Version/s: HADOOP-17800
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Commit to HADOOP-17800 branch. [~newanja] and [~hemanthboyina] thanks for your 
contribution

> Avoid unsafe split and append on fields that might be IPv6 literals
> ---
>
> Key: HDFS-9266
> URL: https://issues.apache.org/jira/browse/HDFS-9266
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: Nemanja Matkovic
>Assignee: Nemanja Matkovic
>Priority: Major
>  Labels: ipv6
> Fix For: HADOOP-17800
>
> Attachments: HDFS-9266-HADOOP-11890.1.patch, 
> HDFS-9266-HADOOP-11890.2.patch, HDFS-9266-HADOOP-17800.001.patch, 
> HDFS-9266-HADOOP-17800.002.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

41 matches

Mail list logo