[jira] [Created] (HDFS-17004) Mis-spellings about test

2023-05-08 Thread lihanran (Jira)
lihanran created HDFS-17004:
---

 Summary: Mis-spellings about test
 Key: HDFS-17004
 URL: https://issues.apache.org/jira/browse/HDFS-17004
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: lihanran
Assignee: lihanran






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16999) Fix wrong use of processFirstBlockReport()

2023-05-08 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HDFS-16999.

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Fix wrong use of processFirstBlockReport()
> --
>
> Key: HDFS-16999
> URL: https://issues.apache.org/jira/browse/HDFS-16999
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Shuyan Zhang
>Assignee: Shuyan Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> `processFirstBlockReport()` is used to process first block report from 
> datanode. It does not calculating `toRemove` list because it believes that 
> there is no metadata about the datanode in the namenode. However, If a 
> datanode is re registered after restarting, its `blockReportCount` will be 
> updated to 0. That is to say, the first block report after a datanode 
> restarts will be processed by `processFirstBlockReport()`.  This is 
> unreasonable because the metadata of the datanode already exists in namenode 
> at this time, and if redundant replica metadata is not removed in time, the 
> blocks with insufficient replicas cannot be reconstruct in time, which 
> increases the risk of missing block. In summary, `processFirstBlockReport()` 
> should only be used when the namenode restarts, not when the datanode 
> restarts. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Nightly Jenkins CI for Hadoop on Windows 10

2023-05-08 Thread Gautham Banasandra
Dear Hadoop community,

It is my pleasure to announce that I've set up the Nightly Jenkins CI for
Hadoop on the Windows 10 platform[1]. The effort mainly involved getting
Yetus to run on Windows against Hadoop.
The nightly CI will run every 36 hours and send out the build report to the
same recipients as this email, upon completion.
There are still quite a few things that need to be sorted out. Currently,
mvninstall has a +1. Other phases like mvnsite, javadoc etc still need to be
fixed.

[1]
https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-win10-x86_64/

Thanks,
--Gautham


[jira] [Created] (HDFS-17003) Erasure coding: invalidate wrong block after reporting bad blocks from datanode

2023-05-08 Thread farmmamba (Jira)
farmmamba created HDFS-17003:


 Summary: Erasure coding: invalidate wrong block after reporting 
bad blocks from datanode
 Key: HDFS-17003
 URL: https://issues.apache.org/jira/browse/HDFS-17003
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: farmmamba


After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block 
to invalidate. It is a dangerous behaviour and may cause data loss. Some logs 
in our production as below:

 

NameNode log:
{code:java}
2023-05-08 14:39:42,241 INFO org.apache.hadoop.hdfs.StateChange: *DIR* 
reportBadBlocks for block: 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008 on datanode: 
datanode1:50010 {code}
datanode1 log:
{code:java}
2023-05-08 14:39:42,183 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-932824627--1680179358678:blk_-9223372036846808880_1669008
 on /data1/hadoop/hdfs/datanode

2023-05-08 14:39:47,338 INFO 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to 
delete replica blk_-9223372036846808879_1669008: ReplicaInfo
not found. {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17002) [EC]:Generate parity blocks in time to prevent file corruption

2023-05-08 Thread farmmamba (Jira)
farmmamba created HDFS-17002:


 Summary: [EC]:Generate parity blocks in time to prevent file 
corruption
 Key: HDFS-17002
 URL: https://issues.apache.org/jira/browse/HDFS-17002
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.4.0
Reporter: farmmamba


In current EC implementation, the corrupted parity block will not be 
regenerated in time. 

Think about below scene when using RS-6-3-1024k EC policy:

If three parity blocks p1, p2, p3 are all corrupted or deleted, we are not 
aware of it.

Unfortunately, a data block is also corrupted in this time period,  then this 
file will be corrupted and can not be read by decoding.

 

So, here we should always re-generate parity block in time when it is unhealthy.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17001) Support getStatus API in WebHDFS

2023-05-08 Thread Hualong Zhang (Jira)
Hualong Zhang created HDFS-17001:


 Summary: Support getStatus API in WebHDFS
 Key: HDFS-17001
 URL: https://issues.apache.org/jira/browse/HDFS-17001
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 3.4.0
Reporter: Hualong Zhang
Assignee: Hualong Zhang
 Attachments: image-2023-05-08-14-34-51-873.png

WebHDFS should support getStatus:

!image-2023-05-08-14-34-51-873.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org