[jira] [Resolved] (HDFS-17024) Potential data race introduced by HDFS-15865

2023-10-26 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-17024.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Potential data race introduced by HDFS-15865
> 
>
> Key: HDFS-17024
> URL: https://issues.apache.org/jira/browse/HDFS-17024
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.3.1
>Reporter: Wei-Chiu Chuang
>Assignee: Segawa Hiroaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> After HDFS-15865, we found client aborted due to an NPE.
> {noformat}
> 2023-04-10 16:07:43,409 ERROR 
> org.apache.hadoop.hbase.regionserver.HRegionServer: * ABORTING region 
> server kqhdp36,16020,1678077077562: Replay of WAL required. Forcing server 
> shutdown *
> org.apache.hadoop.hbase.DroppedSnapshotException: region: WAFER_ALL,16|CM 
> RIE.MA1|CP1114561.18|PROC|,1625899466315.0fbdf0f1810efa9e68af831247e6555f.
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2870)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2539)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2511)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2401)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:613)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:582)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:69)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:362)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:880)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:781)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:898)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:850)
> at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:76)
> at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:105)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.finishClose(HFileWriterImpl.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:687)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:393)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
> at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:78)
> at 
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:1047)
> at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2349)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2806)
> {noformat}
> This is only possible if a data race happened. File this jira to improve the 
> data and eliminate the data race.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17241) long write lock on active NN from rollEditLog()

2023-10-26 Thread shuaiqi.guo (Jira)
shuaiqi.guo created HDFS-17241:
--

 Summary: long write lock on active NN from rollEditLog()
 Key: HDFS-17241
 URL: https://issues.apache.org/jira/browse/HDFS-17241
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.1.2
Reporter: shuaiqi.guo


when standby NN triggering log roll on active NN and sending fsimage to active 
NN at the same time, the active NN while hive a long write lock, which blocks 
almost all requests. like:
{code:java}
INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write 
lock held for 27179 ms via java.lang.Thread.getStackTrace(Thread.java:1559)
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:273)
org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:235)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1617)
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4663)
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1292)
org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:146)
org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12974)
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
java.security.AccessController.doPrivileged(Native Method)
javax.security.auth.Subject.doAs(Subject.java:422)
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2023-10-26 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1192/

No changes


ERROR: File 'out/email-report.txt' does not exist

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2023-10-26 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/

[Oct 25, 2023, 3:43:12 AM] (github) HDFS-17231. HA: Safemode should exit when 
resources are from low to available. (#6207). Contributed by Gu Peng.
[Oct 25, 2023, 5:56:39 AM] (github) HADOOP-18920. RPC Metrics : Optimize logic 
for log slow RPCs (#6146)
[Oct 25, 2023, 1:06:13 PM] (github) HADOOP-18933. upgrade to netty 4.1.100 due 
to CVE (#6173)
[Oct 25, 2023, 4:39:16 PM] (github) HADOOP-18948. S3A. Add option 
fs.s3a.directory.operations.purge.uploads to purge on rename/delete (#6218)
[Oct 26, 2023, 12:22:18 AM] (github) YARN-11593. [Federation] Improve command 
line help information. (#6199) Contributed by Shilun Fan.




-1 overall


The following subsystems voted -1:
blanks hadolint pathlen xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 
  

   cc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-compile-cc-root.txt
 [96K]

   javac:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-compile-javac-root.txt
 [12K]

   blanks:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/blanks-eol.txt
 [15M]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/blanks-tabs.txt
 [2.0M]

   checkstyle:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-checkstyle-root.txt
 [13M]

   hadolint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-hadolint.txt
 [20K]

   pathlen:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-pathlen.txt
 [16K]

   pylint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-pylint.txt
 [20K]

   shellcheck:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-shellcheck.txt
 [24K]

   xml:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/xml.txt
 [24K]

   javadoc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1391/artifact/out/results-javadoc-javadoc-root.txt
 [244K]

Powered by Apache Yetus 0.14.0-SNAPSHOT   https://yetus.apache.org

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

[jira] [Created] (HDFS-17240) Fix a typo in DataStorage.java

2023-10-26 Thread Yu Wang (Jira)
Yu Wang created HDFS-17240:
--

 Summary: Fix a typo in DataStorage.java
 Key: HDFS-17240
 URL: https://issues.apache.org/jira/browse/HDFS-17240
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Reporter: Yu Wang


Fix a typo in DataStorage.java

 
{code:java}
   /**
-   * Analize which and whether a transition of the fs state is required
+   * Analyze which and whether a transition of the fs state is required
    * and perform it if necessary.
    * {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17239) Remove logging of method invalidate which is in BLOCK_POOl write lock.

2023-10-26 Thread farmmamba (Jira)
farmmamba created HDFS-17239:


 Summary: Remove logging of method invalidate which is in 
BLOCK_POOl write lock. 
 Key: HDFS-17239
 URL: https://issues.apache.org/jira/browse/HDFS-17239
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 3.4.0
Reporter: farmmamba
Assignee: farmmamba


In method FsDatasetImpl#invalidate,  there exists some loggings in the range of 
BLOCK_POOl write lock.  We should move them out of write lock.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17238) Setting the value of "dfs.blocksize" too large will cause HDFS to be unable to write to files

2023-10-26 Thread ECFuzz (Jira)
ECFuzz created HDFS-17238:
-

 Summary: Setting the value of "dfs.blocksize" too large will cause 
HDFS to be unable to write to files
 Key: HDFS-17238
 URL: https://issues.apache.org/jira/browse/HDFS-17238
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.3.6
Reporter: ECFuzz


My hadoop version is 3.3.6, and I use the Pseudo-Distributed Operation.

core-site.xml like below.
{code:java}

  
        fs.defaultFS
        hdfs://localhost:9000
    
    
        hadoop.tmp.dir
        /home/hadoop/Mutil_Component/tmp
    
   
{code}
hdfs-site.xml like below.
{code:java}

   
        dfs.replication
        1
    

        dfs.blocksize
        134217728
    
   
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org