[jira] [Created] (HDFS-16953) RBF Mount table store APIs should update cache only if state store record is successfully updated

2023-03-15 Thread Viraj Jasani (Jira)
Viraj Jasani created HDFS-16953:
---

 Summary: RBF Mount table store APIs should update cache only if 
state store record is successfully updated
 Key: HDFS-16953
 URL: https://issues.apache.org/jira/browse/HDFS-16953
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Viraj Jasani
Assignee: Viraj Jasani


RBF Mount table state store APIs addMountTableEntry, updateMountTableEntry and 
removeMountTableEntry performs cache refresh for all routers regardless of the 
actual record update result. If the record fails to get updated on 
zookeeper/file based store impl, reloading the cache for all routers would be 
unnecessary.

 

For instance, simultaneously adding new mount point could lead to failure for 
the second call if first call has not added new entry by the time second call 
retrieves mount table entry from getMountTableEntries before attempting to call 
addMountTableEntry.
{code:java}
DEBUG [{cluster}/{ip}:8111] ipc.Client - IPC Client (1826699684) connection to 
nn-0-{ns}.{cluster}/{ip}:8111 from {user}IPC Client (1826699684) connection to 
nn-0-{ns}.{cluster}/{ip}:8111 from {user} sending #1 
org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocol.addMountTableEntry

DEBUG [{cluster}/{ip}:8111 from {user}] ipc.Client - IPC Client (1826699684) 
connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user} got value #1

DEBUG [main] ipc.ProtobufRpcEngine2 - Call: addMountTableEntry took 24ms

DEBUG [{cluster}/{ip}:8111 from {user}] ipc.Client - IPC Client (1826699684) 
connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user}: closed

DEBUG [{cluster}/{ip}:8111 from {user}] ipc.Client - IPC Client (1826699684) 
connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user}: stopped, remaining 
connections 0

TRACE [main] ipc.ProtobufRpcEngine2 - 1: Response <- 
nn-0-{ns}.{cluster}/{ip}:8111: addMountTableEntry {status: false}

Cannot add mount point /data503 {code}
The failure to write new record:
{code:java}
INFO  [IPC Server handler 0 on default port 8111] impl.StateStoreZooKeeperImpl 
- Cannot write record "/hdfs-federation/MountTable/0SLASH0data503", it already 
exists {code}
Since the successful call has already refreshed cache for all routers, second 
call that failed should not have refreshed cache for all routers again as 
everyone already has updated records in cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2023-03-15 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/967/

No changes


ERROR: File 'out/email-report.txt' does not exist

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64

2023-03-15 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/458/

[Mar 13, 2023, 12:24:36 PM] (Steve Loughran) HADOOP-18661. Fix bin/hadoop usage 
script terminology. (#5473)
[Mar 13, 2023, 12:30:12 PM] (github) HADOOP-18653. LogLevel servlet to 
determine log impl before using setLevel (#5456)
[Mar 15, 2023, 4:33:00 AM] (github) HDFS-16942. Addendum. Send error to 
datanode if FBR is rejected due to bad lease (#5478). Contributed by Stephen 
O'Donnell/




-1 overall


The following subsystems voted -1:
blanks hadolint mvnsite pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

spotbugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Redundant nullcheck of oldLock, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory))
 Redundant null check at DataStorage.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory))
 Redundant null check at DataStorage.java:[line 695] 
   Redundant nullcheck of metaChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long,
 FileInputStream, FileChannel, String) Redundant null check at 
MappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long,
 FileInputStream, FileChannel, String) Redundant null check at 
MappableBlockLoader.java:[line 138] 
   Redundant nullcheck of blockChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at MemoryMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at MemoryMappableBlockLoader.java:[line 75] 
   Redundant nullcheck of blockChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at NativePmemMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at NativePmemMappableBlockLoader.java:[line 85] 
   Redundant nullcheck of metaChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.verifyChecksumAndMapBlock(NativeIO$POSIX$$PmemMappedRegion,,
 long, FileInputStream, FileChannel, String) Redundant null check at 
NativePmemMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.verifyChecksumAndMapBlock(NativeIO$POSIX$$PmemMappedRegion,,
 long, FileInputStream, FileChannel, String) Redundant null check at 
NativePmemMappableBlockLoader.java:[line 130] 
   
org.apache.hadoop.hdfs.server.namenode.top.window.RollingWindowManager$UserCounts
  doesn't override java.util.ArrayList.equals(Object) At 
RollingWindowManager.java:At RollingWindowManager.java:[line 1] 

spotbugs :

   module:hadoop-yarn-project/hadoop-yarn 
   Redundant nullcheck of it, which is known to be non-null in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker,
 NMStateStoreService$Loca

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2023-03-15 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/

No changes




-1 overall


The following subsystems voted -1:
blanks hadolint mvnsite pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

spotbugs :

   module:hadoop-mapreduce-project/hadoop-mapreduce-client 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   
module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   module:hadoop-mapreduce-project 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

spotbugs :

   module:root 
   Write to static field 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:from instance method new 
org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, 
ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, 
ExceptionReporter, SecretKey) At Fetcher.java:[line 120] 

Failed junit tests :

   hadoop.hdfs.server.datanode.TestDirectoryScanner 
  

   cc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-compile-cc-root.txt
 [96K]

   javac:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-compile-javac-root.txt
 [528K]

   blanks:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/blanks-eol.txt
 [14M]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/blanks-tabs.txt
 [2.0M]

   checkstyle:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-checkstyle-root.txt
 [13M]

   hadolint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-hadolint.txt
 [8.0K]

   mvnsite:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/patch-mvnsite-root.txt
 [672K]

   pathlen:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/o

[VOTE] Release Apache Hadoop 3.3.5 (RC3)

2023-03-15 Thread Steve Loughran
Apache Hadoop 3.3.5

Mukund and I have put together a release candidate (RC3) for Hadoop 3.3.5.

What we would like is for anyone who can to verify the tarballs, especially
anyone who can try the arm64 binaries as we want to include them too.

The RC is available at:
https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/

The git tag is release-3.3.5-RC3, commit 706d88266ab

The maven artifacts are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1369/

You can find my public key at:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS

Change log
https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/CHANGELOG.md

Release notes
https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/RELEASENOTES.md

This is off branch-3.3 and is the first big release since 3.3.2.

Key changes include

* Big update of dependencies to try and keep those reports of
  transitive CVEs under control -both genuine and false positives.
* HDFS RBF enhancements
* Critical fix to ABFS input stream prefetching for correct reading.
* Vectored IO API for all FSDataInputStream implementations, with
  high-performance versions for file:// and s3a:// filesystems.
  file:// through java native io
  s3a:// parallel GET requests.
* This release includes Arm64 binaries. Please can anyone with
  compatible systems validate these.
* and compared to the previous RC, all the major changes are
  HDFS issues.

Note, because the arm64 binaries are built separately on a different
platform and JVM, their jar files may not match those of the x86
release -and therefore the maven artifacts. I don't think this is
an issue (the ASF actually releases source tarballs, the binaries are
there for help only, though with the maven repo that's a bit blurred).

The only way to be consistent would actually untar the x86.tar.gz,
overwrite its binaries with the arm stuff, retar, sign and push out
for the vote. Even automating that would be risky.

Please try the release and vote. The vote will run for 5 days.

-Steve


[jira] [Resolved] (HDFS-16947) RBF NamenodeHeartbeatService to report error for not being able to register namenode in state store

2023-03-15 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16947.

Resolution: Fixed

> RBF NamenodeHeartbeatService to report error for not being able to register 
> namenode in state store
> ---
>
> Key: HDFS-16947
> URL: https://issues.apache.org/jira/browse/HDFS-16947
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Namenode heartbeat service should provide error with full stacktrace if it 
> cannot register namenode in the state store. As of today, we only log info 
> msg.
> For zookeeper based impl, this might mean either a) curator manager is not 
> initialized or b) if it fails to write to znode after exhausting retries. For 
> either of these cases, reporting only INFO log might not be good enough and 
> we might have to look for errors elsewhere.
>  
> Sample example:
> {code:java}
> 2023-02-20 23:10:33,714 DEBUG [NamenodeHeartbeatService {ns} nn0-0] 
> router.NamenodeHeartbeatService - Received service state: ACTIVE from HA 
> namenode: {ns}-nn0:nn-0-{ns}.{cluster}:9000
> 2023-02-20 23:10:33,731 INFO  [NamenodeHeartbeatService {ns} nn0-0] 
> impl.MembershipStoreImpl - Inserting new NN registration: 
> nn-0.namenode.{cluster}:->{ns}:nn0:nn-0-{ns}.{cluster}:9000-ACTIVE
> 2023-02-20 23:10:33,731 INFO  [NamenodeHeartbeatService {ns} nn0-0] 
> router.NamenodeHeartbeatService - Cannot register namenode in the State Store
>  {code}
> If we could log full stacktrace:
> {code:java}
> 2023-02-21 00:20:24,691 ERROR [NamenodeHeartbeatService {ns} nn0-0] 
> router.NamenodeHeartbeatService - Cannot register namenode in the State Store
> org.apache.hadoop.hdfs.server.federation.store.StateStoreUnavailableException:
>  State Store driver StateStoreZooKeeperImpl in nn-0.namenode.{cluster} is not 
> ready.
>         at 
> org.apache.hadoop.hdfs.server.federation.store.driver.StateStoreDriver.verifyDriverReady(StateStoreDriver.java:158)
>         at 
> org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.putAll(StateStoreZooKeeperImpl.java:235)
>         at 
> org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreBaseImpl.put(StateStoreBaseImpl.java:74)
>         at 
> org.apache.hadoop.hdfs.server.federation.store.impl.MembershipStoreImpl.namenodeHeartbeat(MembershipStoreImpl.java:179)
>         at 
> org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver.registerNamenode(MembershipNamenodeResolver.java:381)
>         at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:317)
>         at 
> org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.lambda$periodicInvoke$0(NamenodeHeartbeatService.java:244)
> ...
> ... {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16952) Add description of GETSERVERDEFAULTS to WebHDFS doc

2023-03-15 Thread Hualong Zhang (Jira)
Hualong Zhang created HDFS-16952:


 Summary: Add description of GETSERVERDEFAULTS to WebHDFS doc
 Key: HDFS-16952
 URL: https://issues.apache.org/jira/browse/HDFS-16952
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 3.4.0
Reporter: Hualong Zhang


Add description of GETSERVERDEFAULTS to WebHDFS doc



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-16951) Add description of GETSERVERDEFAULTS to WebHDFS doc

2023-03-15 Thread Hualong Zhang (Jira)
Hualong Zhang created HDFS-16951:


 Summary: Add description of GETSERVERDEFAULTS to WebHDFS doc
 Key: HDFS-16951
 URL: https://issues.apache.org/jira/browse/HDFS-16951
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: webhdfs
Affects Versions: 3.4.0
Reporter: Hualong Zhang


Add description of GETSERVERDEFAULTS to WebHDFS doc



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org