[jira] [Created] (HDFS-16953) RBF Mount table store APIs should update cache only if state store record is successfully updated
Viraj Jasani created HDFS-16953: --- Summary: RBF Mount table store APIs should update cache only if state store record is successfully updated Key: HDFS-16953 URL: https://issues.apache.org/jira/browse/HDFS-16953 Project: Hadoop HDFS Issue Type: Improvement Reporter: Viraj Jasani Assignee: Viraj Jasani RBF Mount table state store APIs addMountTableEntry, updateMountTableEntry and removeMountTableEntry performs cache refresh for all routers regardless of the actual record update result. If the record fails to get updated on zookeeper/file based store impl, reloading the cache for all routers would be unnecessary. For instance, simultaneously adding new mount point could lead to failure for the second call if first call has not added new entry by the time second call retrieves mount table entry from getMountTableEntries before attempting to call addMountTableEntry. {code:java} DEBUG [{cluster}/{ip}:8111] ipc.Client - IPC Client (1826699684) connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user}IPC Client (1826699684) connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user} sending #1 org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocol.addMountTableEntry DEBUG [{cluster}/{ip}:8111 from {user}] ipc.Client - IPC Client (1826699684) connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user} got value #1 DEBUG [main] ipc.ProtobufRpcEngine2 - Call: addMountTableEntry took 24ms DEBUG [{cluster}/{ip}:8111 from {user}] ipc.Client - IPC Client (1826699684) connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user}: closed DEBUG [{cluster}/{ip}:8111 from {user}] ipc.Client - IPC Client (1826699684) connection to nn-0-{ns}.{cluster}/{ip}:8111 from {user}: stopped, remaining connections 0 TRACE [main] ipc.ProtobufRpcEngine2 - 1: Response <- nn-0-{ns}.{cluster}/{ip}:8111: addMountTableEntry {status: false} Cannot add mount point /data503 {code} The failure to write new record: {code:java} INFO [IPC Server handler 0 on default port 8111] impl.StateStoreZooKeeperImpl - Cannot write record "/hdfs-federation/MountTable/0SLASH0data503", it already exists {code} Since the successful call has already refreshed cache for all routers, second call that failed should not have refreshed cache for all routers again as everyone already has updated records in cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/967/ No changes ERROR: File 'out/email-report.txt' does not exist - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/458/ [Mar 13, 2023, 12:24:36 PM] (Steve Loughran) HADOOP-18661. Fix bin/hadoop usage script terminology. (#5473) [Mar 13, 2023, 12:30:12 PM] (github) HADOOP-18653. LogLevel servlet to determine log impl before using setLevel (#5456) [Mar 15, 2023, 4:33:00 AM] (github) HDFS-16942. Addendum. Send error to datanode if FBR is rejected due to bad lease (#5478). Contributed by Stephen O'Donnell/ -1 overall The following subsystems voted -1: blanks hadolint mvnsite pathlen spotbugs unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml spotbugs : module:hadoop-hdfs-project/hadoop-hdfs Redundant nullcheck of oldLock, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)) Redundant null check at DataStorage.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)) Redundant null check at DataStorage.java:[line 695] Redundant nullcheck of metaChannel, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long, FileInputStream, FileChannel, String) Redundant null check at MappableBlockLoader.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long, FileInputStream, FileChannel, String) Redundant null check at MappableBlockLoader.java:[line 138] Redundant nullcheck of blockChannel, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long, FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null check at MemoryMappableBlockLoader.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long, FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null check at MemoryMappableBlockLoader.java:[line 75] Redundant nullcheck of blockChannel, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long, FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null check at NativePmemMappableBlockLoader.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long, FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null check at NativePmemMappableBlockLoader.java:[line 85] Redundant nullcheck of metaChannel, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.verifyChecksumAndMapBlock(NativeIO$POSIX$$PmemMappedRegion,, long, FileInputStream, FileChannel, String) Redundant null check at NativePmemMappableBlockLoader.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.verifyChecksumAndMapBlock(NativeIO$POSIX$$PmemMappedRegion,, long, FileInputStream, FileChannel, String) Redundant null check at NativePmemMappableBlockLoader.java:[line 130] org.apache.hadoop.hdfs.server.namenode.top.window.RollingWindowManager$UserCounts doesn't override java.util.ArrayList.equals(Object) At RollingWindowManager.java:At RollingWindowManager.java:[line 1] spotbugs : module:hadoop-yarn-project/hadoop-yarn Redundant nullcheck of it, which is known to be non-null in org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker, NMStateStoreService$Loca
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/ No changes -1 overall The following subsystems voted -1: blanks hadolint mvnsite pathlen spotbugs unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml spotbugs : module:hadoop-mapreduce-project/hadoop-mapreduce-client Write to static field org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:[line 120] spotbugs : module:hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core Write to static field org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:[line 120] spotbugs : module:hadoop-mapreduce-project Write to static field org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:[line 120] spotbugs : module:root Write to static field org.apache.hadoop.mapreduce.task.reduce.Fetcher.nextId from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:from instance method new org.apache.hadoop.mapreduce.task.reduce.Fetcher(JobConf, TaskAttemptID, ShuffleSchedulerImpl, MergeManager, Reporter, ShuffleClientMetrics, ExceptionReporter, SecretKey) At Fetcher.java:[line 120] Failed junit tests : hadoop.hdfs.server.datanode.TestDirectoryScanner cc: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-compile-cc-root.txt [96K] javac: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-compile-javac-root.txt [528K] blanks: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/blanks-eol.txt [14M] https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/blanks-tabs.txt [2.0M] checkstyle: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-checkstyle-root.txt [13M] hadolint: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/results-hadolint.txt [8.0K] mvnsite: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/out/patch-mvnsite-root.txt [672K] pathlen: https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1166/artifact/o
[VOTE] Release Apache Hadoop 3.3.5 (RC3)
Apache Hadoop 3.3.5 Mukund and I have put together a release candidate (RC3) for Hadoop 3.3.5. What we would like is for anyone who can to verify the tarballs, especially anyone who can try the arm64 binaries as we want to include them too. The RC is available at: https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/ The git tag is release-3.3.5-RC3, commit 706d88266ab The maven artifacts are staged at https://repository.apache.org/content/repositories/orgapachehadoop-1369/ You can find my public key at: https://dist.apache.org/repos/dist/release/hadoop/common/KEYS Change log https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/CHANGELOG.md Release notes https://dist.apache.org/repos/dist/dev/hadoop/hadoop-3.3.5-RC3/RELEASENOTES.md This is off branch-3.3 and is the first big release since 3.3.2. Key changes include * Big update of dependencies to try and keep those reports of transitive CVEs under control -both genuine and false positives. * HDFS RBF enhancements * Critical fix to ABFS input stream prefetching for correct reading. * Vectored IO API for all FSDataInputStream implementations, with high-performance versions for file:// and s3a:// filesystems. file:// through java native io s3a:// parallel GET requests. * This release includes Arm64 binaries. Please can anyone with compatible systems validate these. * and compared to the previous RC, all the major changes are HDFS issues. Note, because the arm64 binaries are built separately on a different platform and JVM, their jar files may not match those of the x86 release -and therefore the maven artifacts. I don't think this is an issue (the ASF actually releases source tarballs, the binaries are there for help only, though with the maven repo that's a bit blurred). The only way to be consistent would actually untar the x86.tar.gz, overwrite its binaries with the arm stuff, retar, sign and push out for the vote. Even automating that would be risky. Please try the release and vote. The vote will run for 5 days. -Steve
[jira] [Resolved] (HDFS-16947) RBF NamenodeHeartbeatService to report error for not being able to register namenode in state store
[ https://issues.apache.org/jira/browse/HDFS-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDFS-16947. Resolution: Fixed > RBF NamenodeHeartbeatService to report error for not being able to register > namenode in state store > --- > > Key: HDFS-16947 > URL: https://issues.apache.org/jira/browse/HDFS-16947 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Namenode heartbeat service should provide error with full stacktrace if it > cannot register namenode in the state store. As of today, we only log info > msg. > For zookeeper based impl, this might mean either a) curator manager is not > initialized or b) if it fails to write to znode after exhausting retries. For > either of these cases, reporting only INFO log might not be good enough and > we might have to look for errors elsewhere. > > Sample example: > {code:java} > 2023-02-20 23:10:33,714 DEBUG [NamenodeHeartbeatService {ns} nn0-0] > router.NamenodeHeartbeatService - Received service state: ACTIVE from HA > namenode: {ns}-nn0:nn-0-{ns}.{cluster}:9000 > 2023-02-20 23:10:33,731 INFO [NamenodeHeartbeatService {ns} nn0-0] > impl.MembershipStoreImpl - Inserting new NN registration: > nn-0.namenode.{cluster}:->{ns}:nn0:nn-0-{ns}.{cluster}:9000-ACTIVE > 2023-02-20 23:10:33,731 INFO [NamenodeHeartbeatService {ns} nn0-0] > router.NamenodeHeartbeatService - Cannot register namenode in the State Store > {code} > If we could log full stacktrace: > {code:java} > 2023-02-21 00:20:24,691 ERROR [NamenodeHeartbeatService {ns} nn0-0] > router.NamenodeHeartbeatService - Cannot register namenode in the State Store > org.apache.hadoop.hdfs.server.federation.store.StateStoreUnavailableException: > State Store driver StateStoreZooKeeperImpl in nn-0.namenode.{cluster} is not > ready. > at > org.apache.hadoop.hdfs.server.federation.store.driver.StateStoreDriver.verifyDriverReady(StateStoreDriver.java:158) > at > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.putAll(StateStoreZooKeeperImpl.java:235) > at > org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreBaseImpl.put(StateStoreBaseImpl.java:74) > at > org.apache.hadoop.hdfs.server.federation.store.impl.MembershipStoreImpl.namenodeHeartbeat(MembershipStoreImpl.java:179) > at > org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver.registerNamenode(MembershipNamenodeResolver.java:381) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:317) > at > org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.lambda$periodicInvoke$0(NamenodeHeartbeatService.java:244) > ... > ... {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-16952) Add description of GETSERVERDEFAULTS to WebHDFS doc
Hualong Zhang created HDFS-16952: Summary: Add description of GETSERVERDEFAULTS to WebHDFS doc Key: HDFS-16952 URL: https://issues.apache.org/jira/browse/HDFS-16952 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 3.4.0 Reporter: Hualong Zhang Add description of GETSERVERDEFAULTS to WebHDFS doc -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-16951) Add description of GETSERVERDEFAULTS to WebHDFS doc
Hualong Zhang created HDFS-16951: Summary: Add description of GETSERVERDEFAULTS to WebHDFS doc Key: HDFS-16951 URL: https://issues.apache.org/jira/browse/HDFS-16951 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 3.4.0 Reporter: Hualong Zhang Add description of GETSERVERDEFAULTS to WebHDFS doc -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org