[jira] [Resolved] (YARN-11138) TestRouterWebServicesREST Junit Test Error Fix

2022-05-13 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanshilun resolved YARN-11138.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

> TestRouterWebServicesREST Junit Test Error Fix
> --
>
> Key: YARN-11138
> URL: https://issues.apache.org/jira/browse/YARN-11138
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
> Fix For: 3.4.0
>
>
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 28.818 s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST
> [ERROR] org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST 
>  Time elapsed: 28.817 s  <<< FAILURE!
> java.lang.AssertionError: Web app not running
> at org.junit.Assert.fail(Assert.java:89)
> at 
> org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.waitWebAppRunning(TestRouterWebServicesREST.java:199)
> at 
> org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.setUp(TestRouterWebServicesREST.java:217)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> at 
> org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
> at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64

2022-05-13 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/302/

[May 11, 2022 6:44:52 AM] (noreply) YARN-11130. removed unused import (#4276)
[May 11, 2022 12:29:05 PM] (Szilard Nemeth) YARN-11141. Capacity Scheduler does 
not support ambiguous queue names when moving application across queues. 
Contributed by Andras Gyori
[May 11, 2022 12:39:42 PM] (Szilard Nemeth) YARN-10850. TimelineService v2 
lists containers for all attempts when filtering for one. Contributed by 
Benjamin Teke
[May 11, 2022 2:55:19 PM] (Szilard Nemeth) MAPREDUCE-7379. 
RMContainerRequestor#makeRemoteRequest has confusing log message. Contributed 
by Ashutosh Gupta
[May 11, 2022 4:01:31 PM] (Benjamin Teke) YARN-4. RMWebServices returns 
only apps matching exactly the submitted queue name. Contributed by Szilard 
Nemeth
[May 11, 2022 5:34:22 PM] (noreply) HDFS-16465. Remove redundant strings.h 
inclusions (#4279)
[May 12, 2022 12:01:21 AM] (Owen O'Malley) HADOOP-18193:Support nested mount 
points in INodeTree
[May 12, 2022 8:53:09 AM] (noreply) HDFS-16525.System.err should be used when 
error occurs in multiple methods in DFSAdmin class (#4122)
[May 12, 2022 11:42:06 AM] (Szilard Nemeth) YARN-11126. ZKConfigurationStore 
Java deserialisation vulnerability. Contributed by Tamas Domok




-1 overall


The following subsystems voted -1:
blanks pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

spotbugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Redundant nullcheck of oldLock, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:[line 695] 
   Redundant nullcheck of metaChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long,
 FileInputStream, FileChannel, String) Redundant null check at 
MappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long,
 FileInputStream, FileChannel, String) Redundant null check at 
MappableBlockLoader.java:[line 138] 
   Redundant nullcheck of blockChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at MemoryMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at MemoryMappableBlockLoader.java:[line 75] 
   Redundant nullcheck of blockChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at NativePmemMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at NativePmemMappableBlockLoader.java:[line 85] 
   Redundant nullcheck of metaChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.verifyChecksumAndMapBlock(NativeIO$POSIX$PmemMappedRegion,
 long, FileInputStream, FileChannel, String) Redundant null check at 
NativePmemMappableBlockLoader.java:is known to be non-null in 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2022-05-13 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/

[May 12, 2022 8:53:09 AM] (noreply) HDFS-16525.System.err should be used when 
error occurs in multiple methods in DFSAdmin class (#4122)
[May 12, 2022 11:42:06 AM] (Szilard Nemeth) YARN-11126. ZKConfigurationStore 
Java deserialisation vulnerability. Contributed by Tamas Domok




-1 overall


The following subsystems voted -1:
blanks pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

Failed junit tests :

   hadoop.hdfs.TestClientProtocolForPipelineRecovery 
   hadoop.hdfs.tools.TestDFSAdmin 
   hadoop.mapred.TestLocalDistributedCacheManager 
  

   cc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/results-compile-cc-root.txt
 [96K]

   javac:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/results-compile-javac-root.txt
 [340K]

   blanks:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/blanks-eol.txt
 [13M]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/blanks-tabs.txt
 [2.0M]

   checkstyle:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/results-checkstyle-root.txt
 [14M]

   pathlen:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/results-pathlen.txt
 [16K]

   pylint:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/results-pylint.txt
 [20K]

   shellcheck:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/results-shellcheck.txt
 [28K]

   xml:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/xml.txt
 [24K]

   javadoc:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/results-javadoc-javadoc-root.txt
 [400K]

   unit:

  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 [756K]
  
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/868/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-common.txt
 [48K]

Powered by Apache Yetus 0.14.0-SNAPSHOT   https://yetus.apache.org

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

Re: [VOTE] Release Apache Hadoop 3.3.3 (RC1)

2022-05-13 Thread Mukund Madhav Thakur
+1 (binding)

Signature : good
Checksum : good.
Compiled from source , good

mvn clean install -DskipTests
Ran aws integration tests, good.

 mvn  -Dparallel-tests -DtestsThreadCount=8 clean verify
Compiled gcs using staging repo with version 3.3.3, successful.

 mvn clean install -DskipTests -Dhadoop.version=3.3.3
-Psnapshots-and-staging


On Thu, May 12, 2022 at 7:56 PM Stack  wrote:

> +1 (binding)
>
> * Signature: ok
> * Checksum : ok
> * Rat check (10.0.2): ok
>  - mvn clean apache-rat:check
> * Built from source (10.0.2): ok
>  - mvn clean install  -DskipTests
> * Unit tests pass (10.0.2): ok
>  - mvn package -P runAllTests  -Dsurefire.rerunFailingTestsCount=3
>
> 
> [INFO] Apache Hadoop Cloud Storage Project  SUCCESS [
>  0.026 s]
> [INFO]
> 
> [INFO] BUILD SUCCESS
> [INFO]
> 
> [INFO] Total time:  12:51 h
> [INFO] Finished at: 2022-05-12T06:25:19-07:00
> [INFO]
> 
> [WARNING] The requested profile "runAllTests" could not be activated
> because it does not exist.
>
> Built a downstreamer against this RC and ran it in-the-small. Seemed fine.
>
> S
>
>
> On Wed, May 11, 2022 at 10:25 AM Steve Loughran
> 
> wrote:
>
> > I have put together a release candidate (RC1) for Hadoop 3.3.3
> >
> > The RC is available at:
> > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/
> >
> > The git tag is release-3.3.3-RC1, commit d37586cbda3
> >
> > The maven artifacts are staged at
> > https://repository.apache.org/content/repositories/orgapachehadoop-1349/
> >
> > You can find my public key at:
> > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> >
> > Change log
> > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/CHANGELOG.md
> >
> > Release notes
> > https://dist.apache.org/repos/dist/dev/hadoop/3.3.3-RC1/RELEASENOTES.md
> >
> > There's a very small number of changes, primarily critical code/packaging
> > issues and security fixes.
> >
> > * The critical fixes which shipped in the 3.2.3 release.
> > * CVEs in our code and dependencies
> > * Shaded client packaging issues.
> > * A switch from log4j to reload4j
> >
> > reload4j is an active fork of the log4j 1.17 library with the classes
> > which contain CVEs removed. Even though hadoop never used those classes,
> > they regularly raised alerts on security scans and concen from users.
> > Switching to the forked project allows us to ship a secure logging
> > framework. It will complicate the builds of downstream
> > maven/ivy/gradle projects which exclude our log4j artifacts, as they
> > need to cut the new dependency instead/as well.
> >
> > See the release notes for details.
> >
> > This is the second release attempt. It is the same git commit as before,
> > but
> > fully recompiled with another republish to maven staging, which has bee
> > verified by building spark, as well as a minimal test project.
> >
> > Please try the release and vote. The vote will run for 5 days.
> >
> > -Steve
> >
>


[jira] [Resolved] (YARN-11125) Backport YARN-6483 to branch-2.10

2022-05-13 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved YARN-11125.
--
Fix Version/s: 2.10.2
   Resolution: Fixed

Merged the PR into branch-2.10. Thank you [~groot] for your contribution.

> Backport YARN-6483 to branch-2.10
> -
>
> Key: YARN-11125
> URL: https://issues.apache.org/jira/browse/YARN-11125
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.10.2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Backport YARN-6483 to branch-2.10



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-11149) Add regression test cases for YARN-11073

2022-05-13 Thread Akira Ajisaka (Jira)
Akira Ajisaka created YARN-11149:


 Summary: Add regression test cases for YARN-11073
 Key: YARN-11149
 URL: https://issues.apache.org/jira/browse/YARN-11149
 Project: Hadoop YARN
  Issue Type: Test
  Components: test
Reporter: Akira Ajisaka


Add regression test cases for YARN-11073



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-11073) CapacityScheduler DRF Preemption kicked in incorrectly for low-capacity queues

2022-05-13 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved YARN-11073.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Merged the PR into trunk. Let's add test cases in a separate JIRA.

> CapacityScheduler DRF Preemption kicked in incorrectly for low-capacity queues
> --
>
> Key: YARN-11073
> URL: https://issues.apache.org/jira/browse/YARN-11073
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 2.10.1
>Reporter: Jian Chen
>Assignee: Jian Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: YARN-11073.tmp-1.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When running a Hive job in a low-capacity queue on an idle cluster, 
> preemption kicked in to preempt job containers even though there's no other 
> job running and competing for resources. 
> Let's take this scenario as an example:
>  * cluster resource : 
>  ** {_}*queue_low*{_}: min_capacity 1%
>  ** queue_mid: min_capacity 19%
>  ** queue_high: min_capacity 80%
>  * CapacityScheduler with DRF
> During the fifo preemption candidates selection process, the 
> _preemptableAmountCalculator_ needs to first "{_}computeIdealAllocation{_}" 
> which depends on each queue's guaranteed/min capacity. A queue's guaranteed 
> capacity is currently calculated as 
> "Resources.multiply(totalPartitionResource, absCapacity)", so the guaranteed 
> capacity of queue_low is:
>  * {_}*queue_low*{_}:  = 
> , but since the Resource object takes only Long 
> values, these Doubles values get casted into Long, and then the final result 
> becomes **
> Because the guaranteed capacity of queue_low is 0, its normalized guaranteed 
> capacity based on active queues is also 0 based on the current algorithm in 
> "{_}resetCapacity{_}". This eventually leads to the continuous preemption of 
> job containers running in {_}*queue_low*{_}. 
> In order to work around this corner case, I made a small patch (for my own 
> use case) around "{_}resetCapacity{_}" to consider a couple new scenarios: 
>  * if the sum of absoluteCapacity/minCapacity of all active queues is zero, 
> we should normalize their guaranteed capacity evenly
> {code:java}
> 1.0f / num_of_queues{code}
>  * if the sum of pre-normalized guaranteed capacity values ({_}MB or 
> VCores{_}) of all active queues is zero, meaning we might have several queues 
> like queue_low whose capacity value got casted into 0, we should normalize 
> evenly as well like the first scenario (if they are all tiny, it really makes 
> no big difference, for example, 1% vs 1.2%).
>  * if one of the active queues has a zero pre-normalized guaranteed capacity 
> value but its absoluteCapacity/minCapacity is *not* zero, then we should 
> normalize based on the weight of their configured queue 
> absoluteCapacity/minCapacity. This is to make sure _*queue_low*_ gets a small 
> but fair normalized value when _*queue_mid*_ is also active. 
> {code:java}
> minCapacity / (sum_of_min_capacity_of_active_queues)
> {code}
>  
> This is how I currently work around this issue, it might need someone who's 
> more familiar in this component to do a systematic review of the entire 
> preemption process to fix it properly. Maybe we can always apply the 
> weight-based approach using absoluteCapacity, or rewrite the code of Resource 
> to remove the casting, or always roundUp when calculating a queue's 
> guaranteed capacity, etc.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2022-05-13 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/

No changes




-1 overall


The following subsystems voted -1:
asflicense hadolint mvnsite pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.fs.TestTrash 
   hadoop.fs.TestFileUtil 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   
hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat 
   hadoop.hdfs.server.federation.router.TestRouterQuota 
   hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver 
   hadoop.hdfs.server.federation.resolver.order.TestLocalResolver 
   hadoop.yarn.server.resourcemanager.TestClientRMService 
   
hadoop.yarn.server.resourcemanager.monitor.invariants.TestMetricsInvariantChecker
 
   hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter 
   hadoop.mapreduce.lib.input.TestLineRecordReader 
   hadoop.mapred.TestLineRecordReader 
   hadoop.tools.TestDistCpSystem 
   hadoop.yarn.sls.TestSLSRunner 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
   hadoop.resourceestimator.service.TestResourceEstimatorService 
  

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/diff-compile-javac-root.txt
  [472K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/diff-checkstyle-root.txt
  [14M]

   hadolint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   mvnsite:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-mvnsite-root.txt
  [556K]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/diff-patch-shellcheck.txt
  [72K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/whitespace-eol.txt
  [12M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-javadoc-root.txt
  [40K]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [220K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [428K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [12K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
  [36K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
  [20K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [112K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
  [104K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-tools_hadoop-distcp.txt
  [24K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-tools_hadoop-azure.txt
  [20K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/660/artifact/out/patch-unit-hadoop-tools_hadoop-sls.txt
  [28K]
   

[jira] [Created] (YARN-11148) In federation and security mode, nm recover may fail.

2022-05-13 Thread zhengchenyu (Jira)
zhengchenyu created YARN-11148:
--

 Summary: In federation and security mode, nm recover may fail.
 Key: YARN-11148
 URL: https://issues.apache.org/jira/browse/YARN-11148
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.2.1
Reporter: zhengchenyu
Assignee: zhengchenyu


Exception stack
{code:java}
2022-05-08 00:44:11,536 WARN org.apache.hadoop.ipc.Client: Exception 
encountered while connecting to the server : 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
2022-05-08 00:44:11,540 ERROR 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService: Exception 
when recovering appattempt_1650635484875_0036_02, removing it from 
NMStateStore and move on
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: 
DestHost:destPort host:8032 , LocalHost:localPort node/10.x.x.x:0. Failed on 
local exception: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
    at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.recover(FederationInterceptor.java:441)
    at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.initializePipeline(AMRMProxyService.java:466)
    at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.recover(AMRMProxyService.java:270)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover(ContainerManagerImpl.java:389)
    at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:324)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
    at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
    at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:516)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
    at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:974)
    at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1054)
Caused by: java.io.IOException: DestHost:destPort host:8032 , 
LocalHost:localPort host/10.x.x.x:0. Failed on local exception: 
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client 
cannot authenticate via:[TOKEN, KERBEROS]
    at sun.reflect.GeneratedConstructorAccessor30.newInstance(Unknown Source)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833)
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:808)
    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
    at org.apache.hadoop.ipc.Client.call(Client.java:1492)
    at org.apache.hadoop.ipc.Client.call(Client.java:1389)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
    at com.sun.proxy.$Proxy30.getContainers(Unknown Source)
    at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getContainers(ApplicationClientProtocolPBClientImpl.java:479)
    at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
    at com.sun.proxy.$Proxy31.getContainers(Unknown Source)
    at 
org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor.recover(FederationInterceptor.java:418)
    ... 10 more
Caused by: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
    at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:771)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at