JIRA privilege for committers/PMCs
Hi, Is there any one who is a Hadoop committer or PMC, but who does not have the committer/administrator privilege (e.g. assign/resolve jiras)? JIRA projects like HADOOP has "hadoop" and "hadoop-pmc" groups added as administrators. [image: Screen Shot 2020-10-14 at 3.20.59 PM.png] So I would have thought all Hadoop committers/PMCs have administrator privileges (assuming the hadoop group maps to all Hadoop committers and hadoop-pmc for PMCs). But it seems not the case and I had to add the administrator privileges to committers manually. Does anyone know how it's supposed to work? Thanks, Weichiu
Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64
For more details, see https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/29/ [Oct 13, 2020 8:11:10 AM] (Szilard Nemeth) YARN-10454: Add applicationName policy. Contributed by Peter Bacsko [Oct 13, 2020 8:54:41 AM] (noreply) HADOOP-17223 update org.apache.httpcomponents:httpclient to 4.5.13 and httpcore to 4.4.13 (#2242) [Oct 13, 2020 12:27:55 PM] (noreply) HADOOP-17303. TestWebHDFS.testLargeDirectory failing (#2380) [Oct 13, 2020 12:30:38 PM] (Steve Loughran) Revert "HADOOP-17303. TestWebHDFS.testLargeDirectory failing (#2380)" [Oct 13, 2020 12:31:00 PM] (Steve Loughran) HDFS-15626. TestWebHDFS.testLargeDirectory failing (#2380) [Oct 13, 2020 3:17:44 PM] (noreply) HADOOP-16878. FileUtil.copy() to throw IOException if the source and destination are the same [Oct 13, 2020 3:30:34 PM] (noreply) HADOOP-17301. ABFS: read-ahead error reporting breaks buffer management (#2369) [Oct 13, 2020 3:54:15 PM] (Adam Antal) YARN-10448. SLS should set default user to handle SYNTH format. Contributed by zhuqi [Oct 13, 2020 5:59:42 PM] (noreply) HDFS-15614. Initialize snapshot trash root during NameNode startup if enabled (#2370) -1 overall The following subsystems voted -1: blanks findbugs mvnsite pathlen shadedclient unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml findbugs : module:hadoop-hdfs-project/hadoop-hdfs Redundant nullcheck of oldLock, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory) Redundant null check at DataStorage.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory) Redundant null check at DataStorage.java:[line 695] findbugs : module:hadoop-hdfs-project Redundant nullcheck of oldLock, which is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory) Redundant null check at DataStorage.java:is known to be non-null in org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory) Redundant null check at DataStorage.java:[line 695] findbugs : module:hadoop-yarn-project/hadoop-yarn Redundant nullcheck of it, which is known to be non-null in org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker, NMStateStoreService$LocalResourceTrackerState) Redundant null check at ResourceLocalizationService.java:is known to be non-null in org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker, NMStateStoreService$LocalResourceTrackerState) Redundant null check at ResourceLocalizationService.java:[line 343] Redundant nullcheck of it, which is known to be non-null in org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker, NMStateStoreService$LocalResourceTrackerState) Redundant null check at ResourceLocalizationService.java:is known to be non-null in org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker, NMStateStoreService$LocalResourceTrackerState) Redundant null check at ResourceLocalizationService.java:[line 356] Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWit
[jira] [Created] (YARN-10461) Update the ResourceManagerRest.md with the introduced endpoints
Benjamin Teke created YARN-10461: Summary: Update the ResourceManagerRest.md with the introduced endpoints Key: YARN-10461 URL: https://issues.apache.org/jira/browse/YARN-10461 Project: Hadoop YARN Issue Type: Sub-task Reporter: Benjamin Teke The new APIs introduced in YARN-10421 should be added to the file hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-10433) Extend the DiagnosticService to initiate the diagnostic bundle collection
[ https://issues.apache.org/jira/browse/YARN-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke resolved YARN-10433. -- Resolution: Abandoned Migrated to YARN-10421 > Extend the DiagnosticService to initiate the diagnostic bundle collection > - > > Key: YARN-10433 > URL: https://issues.apache.org/jira/browse/YARN-10433 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > > YARN-10421 introduces the new DiagnosticService class, the two new endpoints > for listing the available actions and starting the diagnostic script collect > method, and a basic diagnostic collector script. After the scripts form is > finalized (YARN-10422) the DiagnosticService should be extended to spawn the > requested collection method based on the input parameters and return the > collected bundle as a response. > To ease the load on the RM, the servlet should allow only one HTTP request at > a time. If a new request comes in while serving another an appropriate > response code should be returned, with the message "Diagnostics Collection in > Progress”. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-10460) Upgrading to JUnit 4.13 causes tests in TestNodeStatusUpdater to fail
Peter Bacsko created YARN-10460: --- Summary: Upgrading to JUnit 4.13 causes tests in TestNodeStatusUpdater to fail Key: YARN-10460 URL: https://issues.apache.org/jira/browse/YARN-10460 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, test Reporter: Peter Bacsko Assignee: Peter Bacsko In our downstream build environment, we're using JUnit 4.13. Recently, we discovered a truly weird test failure in TestNodeStatusUpdater. The problem is that timeout handling has changed in Junit 4.13. See the difference between these two snippets: 4.12 {noformat} @Override public void evaluate() throws Throwable { CallableStatement callable = new CallableStatement(); FutureTask task = new FutureTask(callable); threadGroup = new ThreadGroup("FailOnTimeoutGroup"); Thread thread = new Thread(threadGroup, task, "Time-limited test"); thread.setDaemon(true); thread.start(); callable.awaitStarted(); Throwable throwable = getResult(task, thread); if (throwable != null) { throw throwable; } } {noformat} 4.13 {noformat} @Override public void evaluate() throws Throwable { CallableStatement callable = new CallableStatement(); FutureTask task = new FutureTask(callable); ThreadGroup threadGroup = new ThreadGroup("FailOnTimeoutGroup"); Thread thread = new Thread(threadGroup, task, "Time-limited test"); try { thread.setDaemon(true); thread.start(); callable.awaitStarted(); Throwable throwable = getResult(task, thread); if (throwable != null) { throw throwable; } } finally { try { thread.join(1); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } try { threadGroup.destroy(); < This } catch (IllegalThreadStateException e) { // If a thread from the group is still alive, the ThreadGroup cannot be destroyed. // Swallow the exception to keep the same behavior prior to this change. } } } {noformat} The change comes from [https://github.com/junit-team/junit4/pull/1517]. Unfortunately, destroying the thread group causes an issue because there are all sorts of object caching in the IPC layer. The exception is: {noformat} java.lang.IllegalThreadStateException at java.lang.ThreadGroup.addUnstarted(ThreadGroup.java:867) at java.lang.Thread.init(Thread.java:402) at java.lang.Thread.init(Thread.java:349) at java.lang.Thread.(Thread.java:675) at java.util.concurrent.Executors$DefaultThreadFactory.newThread(Executors.java:613) at com.google.common.util.concurrent.ThreadFactoryBuilder$1.newThread(ThreadFactoryBuilder.java:163) at java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:612) at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:925) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368) at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1136) at org.apache.hadoop.ipc.Client.call(Client.java:1458) at org.apache.hadoop.ipc.Client.call(Client.java:1405) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy81.startContainers(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128) at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.startContainer(TestNodeManagerShutdown.java:251) at org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater.testNodeStatusUpdaterRetryAndNMShutdown(TestNodeStatusUpdater.java:1576) {noformat} Both the {{clientExecutor}} in {{org.apache.hadoop.ipc.Client}} and the client object in {{ProtobufRpcEngine}}/{{ProtobufRpcEngine2}} is stored as long as they're needed. But since the backing thread group is destroyed in the previous test, it's no longer possible to create new threads. A quick workaround is to stop the clients and completely clear the {{ClientCache}} in {{ProtobufRpcEngine}} before each testcase. I tried this and it solves the problem but it feels hacky. Not sure if there is a better approach. -- This message was sent by Atlassian Jira (v8.3.4#803005) ---
[jira] [Created] (YARN-10459) containerLaunchedOnNode method not need to hold schedulerApptemt lock
Ryan Wu created YARN-10459: -- Summary: containerLaunchedOnNode method not need to hold schedulerApptemt lock Key: YARN-10459 URL: https://issues.apache.org/jira/browse/YARN-10459 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 3.1.3, 3.2.0 Reporter: Ryan Wu Fix For: 3.2.1 Now, the containerLaunchedOnNode hold the SchedulerApplicationAttempt writelock, but looking at the method, it does not change any field. And more seriously, this will affect the scheduler public void containerLaunchedOnNode(ContainerId containerId, NodeId nodeId) { // Inform the container writelock.lock try \{ RMContainer rmContainer = getRMContainer(containerId); if (rmContainer == null) { // Some unknown container sneaked into the system. Kill it. rmContext.getDispatcher().getEventHandler().handle( new RMNodeCleanContainerEvent(nodeId, containerId)); return; } rmContainer.handle( new RMContainerEvent(containerId, RMContainerEventType.LAUNCHED)); }finally { writeLock.unlock(); } } -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org