JIRA privilege for committers/PMCs

2020-10-14 Thread Wei-Chiu Chuang
Hi,

Is there any one who is a Hadoop committer or PMC, but who does not have
the committer/administrator privilege (e.g. assign/resolve jiras)?

JIRA projects like HADOOP has "hadoop" and "hadoop-pmc" groups added as
administrators.
[image: Screen Shot 2020-10-14 at 3.20.59 PM.png]

So I would have thought all Hadoop committers/PMCs have administrator
privileges (assuming the hadoop group maps to all Hadoop committers and
hadoop-pmc for PMCs). But it seems not the case and I had to add the
administrator privileges to committers manually.  Does anyone know how it's
supposed to work?

Thanks,
Weichiu


Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64

2020-10-14 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/29/

[Oct 13, 2020 8:11:10 AM] (Szilard Nemeth) YARN-10454: Add applicationName 
policy. Contributed by Peter Bacsko
[Oct 13, 2020 8:54:41 AM] (noreply) HADOOP-17223 update 
org.apache.httpcomponents:httpclient to 4.5.13 and httpcore to 4.4.13 (#2242)
[Oct 13, 2020 12:27:55 PM] (noreply) HADOOP-17303. 
TestWebHDFS.testLargeDirectory failing (#2380)
[Oct 13, 2020 12:30:38 PM] (Steve Loughran) Revert "HADOOP-17303. 
TestWebHDFS.testLargeDirectory failing (#2380)"
[Oct 13, 2020 12:31:00 PM] (Steve Loughran) HDFS-15626. 
TestWebHDFS.testLargeDirectory failing (#2380)
[Oct 13, 2020 3:17:44 PM] (noreply) HADOOP-16878. FileUtil.copy() to throw 
IOException if the source and destination are the same
[Oct 13, 2020 3:30:34 PM] (noreply) HADOOP-17301. ABFS: read-ahead error 
reporting breaks buffer management (#2369)
[Oct 13, 2020 3:54:15 PM] (Adam Antal) YARN-10448. SLS should set default user 
to handle SYNTH format. Contributed by zhuqi
[Oct 13, 2020 5:59:42 PM] (noreply) HDFS-15614. Initialize snapshot trash root 
during NameNode startup if enabled (#2370)




-1 overall


The following subsystems voted -1:
blanks findbugs mvnsite pathlen shadedclient unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

findbugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Redundant nullcheck of oldLock, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:[line 695] 

findbugs :

   module:hadoop-hdfs-project 
   Redundant nullcheck of oldLock, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:[line 695] 

findbugs :

   module:hadoop-yarn-project/hadoop-yarn 
   Redundant nullcheck of it, which is known to be non-null in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker,
 NMStateStoreService$LocalResourceTrackerState) Redundant null check at 
ResourceLocalizationService.java:is known to be non-null in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker,
 NMStateStoreService$LocalResourceTrackerState) Redundant null check at 
ResourceLocalizationService.java:[line 343] 
   Redundant nullcheck of it, which is known to be non-null in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker,
 NMStateStoreService$LocalResourceTrackerState) Redundant null check at 
ResourceLocalizationService.java:is known to be non-null in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.recoverTrackerResources(LocalResourcesTracker,
 NMStateStoreService$LocalResourceTrackerState) Redundant null check at 
ResourceLocalizationService.java:[line 356] 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWit

[jira] [Created] (YARN-10461) Update the ResourceManagerRest.md with the introduced endpoints

2020-10-14 Thread Benjamin Teke (Jira)
Benjamin Teke created YARN-10461:


 Summary: Update the ResourceManagerRest.md with the introduced 
endpoints
 Key: YARN-10461
 URL: https://issues.apache.org/jira/browse/YARN-10461
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Benjamin Teke


The new APIs introduced in YARN-10421 should be added to the file 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10433) Extend the DiagnosticService to initiate the diagnostic bundle collection

2020-10-14 Thread Benjamin Teke (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Teke resolved YARN-10433.
--
Resolution: Abandoned

Migrated to YARN-10421

 

> Extend the DiagnosticService to initiate the diagnostic bundle collection
> -
>
> Key: YARN-10433
> URL: https://issues.apache.org/jira/browse/YARN-10433
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
>
> YARN-10421 introduces the new DiagnosticService class, the two new endpoints 
> for listing the available actions and starting the diagnostic script collect 
> method, and a basic diagnostic collector script. After the scripts form is 
> finalized (YARN-10422) the DiagnosticService should be extended to spawn the 
> requested collection method based on the input parameters and return the 
> collected bundle as a response.
> To ease the load on the RM, the servlet should allow only one HTTP request at 
> a time. If a new request comes in while serving another an appropriate 
> response code should be returned, with the message "Diagnostics Collection in 
> Progress”. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-10460) Upgrading to JUnit 4.13 causes tests in TestNodeStatusUpdater to fail

2020-10-14 Thread Peter Bacsko (Jira)
Peter Bacsko created YARN-10460:
---

 Summary: Upgrading to JUnit 4.13 causes tests in 
TestNodeStatusUpdater to fail
 Key: YARN-10460
 URL: https://issues.apache.org/jira/browse/YARN-10460
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, test
Reporter: Peter Bacsko
Assignee: Peter Bacsko


In our downstream build environment, we're using JUnit 4.13. Recently, we 
discovered a truly weird test failure in TestNodeStatusUpdater.

The problem is that timeout handling has changed in Junit 4.13. See the 
difference between these two snippets:

4.12
{noformat}
@Override
public void evaluate() throws Throwable {
CallableStatement callable = new CallableStatement();
FutureTask task = new FutureTask(callable);
threadGroup = new ThreadGroup("FailOnTimeoutGroup");
Thread thread = new Thread(threadGroup, task, "Time-limited test");
thread.setDaemon(true);
thread.start();
callable.awaitStarted();
Throwable throwable = getResult(task, thread);
if (throwable != null) {
throw throwable;
}
}
{noformat}
 
 4.13
{noformat}
@Override
public void evaluate() throws Throwable {
CallableStatement callable = new CallableStatement();
FutureTask task = new FutureTask(callable);
ThreadGroup threadGroup = new ThreadGroup("FailOnTimeoutGroup");
Thread thread = new Thread(threadGroup, task, "Time-limited test");
try {
thread.setDaemon(true);
thread.start();
callable.awaitStarted();
Throwable throwable = getResult(task, thread);
if (throwable != null) {
throw throwable;
}
} finally {
try {
thread.join(1);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
try {
threadGroup.destroy();  < This
} catch (IllegalThreadStateException e) {
// If a thread from the group is still alive, the ThreadGroup 
cannot be destroyed.
// Swallow the exception to keep the same behavior prior to 
this change.
}
}
}
{noformat}
The change comes from [https://github.com/junit-team/junit4/pull/1517].

Unfortunately, destroying the thread group causes an issue because there are 
all sorts of object caching in the IPC layer. The exception is:
{noformat}
java.lang.IllegalThreadStateException
at java.lang.ThreadGroup.addUnstarted(ThreadGroup.java:867)
at java.lang.Thread.init(Thread.java:402)
at java.lang.Thread.init(Thread.java:349)
at java.lang.Thread.(Thread.java:675)
at 
java.util.concurrent.Executors$DefaultThreadFactory.newThread(Executors.java:613)
at 
com.google.common.util.concurrent.ThreadFactoryBuilder$1.newThread(ThreadFactoryBuilder.java:163)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:612)
at 
java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:925)
at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368)
at 
java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
at 
org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1136)
at org.apache.hadoop.ipc.Client.call(Client.java:1458)
at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
at com.sun.proxy.$Proxy81.startContainers(Unknown Source)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:128)
at 
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.startContainer(TestNodeManagerShutdown.java:251)
at 
org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater.testNodeStatusUpdaterRetryAndNMShutdown(TestNodeStatusUpdater.java:1576)
{noformat}
Both the {{clientExecutor}} in {{org.apache.hadoop.ipc.Client}} and the client 
object in {{ProtobufRpcEngine}}/{{ProtobufRpcEngine2}} is stored as long as 
they're needed. But since the backing thread group is destroyed in the previous 
test, it's no longer possible to create new threads.

A quick workaround is to stop the clients and completely clear the 
{{ClientCache}} in {{ProtobufRpcEngine}} before each testcase. I tried this and 
it solves the problem but it feels hacky. Not sure if there is a better 
approach.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---

[jira] [Created] (YARN-10459) containerLaunchedOnNode method not need to hold schedulerApptemt lock

2020-10-14 Thread Ryan Wu (Jira)
Ryan Wu created YARN-10459:
--

 Summary: containerLaunchedOnNode method not need to hold 
schedulerApptemt lock 
 Key: YARN-10459
 URL: https://issues.apache.org/jira/browse/YARN-10459
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 3.1.3, 3.2.0
Reporter: Ryan Wu
 Fix For: 3.2.1


 

Now, the containerLaunchedOnNode hold the SchedulerApplicationAttempt 
writelock, but looking at the method, it does not change any field. And more 
seriously, this will affect the scheduler  

public void containerLaunchedOnNode(ContainerId containerId,
NodeId nodeId) {
// Inform the container
writelock.lock
try \{
RMContainer rmContainer = getRMContainer(containerId);
if (rmContainer == null) {
  // Some unknown container sneaked into the system. Kill it.
  rmContext.getDispatcher().getEventHandler().handle(
  new RMNodeCleanContainerEvent(nodeId, containerId));
  return;
}

rmContainer.handle(
new RMContainerEvent(containerId, RMContainerEventType.LAUNCHED));

}finally {  
 writeLock.unlock(); 
}   
}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org