[jira] [Created] (YARN-9450) TestCapacityOverTimePolicy#testAllocation fails sporadically

2019-04-05 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9450:
---

 Summary: TestCapacityOverTimePolicy#testAllocation fails 
sporadically
 Key: YARN-9450
 URL: https://issues.apache.org/jira/browse/YARN-9450
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler, test
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


TestCapacityOverTimePolicy#testAllocation fails sporadically. Observed in 
multiple builds ran for - YARN-9447, YARN-8193, YARN-8051.

{code}

Failed
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation[Duration
 90,000,000, height 0.25, numSubmission 1, periodic 8640)]

Failing for the past 1 build (Since Failed#23900 )
Took 34 ms.
Stacktrace
junit.framework.AssertionFailedError
at junit.framework.Assert.fail(Assert.java:55)
at junit.framework.Assert.fail(Assert.java:64)
at junit.framework.TestCase.fail(TestCase.java:235)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.BaseSharingPolicyTest.runTest(BaseSharingPolicyTest.java:146)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy.testAllocation(TestCapacityOverTimePolicy.java:136)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
Standard Output
2019-04-05 23:46:19,022 INFO  [main] recovery.RMStateStore 
(RMStateStore.java:transition(591)) - Storing reservation 
allocation.reservation_-4277767163553399219_8391370105871519867
2019-04-05 23:46:19,022 INFO  [main] recovery.RMStateStore 
(MemoryRMStateStore.java:storeReservationState(258)) - Storing 
reservationallocation for reservation_-4277767163553399219_8391370105871519867 
for plan dedicated
2019-04-05 23:46:19,023 INFO  [main] reservation.InMemoryPlan 
(InMemoryPlan.java:addReservation(373)) - Successfully added reservation: 
reservation_-4277767163553399219_8391370105871519867 to plan.
In-memory Plan: Parent Queue: dedicated Total Capacity:  Step: 1000reservation_-4277767163553399219_8391370105871519867 
user:u1 startTime: 0 endTime: 8640 Periodiciy: 8640 alloc:
[Period: 8640
0: 
 2779008: 
 85579008: 
 8640: 
 9223372036854775807: null
 ] 

{code}

[jira] [Created] (YARN-9449) Non-exclusive labels can create reservation loop on cluster without unlabeled node

2019-04-05 Thread Brandon Scheller (JIRA)
Brandon Scheller created YARN-9449:
--

 Summary: Non-exclusive labels can create reservation loop on 
cluster without unlabeled node
 Key: YARN-9449
 URL: https://issues.apache.org/jira/browse/YARN-9449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.5
Reporter: Brandon Scheller


https://issues.apache.org/jira/browse/YARN-5342 Added a counter to Yarn so that 
unscheduled resource requests were attempted to be scheduled on unlabeled nodes 
first.
This counter is reset only when an attempt to schedule happens on an unlabeled 
node.

On hadoop clusters with only labeled nodes, this counter can never be reset and 
therefore it will block skipping that node.
Because the node will not be skipped, it creates the loop shown in the Yarn RM 
logs

 

{{_2019-02-18 23:54:22,591 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
(ResourceManager Event Processor): container_1550533628872_0003_01_23 
Container Transitioned from NEW to RESERVED 2019-02-18 23:54:22,591 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator
 (ResourceManager Event Processor): Reserved container 
application=application_1550533628872_0003 resource= 
queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@6ffe0dc3
 cluster= 2019-02-18 23:54:22,592 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue 
(ResourceManager Event Processor): assignedContainer queue=root 
usedCapacity=0.0 absoluteUsedCapacity=0.0 used= 
cluster= 2019-02-18 23:54:23,592 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 (ResourceManager Event Processor): Trying to fulfill reservation for 
application application_1550533628872_0003 on node: 
ip-10-0-0-122.ec2.internal:8041 2019-02-18 23:54:23,592 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp
 (ResourceManager Event Processor): Application application_1550533628872_0003 
unreserved on node host: ip-10-0-0-122.ec2.internal:8041 #containers=1 
available= used=, currently has 
0 at priority 1; currentReservation  on node-label=LABELED 
2019-02-18 23:54:23,593 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
(ResourceManager Event Processor): container_1550533628872_0003_01_24 
Container Transitioned from NEW to RESERVED 2019-02-18 23:54:23,593 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator
 (ResourceManager Event Processor): Reserved container 
application=application_1550533628872_0003 resource= 
queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@6ffe0dc3
 cluster= 2019-02-18 23:54:23,593 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue 
(ResourceManager Event Processor): assignedContainer queue=root 
usedCapacity=0.0 absoluteUsedCapacity=0.0 used= 
cluster= 2019-02-18 23:54:24,593 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 (ResourceManager Event Processor): Trying to fulfill reservation for 
application application_1550533628872_0003 on node: 
ip-10-0-0-122.ec2.internal:8041 2019-02-18 23:54:24,593 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp
 (ResourceManager Event Processor): Application application_1550533628872_0003 
unreserved on node host: ip-10-0-0-122.ec2.internal:8041 #containers=1 
available= used=, currently has 
0 at priority 1; currentReservation  on node-label=LABELED 
2019-02-18 23:54:24,594 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
(ResourceManager Event Processor): container_1550533628872_0003_01_25 
Container Transitioned from NEW to RESERVED 2019-02-18 23:54:24,594 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator
 (ResourceManager Event Processor): Reserved container 
application=application_1550533628872_0003 resource= 
queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@6ffe0dc3
 cluster= 2019-02-18 23:54:24,594 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue 
(ResourceManager Event Processor): assignedContainer queue=root 
usedCapacity=0.0 absoluteUsedCapacity=0.0 used= 
cluster= 2019-02-18 23:54:25,594 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 (ResourceManager Event Processor): Trying to fulfill reservation for 
application application_1550533628872_0003 on node: 
ip-10-0-0-122.ec2.internal:8041 2019-02-18 23:54:25,595 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp
 

[jira] [Created] (YARN-9448) Fix Opportunistic Scheduling for node local allocations.

2019-04-05 Thread Abhishek Modi (JIRA)
Abhishek Modi created YARN-9448:
---

 Summary: Fix Opportunistic Scheduling for node local allocations.
 Key: YARN-9448
 URL: https://issues.apache.org/jira/browse/YARN-9448
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Abhishek Modi
Assignee: Abhishek Modi


Right now, opportunistic container might not get allocated on rack local node 
even if it's available.

Nodes are right now blacklisted if any container except node local container is 
allocated on that node. In case, if previously container was allocated on that 
node, that node wouldn't be even considered even if there is an ask for node 
local request. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2019-04-05 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1097/

[Apr 4, 2019 5:50:28 AM] (github) HDDS-1339. Implement ratis snapshots on OM 
(#651)
[Apr 4, 2019 5:53:05 AM] (vrushali) YARN-9303  Username splits won't help 
timelineservice.app_flow table.
[Apr 4, 2019 9:33:37 AM] (yqlin) HDDS-1189. Recon Aggregate DB schema and ORM. 
Contributed by Siddharth
[Apr 4, 2019 10:05:01 AM] (wwei) YARN-9394. Use new API of RackResolver to get 
better performance.
[Apr 4, 2019 11:02:59 AM] (github) HDDS-1207. Refactor Container Report 
Processing logic and plugin new
[Apr 4, 2019 11:05:37 AM] (shashikant) HDDS-1349. Remove watchClient from 
XceiverClientRatis. Contributed by
[Apr 4, 2019 11:07:18 AM] (weichiu) HDFS-14389. getAclStatus returns incorrect 
permissions and owner when an
[Apr 4, 2019 1:48:32 PM] (nandakumar131) HDDS-1353 : Metrics 
scm_pipeline_metrics_num_pipeline_creation_failed
[Apr 4, 2019 3:15:57 PM] (stevel) HADOOP-16208. Do Not Log InterruptedException 
in Client.
[Apr 4, 2019 5:01:56 PM] (eyang) YARN-9396.  Fixed duplicated RM Container 
created event to ATS. 
[Apr 4, 2019 5:21:30 PM] (eyang) YARN-9441.  Updated YARN app catalog name for 
consistency.
[Apr 4, 2019 8:14:18 PM] (stevel) HADOOP-16197 S3AUtils.translateException to 
map
[Apr 5, 2019 12:38:02 AM] (bharat) HDDS-1189. Build failing due to rat check 
failure introduced by
[Apr 5, 2019 3:29:11 AM] (aajisaka) HDFS-14407. Fix misuse of SLF4j logging API 
in




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-common-project/hadoop-kms 
   Null passed for non-null parameter of 
com.google.common.base.Strings.isNullOrEmpty(String) in 
org.apache.hadoop.crypto.key.kms.server.KMSAudit.op(KMSAuditLogger$OpStatus, 
Object, UserGroupInformation, String, String, String) Method invoked at 
KMSAudit.java:of com.google.common.base.Strings.isNullOrEmpty(String) in 
org.apache.hadoop.crypto.key.kms.server.KMSAudit.op(KMSAuditLogger$OpStatus, 
Object, UserGroupInformation, String, String, String) Method invoked at 
KMSAudit.java:[line 195] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Null passed for non-null parameter of 
com.google.common.base.Strings.emptyToNull(String) in 
org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.getHealthReport()
 Method invoked at NodeHealthCheckerService.java:of 
com.google.common.base.Strings.emptyToNull(String) in 
org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.getHealthReport()
 Method invoked at NodeHealthCheckerService.java:[line 66] 
   Null passed for non-null parameter of 
com.google.common.base.Strings.emptyToNull(String) in 
org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.getHealthReport()
 Method invoked at NodeHealthCheckerService.java:of 
com.google.common.base.Strings.emptyToNull(String) in 
org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.getHealthReport()
 Method invoked at NodeHealthCheckerService.java:[line 72] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 
   Null passed for non-null parameter of 
com.google.common.util.concurrent.SettableFuture.set(Object) in 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppTransition.transition(RMStateStore,
 RMStateStoreEvent) At RMStateStore.java:of 
com.google.common.util.concurrent.SettableFuture.set(Object) in 
org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$UpdateAppTransition.transition(RMStateStore,
 RMStateStoreEvent) At RMStateStore.java:[line 291] 
   Null passed for non-null parameter of 
com.google.common.util.concurrent.SettableFuture.set(Object) in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateApplicationPriority(Priority,
 ApplicationId, SettableFuture, UserGroupInformation) At 
CapacityScheduler.java:of 
com.google.common.util.concurrent.SettableFuture.set(Object) in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateApplicationPriority(Priority,
 ApplicationId, SettableFuture, UserGroupInformation) At 
CapacityScheduler.java:[line 2647] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-documentstore
 
   
org.apache.hadoop.yarn.server.timelineservice.documentstore.collection.document.entity.TimelineEntityDocument.setEvents(Map)
 makes inefficient use of keySet iterator instead of entrySet iterator At 

[jira] [Created] (YARN-9447) RM Crashes with NPE at TimelineServiceV2Publisher.putEntity

2019-04-05 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9447:
---

 Summary: RM Crashes with NPE at 
TimelineServiceV2Publisher.putEntity
 Key: YARN-9447
 URL: https://issues.apache.org/jira/browse/YARN-9447
 Project: Hadoop YARN
  Issue Type: Bug
  Components: ATSv2
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph
 Attachments: rm.log

ResourceManager crashes with NullPointerException when 
TimelineServiceV2Publisher does putEntity after the timeline collector service 
for application is removed. This happened when killing a mapreduce job.

{code}
2019-04-05 14:53:24,728 INFO 
org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager:
 The collector service for application_1553788280931_0013 was removed
2019-04-05 14:53:24,734 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: 
Error in dispatcher thread
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.putEntity(TimelineServiceV2Publisher.java:461)
at 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher.access$100(TimelineServiceV2Publisher.java:73)
at 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:496)
at 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV2Publisher$TimelineV2EventHandler.handle(TimelineServiceV2Publisher.java:485)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748)
2019-04-05 14:53:24,743 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: 
Exiting, bbye..
2019-04-05 14:53:24,758 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: 
Container container_e30_1553788280931_0013_01_01 completed with event 
FINISHED, but corresponding RMContainer doesn't exist.
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9446) TestMiniMRClientCluster.testRestart is flaky

2019-04-05 Thread Peter Bacsko (JIRA)
Peter Bacsko created YARN-9446:
--

 Summary: TestMiniMRClientCluster.testRestart is flaky
 Key: YARN-9446
 URL: https://issues.apache.org/jira/browse/YARN-9446
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Peter Bacsko
Assignee: Peter Bacsko


The testcase {{TestMiniMRClientCluster.testRestart}} sometimes fails with this 
error:

{noformat}
2019-04-04 11:21:31,896 INFO [main] service.AbstractService 
(AbstractService.java:noteFailure(273)) - Service 
org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state 
STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.net.BindException: Problem binding to [test-host:35491] 
java.net.BindException: Address already in use; For more details see: 
http://wiki.apache.org/hadoop/BindException
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: 
Problem binding to [test-host:35491] java.net.BindException: Address already in 
use; For more details see: http://wiki.apache.org/hadoop/BindException
 at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:138)
 at 
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:65)
 at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:54)
 at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.startServer(AdminService.java:178)
 at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceStart(AdminService.java:165)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1244)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:355)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:127)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:493)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
 at 
org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:312)
 at 
org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster.serviceStart(MiniMRYarnCluster.java:210)
 at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
 at 
org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.restart(MiniMRYarnClusterAdapter.java:73)
 at 
org.apache.hadoop.mapred.TestMiniMRClientCluster.testRestart(TestMiniMRClientCluster.java:114)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat}

The solution is to set the socket option SO_REUSEADDR which is implemented in 
HADOOP-16238.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86

2019-04-05 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/

[Apr 5, 2019 3:32:07 AM] (aajisaka) HDFS-14407. Fix misuse of SLF4j logging API 
in




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   hadoop-build-tools/src/main/resources/checkstyle/checkstyle.xml 
   hadoop-build-tools/src/main/resources/checkstyle/suppressions.xml 
   
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml
 
   hadoop-tools/hadoop-azure/src/config/checkstyle.xml 
   hadoop-tools/hadoop-resourceestimator/src/config/checkstyle.xml 
   hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient 
non-serializable instance field map In GlobalStorageStatistics.java:instance 
field map In GlobalStorageStatistics.java 

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Dead store to state in 
org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.save(OutputStream,
 INodeSymlink) At 
FSImageFormatPBINode.java:org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.save(OutputStream,
 INodeSymlink) At FSImageFormatPBINode.java:[line 623] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client
 
   Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:then immediately reboxed in 
org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result,
 byte[], byte[], KeyConverter, ValueConverter, boolean) At 
ColumnRWHelper.java:[line 335] 

Failed junit tests :

   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame 
   hadoop.yarn.client.api.impl.TestAMRMProxy 
   hadoop.registry.secure.TestSecureLogins 
   hadoop.yarn.server.nodemanager.amrmproxy.TestFederationInterceptor 
   hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt
  [328K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-compile-cc-root-jdk1.8.0_191.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-compile-javac-root-jdk1.8.0_191.txt
  [308K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-checkstyle-root.txt
  [16M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-patch-pylint.txt
  [24K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-patch-shellcheck.txt
  [72K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/diff-patch-shelldocs.txt
  [8.0K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/whitespace-eol.txt
  [12M]
   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/whitespace-tabs.txt
  [1.2M]

   xml:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/xml.txt
  [20K]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/282/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html
  [8.0K]
   

[jira] [Created] (YARN-9445) yarn.admin.acl is futile

2019-04-05 Thread Peter Simon (JIRA)
Peter Simon created YARN-9445:
-

 Summary: yarn.admin.acl is futile
 Key: YARN-9445
 URL: https://issues.apache.org/jira/browse/YARN-9445
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security
Affects Versions: 2.6.0
Reporter: Peter Simon


* Define a queue with restrictive administerApps settings (e.g. yarn)
 * Set yarn.admin.acl to "*".
 * Try to submit an application with user yarn, it is denied.

This way my expected behaviour would be that while everyone is admin, I can 
submit to whatever pool.
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-9444) YARN Api Resource Utils's getRequestedResourcesFromConfig doesn't recognise yarn.io/gpu as a valid resource type

2019-04-05 Thread Gergely Pollak (JIRA)
Gergely Pollak created YARN-9444:


 Summary: YARN Api Resource Utils's getRequestedResourcesFromConfig 
doesn't recognise yarn.io/gpu as a valid resource type
 Key: YARN-9444
 URL: https://issues.apache.org/jira/browse/YARN-9444
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api
Reporter: Gergely Pollak
Assignee: Gergely Pollak


The original issue was the jobclient test did not send the requested resource 
type, when specified in the command line eg:

hadoop jar hadoop-mapreduce-client-jobclient-tests.jar sleep 
-Dmapreduce.reduce.resource.yarn.io/gpu=1  -m 10 -r 1 -mt 9

After some investigation, it turned out it only affects resource types with 
name containing '.' characters. And the root cause is regexp from the 
getRequestedResourcesFromConfig method.

"^" + Pattern.quote(prefix) + "[^.]+$"

This regexp explicitly forbids any dots in the resource type name, which is 
inconsistent with the default resource type for gpu and fpga, which are 
yarn.io/gpu and yarn.io/fpga respectively.

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-8647) Add a flag to disable move app between queues

2019-04-05 Thread Abhishek Modi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Modi resolved YARN-8647.
-
Resolution: Won't Fix

> Add a flag to disable move app between queues
> -
>
> Key: YARN-8647
> URL: https://issues.apache.org/jira/browse/YARN-8647
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.7.3
>Reporter: sarun singla
>Assignee: Abhishek Modi
>Priority: Critical
>
> For large clusters where we have a number of users submitting application, we 
> can result into scenarios where app developers try to move the queues for 
> their applications using something like 
> {code:java}
> yarn application -movetoqueue  -queue {code}
> Today there is no way of disabling the feature if one does not want 
> application developers to use  the feature.
> *Solution:*
> We should probably add an option to disable move queue feature from RM side 
> on the cluster level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org