[jira] [Created] (YARN-7889) Missing kerberos token when check for RM REST API availability

2018-02-02 Thread Eric Yang (JIRA)
Eric Yang created YARN-7889:
---

 Summary: Missing kerberos token when check for RM REST API 
availability
 Key: YARN-7889
 URL: https://issues.apache.org/jira/browse/YARN-7889
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-native-services
Affects Versions: 3.1.0
Reporter: Eric Yang


When checking for which resource manager can be used for REST API request, 
client side must send kerberos token to REST API end point.  The checking 
mechanism is currently missing the kerberos token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7888) container-log4j.properties is in hadoop-yarn-node-manager.jar

2018-02-02 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-7888:


 Summary: container-log4j.properties is in 
hadoop-yarn-node-manager.jar
 Key: YARN-7888
 URL: https://issues.apache.org/jira/browse/YARN-7888
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haibo Chen


NM sets up log4j for containers with the container-log4j.properties file in its 
own jar. However, ideally we should not expose server side jars to containers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-02-02 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/677/

[Feb 2, 2018 2:03:01 AM] (aengineer) HDFS-12942. Synchronization issue in 
FSDataSetImpl#moveBlock.
[Feb 2, 2018 3:25:41 AM] (yqlin) HDFS-13068. RBF: Add router admin option to 
manage safe mode.
[Feb 2, 2018 5:34:07 AM] (aajisaka) HDFS-13048. LowRedundancyReplicatedBlocks 
metric can be negative
[Feb 2, 2018 5:33:26 PM] (jlowe) HADOOP-15170. Add symlink support to 
FileUtil#unTarUsingJava.
[Feb 2, 2018 6:28:22 PM] (arun suresh) YARN-7839. Modify PlacementAlgorithm to 
Check node capacity before
[Feb 2, 2018 7:10:47 PM] (jianhe) YARN-7868. Provide improved error message 
when YARN service is disabled.
[Feb 2, 2018 7:37:51 PM] (arp) HADOOP-15198. Correct the spelling in 
CopyFilter.java. Contributed by
[Feb 2, 2018 8:51:27 PM] (hanishakoneru) HADOOP-15168. Add kdiag tool to hadoop 
command. Contributed by Bharat
[Feb 2, 2018 10:38:33 PM] (jianhe) YARN-7831. YARN Service CLI should use 
hadoop.http.authentication.type
[Feb 2, 2018 10:46:20 PM] (kkaranasos) YARN-7778. Merging of placement 
constraints defined at different levels.
[Feb 3, 2018 12:28:03 AM] (hanishakoneru) HDFS-13073. Cleanup code in 
InterQJournalProtocol.proto. Contributed by
[Feb 3, 2018 12:48:57 AM] (szegedim) YARN-7879. NM user is unable to access the 
application filecache due to
[Feb 3, 2018 1:18:42 AM] (weichiu) HDFS-11187. Optimize disk access for last 
partial chunk checksum of

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org

[jira] [Created] (YARN-7887) SchedulingMonitor#PolicyInvoker catches all throwables

2018-02-02 Thread Young Chen (JIRA)
Young Chen created YARN-7887:


 Summary: SchedulingMonitor#PolicyInvoker catches all throwables 
 Key: YARN-7887
 URL: https://issues.apache.org/jira/browse/YARN-7887
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Young Chen


SchedulingMonitor catches all Throwables. This prevents InvariantsCheckers from 
failing simulations when their invariants are violated. There should be some 
method to selectively enable SchedulingEditPolicies to propagate exceptions out 
of the SchedulingMonitor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7886) [GQ] Compare resource allocation achieved by rebalancing algorithms with single-cluster capacity scheduler allocation

2018-02-02 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-7886:


 Summary: [GQ] Compare resource allocation achieved by rebalancing 
algorithms with single-cluster capacity scheduler allocation
 Key: YARN-7886
 URL: https://issues.apache.org/jira/browse/YARN-7886
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos
Assignee: Konstantinos Karanasos


Given a federated cluster, this Jira will enable us to compare the allocation 
achieved by our rebalancing algorithms, when compared to the allocation that 
the Capacity Scheduler would achieve if it were operating over a single big 
cluster having the same total resources as the federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7885) [GQ] Generator for queue hierarchies over federated clusters

2018-02-02 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-7885:


 Summary: [GQ] Generator for queue hierarchies over federated 
clusters
 Key: YARN-7885
 URL: https://issues.apache.org/jira/browse/YARN-7885
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos
Assignee: Konstantinos Karanasos


This Jira will focus on generating random queue hierarchies with different 
total/used/pending resources across the sub-clusters of a federated cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7884) Race condition in registering YARN service in ZooKeeper

2018-02-02 Thread Eric Yang (JIRA)
Eric Yang created YARN-7884:
---

 Summary: Race condition in registering YARN service in ZooKeeper
 Key: YARN-7884
 URL: https://issues.apache.org/jira/browse/YARN-7884
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-native-services
Affects Versions: 3.1.0
Reporter: Eric Yang


In Kerberos enabled cluster, there seems to be a race condition for registering 
YARN service.

Yarn-service znode creation seems to happen after AM started and reporting back 
to update components information.  For some reason, Yarnservice znode should 
have access to create the znode, but reported NoAuth.

{code}
2018-02-02 22:53:30,442 [main] INFO  service.ServiceScheduler - Set registry 
user accounts: sasl:hbase
2018-02-02 22:53:30,471 [main] INFO  zk.RegistrySecurity - Registry default 
system acls: 
[1,s{'world,'anyone}
, 31,s{'sasl,'yarn}
, 31,s{'sasl,'jhs}
, 31,s{'sasl,'hdfs-demo}
, 31,s{'sasl,'rm}
, 31,s{'sasl,'hive}
]
2018-02-02 22:53:30,472 [main] INFO  zk.RegistrySecurity - Registry User ACLs 
[31,s{'sasl,'hbase}
, 31,s{'sasl,'hbase}
]
2018-02-02 22:53:30,503 [main] INFO  event.AsyncDispatcher - Registering class 
org.apache.hadoop.yarn.service.component.ComponentEventType for class 
org.apache.hadoop.yarn.service.ServiceScheduler$ComponentEventHandler
2018-02-02 22:53:30,504 [main] INFO  event.AsyncDispatcher - Registering class 
org.apache.hadoop.yarn.service.component.instance.ComponentInstanceEventType 
for class 
org.apache.hadoop.yarn.service.ServiceScheduler$ComponentInstanceEventHandler
2018-02-02 22:53:30,528 [main] INFO  impl.NMClientAsyncImpl - Upper bound of 
the thread pool size is 500
2018-02-02 22:53:30,531 [main] INFO  service.ServiceMaster - Starting service 
as user hbase/eyang-5.openstacklo...@example.com (auth:KERBEROS)
2018-02-02 22:53:30,545 [main] INFO  ipc.CallQueueManager - Using callQueue: 
class java.util.concurrent.LinkedBlockingQueue queueCapacity: 100 scheduler: 
class org.apache.hadoop.ipc.DefaultRpcScheduler
2018-02-02 22:53:30,554 [Socket Reader #1 for port 56859] INFO  ipc.Server - 
Starting Socket Reader #1 for port 56859
2018-02-02 22:53:30,589 [main] INFO  pb.RpcServerFactoryPBImpl - Adding 
protocol org.apache.hadoop.yarn.service.impl.pb.service.ClientAMProtocolPB to 
the server
2018-02-02 22:53:30,606 [IPC Server Responder] INFO  ipc.Server - IPC Server 
Responder: starting
2018-02-02 22:53:30,607 [IPC Server listener on 56859] INFO  ipc.Server - IPC 
Server listener on 56859: starting
2018-02-02 22:53:30,607 [main] INFO  service.ClientAMService - Instantiated 
ClientAMService at eyang-5.openstacklocal/172.26.111.20:56859
2018-02-02 22:53:30,609 [main] INFO  zk.CuratorService - Creating 
CuratorService with connection fixed ZK quorum "eyang-1.openstacklocal:2181" 
2018-02-02 22:53:30,615 [main] INFO  zk.RegistrySecurity - Enabling ZK sasl 
client: jaasClientEntry = Client, principal = 
hbase/eyang-5.openstacklo...@example.com, keytab = 
/etc/security/keytabs/hbase.service.keytab
2018-02-02 22:53:30,752 [main] INFO  client.RMProxy - Connecting to 
ResourceManager at eyang-1.openstacklocal/172.26.111.17:8032
2018-02-02 22:53:30,909 [main] INFO  service.ServiceScheduler - Registering 
appattempt_1517611904996_0001_01, abc into registry
2018-02-02 22:53:30,911 [main] INFO  service.ServiceScheduler - Received 0 
containers from previous attempt.
2018-02-02 22:53:31,072 [main] INFO  service.ServiceScheduler - Could not read 
component paths: `/users/hbase/services/yarn-service/abc/components': No such 
file or directory: KeeperErrorCode = NoNode for 
/registry/users/hbase/services/yarn-service/abc/components
2018-02-02 22:53:31,074 [main] INFO  service.ServiceScheduler - Triggering 
initial evaluation of component sleeper
2018-02-02 22:53:31,075 [main] INFO  component.Component - [INIT COMPONENT 
sleeper]: 2 instances.
2018-02-02 22:53:31,094 [main] INFO  component.Component - [COMPONENT sleeper] 
Transitioned from INIT to FLEXING on FLEX event.
2018-02-02 22:53:31,215 [pool-5-thread-1] ERROR service.ServiceScheduler - 
Failed to register app abc in registry
org.apache.hadoop.registry.client.exceptions.NoPathPermissionsException: 
`/registry/users/hbase/services/yarn-service/abc': Not authorized to access 
path; ACLs: [
0x01: 'world,'anyone
 0x1f: 'sasl,'yarn
 0x1f: 'sasl,'jhs
 0x1f: 'sasl,'hdfs-demo
 0x1f: 'sasl,'rm
 0x1f: 'sasl,'hive
 0x1f: 'sasl,'hbase
 0x1f: 'sasl,'hbase
 ]: KeeperErrorCode = NoAuth for /registry/users/hbase/services/yarn-service/abc
at 
org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:412)
at 
org.apache.hadoop.registry.client.impl.zk.CuratorService.zkCreate(CuratorService.java:637)
at 
org.apache.hadoop.registry.client.impl.zk.CuratorService.zkSet(CuratorService.java:679)
at 

[jira] [Resolved] (YARN-7883) Make HAR tool support IndexedLogAggregtionController

2018-02-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong resolved YARN-7883.
-
Resolution: Duplicate

> Make HAR tool support IndexedLogAggregtionController
> 
>
> Key: YARN-7883
> URL: https://issues.apache.org/jira/browse/YARN-7883
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Major
>
> In https://issues.apache.org/jira/browse/MAPREDUCE-6415, we have created a 
> tool to combine aggregated logs into HAR files which currently only work for 
> TFileLogAggregationFileController. We should make it support  
> IndexedLogAggregtionController as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7883) Make HAR tool support IndexedLogAggregtionController

2018-02-02 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-7883:
---

 Summary: Make HAR tool support IndexedLogAggregtionController
 Key: YARN-7883
 URL: https://issues.apache.org/jira/browse/YARN-7883
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong


In https://issues.apache.org/jira/browse/MAPREDUCE-6415, we have created a tool 
to combine aggregated logs into HAR files which currently only work for 
TFileLogAggregationFileController. We should make it support  
IndexedLogAggregtionController as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7882) Server side proxy for UI2 log viewer

2018-02-02 Thread Eric Yang (JIRA)
Eric Yang created YARN-7882:
---

 Summary: Server side proxy for UI2 log viewer
 Key: YARN-7882
 URL: https://issues.apache.org/jira/browse/YARN-7882
 Project: Hadoop YARN
  Issue Type: Bug
  Components: security, timelineserver, yarn-ui-v2
Affects Versions: 3.0.0
Reporter: Eric Yang


When viewing container logs in UI2, the log files are directly fetched through 
timeline server 2.  Hadoop in simple security mode does not have authenticator 
to make sure the user is authorized to view the log.  The general practice is 
to use knox or other security proxy to authenticate the user and reserve proxy 
the request to Hadoop UI to ensure the information does not leak through 
anonymous user.  The current implementation of UI2 log viewer uses ajax code to 
timeline server 2.  This could prevent knox or reverse proxy software from 
working properly with the new design.  It would be good to perform server side 
proxy to prevent browser from side step the authentication check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: Apache Hadoop 3.0.1 Release plan

2018-02-02 Thread Chris Douglas
On Fri, Feb 2, 2018 at 10:22 AM, Arpit Agarwal  wrote:
> Do you plan to roll an RC with an uncommitted fix? That isn't the right 
> approach.

The fix will be committed to the release branch. We'll vote on the
release, and if it receives a majority of +1 votes then it becomes
3.0.1. That's how the PMC decides how to move forward. In this case,
that will also resolve whether or not it can be committed to trunk.

If this logic is unpersuasive, then we can require a 2/3 majority to
replace the codebase. Either way, the PMC will vote to define the
consensus view when it is not emergent.

> This issue has good visibility and enough discussion.

Yes, it has. We always prefer consensus to voting, but when discussion
reveals that complete consensus is impossible, we still need a way
forward. This is rare, and usually reserved for significant changes
(like merging YARN). Frankly, it's embarrassing to resort to it here,
but here we are.

> If there is a binding veto in effect then the change must be abandoned. Else 
> you should be able to proceed with committing. However, 3.0.0 must be called 
> out as an abandoned release if we commit it.

This is not accurate. A binding veto from any committer halts
progress, but the PMC sets the direction of the project. That includes
making decisions that are not universally accepted. -C

> On 2/1/18, 3:01 PM, "Lei Xu"  wrote:
>
> Sounds good to me, ATM.
>
> On Thu, Feb 1, 2018 at 2:34 PM, Aaron T. Myers  wrote:
> > Hey Anu,
> >
> > My feeling on HDFS-12990 is that we've discussed it quite a bit already 
> and
> > it doesn't seem at this point like either side is going to budge. I'm
> > certainly happy to have a phone call about it, but I don't expect that 
> we'd
> > make much progress.
> >
> > My suggestion is that we simply include the patch posted to HDFS-12990 
> in
> > the 3.0.1 RC and call this issue out clearly in the subsequent VOTE 
> thread
> > for the 3.0.1 release. Eddy, are you up for that?
> >
> > Best,
> > Aaron
> >
> > On Thu, Feb 1, 2018 at 1:13 PM, Lei Xu  wrote:
> >>
> >> +Xiao
> >>
> >> My understanding is that we will have this for 3.0.1.   Xiao, could
> >> you give your inputs here?
> >>
> >> On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer 
> 
> >> wrote:
> >> > Hi Eddy,
> >> >
> >> > Thanks for driving this release. Just a quick question, do we have 
> time
> >> > to close this issue?
> >> > https://issues.apache.org/jira/browse/HDFS-12990
> >> >
> >> > or are we abandoning it? I believe that this is the last window for 
> us
> >> > to fix this issue.
> >> >
> >> > Should we have a call and get this resolved one way or another?
> >> >
> >> > Thanks
> >> > Anu
> >> >
> >> > On 2/1/18, 10:51 AM, "Lei Xu"  wrote:
> >> >
> >> > Hi, All
> >> >
> >> > I just cut branch-3.0.1 from branch-3.0.  Please make sure all
> >> > patches
> >> > targeted to 3.0.1 being checked in both branch-3.0 and 
> branch-3.0.1.
> >> >
> >> > Thanks!
> >> > Eddy
> >> >
> >> > On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  
> wrote:
> >> > > Hi, All
> >> > >
> >> > > We have released Apache Hadoop 3.0.0 in December [1]. To 
> further
> >> > > improve the quality of release, we plan to cut branch-3.0.1 
> branch
> >> > > tomorrow for the preparation of Apache Hadoop 3.0.1 release. 
> The
> >> > focus
> >> > > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug
> >> > fixes
> >> > > [2].  No new features and improvement should be included.
> >> > >
> >> > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for 
> RC on
> >> > Feb
> >> > > 1st, targeting for Feb 9th release.
> >> > >
> >> > > Please feel free to share your insights.
> >> > >
> >> > > [1]
> >> > https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> >> > > [2] https://issues.apache.org/jira/issues/?filter=12342842
> >> > >
> >> > > Best,
> >> > > --
> >> > > Lei (Eddy) Xu
> >> > > Software Engineer, Cloudera
> >> >
> >> >
> >> >
> >> > --
> >> > Lei (Eddy) Xu
> >> > Software Engineer, Cloudera
> >> >
> >> >
> >> > -
> >> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> > For additional commands, e-mail: 
> common-dev-h...@hadoop.apache.org
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Lei (Eddy) Xu
> >> Software Engineer, Cloudera
> >>
> >> 

Re: Apache Hadoop 3.0.1 Release plan

2018-02-02 Thread Arpit Agarwal
Hi Aaron/Lei,

Do you plan to roll an RC with an uncommitted fix? That isn't the right 
approach.

This issue has good visibility and enough discussion. If there is a binding 
veto in effect then the change must be abandoned. Else you should be able to 
proceed with committing. However, 3.0.0 must be called out as an abandoned 
release if we commit it.

Regards,
Arpit


On 2/1/18, 3:01 PM, "Lei Xu"  wrote:

Sounds good to me, ATM.

On Thu, Feb 1, 2018 at 2:34 PM, Aaron T. Myers  wrote:
> Hey Anu,
>
> My feeling on HDFS-12990 is that we've discussed it quite a bit already 
and
> it doesn't seem at this point like either side is going to budge. I'm
> certainly happy to have a phone call about it, but I don't expect that 
we'd
> make much progress.
>
> My suggestion is that we simply include the patch posted to HDFS-12990 in
> the 3.0.1 RC and call this issue out clearly in the subsequent VOTE thread
> for the 3.0.1 release. Eddy, are you up for that?
>
> Best,
> Aaron
>
> On Thu, Feb 1, 2018 at 1:13 PM, Lei Xu  wrote:
>>
>> +Xiao
>>
>> My understanding is that we will have this for 3.0.1.   Xiao, could
>> you give your inputs here?
>>
>> On Thu, Feb 1, 2018 at 11:55 AM, Anu Engineer 
>> wrote:
>> > Hi Eddy,
>> >
>> > Thanks for driving this release. Just a quick question, do we have time
>> > to close this issue?
>> > https://issues.apache.org/jira/browse/HDFS-12990
>> >
>> > or are we abandoning it? I believe that this is the last window for us
>> > to fix this issue.
>> >
>> > Should we have a call and get this resolved one way or another?
>> >
>> > Thanks
>> > Anu
>> >
>> > On 2/1/18, 10:51 AM, "Lei Xu"  wrote:
>> >
>> > Hi, All
>> >
>> > I just cut branch-3.0.1 from branch-3.0.  Please make sure all
>> > patches
>> > targeted to 3.0.1 being checked in both branch-3.0 and 
branch-3.0.1.
>> >
>> > Thanks!
>> > Eddy
>> >
>> > On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
>> > > Hi, All
>> > >
>> > > We have released Apache Hadoop 3.0.0 in December [1]. To further
>> > > improve the quality of release, we plan to cut branch-3.0.1 
branch
>> > > tomorrow for the preparation of Apache Hadoop 3.0.1 release. The
>> > focus
>> > > of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug
>> > fixes
>> > > [2].  No new features and improvement should be included.
>> > >
>> > > We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC 
on
>> > Feb
>> > > 1st, targeting for Feb 9th release.
>> > >
>> > > Please feel free to share your insights.
>> > >
>> > > [1]
>> > https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
>> > > [2] https://issues.apache.org/jira/issues/?filter=12342842
>> > >
>> > > Best,
>> > > --
>> > > Lei (Eddy) Xu
>> > > Software Engineer, Cloudera
>> >
>> >
>> >
>> > --
>> > Lei (Eddy) Xu
>> > Software Engineer, Cloudera
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> >
>> >
>> >
>>
>>
>>
>> --
>> Lei (Eddy) Xu
>> Software Engineer, Cloudera
>>
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>>
>



-- 
Lei (Eddy) Xu
Software Engineer, Cloudera

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org





[jira] [Created] (YARN-7881) Add Log Aggregation Status API to the RM Webservice

2018-02-02 Thread JIRA
Gergely Novák created YARN-7881:
---

 Summary: Add Log Aggregation Status API to the RM Webservice
 Key: YARN-7881
 URL: https://issues.apache.org/jira/browse/YARN-7881
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: yarn
Reporter: Gergely Novák
Assignee: Gergely Novák


The old YARN UI has a page: /cluster/logaggregationstatus/\{app_id} which shows 
the log aggregation status for all the nodes that run containers for the given 
application. This information is not yet available by the RM Rest API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7832) Logs page does not work for Running applications

2018-02-02 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G resolved YARN-7832.
---
Resolution: Not A Problem

Thanks [~yeshavora] for confirming, This is working fine with Combine System 
Metric Publisher mode

> Logs page does not work for Running applications
> 
>
> Key: YARN-7832
> URL: https://issues.apache.org/jira/browse/YARN-7832
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Affects Versions: 3.0.0
>Reporter: Yesha Vora
>Assignee: Sunil G
>Priority: Critical
> Attachments: Screen Shot 2018-01-26 at 3.28.40 PM.png, 
> YARN-7832.001.patch
>
>
> Scenario
>  * Run yarn service application
>  * When application is Running, go to log page
>  * Select AttemptId and Container Id
> Logs are not showed on UI. It complains "No log data available!"
>  
> Here 
> [http://xxx:8188/ws/v1/applicationhistory/containers/container_e07_1516919074719_0004_01_01/logs?_=1517009230358]
>  API fails with 500 Internal Server Error.
> {"exception":"WebApplicationException","message":"java.io.IOException: 
> ","javaClassName":"javax.ws.rs.WebApplicationException"}
> {code:java}
> GET 
> http://xxx:8188/ws/v1/applicationhistory/containers/container_e07_1516919074719_0004_01_01/logs?_=1517009230358
>  500 (Internal Server Error)
> (anonymous) @ VM779:1
> send @ vendor.js:572
> ajax @ vendor.js:548
> (anonymous) @ vendor.js:5119
> initializePromise @ vendor.js:2941
> Promise @ vendor.js:3005
> ajax @ vendor.js:5117
> ajax @ yarn-ui.js:1
> superWrapper @ vendor.js:1591
> query @ vendor.js:5112
> ember$data$lib$system$store$finders$$_query @ vendor.js:5177
> query @ vendor.js:5334
> fetchLogFilesForContainerId @ yarn-ui.js:132
> showLogFilesForContainerId @ yarn-ui.js:126
> run @ vendor.js:648
> join @ vendor.js:648
> run.join @ vendor.js:1510
> closureAction @ vendor.js:1865
> trigger @ vendor.js:302
> (anonymous) @ vendor.js:339
> each @ vendor.js:61
> each @ vendor.js:51
> trigger @ vendor.js:339
> d.select @ vendor.js:5598
> (anonymous) @ vendor.js:5598
> d.invoke @ vendor.js:5598
> d.trigger @ vendor.js:5598
> e.trigger @ vendor.js:5598
> (anonymous) @ vendor.js:5598
> d.invoke @ vendor.js:5598
> d.trigger @ vendor.js:5598
> (anonymous) @ vendor.js:5598
> dispatch @ vendor.js:306
> elemData.handle @ vendor.js:281{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7880) FiCaSchedulerApp.commonCheckContainerAllocation throws NPE when running sls

2018-02-02 Thread Jiandan Yang (JIRA)
Jiandan Yang  created YARN-7880:
---

 Summary: FiCaSchedulerApp.commonCheckContainerAllocation throws 
NPE when running sls
 Key: YARN-7880
 URL: https://issues.apache.org/jira/browse/YARN-7880
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jiandan Yang 


18/02/02 20:54:28 INFO rmcontainer.RMContainerImpl: 
container_1517575125794_5707_01_86 Container Transitioned from ACQUIRED to 
RUNNING

java.lang.NullPointerException

        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.commonCheckContainerAllocation(FiCaSchedulerApp.java:324)

        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.accept(FiCaSchedulerApp.java:420)

        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.tryCommit(CapacityScheduler.java:2506)

        at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$ResourceCommitterService.run(CapacityScheduler.java:541)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-7879) NM user is unable to access the application filecache due to permissions

2018-02-02 Thread Shane Kumpf (JIRA)
Shane Kumpf created YARN-7879:
-

 Summary: NM user is unable to access the application filecache due 
to permissions
 Key: YARN-7879
 URL: https://issues.apache.org/jira/browse/YARN-7879
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Shane Kumpf


I noticed the following log entries where localization was being retried on 
several MR AM files. 
{code}
2018-02-02 02:53:02,905 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
 Resource 
/hadoop-yarn/usercache/hadoopuser/appcache/application_1517539453610_0001/filecache/11/job.jar
 is missing, localizing it again
2018-02-02 02:53:42,908 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl:
 Resource 
/hadoop-yarn/usercache/hadoopuser/appcache/application_1517539453610_0001/filecache/13/job.xml
 is missing, localizing it again
{code}

The cluster is configured to use LCE and 
{{yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user}} is set 
to a user ({{hadoopuser}}) that is in the {{hadoop}} group. The user has a 
umask of {{0002}}. The cluser is configured with 
{{fs.permissions.umask-mode=022}}, coming from {{core-default}}. Setting the 
local-user to {{nobody}}, who is not a login user or in the {{hadoop}} group, 
produces the same results.
{code}
[hadoopuser@y7001 ~]$ umask
0002
[hadoopuser@y7001 ~]$ id
uid=1003(hadoopuser) gid=1004(hadoopuser) groups=1004(hadoopuser),1001(hadoop)
{code}

The cause of the log entry was tracked down a simple !file.exists call in 
{{LocalResourcesTrackerImpl#isResourcePresent}}.
{code}
  public boolean isResourcePresent(LocalizedResource rsrc) {
boolean ret = true;
if (rsrc.getState() == ResourceState.LOCALIZED) {
  File file = new File(rsrc.getLocalPath().toUri().getRawPath().
toString());
  if (!file.exists()) {
ret = false;
  } else if (dirsHandler != null) {
ret = checkLocalResource(rsrc);
  }
}
return ret;
  }
{code}

The Resources Tracker runs as the NM user, in this case {{yarn}}. The files 
being retried are in the filecache. The directories in the filecache are all 
owned by the local-user's primary group and 700 perms, which makes it 
unreadable by the {{yarn}} user.
{code}
[root@y7001 ~]# ls -la 
/hadoop-yarn/usercache/hadoopuser/appcache/application_1517540536531_0001/filecache
total 0
drwx--x---. 6 hadoopuser hadoop 46 Feb  2 03:06 .
drwxr-s---. 4 hadoopuser hadoop 73 Feb  2 03:07 ..
drwx--. 2 hadoopuser hadoopuser 61 Feb  2 03:05 10
drwx--. 3 hadoopuser hadoopuser 21 Feb  2 03:05 11
drwx--. 2 hadoopuser hadoopuser 45 Feb  2 03:06 12
drwx--. 2 hadoopuser hadoopuser 41 Feb  2 03:06 13
{code}

I saw YARN-5287, but that appears to be related to a restrictive umask and the 
usercache itself. I was unable to locate any other known issues that seemed 
relevent. Is the above already known? a configuration issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org