[jira] [Commented] (YARN-9385) YARN Services with simple authentication doesn't respect current UGI

2019-03-15 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793902#comment-16793902
 ] 

Todd Lipcon commented on YARN-9385:
---

+1, lgtm, thanks Eric

> YARN Services with simple authentication doesn't respect current UGI
> 
>
> Key: YARN-9385
> URL: https://issues.apache.org/jira/browse/YARN-9385
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: security, yarn-native-services
>Reporter: Todd Lipcon
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-9385.001.patch, YARN-9385.002.patch, 
> YARN-9385.003.patch, YARN-9385.004.patch, YARN-9385.005.patch
>
>
> The ApiServiceClient implementation appends the current username to the 
> request URL for "simple" authentication. However, that username is derived 
> from the 'user.name' system property instead of the current UGI. That means 
> that username spoofing via the 'HADOOP_USER_NAME' variable doesn't take 
> effect for HTTP-based calls in the same manner that it does for RPC-based 
> calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9385) YARN Services with simple authentication doesn't respect current UGI

2019-03-13 Thread Todd Lipcon (JIRA)
Todd Lipcon created YARN-9385:
-

 Summary: YARN Services with simple authentication doesn't respect 
current UGI
 Key: YARN-9385
 URL: https://issues.apache.org/jira/browse/YARN-9385
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: security, yarn-native-services
Reporter: Todd Lipcon


The ApiServiceClient implementation appends the current username to the request 
URL for "simple" authentication. However, that username is derived from the 
'user.name' system property instead of the current UGI. That means that 
username spoofing via the 'HADOOP_USER_NAME' variable doesn't take effect for 
HTTP-based calls in the same manner that it does for RPC-based calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9385) YARN Services with simple authentication doesn't respect current UGI

2019-03-13 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791981#comment-16791981
 ] 

Todd Lipcon commented on YARN-9385:
---

I noticed that the 'user.name' request parameter setting is done in two 
different ways in this file. In the getRMWebAddress() function it's correctly 
using UserGroupInformation to get the username, whereas in 
'appendUserNameIfRequired()' it's using the Java system property. It seems 
replacing the use of the system property with the UGI-based short name in 
appendUserNameIfRequired() is the way to fix this.

However, I noticed one other inconsistency: appendUserNameIfRequired() is 
basing its decision whether to append the username on the configuration of the 
http authentication configuration variable, whereas getRMWebAddress is basing 
it on whether Kerberos is enabled. Which of the two is correct? It seems like 
probably the former (the HTTP-specific setting) is more appropriate.

> YARN Services with simple authentication doesn't respect current UGI
> 
>
> Key: YARN-9385
> URL: https://issues.apache.org/jira/browse/YARN-9385
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: security, yarn-native-services
>Reporter: Todd Lipcon
>Priority: Major
>
> The ApiServiceClient implementation appends the current username to the 
> request URL for "simple" authentication. However, that username is derived 
> from the 'user.name' system property instead of the current UGI. That means 
> that username spoofing via the 'HADOOP_USER_NAME' variable doesn't take 
> effect for HTTP-based calls in the same manner that it does for RPC-based 
> calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-2490) Bad links to jobhistory server

2014-09-02 Thread Todd Lipcon (JIRA)
Todd Lipcon created YARN-2490:
-

 Summary: Bad links to jobhistory server
 Key: YARN-2490
 URL: https://issues.apache.org/jira/browse/YARN-2490
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Todd Lipcon


If you run an MR/YARN cluster without configuring the jobhistory URL, you get 
some really bad usability:
- your jobs still produce JobHistory links
- the job history link goes to whichever NM the AM happened to run on

Even if you run the job history server on the same server as the RM, your links 
will be incorrect unless you've explicitly configured its hostname.

If JobHistory isn't running, we shouldn't produce URLs (or we should by default 
embed JobHistory inside the RM). If we require a hostname beyond 0.0.0.0, we 
should refuse to start the JH server with 0.0.0.0 as its configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-2491) Speculative attempts should not run on the same node as their original attempt

2014-09-02 Thread Todd Lipcon (JIRA)
Todd Lipcon created YARN-2491:
-

 Summary: Speculative attempts should not run on the same node as 
their original attempt
 Key: YARN-2491
 URL: https://issues.apache.org/jira/browse/YARN-2491
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 3.0.0
Reporter: Todd Lipcon


I'm seeing a behavior on trunk with fair scheduler enabled where a speculative 
reduce attempt is getting run on the same node as its original attempt. This 
doesn't make sense -- the main reason for speculative execution is to deal with 
a slow node, so scheduling a second attempt on the same node would just make 
the problem worse if anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1796) container-executor shouldn't require o-r permissions

2014-03-17 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13938828#comment-13938828
 ] 

Todd Lipcon commented on YARN-1796:
---

Patch looks good to me. +1 pending Jenkins.

 container-executor shouldn't require o-r permissions
 

 Key: YARN-1796
 URL: https://issues.apache.org/jira/browse/YARN-1796
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.4.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Priority: Minor
 Attachments: YARN-1796.patch


 The container-executor currently checks that other users don't have read 
 permissions. This is unnecessary and runs contrary to the debian packaging 
 policy manual.
 This is the analogous fix for YARN that was done for MR1 in MAPREDUCE-2103.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1795) After YARN-713, using FairScheduler can cause an InvalidToken Exception for NMTokens

2014-03-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935937#comment-13935937
 ] 

Todd Lipcon commented on YARN-1795:
---

I'm seeing this on a real cluster, too, without running Oozie. Out of a job 
with 1000 tasks I typically see a few tasks early in the job's lifetime (first 
wave of task assignment) fail, all on the same host. EG:

{code}
14/03/14 19:15:38 INFO mapreduce.Job:  map 0% reduce 0%
14/03/14 19:15:42 INFO mapreduce.Job: Task Id : 
attempt_1394818402366_5229_m_66_0, Status : FAILED
Container launch failed for container_1394818402366_5229_01_74 : 
org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
for d2208.halxg.cloudera.com:8041
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:196)
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

14/03/14 19:15:42 INFO mapreduce.Job: Task Id : 
attempt_1394818402366_5229_m_000107_0, Status : FAILED
Container launch failed for container_1394818402366_5229_01_000118 : 
org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
for d2208.halxg.cloudera.com:8041
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:196)
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

14/03/14 19:15:51 INFO mapreduce.Job: Task Id : 
attempt_1394818402366_5229_m_66_1, Status : FAILED
Container launch failed for container_1394818402366_5229_01_000135 : 
org.apache.hadoop.security.token.SecretManager$InvalidToken: No NMToken sent 
for d2208.halxg.cloudera.com:8041
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:206)
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:196)
at 
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)
at 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
{code}



 After YARN-713, using FairScheduler can cause an InvalidToken Exception for 
 NMTokens
 

 Key: YARN-1795
 URL: https://issues.apache.org/jira/browse/YARN-1795
 Project: Hadoop YARN
  Issue Type: Bug

[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM

2013-12-13 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848168#comment-13848168
 ] 

Todd Lipcon commented on YARN-1029:
---

I agree with Karthik here -- the main reasons to pursue a separate ZKFC in HDFS 
were:
- avoid failover in the case of GC (since ZKFC has a very low heap requirement) 
but still failover fast in machine failure.
- avoid adding any dependency on ZK within the NN
- allow the option to use other resource managers
-- in practice no one has done this and I think the extra complexity all of our 
pluggability introduces is not worth it

In the case of RM HA, as I understand it (apologies if I got anything wrong - 
only tangentially followed this discussion):
- RM HA uses ZK itself for shared storage, so it already has a dependency on ZK.
- Given that the shared state is in ZK, we don't need fencing if the same ZK 
client does election. The reason is that, if an RM loses its ZK lease, it will 
simultaneously trigger the failover _and_ be unable to make further changes in 
ZK. This exactly the semantics that we want.

Having a separate ZKFC actually complicates things, because we may have to 
reintroduce some kind of fencing. What does it mean if the ZKFC loses its ZK 
lease, but the RM itself continues to have access to ZK? It multiplies the 
'state diagram' in two, and doesn't seem to offer any particular advantages.

As for embedding ZKFC (and refactoring it so it can (a) not do health checks, 
(b) not control the RM via RPC, but directly, (c) re-use the same ZK session) 
seems more complicated than it's worth. Given we'd be throwing away all of the 
ZKFC features beyond the elector, why not just use the elector?

I'm also not sure why we want to preserve the external ZKFC option - per 
above it's a more complicated deployment scenario and seems to offer little 
tangible benefit.

 Allow embedding leader election into the RM
 ---

 Key: YARN-1029
 URL: https://issues.apache.org/jira/browse/YARN-1029
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, 
 yarn-1029-0.patch, yarn-1029-approach.patch


 It should be possible to embed common ActiveStandyElector into the RM such 
 that ZooKeeper based leader election and notification is in-built. In 
 conjunction with a ZK state store, this configuration will be a simple 
 deployment option.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to use cgroups in unsecure mode

2013-10-01 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783236#comment-13783236
 ] 

Todd Lipcon commented on YARN-1253:
---

On the security front, I see this as an improvement in compartmentalization. 
Sometimes people have an HDFS cluster with lax security concerns for the data 
within that cluster -- or already restrict access to submit jobs on the cluster 
to folks who are considered trusted to access all of the data. Given that, the 
concern of a malicious user masquerading as another on the cluster isn't a big 
one -- or else they'd set up Kerberos security as you mentioned above.

That said, these clusters may still be configured with automatic NFS mounts, 
and non-Kerberized NFS, in which case the ability to masquerade as another Unix 
user is a big problem.

Let me give an example from my past -- a univerisity CS department where I 
helped as a system administrator. In this environment, all users used 
non-Kerberized nfs v3 to access a big filer with home directories. Users on the 
sysadmin staff had a great deal of access on the filer, as well as general sudo 
type access, sensitive SSH keys, etc stored within their home directories. 
Obviously, students did not. In this same environment, we had shared compute 
clusters (typically traditional HPCC, but some early experiments with Hadoop as 
well back in ancient days). Different grad students shared these compute 
clusters to perform their research jobs on, but security would not have been an 
important consideration - within a trusted environment like a small university 
research department, convenience outweighed security. That said, resource 
allocation and isolation between users was important - I had many cases I had 
to handle where students or professors up against a paper deadline got pretty 
pissed off that some undergrad was monopolizing CPU cycles on shared machines.

In such an environment, the setup proposed by this JIRA would help. We could 
not have simply used LCE, because that would have opened an attack vector: any 
student could submit a job as toddlipcon and then use my NFS access to 
essentially gain department-wide root. Without using LCE, there would be poor 
resource isolation (lacking cgroups).

I'm certain that there are other environments as well where Unix user 
masquerading opens a lot of attack vectors, but where within-Hadoop strong auth 
is not a requirement.





 Changes to LinuxContainerExecutor to use cgroups in unsecure mode
 -

 Key: YARN-1253
 URL: https://issues.apache.org/jira/browse/YARN-1253
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Roman Shaposhnik
Priority: Blocker

 When using cgroups we require LCE to be configured in the cluster to start 
 containers. 
 When LCE starts containers as the user that submitted the job. While this 
 works correctly in a secure setup, in an un-secure setup this presents a 
 couple issues:
 * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes
 * Because users can impersonate other users, any user would have access to 
 any local file of other users
 Particularly, the second issue is not desirable as a user could get access to 
 ssh keys of other users in the nodes or if there are NFS mounts, get to other 
 users data outside of the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-1253) Changes to LinuxContainerExecutor to run containers as a single dedicated user in non-secure mode

2013-10-01 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783318#comment-13783318
 ] 

Todd Lipcon commented on YARN-1253:
---

bq. We should refactor that code out to be able to use it as a standalone 
library/binary (which doesn't bring in the extra baggage of user-accounts etc.) 
- that's the correct fix IMO. Putting in a local-user is an easy short-term 
solution

I think separating the local run-as user from the daemon user has other 
benefits as well, separate from cgroups. This is a long-standing tradition in 
Unix services - eg Apache httpd typically runs CGI scripts as nobody unless 
suexec is configured. So this change still has value.

 Changes to LinuxContainerExecutor to run containers as a single dedicated 
 user in non-secure mode
 -

 Key: YARN-1253
 URL: https://issues.apache.org/jira/browse/YARN-1253
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Roman Shaposhnik
Priority: Blocker
 Attachments: YARN-1253.patch.txt


 When using cgroups we require LCE to be configured in the cluster to start 
 containers. 
 When LCE starts containers as the user that submitted the job. While this 
 works correctly in a secure setup, in an un-secure setup this presents a 
 couple issues:
 * LCE requires all Hadoop users submitting jobs to be Unix users in all nodes
 * Because users can impersonate other users, any user would have access to 
 any local file of other users
 Particularly, the second issue is not desirable as a user could get access to 
 ssh keys of other users in the nodes or if there are NFS mounts, get to other 
 users data outside of the cluster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (YARN-311) Dynamic node resource configuration on RM with JMX interface

2013-01-04 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544082#comment-13544082
 ] 

Todd Lipcon commented on YARN-311:
--

Per the discussion in HADOOP-9160, I really don't think we should add anything 
which is only available over JMX. The security model, for one, is wildly 
incompatible with the rest of Hadoop security.

If the main reason for wanting JMX is so that other software can call these 
RPCs without the Hadoop jar, I'll counter and say we should go even farther and 
allow other software to not even require a JVM. I see two ways of doing this:
1) Implement a simple client in C or Python which speaks Hadoop RPC via protobuf
2) Add REST interfaces.

I am in favor of doing option 1. A single-threaded blocking RPC client without 
any connection pooling, etc, is not very difficult to write, and for 
administrative purposes that would be sufficient, right? If we had such a thing 
available as a relatively small C or python library available, would that solve 
the issue just as well?

 Dynamic node resource configuration on RM with JMX interface
 

 Key: YARN-311
 URL: https://issues.apache.org/jira/browse/YARN-311
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du

 As the first step, we go for resource change on RM side and expose JMX API. 
 For design details, please refer proposal and discussion in parent JIRA: 
 YARN-291.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira