[jira] [Commented] (MAPREDUCE-7523) MapReduce Task-Level Security Enforcement

ASF GitHub Bot (Jira) Wed, 26 Nov 2025 03:19:08 -0800


    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18040795#comment-18040795
 ]


ASF GitHub Bot commented on MAPREDUCE-7523:
-------------------------------------------

K0K0V0K commented on PR #8100:
URL: https://github.com/apache/hadoop/pull/8100#issuecomment-3580839466

   First of all thanks @steveloughran for the detailed review!
   
   > So this is a kerberized? non kerberized? cluster where you want to 
restrict the specific code an untrusted user may execute. Presumably you have 
to complete the lockdown the ability for them to add any new classes to the 
classpath, otherwise they would just add a new mapper or reducer class.
   
   I tested this on a non-kerberized cluster, but you’re right, I will also 
verify it on a kerberized cluster.
   This feature is not intended to prevent certain types of mapper/reducer code 
from being uploaded to the YARN cluster.
   For example, a user might have an old legacy MapReduce library that cannot 
be removed from processes, otherwise their business could fall apart, like a 
Banksy artwork in a paper shredder.
   The purpose of this feature is that if you must use a specific MapReduce 
library, at least you can restrict the user domain that has access to it, which 
reduces the potential attack surface.
   This feature is not designed to mitigate the risk of users uploading custom 
mapper/reducer code.
   
   > Is there any write-up on the security model, attacks it is intended to 
defend against etc?
   
   Not yet but i will add to the docs.
   
   > If you are trying to stop untrusted users from having their classes loaded 
in the cluster –I don't think it is sufficient. You would have to audit every 
single place where we instantiate a class through Configuration. A quick scan 
of Configuration.getClass use shows there are some distcp references 
(distcp.copy.listing.class,...) and shows up mapreduce.chain.mapper as another 
mechanism used to chain together tasks -it'll need lockdown too.
   
   Yes, you’re right, this cannot prevent users from uploading their custom 
code. I hope this can be achieved through other security mechanisms.
   
   > I think I will need to see the design goal of the security measure, the 
threat model it intends to mitigate and whether you want that mitigation to be 
absolute or best-effort. I'm curious about which tasks you want to lock down as 
well while still having them on the classpath.
   
   I will add the goal to the TaskLevelSecurityEnforcement.md file of this PR, 
if OK for you
   
   > I really think it'll be hard to stop someone sufficiently motivated from 
executing code in the cluster.
   
   100% agree, this can be hardly achievable from hadoop code.




> MapReduce Task-Level Security Enforcement
> -----------------------------------------
>
>                 Key: MAPREDUCE-7523
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7523
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2
>            Reporter: Bence Kosztolnik
>            Priority: Major
>              Labels: pull-request-available
>
> h2. Overview
> The goal of this feature to provide a configurable mechanism to control which 
> users are allowed to execute specific MapReduce jobs. 
> This feature aims to prevent unauthorized or potentially harmful 
> mapper/reducer implementations from running within the Hadoop cluster.
> In the standard Hadoop MapReduce execution flow:
> 1) A MapReduce job is submitted by a user.
> 2) The job is registered with the Resource Manager (RM).
> 3) The RM assigns the job to a Node Manager (NM), where the Application 
> Master (AM) for the job is launched.
> 4) The AM requests additional containers from the cluster, to be able to 
> start tasks.
> 5) The NM launches those containers, and the containers execute the 
> mapper/reducer tasks defined by the job.
> The proposed feature introduces a security filtering mechanism inside the 
> Application Master. 
> Before mapper or reducer tasks are launched, the AM will verify that the 
> user-submitted MapReduce code complies with a cluster-defined security 
> policy. 
> This ensures that only approved classes or packages can be executed inside 
> the containers.
> The goal is to protect the cluster from unwanted or unsafe task 
> implementations, such as custom code that may introduce performance, 
> stability, or security risks.
> Upon receiving job metadata, the Application Master will:
> 1) Check the feature is enabled.
> 2) Check the user who submitted the job is allowed to bypass the security 
> check.
> 3) Compare classes in job config against the denied task list.
> 4) If job is not authorised an exception will be thrown and AM will fail.
> h2. New Configs
> h5. Enables MapReduce Task-Level Security Enforcement
> When enabled, the Application Master performs validation of user-submitted 
> mapper, reducer, and other task-related classes before launching containers.
> This mechanism protects the cluster from running disallowed or unsafe task 
> implementations as defined by administrator-controlled policies.
>  - Property name: mapreduce.security.enabled
>  - Property type: boolean
>  - Default: false (security disabled)
> h5. MapReduce Task-Level Security Enforcement: Property Domain
> Defines the set of MapReduce configuration keys that represent user-supplied 
> class names involved in task execution (e.g., mapper, reducer, partitioner).
> The Application Master examines the values of these properties and checks 
> whether any referenced class is listed in denied tasks.
> Administrators may override this list to expand or restrict the validation 
> domain.
>  - Property name: mapreduce.security.property-domain
>  - Property type: list of configuration keys
>  - Default:
>  * mapreduce.job.combine.class
>  * mapreduce.job.combiner.group.comparator.class
>  * mapreduce.job.end-notification.custom-notifier-class
>  * mapreduce.job.inputformat.class
>  * mapreduce.job.map.class
>  * mapreduce.job.map.output.collector.class
>  * mapreduce.job.output.group.comparator.class
>  * mapreduce.job.output.key.class
>  * mapreduce.job.output.key.comparator.class
>  * mapreduce.job.output.value.class
>  * mapreduce.job.outputformat.class
>  * mapreduce.job.partitioner.class
>  * mapreduce.job.reduce.class
>  * mapreduce.map.output.key.class
>  * mapreduce.map.output.value.class
> h5. MapReduce Task-Level Security Enforcement: Denied Tasks
> Specifies the list of disallowed task implementation classes or packages.
> If a user submits a job whose mapper, reducer, or other task-related classes 
> match any entry in this blacklist.
>  - Property name: mapreduce.security.denied-tasks
>  - Property type: list of class name or package patterns
>  - Default: empty
>  - Example: 
> org.apache.hadoop.streaming,org.apache.hadoop.examples.QuasiMonteCarlo
> h5. MapReduce Task-Level Security Enforcement: Allowed Users
> Specifies users who may bypass the blacklist defined in denied tasks.
> This whitelist is intended for trusted or system-level workflows that may 
> legitimately require the use of restricted task implementations.
> If the submitting user is listed here, blacklist enforcement is skipped, 
> although standard Hadoop authentication and ACL checks still apply.
>  - Property name: mapreduce.security.allowed-users
>  - Property type: list of usernames
>  - Default: empty
>  - Example: alice,bob



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (MAPREDUCE-7523) MapReduce Task-Level Security Enforcement

Reply via email to