K0K0V0K commented on PR #8100: URL: https://github.com/apache/hadoop/pull/8100#issuecomment-3580839466
First of all thanks @steveloughran for the detailed review! > So this is a kerberized? non kerberized? cluster where you want to restrict the specific code an untrusted user may execute. Presumably you have to complete the lockdown the ability for them to add any new classes to the classpath, otherwise they would just add a new mapper or reducer class. I tested this on a non-kerberized cluster, but you’re right, I will also verify it on a kerberized cluster. This feature is not intended to prevent certain types of mapper/reducer code from being uploaded to the YARN cluster. For example, a user might have an old legacy MapReduce library that cannot be removed from processes, otherwise their business could fall apart, like a Banksy artwork in a paper shredder. The purpose of this feature is that if you must use a specific MapReduce library, at least you can restrict the user domain that has access to it, which reduces the potential attack surface. This feature is not designed to mitigate the risk of users uploading custom mapper/reducer code. > Is there any write-up on the security model, attacks it is intended to defend against etc? Not yet but i will add to the docs. > If you are trying to stop untrusted users from having their classes loaded in the cluster –I don't think it is sufficient. You would have to audit every single place where we instantiate a class through Configuration. A quick scan of Configuration.getClass use shows there are some distcp references (distcp.copy.listing.class,...) and shows up mapreduce.chain.mapper as another mechanism used to chain together tasks -it'll need lockdown too. Yes, you’re right, this cannot prevent users from uploading their custom code. I hope this can be achieved through other security mechanisms. > I think I will need to see the design goal of the security measure, the threat model it intends to mitigate and whether you want that mitigation to be absolute or best-effort. I'm curious about which tasks you want to lock down as well while still having them on the classpath. I will add the goal to the TaskLevelSecurityEnforcement.md file of this PR, if OK for you > I really think it'll be hard to stop someone sufficiently motivated from executing code in the cluster. 100% agree, this can be hardly achievable from hadoop code. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
