Repository: spark
Updated Branches:
  refs/heads/branch-2.0 ddf6dd88a -> 068500a4a


[SPARK-20239][CORE][2.1-BACKPORT] Improve HistoryServer's ACL mechanism

Current SHS (Spark History Server) has two different ACLs:

* ACL of base URL, it is controlled by "spark.acls.enabled" or 
"spark.ui.acls.enabled", and with this enabled, only user configured with 
"spark.admin.acls" (or group) or "spark.ui.view.acls" (or group), or the user 
who started SHS could list all the applications, otherwise none of them can be 
listed. This will also affect REST APIs which listing the summary of all apps 
and one app.
* Per application ACL. This is controlled by "spark.history.ui.acls.enabled". 
With this enabled only history admin user and user/group who ran this app can 
access the details of this app.

With this two ACLs, we may encounter several unexpected behaviors:

1. if base URL's ACL (`spark.acls.enable`) is enabled but user A has no view 
permission. User "A" cannot see the app list but could still access details of 
it's own app.
2. if ACLs of base URL (`spark.acls.enable`) is disabled, then user "A" could 
download any application's event log, even it is not run by user "A".
3. The changes of Live UI's ACL will affect History UI's ACL which share the 
same conf file.

The unexpected behaviors is mainly because we have two different ACLs, ideally 
we should have only one to manage all.

So to improve SHS's ACL mechanism, here in this PR proposed to:

1. Disable "spark.acls.enable" and only use "spark.history.ui.acls.enable" for 
history server.
2. Check permission for event-log download REST API.

With this PR:

1. Admin user could see/download the list of all applications, as well as 
application details.
2. Normal user could see the list of all applications, but can only download 
and check the details of applications accessible to him.

New UTs are added, also verified in real cluster.

CC tgravescs vanzin please help to review, this PR changes the semantics you 
did previously. Thanks a lot.

Author: jerryshao <ss...@hortonworks.com>

Closes #17755 from jerryshao/SPARK-20239-2.1-backport.

(cherry picked from commit 359382c038d5836e95ee3ca871f3d1da5bc08148)
Signed-off-by: Marcelo Vanzin <van...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/068500a4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/068500a4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/068500a4

Branch: refs/heads/branch-2.0
Commit: 068500a4a67d67112dcd012820388ca99df9a011
Parents: ddf6dd8
Author: jerryshao <ss...@hortonworks.com>
Authored: Tue Apr 25 15:21:12 2017 -0700
Committer: Marcelo Vanzin <van...@cloudera.com>
Committed: Tue Apr 25 15:21:24 2017 -0700

----------------------------------------------------------------------
 .../history/ApplicationHistoryProvider.scala    |  4 ++--
 .../spark/deploy/history/HistoryServer.scala    | 20 +++++++++++++++++++-
 .../spark/status/api/v1/ApiRootResource.scala   | 18 +++++++++++++++---
 .../deploy/history/HistoryServerSuite.scala     | 12 +++++++-----
 4 files changed, 43 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/068500a4/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
 
b/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
index f3ea541..bc9fa70 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
@@ -84,7 +84,7 @@ private[history] abstract class ApplicationHistoryProvider {
    * @return Count of application event logs that are currently under process
    */
   def getEventLogsUnderProcess(): Int = {
-    return 0;
+    0
   }
 
   /**
@@ -93,7 +93,7 @@ private[history] abstract class ApplicationHistoryProvider {
    * @return 0 if this is undefined or unsupported, otherwise the last updated 
time in millis
    */
   def getLastUpdatedTime(): Long = {
-    return 0;
+    0
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/068500a4/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
index 996c19e..44ce495 100644
--- a/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala
@@ -261,7 +261,7 @@ object HistoryServer extends Logging {
     Utils.initDaemon(log)
     new HistoryServerArguments(conf, argStrings)
     initSecurity()
-    val securityManager = new SecurityManager(conf)
+    val securityManager = createSecurityManager(conf)
 
     val providerName = conf.getOption("spark.history.provider")
       .getOrElse(classOf[FsHistoryProvider].getName())
@@ -281,6 +281,24 @@ object HistoryServer extends Logging {
     while(true) { Thread.sleep(Int.MaxValue) }
   }
 
+  /**
+   * Create a security manager.
+   * This turns off security in the SecurityManager, so that the History 
Server can start
+   * in a Spark cluster where security is enabled.
+   * @param config configuration for the SecurityManager constructor
+   * @return the security manager for use in constructing the History Server.
+   */
+  private[history] def createSecurityManager(config: SparkConf): 
SecurityManager = {
+    if (config.getBoolean("spark.acls.enable", 
config.getBoolean("spark.ui.acls.enable", false))) {
+      logInfo("Either spark.acls.enable or spark.ui.acls.enable is configured, 
clearing it and " +
+        "only using spark.history.ui.acl.enable")
+      config.set("spark.acls.enable", "false")
+      config.set("spark.ui.acls.enable", "false")
+    }
+
+    new SecurityManager(config)
+  }
+
   def initSecurity() {
     // If we are accessing HDFS and it has security enabled (Kerberos), we 
have to login
     // from a keytab file so that we can access HDFS beyond the kerberos 
ticket expiration.

http://git-wip-us.apache.org/repos/asf/spark/blob/068500a4/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
----------------------------------------------------------------------
diff --git 
a/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala 
b/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
index 8dd792c..b97359d 100644
--- a/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
+++ b/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
@@ -168,14 +168,27 @@ private[v1] class ApiRootResource extends 
ApiRequestContext {
   @Path("applications/{appId}/logs")
   def getEventLogs(
       @PathParam("appId") appId: String): EventLogDownloadResource = {
-    new EventLogDownloadResource(uiRoot, appId, None)
+    try {
+      // withSparkUI will throw NotFoundException if attemptId exists for this 
application.
+      // So we need to try again with attempt id "1".
+      withSparkUI(appId, None) { _ =>
+        new EventLogDownloadResource(uiRoot, appId, None)
+      }
+    } catch {
+      case _: NotFoundException =>
+        withSparkUI(appId, Some("1")) { _ =>
+          new EventLogDownloadResource(uiRoot, appId, None)
+        }
+    }
   }
 
   @Path("applications/{appId}/{attemptId}/logs")
   def getEventLogs(
       @PathParam("appId") appId: String,
       @PathParam("attemptId") attemptId: String): EventLogDownloadResource = {
-    new EventLogDownloadResource(uiRoot, appId, Some(attemptId))
+    withSparkUI(appId, Some(attemptId)) { _ =>
+      new EventLogDownloadResource(uiRoot, appId, Some(attemptId))
+    }
   }
 
   @Path("version")
@@ -260,7 +273,6 @@ private[v1] trait ApiRequestContext {
       case None => throw new NotFoundException("no such app: " + appId)
     }
   }
-
 }
 
 private[v1] class ForbiddenException(msg: String) extends 
WebApplicationException(

http://git-wip-us.apache.org/repos/asf/spark/blob/068500a4/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
----------------------------------------------------------------------
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
index 313f753..891d2c1 100644
--- 
a/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala
@@ -545,12 +545,11 @@ class HistoryServerSuite extends SparkFunSuite with 
BeforeAndAfter with Matchers
     assert(jobcount === getNumJobs("/jobs"))
 
     // no need to retain the test dir now the tests complete
-    logDir.deleteOnExit();
-
+    logDir.deleteOnExit()
   }
 
   test("ui and api authorization checks") {
-    val appId = "local-1422981759269"
+    val appId = "local-1430917381535"
     val owner = "irashid"
     val other = "alice"
 
@@ -567,8 +566,11 @@ class HistoryServerSuite extends SparkFunSuite with 
BeforeAndAfter with Matchers
 
     val port = server.boundPort
     val testUrls = Seq(
-      s"http://localhost:$port/api/v1/applications/$appId/jobs";,
-      s"http://localhost:$port/history/$appId/jobs/";)
+      s"http://localhost:$port/api/v1/applications/$appId/1/jobs";,
+      s"http://localhost:$port/history/$appId/1/jobs/";,
+      s"http://localhost:$port/api/v1/applications/$appId/logs";,
+      s"http://localhost:$port/api/v1/applications/$appId/1/logs";,
+      s"http://localhost:$port/api/v1/applications/$appId/2/logs";)
 
     tests.foreach { case (user, expectedCode) =>
       testUrls.foreach { url =>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to