This is an automated email from the ASF dual-hosted git repository. liuxun pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/submarine.git
The following commit(s) were added to refs/heads/master by this push: new 58b0f6d SUBMARINE-438. Add documentation for spark security 58b0f6d is described below commit 58b0f6d5df423683682832b7bab9590dff9e7664 Author: Kent Yao <yaooq...@hotmail.com> AuthorDate: Wed Mar 18 19:21:12 2020 +0800 SUBMARINE-438. Add documentation for spark security ### What is this PR for? Add documentation for spark security plugin ### What type of PR is it? [Bug Fix | Improvement | Feature | Documentation | Hot Fix | Refactoring] doc ### Todos * [ ] - Task ### What is the Jira issue? * Open an issue on Jira https://issues.apache.org/jira/browse/SUBMARINE-438 * Put link here, and add [SUBMARINE-*Jira number*] in PR title, eg. [SUBMARINE-23] ### How should this be tested? * First time? Setup Travis CI as described on https://submarine.apache.org/contribution/contributions.html#continuous-integration * Strongly recommended: add automated unit tests for any new or changed behavior * Outline any manual steps to test the PR here. no code based change ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? No * Is there breaking changes for older versions? No * Does this needs documentation? Yes Author: Kent Yao <yaooq...@hotmail.com> Closes #235 from yaooqinn/SUBMARINE-438 and squashes the following commits: fef0ab9 [Kent Yao] SUBMARINE-438. Add documentation for spark security --- README.md | 4 + docs/submarine-security/spark-security/README.md | 134 +++++++++++++++++++++ .../build-submarine-spark-security-plugin.md | 30 +++++ 3 files changed, 168 insertions(+) diff --git a/README.md b/README.md index 881fa36..190904b 100644 --- a/README.md +++ b/README.md @@ -12,8 +12,12 @@ limitations under the License. See accompanying LICENSE file. --> + +  +[](https://travis-ci.com/apache/submarine) [](https://www.apache.org/licenses/LICENSE-2.0.html) [](http://hits.dwyl.io/apache/submarine) + # What is Apache Submarine? Apache Submarine is a unified AI platform which allows engineers and data scientists to run Machine Learning and Deep Learning workload in distributed cluster. diff --git a/docs/submarine-security/spark-security/README.md b/docs/submarine-security/spark-security/README.md new file mode 100644 index 0000000..33d905d --- /dev/null +++ b/docs/submarine-security/spark-security/README.md @@ -0,0 +1,134 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +# Submarine Spark Security Plugin + +ACL Management for Apache Spark SQL with Apache Ranger, enabling: + +- Table/Column level authorization +- Row level filtering +- Data masking + + +Security is one of fundamental features for enterprise adoption. [Apache Ranger™](https://ranger.apache.org) offers many security plugins for many Hadoop ecosystem components, +such as HDFS, Hive, HBase, Solr and Sqoop2. However, [Apache Spark™](http://spark.apache.org) is not counted in yet. +When a secured HDFS cluster is used as a data warehouse accessed by various users and groups via different applications wrote by Spark and Hive, +it is very difficult to guarantee data management in a consistent way. Apache Spark users visit data warehouse only +with Storage based access controls offered by HDFS. This library enables Spark with SQL Standard Based Authorization. + +## Build + +Please refer to the online documentation - [Building submarine spark security plguin](build-submarine-spark-security-plugin.md) + +## Quick Start + +Three steps to integrate Apache Spark and Apache Ranger. + +### Installation + +Place the submarine-spark-security-<version>.jar into `$SPARK_HOME/jars`. + +### Configurations + +#### Settings for Apache Ranger + +Create `ranger-spark-security.xml` in `$SPARK_HOME/conf` and add the following configurations +for pointing to the right Apache Ranger admin server. + + +```xml + +<configuration> + + <property> + <name>ranger.plugin.spark.policy.rest.url</name> + <value>ranger admin address like http://ranger-admin.org:6080</value> + </property> + + <property> + <name>ranger.plugin.spark.service.name</name> + <value>a ranger hive service name</value> + </property> + + <property> + <name>ranger.plugin.spark.policy.cache.dir</name> + <value>./a ranger hive service name/policycache</value> + </property> + + <property> + <name>ranger.plugin.spark.policy.pollIntervalMs</name> + <value>5000</value> + </property> + + <property> + <name>ranger.plugin.spark.policy.source.impl</name> + <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value> + </property> + +</configuration> +``` + +Create `ranger-spark-audit.xml` in `$SPARK_HOME/conf` and add the following configurations +to enable/disable auditing. + +```xml +<configuration> + + <property> + <name>xasecure.audit.is.enabled</name> + <value>true</value> + </property> + + <property> + <name>xasecure.audit.destination.db</name> + <value>false</value> + </property> + + <property> + <name>xasecure.audit.destination.db.jdbc.driver</name> + <value>com.mysql.jdbc.Driver</value> + </property> + + <property> + <name>xasecure.audit.destination.db.jdbc.url</name> + <value>jdbc:mysql://10.171.161.78/ranger</value> + </property> + + <property> + <name>xasecure.audit.destination.db.password</name> + <value>rangeradmin</value> + </property> + + <property> + <name>xasecure.audit.destination.db.user</name> + <value>rangeradmin</value> + </property> + +</configuration> + +``` + +#### Settings for Apache Spark + +You can configure `spark.sql.extensions` with the `*Extension` we provided. +For example, `spark.sql.extensions=org.apache.submarine.spark.security.api.RangerSparkAuthzExtension` + +Currently, you can set the following options to `spark.sql.extensions` to choose authorization w/ or w/o +extra functions. + +| option | authorization | row filtering | data masking | +|---|---|---|---| +|org.apache.submarine.spark.security.api.RangerSparkAuthzExtension| √ | × | × | +|org.apache.submarine.spark.security.api.RangerSparkSQLExtension| √ | √ | √ | diff --git a/docs/submarine-security/spark-security/build-submarine-spark-security-plugin.md b/docs/submarine-security/spark-security/build-submarine-spark-security-plugin.md new file mode 100644 index 0000000..c58bf4f --- /dev/null +++ b/docs/submarine-security/spark-security/build-submarine-spark-security-plugin.md @@ -0,0 +1,30 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +# Building Submarine Spark Security Plugin + +Submarine Spark Security Plugin is built using [Apache Maven](http://maven.apache.org). To build it, `cd` to the root direct of submarine project and run: + +```bash +mvn clean package -Dmaven.javadoc.skip=true -DskipTests -pl :submarine-spark-security +``` + +By default, Submarine Spark Security Plugin is built against Apache Spark 2.3.x and Apache Ranger 1.1.0, which may be incompatible with other Apache Spark or Apache Ranger releases. + +Currently, available profiles are: + +Spark: -Pspark-2.3, -Pspark-2.4 + +Ranger: -Pranger-1.0, -Pranger-1.1, -Pranger-1.2 -Pranger-2.0 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@submarine.apache.org For additional commands, e-mail: dev-h...@submarine.apache.org