Miscellaneous documentation fixes This PR contains miscellaneous fixes & improvements for documentation:
- fixes for code snippets formatting, like, https://zeppelin.apache.org/docs/0.8.0-SNAPSHOT/setup/security/shiro_authentication.html#apply-multiple-roles-in-shiro-configuration - fixes syntax highlighting (adding `scala`, `xml`, `java`, `bash`, ...) - fixes for list of interpreters - ... Documentation Author: Alex Ott <alex...@gmail.com> Closes #2997 from alexott/doc-formatting-fixes and squashes the following commits: 10eed86ca [Alex Ott] Merge branch 'master' into doc-formatting-fixes 37a2bb778 [Alex Ott] miscellaneous fixes - wording, formatting, etc. 63ca2b0e2 [Alex Ott] fix usage of the ``` markup that lead to broken formatting 9d285a1b7 [Alex Ott] Fix list of interpreters 5a7950e79 [Alex Ott] add missing language spec for syntax highlighting bb26a2954 [Alex Ott] use same formatting for parser name c90b61f11 [Alex Ott] use same capitalization in all interpreter names a994f4ecf [Alex Ott] improve formatting for Cassandra interpreter docs (cherry picked from commit ebca7aed7c5f6fa634e3f2d4fc6188ee41f5770c) Signed-off-by: Jeff Zhang <zjf...@apache.org> Project: http://git-wip-us.apache.org/repos/asf/zeppelin/repo Commit: http://git-wip-us.apache.org/repos/asf/zeppelin/commit/68cb6761 Tree: http://git-wip-us.apache.org/repos/asf/zeppelin/tree/68cb6761 Diff: http://git-wip-us.apache.org/repos/asf/zeppelin/diff/68cb6761 Branch: refs/heads/branch-0.8 Commit: 68cb67619edb344ec0588d0fb3f34ece73e9fed3 Parents: 23d79b7 Author: Alex Ott <alex...@gmail.com> Authored: Wed Jun 20 09:43:52 2018 +0200 Committer: Jeff Zhang <zjf...@apache.org> Committed: Sat Jun 23 20:55:10 2018 +0800 ---------------------------------------------------------------------- docs/README.md | 4 +- .../contribution/how_to_contribute_code.md | 39 +++-- .../contribution/how_to_contribute_website.md | 2 +- .../contribution/useful_developer_tools.md | 17 +- docs/development/helium/writing_application.md | 6 +- .../helium/writing_visualization_basic.md | 4 +- .../development/writing_zeppelin_interpreter.md | 24 +-- docs/index.md | 7 +- docs/interpreter/cassandra.md | 161 +++++++++---------- docs/interpreter/groovy.md | 1 + docs/interpreter/hbase.md | 4 +- docs/interpreter/ignite.md | 8 +- docs/interpreter/jdbc.md | 8 +- docs/interpreter/kylin.md | 2 +- docs/interpreter/lens.md | 20 ++- docs/interpreter/livy.md | 9 +- docs/interpreter/mahout.md | 20 +-- docs/interpreter/markdown.md | 3 +- docs/interpreter/neo4j.md | 6 +- docs/interpreter/python.md | 36 +++-- docs/interpreter/r.md | 30 +++- docs/interpreter/sap.md | 4 +- docs/interpreter/scalding.md | 19 ++- docs/interpreter/shell.md | 9 +- docs/interpreter/spark.md | 6 +- docs/setup/basics/how_to_build.md | 15 +- docs/setup/deployment/cdh.md | 6 +- docs/setup/deployment/docker.md | 6 +- .../setup/deployment/flink_and_spark_cluster.md | 50 +++--- docs/setup/deployment/spark_cluster_mode.md | 24 +-- docs/setup/deployment/virtual_machine.md | 18 +-- docs/setup/deployment/yarn_install.md | 4 +- docs/setup/operation/configuration.md | 15 +- docs/setup/operation/trouble_shooting.md | 2 +- docs/setup/security/authentication_nginx.md | 14 +- docs/setup/security/http_security_headers.md | 14 +- docs/setup/security/notebook_authorization.md | 4 +- docs/setup/security/shiro_authentication.md | 4 +- docs/setup/storage/storage.md | 76 ++++----- docs/usage/display_system/angular_backend.md | 1 + docs/usage/display_system/basic.md | 6 +- docs/usage/interpreter/dynamic_loading.md | 2 +- docs/usage/interpreter/installation.md | 15 +- docs/usage/interpreter/overview.md | 4 +- docs/usage/interpreter/user_impersonation.md | 8 +- .../other_features/customizing_homepage.md | 2 +- docs/usage/other_features/zeppelin_context.md | 10 +- 47 files changed, 407 insertions(+), 342 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/README.md ---------------------------------------------------------------------- diff --git a/docs/README.md b/docs/README.md index 4dc810e..c1fee03 100644 --- a/docs/README.md +++ b/docs/README.md @@ -56,8 +56,10 @@ If you wish to help us and contribute to Zeppelin Documentation, please look at ``` 2. checkout ASF repo + ``` svn co https://svn.apache.org/repos/asf/zeppelin asf-zeppelin ``` + 3. copy `zeppelin/docs/_site` to `asf-zeppelin/site/docs/[VERSION]` - 4. ```svn commit``` + 4. `svn commit` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/development/contribution/how_to_contribute_code.md ---------------------------------------------------------------------- diff --git a/docs/development/contribution/how_to_contribute_code.md b/docs/development/contribution/how_to_contribute_code.md index 92b69b5..290c8d1 100644 --- a/docs/development/contribution/how_to_contribute_code.md +++ b/docs/development/contribution/how_to_contribute_code.md @@ -51,13 +51,13 @@ First of all, you need Zeppelin source code. The official location of Zeppelin i Get the source code on your development machine using git. -``` +```bash git clone git://git.apache.org/zeppelin.git zeppelin ``` You may also want to develop against a specific branch. For example, for branch-0.5.6 -``` +```bash git clone -b branch-0.5.6 git://git.apache.org/zeppelin.git zeppelin ``` @@ -69,19 +69,19 @@ Before making a pull request, please take a look [Contribution Guidelines](http: ### Build -``` +```bash mvn install ``` To skip test -``` +```bash mvn install -DskipTests ``` To build with specific spark / hadoop version -``` +```bash mvn install -Dspark.version=x.x.x -Dhadoop.version=x.x.x ``` @@ -93,18 +93,26 @@ For the further 1. Copy the `conf/zeppelin-site.xml.template` to `zeppelin-server/src/main/resources/zeppelin-site.xml` and change the configurations in this file if required 2. Run the following command -``` + +```bash cd zeppelin-server -HADOOP_HOME=YOUR_HADOOP_HOME JAVA_HOME=YOUR_JAVA_HOME mvn exec:java -Dexec.mainClass="org.apache.zeppelin.server.ZeppelinServer" -Dexec.args="" +HADOOP_HOME=YOUR_HADOOP_HOME JAVA_HOME=YOUR_JAVA_HOME \ +mvn exec:java -Dexec.mainClass="org.apache.zeppelin.server.ZeppelinServer" -Dexec.args="" ``` #### Option 2 - Daemon Script -> **Note:** Make sure you first run ```mvn clean install -DskipTests``` on your zeppelin root directory, otherwise your server build will fail to find the required dependencies in the local repro. +> **Note:** Make sure you first run + +```bash +mvn clean install -DskipTests +``` + +in your zeppelin root directory, otherwise your server build will fail to find the required dependencies in the local repro. or use daemon script -``` +```bash bin/zeppelin-daemon start ``` @@ -122,8 +130,7 @@ Some portions of the Zeppelin code are generated by [Thrift](http://thrift.apach To regenerate the code, install **thrift-0.9.2** and then run the following command to generate thrift code. - -``` +```bash cd <zeppelin_home>/zeppelin-interpreter/src/main/thrift ./genthrift.sh ``` @@ -132,14 +139,16 @@ cd <zeppelin_home>/zeppelin-interpreter/src/main/thrift Zeppelin has [set of integration tests](https://github.com/apache/zeppelin/tree/master/zeppelin-server/src/test/java/org/apache/zeppelin/integration) using Selenium. To run these test, first build and run Zeppelin and make sure Zeppelin is running on port 8080. Then you can run test using following command -``` -TEST_SELENIUM=true mvn test -Dtest=[TEST_NAME] -DfailIfNoTests=false -pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' +```bash +TEST_SELENIUM=true mvn test -Dtest=[TEST_NAME] -DfailIfNoTests=false \ +-pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' ``` For example, to run [ParagraphActionIT](https://github.com/apache/zeppelin/blob/master/zeppelin-server/src/test/java/org/apache/zeppelin/integration/ParagraphActionsIT.java), -``` -TEST_SELENIUM=true mvn test -Dtest=ParagraphActionsIT -DfailIfNoTests=false -pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' +```bash +TEST_SELENIUM=true mvn test -Dtest=ParagraphActionsIT -DfailIfNoTests=false \ +-pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server' ``` You'll need Firefox web browser installed in your development environment. While CI server uses [Firefox 31.0](https://ftp.mozilla.org/pub/firefox/releases/31.0/) to run selenium test, it is good idea to install the same version (disable auto update to keep the version). http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/development/contribution/how_to_contribute_website.md ---------------------------------------------------------------------- diff --git a/docs/development/contribution/how_to_contribute_website.md b/docs/development/contribution/how_to_contribute_website.md index d5d3b5a..1b7c2d9 100644 --- a/docs/development/contribution/how_to_contribute_website.md +++ b/docs/development/contribution/how_to_contribute_website.md @@ -39,7 +39,7 @@ Documentation website is hosted in 'master' branch under `/docs/` dir. First of all, you need the website source code. The official location of mirror for Zeppelin is [http://git.apache.org/zeppelin.git](http://git.apache.org/zeppelin.git). Get the source code on your development machine using git. -``` +```bash git clone git://git.apache.org/zeppelin.git cd docs ``` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/development/contribution/useful_developer_tools.md ---------------------------------------------------------------------- diff --git a/docs/development/contribution/useful_developer_tools.md b/docs/development/contribution/useful_developer_tools.md index 326986a..17ca403 100644 --- a/docs/development/contribution/useful_developer_tools.md +++ b/docs/development/contribution/useful_developer_tools.md @@ -37,7 +37,7 @@ Check [zeppelin-web: Local Development](https://github.com/apache/zeppelin/tree/ this script would be helpful when changing JDK version frequently. -``` +```bash function setjdk() { if [ $# -ne 0 ]; then # written based on OSX. @@ -59,7 +59,7 @@ you can use this function like `setjdk 1.8` / `setjdk 1.7` ### Building Submodules Selectively -``` +```bash # build `zeppelin-web` only mvn clean -pl 'zeppelin-web' package -DskipTests; @@ -71,7 +71,8 @@ mvn clean package -pl 'spark,spark-dependencies,zeppelin-server' --am -DskipTest # build spark related modules with profiles: scala 2.11, spark 2.1 hadoop 2.7 ./dev/change_scala_version.sh 2.11 -mvn clean package -Pspark-2.1 -Phadoop-2.7 -Pscala-2.11 -pl 'spark,spark-dependencies,zeppelin-server' --am -DskipTests +mvn clean package -Pspark-2.1 -Phadoop-2.7 -Pscala-2.11 \ +-pl 'spark,spark-dependencies,zeppelin-server' --am -DskipTests # build `zeppelin-server` and `markdown` with dependencies mvn clean package -pl 'markdown,zeppelin-server' --am -DskipTests @@ -79,7 +80,7 @@ mvn clean package -pl 'markdown,zeppelin-server' --am -DskipTests ### Running Individual Tests -``` +```bash # run the `HeliumBundleFactoryTest` test class mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=HeliumBundleFactoryTest ``` @@ -88,13 +89,15 @@ mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=HeliumBundleFac Make sure that Zeppelin instance is started to execute integration tests (= selenium tests). -``` +```bash # run the `SparkParagraphIT` test class -TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=SparkParagraphIT +TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am \ +-DfailIfNoTests=false -Dtest=SparkParagraphIT # run the `testSqlSpark` test function only in the `SparkParagraphIT` class # but note that, some test might be dependent on the previous tests -TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=SparkParagraphIT#testSqlSpark +TEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am \ +-DfailIfNoTests=false -Dtest=SparkParagraphIT#testSqlSpark ``` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/development/helium/writing_application.md ---------------------------------------------------------------------- diff --git a/docs/development/helium/writing_application.md b/docs/development/helium/writing_application.md index 366d3e7..d128671 100644 --- a/docs/development/helium/writing_application.md +++ b/docs/development/helium/writing_application.md @@ -147,7 +147,7 @@ Resouce name is a string which will be compared with the name of objects in the Application may require two or more resources. Required resources can be listed inside of the json array. For example, if the application requires object "name1", "name2" and "className1" type of object to run, resources field can be -``` +```json resources: [ [ "name1", "name2", ":className1", ...] ] @@ -155,7 +155,7 @@ resources: [ If Application can handle alternative combination of required resources, alternative set can be listed as below. -``` +```json resources: [ [ "name", ":className"], [ "altName", ":altClassName1"], @@ -165,7 +165,7 @@ resources: [ Easier way to understand this scheme is -``` +```json resources: [ [ 'resource' AND 'resource' AND ... ] OR [ 'resource' AND 'resource' AND ... ] OR http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/development/helium/writing_visualization_basic.md ---------------------------------------------------------------------- diff --git a/docs/development/helium/writing_visualization_basic.md b/docs/development/helium/writing_visualization_basic.md index 3f7c1dc..207e8b5 100644 --- a/docs/development/helium/writing_visualization_basic.md +++ b/docs/development/helium/writing_visualization_basic.md @@ -190,7 +190,7 @@ e.g. #### 4. Run in dev mode -Place your __Helium package file__ in local registry (ZEPPELIN_HOME/helium). +Place your __Helium package file__ in local registry (`ZEPPELIN_HOME/helium`). Run Zeppelin. And then run zeppelin-web in visualization dev mode. ```bash @@ -198,7 +198,7 @@ cd zeppelin-web yarn run dev:helium ``` -You can browse localhost:9000. Everytime refresh your browser, Zeppelin will rebuild your visualization and reload changes. +You can browse `localhost:9000`. Everytime refresh your browser, Zeppelin will rebuild your visualization and reload changes. #### 5. Publish your visualization http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/development/writing_zeppelin_interpreter.md ---------------------------------------------------------------------- diff --git a/docs/development/writing_zeppelin_interpreter.md b/docs/development/writing_zeppelin_interpreter.md index f62e97e..c4737ef 100644 --- a/docs/development/writing_zeppelin_interpreter.md +++ b/docs/development/writing_zeppelin_interpreter.md @@ -42,7 +42,7 @@ In 'Separate Interpreter(scoped / isolated) for each note' mode which you can se Creating a new interpreter is quite simple. Just extend [org.apache.zeppelin.interpreter](https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/Interpreter.java) abstract class and implement some methods. For your interpreter project, you need to make `interpreter-parent` as your parent project and use plugin `maven-enforcer-plugin`, `maven-dependency-plugin` and `maven-resources-plugin`. Here's one sample pom.xml -``` +```xml <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> @@ -128,7 +128,7 @@ Here is an example of `interpreter-setting.json` on your own interpreter. Finally, Zeppelin uses static initialization with the following: -``` +```java static { Interpreter.register("MyInterpreterName", MyClassName.class.getName()); } @@ -157,7 +157,7 @@ If you want to add a new set of syntax highlighting, 1. Add the `mode-*.js` file to <code>[zeppelin-web/bower.json](https://github.com/apache/zeppelin/blob/master/zeppelin-web/bower.json)</code> (when built, <code>[zeppelin-web/src/index.html](https://github.com/apache/zeppelin/blob/master/zeppelin-web/src/index.html)</code> will be changed automatically). 2. Add `language` field to `editor` object. Note that if you don't specify language field, your interpreter will use plain text mode for syntax highlighting. Let's say you want to set your language to `java`, then add: - ``` + ```json "editor": { "language": "java" } @@ -166,7 +166,7 @@ If you want to add a new set of syntax highlighting, ### Edit on double click If your interpreter uses mark-up language such as markdown or HTML, set `editOnDblClick` to `true` so that text editor opens on pargraph double click and closes on paragraph run. Otherwise set it to `false`. -``` +```json "editor": { "editOnDblClick": false } @@ -177,7 +177,7 @@ By default, `Ctrl+dot(.)` brings autocompletion list in the editor. Through `completionKey`, each interpreter can configure autocompletion key. Currently `TAB` is only available option. -``` +```json "editor": { "completionKey": "TAB" } @@ -201,7 +201,7 @@ To configure your interpreter you need to follow these steps: Property value is comma separated [INTERPRETER\_CLASS\_NAME]. For example, - ``` + ```xml <property> <name>zeppelin.interpreters</name> <value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter</value> @@ -225,7 +225,7 @@ Note that the first interpreter configuration in zeppelin.interpreters will be t For example, -``` +```scala %myintp val a = "My interpreter" @@ -235,11 +235,11 @@ println(a) ### 0.6.0 and later Inside of a note, `%[INTERPRETER_GROUP].[INTERPRETER_NAME]` directive will call your interpreter. -You can omit either [INTERPRETER\_GROUP] or [INTERPRETER\_NAME]. If you omit [INTERPRETER\_NAME], then first available interpreter will be selected in the [INTERPRETER\_GROUP]. -Likewise, if you skip [INTERPRETER\_GROUP], then [INTERPRETER\_NAME] will be chosen from default interpreter group. +You can omit either `[INTERPRETER\_GROUP]` or `[INTERPRETER\_NAME]`. If you omit `[INTERPRETER\_NAME]`, then first available interpreter will be selected in the `[INTERPRETER\_GROUP]`. +Likewise, if you skip `[INTERPRETER\_GROUP]`, then `[INTERPRETER\_NAME]` will be chosen from default interpreter group. -For example, if you have two interpreter myintp1 and myintp2 in group mygrp, you can call myintp1 like +For example, if you have two interpreter `myintp1` and `myintp2` in group `mygrp`, you can call myintp1 like ``` %mygrp.myintp1 @@ -247,7 +247,7 @@ For example, if you have two interpreter myintp1 and myintp2 in group mygrp, you codes for myintp1 ``` -and you can call myintp2 like +and you can call `myintp2` like ``` %mygrp.myintp2 @@ -255,7 +255,7 @@ and you can call myintp2 like codes for myintp2 ``` -If you omit your interpreter name, it'll select first available interpreter in the group ( myintp1 ). +If you omit your interpreter name, it'll select first available interpreter in the group ( `myintp1` ). ``` %mygrp http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/index.md ---------------------------------------------------------------------- diff --git a/docs/index.md b/docs/index.md index 788a979..6fe044a 100644 --- a/docs/index.md +++ b/docs/index.md @@ -134,7 +134,7 @@ limitations under the License. * [BigQuery](./interpreter/bigquery.html) * [Cassandra](./interpreter/cassandra.html) * [Elasticsearch](./interpreter/elasticsearch.html) - * [flink](./interpreter/flink.html) + * [Flink](./interpreter/flink.html) * [Geode](./interpreter/geode.html) * [Groovy](./interpreter/groovy.html) * [HBase](./interpreter/hbase.html) @@ -145,7 +145,8 @@ limitations under the License. * [Kylin](./interpreter/kylin.html) * [Lens](./interpreter/lens.html) * [Livy](./interpreter/livy.html) - * [markdown](./interpreter/markdown.html) + * [Mahout](./interpreter/mahout.html) + * [Markdown](./interpreter/markdown.html) * [Neo4j](./interpreter/neo4j.html) * [Pig](./interpreter/pig.html) * [Postgresql, HAWQ](./interpreter/postgresql.html) @@ -154,7 +155,7 @@ limitations under the License. * [SAP](./interpreter/sap.html) * [Scalding](./interpreter/scalding.html) * [Scio](./interpreter/scio.html) - * [Shell](./interpreter/Shell.html) + * [Shell](./interpreter/shell.html) * [Spark](./interpreter/spark.html) #### External Resources http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/cassandra.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/cassandra.md b/docs/interpreter/cassandra.md index e91d995..5a20d82 100644 --- a/docs/interpreter/cassandra.md +++ b/docs/interpreter/cassandra.md @@ -69,27 +69,27 @@ The **Cassandra** interpreter accepts the following commands </tr> <tr> <td nowrap>Help command</td> - <td>HELP</td> + <td>`HELP`</td> <td>Display the interactive help menu</td> </tr> <tr> <td nowrap>Schema commands</td> - <td>DESCRIBE KEYSPACE, DESCRIBE CLUSTER, DESCRIBE TABLES ...</td> + <td>`DESCRIBE KEYSPACE`, `DESCRIBE CLUSTER`, `DESCRIBE TABLES` ...</td> <td>Custom commands to describe the Cassandra schema</td> </tr> <tr> <td nowrap>Option commands</td> - <td>@consistency, @retryPolicy, @fetchSize ...</td> + <td>`@consistency`, `@retryPolicy`, `@fetchSize` ...</td> <td>Inject runtime options to all statements in the paragraph</td> </tr> <tr> <td nowrap>Prepared statement commands</td> - <td>@prepare, @bind, @remove_prepared</td> + <td>`@prepare`, `@bind`, `@remove_prepared`</td> <td>Let you register a prepared command and re-use it later by injecting bound values</td> </tr> <tr> <td nowrap>Native CQL statements</td> - <td>All CQL-compatible statements (SELECT, INSERT, CREATE ...)</td> + <td>All CQL-compatible statements (`SELECT`, `INSERT`, `CREATE`, ...)</td> <td>All CQL statements are executed directly against the Cassandra server</td> </tr> </table> @@ -107,15 +107,15 @@ SELECT * FROM users WHERE login='jdoe'; Each statement should be separated by a semi-colon ( **;** ) except the special commands below: -1. @prepare -2. @bind -3. @remove_prepare -4. @consistency -5. @serialConsistency -6. @timestamp -7. @retryPolicy -8. @fetchSize -9. @requestTimeOut +1. `@prepare` +2. `@bind` +3. `@remove_prepare` +4. `@consistency` +5. `@serialConsistency` +6. `@timestamp` +7. `@retryPolicy` +8. `@fetchSize` +9. `@requestTimeOut` Multi-line statements as well as multiple statements on the same line are also supported as long as they are separated by a semi-colon. Ex: @@ -130,7 +130,7 @@ FROM artists WHERE login='jlennon'; ``` -Batch statements are supported and can span multiple lines, as well as DDL(CREATE/ALTER/DROP) statements: +Batch statements are supported and can span multiple lines, as well as DDL (`CREATE`/`ALTER`/`DROP`) statements: ```sql @@ -429,7 +429,7 @@ Some remarks about query parameters: > 1. **many** query parameters can be set in the same paragraph > 2. if the **same** query parameter is set many time with different values, > the interpreter only take into account the first value -> 3. each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text (like forcing timestamp with the USING clause) +> 3. each query parameter applies to **all CQL statements** in the same paragraph, unless you override the option using plain CQL text (like forcing timestamp with the `USING` clause) > 4. the order of each query parameter with regard to CQL statement does not > matter ## Support for Prepared Statements @@ -463,7 +463,7 @@ saves the generated prepared statement in an **internal hash map**, using the pr > Please note that this internal prepared statement map is shared with **all > notebooks** and **all paragraphs** because there is only one instance of the interpreter for Cassandra -> If the interpreter encounters **many** @prepare for the **same _statement-name_ (key)**, only the **first** statement will be taken into account. +> If the interpreter encounters **many** `@prepare` for the **same _statement-name_ (key)**, only the **first** statement will be taken into account. Example: @@ -474,7 +474,7 @@ Example: ``` For the above example, the prepared statement is `SELECT * FROM spark_demo.albums LIMIT ?`. -`SELECT * FROM spark_demo.artists LIMIT ? is ignored because an entry already exists in the prepared statements map with the key select. +`SELECT * FROM spark_demo.artists LIMIT ?` is ignored because an entry already exists in the prepared statements map with the key _select_. In the context of **Zeppelin**, a notebook can be scheduled to be executed at regular interval, thus it is necessary to **avoid re-preparing many time the same statement (considered an anti-pattern)**. @@ -488,18 +488,18 @@ Once the statement is prepared (possibly in a separated notebook/paragraph). You Bound values are not mandatory for the **@bind** statement. However if you provide bound values, they need to comply to some syntax: -* String values should be enclosed between simple quotes ( â ) -* Date values should be enclosed between simple quotes ( â ) and respect the formats: +* String values should be enclosed between simple quotes (**'**) +* Date values should be enclosed between simple quotes (**'**) and respect the formats (full list is in the [documentation](https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timestamp_type_r.html)): 1. yyyy-MM-dd HH:MM:ss 2. yyyy-MM-dd HH:MM:ss.SSS * **null** is parsed as-is -* **boolean** (true|false) are parsed as-is +* **boolean** (`true`|`false`) are parsed as-is * collection values must follow the **[standard CQL syntax]**: - * list: [âlist_item1â, âlist_item2â, ...] - * set: {âset_item1â, âset_item2â, â¦} - * map: {âkey1â: âval1â, âkey2â: âval2â, â¦} -* **tuple** values should be enclosed between parenthesis (see **[Tuple CQL syntax]**): (âtextâ, 123, true) -* **udt** values should be enclosed between brackets (see **[UDT CQL syntax]**): {stree_name: âBeverly Hillsâ, number: 104, zip_code: 90020, state: âCaliforniaâ, â¦} + * list: ['list_item1', 'list_item2', ...] + * set: {'set_item1', 'set_item2', â¦} + * map: {'key1': 'val1', 'key2': 'val2', â¦} +* **tuple** values should be enclosed between parenthesis (see **[Tuple CQL syntax]**): ('text', 123, true) +* **udt** values should be enclosed between brackets (see **[UDT CQL syntax]**): {stree_name: 'Beverly Hills', number: 104, zip_code: 90020, state: 'California', â¦} > It is possible to use the @bind statement inside a batch: > @@ -540,8 +540,7 @@ Example: AND styles CONTAINS '${style=Rock}'; {% endraw %} - -In the above example, the first CQL query will be executed for _performer='Sheryl Crow' AND style='Rock'_. +In the above example, the first CQL query will be executed for `performer='Sheryl Crow' AND style='Rock'`. For subsequent queries, you can change the value directly using the form. > Please note that we enclosed the **$\{ \}** block between simple quotes ( > **'** ) because Cassandra expects a String here. @@ -550,14 +549,12 @@ For subsequent queries, you can change the value directly using the form. It is also possible to use dynamic forms for **prepared statements**: {% raw %} - @bind[select]=='${performer=Sheryl Crow|Doof|Fanfarlo|Los Paranoia}', '${style=Rock}' - {% endraw %} ## Shared states -It is possible to execute many paragraphs in parallel. However, at the back-end side, weâre still using synchronous queries. +It is possible to execute many paragraphs in parallel. However, at the back-end side, we're still using synchronous queries. _Asynchronous execution_ is only possible when it is possible to return a `Future` value in the `InterpreterResult`. It may be an interesting proposal for the **Zeppelin** project. @@ -570,7 +567,7 @@ Long story short, you have 3 available bindings: - **isolated**: _different JVM_ running a _single Interpreter instance_, one JVM for each note Using the **shared** binding, the same `com.datastax.driver.core.Session` object is used for **all** notes and paragraphs. -Consequently, if you use the **USE _keyspace name_;** statement to log into a keyspace, it will change the keyspace for +Consequently, if you use the `USE keyspace_name;` statement to log into a keyspace, it will change the keyspace for **all current users** of the **Cassandra** interpreter because we only create 1 `com.datastax.driver.core.Session` object per instance of **Cassandra** interpreter. @@ -588,7 +585,7 @@ To configure the **Cassandra** interpreter, go to the **Interpreter** menu and s The **Cassandra** interpreter is using the official **[Cassandra Java Driver]** and most of the parameters are used to configure the Java driver -Below are the configuration parameters and their default value. +Below are the configuration parameters and their default values. <table class="table-configuration"> <tr> @@ -597,41 +594,41 @@ Below are the configuration parameters and their default value. <th>Default Value</th> </tr> <tr> - <td>cassandra.cluster</td> + <td>`cassandra.cluster`</td> <td>Name of the Cassandra cluster to connect to</td> <td>Test Cluster</td> </tr> <tr> - <td>cassandra.compression.protocol</td> - <td>On wire compression. Possible values are: NONE, SNAPPY, LZ4</td> - <td>NONE</td> + <td>`cassandra.compression.protocol`</td> + <td>On wire compression. Possible values are: `NONE`, `SNAPPY`, `LZ4`</td> + <td>`NONE`</td> </tr> <tr> - <td>cassandra.credentials.username</td> + <td>`cassandra.credentials.username`</td> <td>If security is enable, provide the login</td> <td>none</td> </tr> <tr> - <td>cassandra.credentials.password</td> + <td>`cassandra.credentials.password`</td> <td>If security is enable, provide the password</td> <td>none</td> </tr> <tr> - <td>cassandra.hosts</td> + <td>`cassandra.hosts`</td> <td> Comma separated Cassandra hosts (DNS name or IP address). <br/> - Ex: '192.168.0.12,node2,node3' + Ex: `192.168.0.12,node2,node3` </td> - <td>localhost</td> + <td>`localhost`</td> </tr> <tr> - <td>cassandra.interpreter.parallelism</td> + <td>`cassandra.interpreter.parallelism`</td> <td>Number of concurrent paragraphs(queries block) that can be executed</td> <td>10</td> </tr> <tr> - <td>cassandra.keyspace</td> + <td>`cassandra.keyspace`</td> <td> Default keyspace to connect to. <strong> @@ -640,80 +637,80 @@ Below are the configuration parameters and their default value. in all of your queries </strong> </td> - <td>system</td> + <td>`system`</td> </tr> <tr> - <td>cassandra.load.balancing.policy</td> + <td>`cassandra.load.balancing.policy`</td> <td> - Load balancing policy. Default = <em>new TokenAwarePolicy(new DCAwareRoundRobinPolicy())</em> - To Specify your own policy, provide the <strong>fully qualify class name (FQCN)</strong> of your policy. + Load balancing policy. Default = `new TokenAwarePolicy(new DCAwareRoundRobinPolicy())` + To Specify your own policy, provide the <em>fully qualify class name (FQCN)</em> of your policy. At runtime the interpreter will instantiate the policy using <strong>Class.forName(FQCN)</strong> </td> <td>DEFAULT</td> </tr> <tr> - <td>cassandra.max.schema.agreement.wait.second</td> + <td>`cassandra.max.schema.agreement.wait.second`</td> <td>Cassandra max schema agreement wait in second</td> <td>10</td> </tr> <tr> - <td>cassandra.pooling.core.connection.per.host.local</td> + <td>`cassandra.pooling.core.connection.per.host.local`</td> <td>Protocol V2 and below default = 2. Protocol V3 and above default = 1</td> <td>2</td> </tr> <tr> - <td>cassandra.pooling.core.connection.per.host.remote</td> + <td>`cassandra.pooling.core.connection.per.host.remote`</td> <td>Protocol V2 and below default = 1. Protocol V3 and above default = 1</td> <td>1</td> </tr> <tr> - <td>cassandra.pooling.heartbeat.interval.seconds</td> + <td>`cassandra.pooling.heartbeat.interval.seconds`</td> <td>Cassandra pool heartbeat interval in secs</td> <td>30</td> </tr> <tr> - <td>cassandra.pooling.idle.timeout.seconds</td> + <td>`cassandra.pooling.idle.timeout.seconds`</td> <td>Cassandra idle time out in seconds</td> <td>120</td> </tr> <tr> - <td>cassandra.pooling.max.connection.per.host.local</td> + <td>`cassandra.pooling.max.connection.per.host.local`</td> <td>Protocol V2 and below default = 8. Protocol V3 and above default = 1</td> <td>8</td> </tr> <tr> - <td>cassandra.pooling.max.connection.per.host.remote</td> + <td>`cassandra.pooling.max.connection.per.host.remote`</td> <td>Protocol V2 and below default = 2. Protocol V3 and above default = 1</td> <td>2</td> </tr> <tr> - <td>cassandra.pooling.max.request.per.connection.local</td> + <td>`cassandra.pooling.max.request.per.connection.local`</td> <td>Protocol V2 and below default = 128. Protocol V3 and above default = 1024</td> <td>128</td> </tr> <tr> - <td>cassandra.pooling.max.request.per.connection.remote</td> + <td>`cassandra.pooling.max.request.per.connection.remote`</td> <td>Protocol V2 and below default = 128. Protocol V3 and above default = 256</td> <td>128</td> </tr> <tr> - <td>cassandra.pooling.new.connection.threshold.local</td> + <td>`cassandra.pooling.new.connection.threshold.local`</td> <td>Protocol V2 and below default = 100. Protocol V3 and above default = 800</td> <td>100</td> </tr> <tr> - <td>cassandra.pooling.new.connection.threshold.remote</td> + <td>`cassandra.pooling.new.connection.threshold.remote`</td> <td>Protocol V2 and below default = 100. Protocol V3 and above default = 200</td> <td>100</td> </tr> <tr> - <td>cassandra.pooling.pool.timeout.millisecs</td> + <td>`cassandra.pooling.pool.timeout.millisecs`</td> <td>Cassandra pool time out in millisecs</td> <td>5000</td> </tr> <tr> - <td>cassandra.protocol.version</td> + <td>`cassandra.protocol.version`</td> <td>Cassandra binary protocol version</td> <td>4</td> </tr> @@ -722,74 +719,74 @@ Below are the configuration parameters and their default value. <td> Cassandra query default consistency level <br/> - Available values: ONE, TWO, THREE, QUORUM, LOCAL_ONE, LOCAL_QUORUM, EACH_QUORUM, ALL + Available values: `ONE`, `TWO`, `THREE`, `QUORUM`, `LOCAL_ONE`, `LOCAL_QUORUM`, `EACH_QUORUM`, `ALL` </td> - <td>ONE</td> + <td>`ONE`</td> </tr> <tr> - <td>cassandra.query.default.fetchSize</td> + <td>`cassandra.query.default.fetchSize`</td> <td>Cassandra query default fetch size</td> <td>5000</td> </tr> <tr> - <td>cassandra.query.default.serial.consistency</td> + <td>`cassandra.query.default.serial.consistency`</td> <td> Cassandra query default serial consistency level <br/> - Available values: SERIAL, LOCAL_SERIAL + Available values: `SERIAL`, `LOCAL_SERIAL` </td> - <td>SERIAL</td> + <td>`SERIAL`</td> </tr> <tr> - <td>cassandra.reconnection.policy</td> + <td>`cassandra.reconnection.policy`</td> <td> Cassandra Reconnection Policy. - Default = new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000) - To Specify your own policy, provide the <strong>fully qualify class name (FQCN)</strong> of your policy. + Default = `new ExponentialReconnectionPolicy(1000, 10 * 60 * 1000)` + To Specify your own policy, provide the <em>fully qualify class name (FQCN)</em> of your policy. At runtime the interpreter will instantiate the policy using <strong>Class.forName(FQCN)</strong> </td> <td>DEFAULT</td> </tr> <tr> - <td>cassandra.retry.policy</td> + <td>`cassandra.retry.policy`</td> <td> Cassandra Retry Policy. - Default = DefaultRetryPolicy.INSTANCE - To Specify your own policy, provide the <strong>fully qualify class name (FQCN)</strong> of your policy. + Default = `DefaultRetryPolicy.INSTANCE` + To Specify your own policy, provide the <em>fully qualify class name (FQCN)</em> of your policy. At runtime the interpreter will instantiate the policy using <strong>Class.forName(FQCN)</strong> </td> <td>DEFAULT</td> </tr> <tr> - <td>cassandra.socket.connection.timeout.millisecs</td> + <td>`cassandra.socket.connection.timeout.millisecs`</td> <td>Cassandra socket default connection timeout in millisecs</td> <td>500</td> </tr> <tr> - <td>cassandra.socket.read.timeout.millisecs</td> + <td>`cassandra.socket.read.timeout.millisecs`</td> <td>Cassandra socket read timeout in millisecs</td> <td>12000</td> </tr> <tr> - <td>cassandra.socket.tcp.no_delay</td> + <td>`cassandra.socket.tcp.no_delay`</td> <td>Cassandra socket TCP no delay</td> <td>true</td> </tr> <tr> - <td>cassandra.speculative.execution.policy</td> + <td>`cassandra.speculative.execution.policy`</td> <td> Cassandra Speculative Execution Policy. - Default = NoSpeculativeExecutionPolicy.INSTANCE - To Specify your own policy, provide the <strong>fully qualify class name (FQCN)</strong> of your policy. + Default = `NoSpeculativeExecutionPolicy.INSTANCE` + To Specify your own policy, provide the <em>fully qualify class name (FQCN)</em> of your policy. At runtime the interpreter will instantiate the policy using <strong>Class.forName(FQCN)</strong> </td> <td>DEFAULT</td> </tr> <tr> - <td>cassandra.ssl.enabled</td> + <td>`cassandra.ssl.enabled`</td> <td> Enable support for connecting to the Cassandra configured with SSL. To connect to Cassandra configured with SSL use <strong>true</strong> @@ -798,14 +795,14 @@ Below are the configuration parameters and their default value. <td>false</td> </tr> <tr> - <td>cassandra.ssl.truststore.path</td> + <td>`cassandra.ssl.truststore.path`</td> <td> Filepath for the truststore file to use for connection to Cassandra with SSL. </td> <td></td> </tr> <tr> - <td>cassandra.ssl.truststore.password</td> + <td>`cassandra.ssl.truststore.password`</td> <td> Password for the truststore file to use for connection to Cassandra with SSL. </td> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/groovy.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/groovy.md b/docs/interpreter/groovy.md index f64cbde..679b5bc 100644 --- a/docs/interpreter/groovy.md +++ b/docs/interpreter/groovy.md @@ -91,6 +91,7 @@ g.table( * `String g.getProperty('PROPERTY_NAME')` + ```groovy g.PROPERTY_NAME g.'PROPERTY_NAME' http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/hbase.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/hbase.md b/docs/interpreter/hbase.md index 12e0517..fd6334a 100644 --- a/docs/interpreter/hbase.md +++ b/docs/interpreter/hbase.md @@ -70,9 +70,9 @@ mvn clean package -DskipTests -Phadoop-2.6 -Dhadoop.version=2.6.0 -P build-distr If you want to connect to HBase running on a cluster, you'll need to follow the next step. ### Export HBASE_HOME -In **conf/zeppelin-env.sh**, export `HBASE_HOME` environment variable with your HBase installation path. This ensures `hbase-site.xml` can be loaded. +In `conf/zeppelin-env.sh`, export `HBASE_HOME` environment variable with your HBase installation path. This ensures `hbase-site.xml` can be loaded. -for example +For example ```bash export HBASE_HOME=/usr/lib/hbase http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/ignite.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/ignite.md b/docs/interpreter/ignite.md index 0b4e27b..49e432f 100644 --- a/docs/interpreter/ignite.md +++ b/docs/interpreter/ignite.md @@ -42,8 +42,8 @@ In order to use Ignite interpreters, you may install Apache Ignite in some simpl > **Tip. If you want to run Ignite examples on the cli not IDE, you can export > executable Jar file from IDE. Then run it by using below command.** -``` -$ nohup java -jar </path/to/your Jar file name> +```bash +nohup java -jar </path/to/your Jar file name> ``` ## Configuring Ignite Interpreter @@ -96,7 +96,7 @@ In order to execute SQL query, use ` %ignite.ignitesql ` prefix. <br> Supposing you are running `org.apache.ignite.examples.streaming.wordcount.StreamWords`, then you can use "words" cache( Of course you have to specify this cache name to the Ignite interpreter setting section `ignite.jdbc.url` of Zeppelin ). For example, you can select top 10 words in the words cache using the following query -``` +```sql %ignite.ignitesql select _val, count(_val) as cnt from String group by _val order by cnt desc limit 10 ``` @@ -105,7 +105,7 @@ select _val, count(_val) as cnt from String group by _val order by cnt desc limi As long as your Ignite version and Zeppelin Ignite version is same, you can also use scala code. Please check the Zeppelin Ignite version before you download your own Ignite. -``` +```scala %ignite import org.apache.ignite._ import org.apache.ignite.cache.affinity._ http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/jdbc.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/jdbc.md b/docs/interpreter/jdbc.md index aee9c4e..5a8ffc9 100644 --- a/docs/interpreter/jdbc.md +++ b/docs/interpreter/jdbc.md @@ -738,16 +738,18 @@ The JDBC interpreter also supports interpolation of `ZeppelinContext` objects in The following example shows one use of this facility: ####In Scala cell: -``` + +```scala z.put("country_code", "KR") // ... ``` ####In later JDBC cell: + ```sql %jdbc_interpreter_name - select * from patents_list where - priority_country = '{country_code}' and filing_date like '2015-%' +select * from patents_list where +priority_country = '{country_code}' and filing_date like '2015-%' ``` Object interpolation is disabled by default, and can be enabled for all instances of the JDBC interpreter by http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/kylin.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/kylin.md b/docs/interpreter/kylin.md index e1d27d9..1f2b0f3 100644 --- a/docs/interpreter/kylin.md +++ b/docs/interpreter/kylin.md @@ -75,7 +75,7 @@ To get start with Apache Kylin, please see [Apache Kylin Quickstart](https://kyl ## Using the Apache Kylin Interpreter In a paragraph, use `%kylin(project_name)` to select the **kylin** interpreter, **project name** and then input **sql**. If no project name defined, will use the default project name from the above configuration. -``` +```sql %kylin(learn_project) select count(*) from kylin_sales group by part_dt ``` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/lens.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/lens.md b/docs/interpreter/lens.md index 4f07c71..cd00d1c 100644 --- a/docs/interpreter/lens.md +++ b/docs/interpreter/lens.md @@ -35,8 +35,8 @@ In order to use Lens interpreters, you may install Apache Lens in some simple st 2. Before running Lens, you have to set HIVE_HOME and HADOOP_HOME. If you want to get more information about this, please refer to [here](http://lens.apache.org/lenshome/install-and-run.html#Installation). Lens also provides Pseudo Distributed mode. [Lens pseudo-distributed setup](http://lens.apache.org/lenshome/pseudo-distributed-setup.html) is done by using [docker](https://www.docker.com/). Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup. 3. Now, you can start lens server (or stop). -``` -./bin/lens-ctl start (or stop) +```bash +./bin/lens-ctl start # (or stop) ``` ## Configuring Lens Interpreter @@ -102,11 +102,11 @@ For more interpreter binding information see [here](../usage/interpreter/overvie ### How to use You can analyze your data by using [OLAP Cube](http://lens.apache.org/user/olap-cube.html) [QL](http://lens.apache.org/user/cli.html) which is a high level SQL like language to query and describe data sets organized in data cubes. You may experience OLAP Cube like this [Video tutorial](https://cwiki.apache.org/confluence/display/LENS/2015/07/13/20+Minute+video+demo+of+Apache+Lens+through+examples). -As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh). All of these functions also can be used on Zeppelin by using Lens interpreter. +As you can see in this video, they are using Lens Client Shell(`./bin/lens-cli.sh`). All of these functions also can be used on Zeppelin by using Lens interpreter. -<li> Create and Use(Switch) Databases. +<li> Create and Use (Switch) Databases. -``` +```sql create database newDb ``` @@ -161,17 +161,21 @@ create fact your/path/to/lens/client/examples/resources/sales-raw-fact.xml <li> Add partitions to Dimtable and Fact. ``` -dimtable add single-partition --dimtable_name customer_table --storage_name local --path your/path/to/lens/client/examples/resources/customer-local-part.xml +dimtable add single-partition --dimtable_name customer_table --storage_name local +--path your/path/to/lens/client/examples/resources/customer-local-part.xml ``` ``` -fact add partitions --fact_name sales_raw_fact --storage_name local --path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml +fact add partitions --fact_name sales_raw_fact --storage_name local +--path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml ``` <li> Now, you can run queries on cubes. ``` -query execute cube select customer_city_name, product_details.description, product_details.category, product_details.color, store_sales from sales where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00') +query execute cube select customer_city_name, product_details.description, +product_details.category, product_details.color, store_sales from sales +where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00') ``` ![Lens Query Result]({{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/lens-result.png) http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/livy.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/livy.md b/docs/interpreter/livy.md index e4784d4..954eb8c 100644 --- a/docs/interpreter/livy.md +++ b/docs/interpreter/livy.md @@ -177,7 +177,7 @@ Basically, you can use **spark** -``` +```scala %livy.spark sc.version ``` @@ -185,14 +185,14 @@ sc.version **pyspark** -``` +```python %livy.pyspark print "1" ``` **sparkR** -``` +```r %livy.sparkr hello <- function( name ) { sprintf( "Hello, %s", name ); @@ -209,7 +209,8 @@ This is particularly useful when multi users are sharing a Notebook server. ## Apply Zeppelin Dynamic Forms You can leverage [Zeppelin Dynamic Form](../usage/dynamic_form/intro.html). Form templates is only avalible for livy sql interpreter. -``` + +```sql %livy.sql select * from products where ${product_id=1} ``` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/mahout.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/mahout.md b/docs/interpreter/mahout.md index c3b4146..0b1d529 100644 --- a/docs/interpreter/mahout.md +++ b/docs/interpreter/mahout.md @@ -29,6 +29,7 @@ Apache Mahout is a collection of packages that enable machine learning and matri ### Easy Installation To quickly and easily get up and running using Apache Mahout, run the following command from the top-level directory of the Zeppelin install: + ```bash python scripts/mahout/add_mahout.py ``` @@ -39,34 +40,34 @@ This will create the `%sparkMahout` and `%flinkMahout` interpreters, and restart The `add_mahout.py` script contains several command line arguments for advanced users. -<table class="table-configuration"> + <table class="table-configuration"> <tr> <th>Argument</th> <th>Description</th> <th>Example</th> </tr> <tr> - <td>--zeppelin_home</td> + <td>`--zeppelin_home`</td> <td>This is the path to the Zeppelin installation. This flag is not needed if the script is run from the top-level installation directory or from the `zeppelin/scripts/mahout` directory.</td> - <td>/path/to/zeppelin</td> + <td>`/path/to/zeppelin`</td> </tr> <tr> - <td>--mahout_home</td> + <td>`--mahout_home`</td> <td>If the user has already installed Mahout, this flag can set the path to `MAHOUT_HOME`. If this is set, downloading Mahout will be skipped.</td> - <td>/path/to/mahout_home</td> + <td>`/path/to/mahout_home`</td> </tr> <tr> - <td>--restart_later</td> - <td>Restarting is necessary for updates to take effect. By default the script will restart Zeppelin for you- restart will be skipped if this flag is set.</td> + <td>`--restart_later`</td> + <td>Restarting is necessary for updates to take effect. By default the script will restart Zeppelin for you. Restart will be skipped if this flag is set.</td> <td>NA</td> </tr> <tr> - <td>--force_download</td> + <td>`--force_download`</td> <td>This flag will force the script to re-download the binary even if it already exists. This is useful for previously failed downloads.</td> <td>NA</td> </tr> <tr> - <td>--overwrite_existing</td> + <td>`--overwrite_existing`</td> <td>This flag will force the script to overwrite existing `%sparkMahout` and `%flinkMahout` interpreters. Useful when you want to just start over.</td> <td>NA</td> </tr> @@ -165,6 +166,7 @@ Resource Pools are a powerful Zeppelin feature that lets us share information be ### Setting up a Resource Pool in Flink In Spark based interpreters resource pools are accessed via the ZeppelinContext API. To put and get things from the resource pool one can be done simple + ```scala val myVal = 1 z.put("foo", myVal) http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/markdown.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/markdown.md b/docs/interpreter/markdown.md index d5581d9..609f204 100644 --- a/docs/interpreter/markdown.md +++ b/docs/interpreter/markdown.md @@ -71,7 +71,6 @@ For more information, please see [Mathematical Expression](../usage/display_syst ### Markdown4j Parser -Since pegdown parser is more accurate and provides much more markdown syntax -`markdown4j` option might be removed later. But keep this parser for the backward compatibility. +Since `pegdown` parser is more accurate and provides much more markdown syntax `markdown4j` option might be removed later. But keep this parser for the backward compatibility. http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/neo4j.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/neo4j.md b/docs/interpreter/neo4j.md index 37f1f8c..1b14127 100644 --- a/docs/interpreter/neo4j.md +++ b/docs/interpreter/neo4j.md @@ -75,7 +75,7 @@ In a notebook, to enable the **Neo4j** interpreter, click the **Gear** icon and In a paragraph, use `%neo4j` to select the Neo4j interpreter and then input the Cypher commands. For list of Cypher commands please refer to the official [Cyper Refcard](http://neo4j.com/docs/cypher-refcard/current/) -```bash +``` %neo4j //Sample the TrumpWorld dataset WITH @@ -92,7 +92,7 @@ The Neo4j interpreter leverages the [Network display system](../usage/display_sy This query: -```bash +``` %neo4j MATCH (vp:Person {name:"VLADIMIR PUTIN"}), (dt:Person {name:"DONALD J. TRUMP"}) MATCH path = allShortestPaths( (vp)-[*]-(dt) ) @@ -104,7 +104,7 @@ produces the following result_ ### Apply Zeppelin Dynamic Forms You can leverage [Zeppelin Dynamic Form](../usage/dynamic_form/intro.html) inside your queries. This query: -```bash +``` %neo4j MATCH (o:Organization)-[r]-() RETURN o.name, count(*), collect(distinct type(r)) AS types http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/python.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/python.md b/docs/interpreter/python.md index 1965fc9..94bffd0 100644 --- a/docs/interpreter/python.md +++ b/docs/interpreter/python.md @@ -70,34 +70,51 @@ The interpreter can use all modules already installed (with pip, easy_install... - get the Conda Infomation: - ```%python.conda info``` + ``` + %python.conda info + ``` - list the Conda environments: - ```%python.conda env list``` + ``` + %python.conda env list + ``` - create a conda enviornment: - ```%python.conda create --name [ENV NAME]``` + + ``` + %python.conda create --name [ENV NAME] + ``` - activate an environment (python interpreter will be restarted): - ```%python.conda activate [ENV NAME]``` + ``` + %python.conda activate [ENV NAME] + ``` - deactivate - ```%python.conda deactivate``` + ``` + %python.conda deactivate + ``` - get installed package list inside the current environment - ```%python.conda list``` + ``` + %python.conda list + ``` - install package - ```%python.conda install [PACKAGE NAME]``` + ``` + %python.conda install [PACKAGE NAME] + ``` - uninstall package - ```%python.conda uninstall [PACKAGE NAME]``` + ``` + %python.conda uninstall [PACKAGE NAME] + ``` ### Docker @@ -171,7 +188,8 @@ If Zeppelin cannot find the matplotlib backend files (which should usually be fo then the backend will automatically be set to agg, and the (otherwise deprecated) instructions below can be used for more limited inline plotting. If you are unable to load the inline backend, use `z.show(plt)`: - ```python + +```python %python import matplotlib.pyplot as plt plt.figure() http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/r.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/r.md b/docs/interpreter/r.md index 3c6a9a9..966dc1e 100644 --- a/docs/interpreter/r.md +++ b/docs/interpreter/r.md @@ -40,12 +40,30 @@ R -e "print(1+1)" To enjoy plots, install additional libraries with: -``` -+ devtools with `R -e "install.packages('devtools', repos = 'http://cran.us.r-project.org')"` -+ knitr with `R -e "install.packages('knitr', repos = 'http://cran.us.r-project.org')"` -+ ggplot2 with `R -e "install.packages('ggplot2', repos = 'http://cran.us.r-project.org')"` -+ Other vizualisation librairies: `R -e "install.packages(c('devtools','mplot', 'googleVis'), repos = 'http://cran.us.r-project.org'); require(devtools); install_github('ramnathv/rCharts')"` -``` ++ devtools with + + ```bash + R -e "install.packages('devtools', repos = 'http://cran.us.r-project.org')" + ``` + ++ knitr with + + ```bash + R -e "install.packages('knitr', repos = 'http://cran.us.r-project.org')" + ``` + ++ ggplot2 with + + ```bash + R -e "install.packages('ggplot2', repos = 'http://cran.us.r-project.org')" + ``` + ++ Other visualization libraries: + + ```bash + R -e "install.packages(c('devtools','mplot', 'googleVis'), repos = 'http://cran.us.r-project.org'); + require(devtools); install_github('ramnathv/rCharts')" + ``` We recommend you to also install the following optional R libraries for happy data analytics: http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/sap.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/sap.md b/docs/interpreter/sap.md index be05aee..0447958 100644 --- a/docs/interpreter/sap.md +++ b/docs/interpreter/sap.md @@ -98,7 +98,7 @@ If generated query contains promtps, then promtps will appear as dynamic form af Example query -``` +```sql %sap universe [Universe Name]; @@ -120,4 +120,4 @@ where and [Folder1].[Dimension4] is not null and [Folder1].[Dimension5] in ('Value1', 'Value2'); -``` \ No newline at end of file +``` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/scalding.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/scalding.md b/docs/interpreter/scalding.md index f2e3461..02c5fb8 100644 --- a/docs/interpreter/scalding.md +++ b/docs/interpreter/scalding.md @@ -28,7 +28,7 @@ limitations under the License. ## Building the Scalding Interpreter You have to first build the Scalding interpreter by enable the **scalding** profile as follows: -``` +```bash mvn clean package -Pscalding -DskipTests ``` @@ -66,20 +66,19 @@ and directories with custom jar files you need for your scalding commands. **Set arguments to the scalding repl** -The default arguments are: "--local --repl" +The default arguments are: `--local --repl` -For hdfs mode you need to add: "--hdfs --repl" +For hdfs mode you need to add: `--hdfs --repl` -If you want to add custom jars, you need to add: -"-libjars directory/*:directory/*" +If you want to add custom jars, you need to add: `-libjars directory/*:directory/*` For reducer estimation, you need to add something like: -"-Dscalding.reducer.estimator.classes=com.twitter.scalding.reducer_estimation.InputSizeReducerEstimator" +`-Dscalding.reducer.estimator.classes=com.twitter.scalding.reducer_estimation.InputSizeReducerEstimator` **Set max.open.instances** If you want to control the maximum number of open interpreters, you have to select "scoped" interpreter for note -option and set max.open.instances argument. +option and set `max.open.instances` argument. ## Testing the Interpreter @@ -88,7 +87,7 @@ option and set max.open.instances argument. In example, by using the [Alice in Wonderland](https://gist.github.com/johnynek/a47699caa62f4f38a3e2) tutorial, we will count words (of course!), and plot a graph of the top 10 words in the book. -``` +```scala %scalding import scala.io.Source @@ -144,7 +143,7 @@ res4: com.twitter.scalding.Mode = Hdfs(true,Configuration: core-default.xml, cor **Test HDFS read** -``` +```scala val testfile = TypedPipe.from(TextLine("/user/x/testfile")) testfile.dump ``` @@ -153,7 +152,7 @@ This command should print the contents of the hdfs file /user/x/testfile. **Test map-reduce job** -``` +```scala val testfile = TypedPipe.from(TextLine("/user/x/testfile")) val a = testfile.groupAll.size.values a.toList http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/shell.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/shell.md b/docs/interpreter/shell.md index 9ab4036..d44a425 100644 --- a/docs/interpreter/shell.md +++ b/docs/interpreter/shell.md @@ -93,7 +93,8 @@ The shell interpreter also supports interpolation of `ZeppelinContext` objects i The following example shows one use of this facility: ####In Scala cell: -``` + +```scala z.put("dataFileName", "members-list-003.parquet") // ... val members = spark.read.parquet(z.get("dataFileName")) @@ -101,8 +102,10 @@ val members = spark.read.parquet(z.get("dataFileName")) ``` ####In later Shell cell: -``` -%sh rm -rf {dataFileName} + +```bash +%sh +rm -rf {dataFileName} ``` Object interpolation is disabled by default, and can be enabled (for the Shell interpreter) by http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/interpreter/spark.md ---------------------------------------------------------------------- diff --git a/docs/interpreter/spark.md b/docs/interpreter/spark.md index 51b7c9e..6775fbf 100644 --- a/docs/interpreter/spark.md +++ b/docs/interpreter/spark.md @@ -360,8 +360,10 @@ This is to make the server communicate with KDC. 3. Add the two properties below to Spark configuration (`[SPARK_HOME]/conf/spark-defaults.conf`): - spark.yarn.principal - spark.yarn.keytab + ``` + spark.yarn.principal + spark.yarn.keytab + ``` > **NOTE:** If you do not have permission to access for the above spark-defaults.conf file, optionally, you can add the above lines to the Spark Interpreter setting through the Interpreter tab in the Zeppelin UI. http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/basics/how_to_build.md ---------------------------------------------------------------------- diff --git a/docs/setup/basics/how_to_build.md b/docs/setup/basics/how_to_build.md index f78c631..85a59ca 100644 --- a/docs/setup/basics/how_to_build.md +++ b/docs/setup/basics/how_to_build.md @@ -51,7 +51,7 @@ If you haven't installed Git and Maven yet, check the [Build requirements](#buil #### 1. Clone the Apache Zeppelin repository -``` +```bash git clone https://github.com/apache/zeppelin.git ``` @@ -60,7 +60,7 @@ git clone https://github.com/apache/zeppelin.git You can build Zeppelin with following maven command: -``` +```bash mvn clean package -DskipTests [Options] ``` @@ -248,7 +248,7 @@ plugin.frontend.yarnDownloadRoot # default https://github.com/yarnpkg/yarn/relea If you don't have requirements prepared, install it. (The installation method may vary according to your environment, example is for Ubuntu.) -``` +```bash sudo apt-get update sudo apt-get install git sudo apt-get install openjdk-7-jdk @@ -261,7 +261,8 @@ sudo apt-get install r-cran-evaluate ### Install maven -``` + +```bash wget http://www.eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz sudo tar -zxf apache-maven-3.3.9-bin.tar.gz -C /usr/local/ sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn /usr/local/bin/mvn @@ -280,7 +281,7 @@ If you're behind the proxy, you'll need to configure maven and npm to pass throu First of all, configure maven in your `~/.m2/settings.xml`. -``` +```xml <settings> <proxies> <proxy> @@ -309,7 +310,7 @@ First of all, configure maven in your `~/.m2/settings.xml`. Then, next commands will configure npm. -``` +```bash npm config set proxy http://localhost:3128 npm config set https-proxy http://localhost:3128 npm config set registry "http://registry.npmjs.org/" @@ -318,7 +319,7 @@ npm config set strict-ssl false Configure git as well -``` +```bash git config --global http.proxy http://localhost:3128 git config --global https.proxy http://localhost:3128 git config --global url."http://".insteadOf git:// http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/deployment/cdh.md ---------------------------------------------------------------------- diff --git a/docs/setup/deployment/cdh.md b/docs/setup/deployment/cdh.md index 9fb508f..d35292e 100644 --- a/docs/setup/deployment/cdh.md +++ b/docs/setup/deployment/cdh.md @@ -29,14 +29,14 @@ limitations under the License. You can import the Docker image by pulling it from Cloudera Docker Hub. -``` +```bash docker pull cloudera/quickstart:latest ``` ### 2. Run docker -``` +```bash docker run -it \ -p 80:80 \ -p 4040:4040 \ @@ -75,7 +75,7 @@ To verify the application is running well, check the web UI for HDFS on `http:// ### 4. Configure Spark interpreter in Zeppelin Set following configurations to `conf/zeppelin-env.sh`. -``` +```bash export MASTER=yarn-client export HADOOP_CONF_DIR=[your_hadoop_conf_path] export SPARK_HOME=[your_spark_home_path] http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/deployment/docker.md ---------------------------------------------------------------------- diff --git a/docs/setup/deployment/docker.md b/docs/setup/deployment/docker.md index c0cdb69..746986d 100644 --- a/docs/setup/deployment/docker.md +++ b/docs/setup/deployment/docker.md @@ -33,7 +33,7 @@ You need to [install docker](https://docs.docker.com/engine/installation/) on yo ### Running docker image -``` +```bash docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:<release-version> ``` @@ -41,7 +41,7 @@ docker run -p 8080:8080 --rm --name zeppelin apache/zeppelin:<release-version> If you want to specify `logs` and `notebook` dir, -``` +```bash docker run -p 8080:8080 --rm \ -v $PWD/logs:/logs \ -v $PWD/notebook:/notebook \ @@ -52,7 +52,7 @@ docker run -p 8080:8080 --rm \ ### Building dockerfile locally -``` +```bash cd $ZEPPELIN_HOME cd scripts/docker/zeppelin/bin http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/deployment/flink_and_spark_cluster.md ---------------------------------------------------------------------- diff --git a/docs/setup/deployment/flink_and_spark_cluster.md b/docs/setup/deployment/flink_and_spark_cluster.md index 11188a4..5094840 100644 --- a/docs/setup/deployment/flink_and_spark_cluster.md +++ b/docs/setup/deployment/flink_and_spark_cluster.md @@ -20,7 +20,7 @@ limitations under the License. {% include JB/setup %} -# Install with flink and spark cluster +# Install with Flink and Spark cluster <div id="toc"></div> @@ -48,24 +48,24 @@ For git, openssh-server, and OpenJDK 7 we will be using the apt package manager. ##### git From the command prompt: -``` +```bash sudo apt-get install git ``` ##### openssh-server -``` +```bash sudo apt-get install openssh-server ``` ##### OpenJDK 7 -``` +```bash sudo apt-get install openjdk-7-jdk openjdk-7-jre-lib ``` *A note for those using Ubuntu 16.04*: To install `openjdk-7` on Ubuntu 16.04, one must add a repository. [Source](http://askubuntu.com/questions/761127/ubuntu-16-04-and-openjdk-7) -``` bash +```bash sudo add-apt-repository ppa:openjdk-r/ppa sudo apt-get update sudo apt-get install openjdk-7-jdk openjdk-7-jre-lib @@ -76,26 +76,26 @@ Zeppelin requires maven version 3.x. The version available in the repositories Purge any existing versions of maven. -``` +```bash sudo apt-get purge maven maven2 ``` Download the maven 3.3.9 binary. -``` +```bash wget "http://www.us.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz" ``` Unarchive the binary and move to the `/usr/local` directory. -``` +```bash tar -zxvf apache-maven-3.3.9-bin.tar.gz sudo mv ./apache-maven-3.3.9 /usr/local ``` Create symbolic links in `/usr/bin`. -``` +```bash sudo ln -s /usr/local/apache-maven-3.3.9/bin/mvn /usr/bin/mvn ``` @@ -105,19 +105,19 @@ This provides a quick overview of Zeppelin installation from source, however the From the command prompt: Clone Zeppelin. -``` +```bash git clone https://github.com/apache/zeppelin.git ``` Enter the Zeppelin root directory. -``` +```bash cd zeppelin ``` Package Zeppelin. -``` +```bash mvn clean package -DskipTests -Pspark-1.6 -Dflink.version=1.1.3 -Pscala-2.10 ``` @@ -145,7 +145,7 @@ As long as you didn't edit any code, it is unlikely the build is failing because Start the Zeppelin daemon. -``` +```bash bin/zeppelin-daemon.sh start ``` @@ -158,9 +158,7 @@ See the [Zeppelin tutorial](../../quickstart/tutorial.html) for basic Zeppelin u ##### Flink Test Create a new notebook named "Flink Test" and copy and paste the following code. - ```scala - %flink // let Zeppelin know what interpreter to use. val text = benv.fromElements("In the time of chimpanzees, I was a monkey", // some lines of text to analyze @@ -238,7 +236,7 @@ Run the code to make sure the built-in Zeppelin Flink interpreter is working pro Finally, stop the Zeppelin daemon. From the command prompt run: -``` +```bash bin/zeppelin-daemon.sh stop ``` @@ -273,7 +271,7 @@ See the [Flink Installation guide](https://github.com/apache/flink/blob/master/R Return to the directory where you have been downloading, this tutorial assumes that is `$HOME`. Clone Flink, check out release-1.1.3-rc2, and build. -``` +```bash cd $HOME git clone https://github.com/apache/flink.git cd flink @@ -283,7 +281,7 @@ mvn clean install -DskipTests Start the Flink Cluster in stand-alone mode -``` +```bash build-target/bin/start-cluster.sh ``` @@ -297,14 +295,16 @@ In a browser, navigate to http://`yourip`:8082 to see the Flink Web-UI. Click o If no task managers are present, restart the Flink cluster with the following commands: (if binaries) -``` + +```bash flink-1.1.3/bin/stop-cluster.sh flink-1.1.3/bin/start-cluster.sh ``` (if built from source) -``` + +```bash build-target/bin/stop-cluster.sh build-target/bin/start-cluster.sh ``` @@ -339,13 +339,13 @@ Return to the directory where you have been downloading, this tutorial assumes t the time of writing. You are free to check out other version, just make sure you build Zeppelin against the correct version of Spark. However if you use Spark 2.0, the word count example will need to be changed as Spark 2.0 is not compatible with the following examples. -``` +```bash cd $HOME ``` Clone, check out, and build Spark version 1.6.x. -``` +```bash git clone https://github.com/apache/spark.git cd spark git checkout branch-1.6 @@ -362,7 +362,7 @@ cd $HOME Start the Spark cluster in stand alone mode, specifying the webui-port as some port other than 8080 (the webui-port of Zeppelin). -``` +```bash spark/sbin/start-master.sh --webui-port 8082 ``` **Note:** Why `--webui-port 8082`? There is a digression toward the end of this document that explains this. @@ -375,13 +375,13 @@ Toward the top of the page there will be a *URL*: spark://`yourhost`:7077. Note Start the slave using the URI from the Spark master WebUI: -``` +```bash spark/sbin/start-slave.sh spark://yourhostname:7077 ``` Return to the root directory and start the Zeppelin daemon. -``` +```bash cd $HOME zeppelin/bin/zeppelin-daemon.sh start http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/deployment/spark_cluster_mode.md ---------------------------------------------------------------------- diff --git a/docs/setup/deployment/spark_cluster_mode.md b/docs/setup/deployment/spark_cluster_mode.md index 7abaecd..94102bf 100644 --- a/docs/setup/deployment/spark_cluster_mode.md +++ b/docs/setup/deployment/spark_cluster_mode.md @@ -38,14 +38,14 @@ You can simply set up Spark standalone environment with below steps. ### 1. Build Docker file You can find docker script files under `scripts/docker/spark-cluster-managers`. -``` +```bash cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_standalone docker build -t "spark_standalone" . ``` ### 2. Run docker -``` +```bash docker run -it \ -p 8080:8080 \ -p 7077:7077 \ @@ -70,7 +70,7 @@ After running single paragraph with Spark interpreter in Zeppelin, browse `https You can also simply verify that Spark is running well in Docker with below command. -``` +```bash ps -ef | grep spark ``` @@ -83,14 +83,14 @@ You can simply set up [Spark on YARN](http://spark.apache.org/docs/latest/runnin ### 1. Build Docker file You can find docker script files under `scripts/docker/spark-cluster-managers`. -``` +```bash cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_yarn_cluster docker build -t "spark_yarn" . ``` ### 2. Run docker -``` +```bash docker run -it \ -p 5000:5000 \ -p 9000:9000 \ @@ -120,7 +120,7 @@ Note that `sparkmaster` hostname used here to run docker container should be def You can simply verify the processes of Spark and YARN are running well in Docker with below command. -``` +```bash ps -ef ``` @@ -129,7 +129,7 @@ You can also check each application web UI for HDFS on `http://<hostname>:50070/ ### 4. Configure Spark interpreter in Zeppelin Set following configurations to `conf/zeppelin-env.sh`. -``` +```bash export MASTER=yarn-client export HADOOP_CONF_DIR=[your_hadoop_conf_path] export SPARK_HOME=[your_spark_home_path] @@ -154,7 +154,7 @@ You can simply set up [Spark on Mesos](http://spark.apache.org/docs/latest/runni ### 1. Build Docker file -``` +```bash cd $ZEPPELIN_HOME/scripts/docker/spark-cluster-managers/spark_mesos docker build -t "spark_mesos" . ``` @@ -162,7 +162,7 @@ docker build -t "spark_mesos" . ### 2. Run docker -``` +```bash docker run --net=host -it \ -p 8080:8080 \ -p 7077:7077 \ @@ -183,7 +183,7 @@ Note that `sparkmaster` hostname used here to run docker container should be def You can simply verify the processes of Spark and Mesos are running well in Docker with below command. -``` +```bash ps -ef ``` @@ -192,7 +192,7 @@ You can also check each application web UI for Mesos on `http://<hostname>:5050/ ### 4. Configure Spark interpreter in Zeppelin -``` +```bash export MASTER=mesos://127.0.1.1:5050 export MESOS_NATIVE_JAVA_LIBRARY=[PATH OF libmesos.so] export SPARK_HOME=[PATH OF SPARK HOME] @@ -234,4 +234,4 @@ W0103 20:17:24.040252 339 sched.cpp:736] Ignoring framework registered message W0103 20:17:26.150250 339 sched.cpp:736] Ignoring framework registered message because it was sentfrom 'master@127.0.0.1:5050' instead of the leading master 'master@127.0.1.1:5050' W0103 20:17:26.737604 339 sched.cpp:736] Ignoring framework registered message because it was sentfrom 'master@127.0.0.1:5050' instead of the leading master 'master@127.0.1.1:5050' W0103 20:17:35.241714 336 sched.cpp:736] Ignoring framework registered message because it was sentfrom 'master@127.0.0.1:5050' instead of the leading master 'master@127.0.1.1:5050' -``` \ No newline at end of file +``` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/deployment/virtual_machine.md ---------------------------------------------------------------------- diff --git a/docs/setup/deployment/virtual_machine.md b/docs/setup/deployment/virtual_machine.md index 21beba6..a50d1a2 100644 --- a/docs/setup/deployment/virtual_machine.md +++ b/docs/setup/deployment/virtual_machine.md @@ -25,9 +25,7 @@ limitations under the License. ## Overview -Apache Zeppelin distribution includes a script directory - - `scripts/vagrant/zeppelin-dev` +Apache Zeppelin distribution includes a script directory `scripts/vagrant/zeppelin-dev` This script creates a virtual machine that launches a repeatable, known set of core dependencies required for developing Zeppelin. It can also be used to run an existing Zeppelin build if you don't plan to build from source. For PySpark users, this script includes several helpful [Python Libraries](#python-extras). @@ -44,7 +42,7 @@ If you are running Windows and don't yet have python installed, [install Python 1. Download and Install Vagrant: [Vagrant Downloads](http://www.vagrantup.com/downloads.html) 2. Install Ansible: [Ansible Python pip install](http://docs.ansible.com/ansible/intro_installation.html#latest-releases-via-pip) - ``` + ```bash sudo easy_install pip sudo pip install ansible ansible --version @@ -58,7 +56,7 @@ Thats it ! You can now run `vagrant ssh` and this will place you into the guest If you don't wish to build Zeppelin from scratch, run the z-manager installer script while running in the guest VM: -``` +```bash curl -fsSL https://raw.githubusercontent.com/NFLabs/z-manager/master/zeppelin-installer.sh | bash ``` @@ -67,7 +65,7 @@ curl -fsSL https://raw.githubusercontent.com/NFLabs/z-manager/master/zeppelin-in You can now -``` +```bash git clone git://git.apache.org/zeppelin.git ``` @@ -87,8 +85,8 @@ By default, Vagrant will share your project directory (the directory with the Va Running the following commands in the guest machine should display these expected versions: -`node --version` should report *v0.12.7* -`mvn --version` should report *Apache Maven 3.3.9* and *Java version: 1.7.0_85* +* `node --version` should report *v0.12.7* +* `mvn --version` should report *Apache Maven 3.3.9* and *Java version: 1.7.0_85* The virtual machine consists of: @@ -108,7 +106,7 @@ The virtual machine consists of: This assumes you've already cloned the project either on the host machine in the zeppelin-dev directory (to be shared with the guest machine) or cloned directly into a directory while running inside the guest machine. The following build steps will also include Python and R support via PySpark and SparkR: -``` +```bash cd /zeppelin mvn clean package -Pspark-1.6 -Phadoop-2.4 -DskipTests ./bin/zeppelin-daemon.sh start @@ -189,4 +187,4 @@ show(plt) ### R Extras With zeppelin running, an R Tutorial notebook will be available. The R packages required to run the examples and graphs in this tutorial notebook were installed by this virtual machine. -The installed R Packages include: Knitr, devtools, repr, rCharts, ggplot2, googleVis, mplot, htmltools, base64enc, data.table +The installed R Packages include: `knitr`, `devtools`, `repr`, `rCharts`, `ggplot2`, `googleVis`, `mplot`, `htmltools`, `base64enc`, `data.table`. http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/deployment/yarn_install.md ---------------------------------------------------------------------- diff --git a/docs/setup/deployment/yarn_install.md b/docs/setup/deployment/yarn_install.md index fc46bc2..b596799 100644 --- a/docs/setup/deployment/yarn_install.md +++ b/docs/setup/deployment/yarn_install.md @@ -105,7 +105,7 @@ hdp-select status hadoop-client | sed 's/hadoop-client - \(.*\)/\1/' ## Start/Stop ### Start Zeppelin -``` +```bash cd /home/zeppelin/zeppelin bin/zeppelin-daemon.sh start ``` @@ -113,7 +113,7 @@ After successful start, visit http://[zeppelin-server-host-name]:8080 with your ### Stop Zeppelin -``` +```bash bin/zeppelin-daemon.sh stop ``` http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/operation/configuration.md ---------------------------------------------------------------------- diff --git a/docs/setup/operation/configuration.md b/docs/setup/operation/configuration.md index ed4e1f2..f2e356d 100644 --- a/docs/setup/operation/configuration.md +++ b/docs/setup/operation/configuration.md @@ -368,8 +368,9 @@ A condensed example can be found in the top answer to this [StackOverflow post]( The keystore holds the private key and certificate on the server end. The trustore holds the trusted client certificates. Be sure that the path and password for these two stores are correctly configured in the password fields below. They can be obfuscated using the Jetty password tool. After Maven pulls in all the dependency to build Zeppelin, one of the Jetty jars contain the Password tool. Invoke this command from the Zeppelin home build directory with the appropriate version, user, and password. -``` -java -cp ./zeppelin-server/target/lib/jetty-all-server-<version>.jar org.eclipse.jetty.util.security.Password <user> <password> +```bash +java -cp ./zeppelin-server/target/lib/jetty-all-server-<version>.jar \ +org.eclipse.jetty.util.security.Password <user> <password> ``` If you are using a self-signed, a certificate signed by an untrusted CA, or if client authentication is enabled, then the client must have a browser create exceptions for both the normal HTTPS port and WebSocket port. This can by done by trying to establish an HTTPS connection to both ports in a browser (e.g. if the ports are 443 and 8443, then visit https://127.0.0.1:443 and https://127.0.0.1:8443). This step can be skipped if the server certificate is signed by a trusted CA and client auth is disabled. @@ -378,7 +379,7 @@ If you are using a self-signed, a certificate signed by an untrusted CA, or if c The following properties needs to be updated in the `zeppelin-site.xml` in order to enable server side SSL. -``` +```xml <property> <name>zeppelin.server.ssl.port</name> <value>8443</value> @@ -421,7 +422,7 @@ The following properties needs to be updated in the `zeppelin-site.xml` in order The following properties needs to be updated in the `zeppelin-site.xml` in order to enable client side certificate authentication. -``` +```xml <property> <name>zeppelin.server.ssl.port</name> <value>8443</value> @@ -461,7 +462,7 @@ Please notice that passwords will be stored in *plain text* by default. To encry You can generate an appropriate encryption key any way you'd like - for instance, by using the openssl tool: -``` +```bash openssl enc -aes-128-cbc -k secret -P -md sha1 ``` @@ -476,7 +477,7 @@ The Password tool documentation can be found [here](http://www.eclipse.org/jetty After using the tool: -``` +```bash java -cp $ZEPPELIN_HOME/zeppelin-server/target/lib/jetty-util-9.2.15.v20160210.jar \ org.eclipse.jetty.util.security.Password \ password @@ -489,7 +490,7 @@ MD5:5f4dcc3b5aa765d61d8327deb882cf99 update your configuration with the obfuscated password : -``` +```xml <property> <name>zeppelin.ssl.keystore.password</name> <value>OBF:1v2j1uum1xtv1zej1zer1xtn1uvk1v1v</value> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/operation/trouble_shooting.md ---------------------------------------------------------------------- diff --git a/docs/setup/operation/trouble_shooting.md b/docs/setup/operation/trouble_shooting.md index f16dc8f..5857bd8 100644 --- a/docs/setup/operation/trouble_shooting.md +++ b/docs/setup/operation/trouble_shooting.md @@ -19,7 +19,7 @@ limitations under the License. --> {% include JB/setup %} -# Trouble Shooting +# Troubleshooting <div id="toc"></div> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/security/authentication_nginx.md ---------------------------------------------------------------------- diff --git a/docs/setup/security/authentication_nginx.md b/docs/setup/security/authentication_nginx.md index be4875a..705a21d 100644 --- a/docs/setup/security/authentication_nginx.md +++ b/docs/setup/security/authentication_nginx.md @@ -38,7 +38,7 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c You can install NGINX server with same box where zeppelin installed or separate box where it is dedicated to serve as proxy server. - ``` + ```bash $ apt-get install nginx ``` > **NOTE :** On pre 1.3.13 version of NGINX, Proxy for Websocket may not fully works. Please use latest version of NGINX. See: [NGINX documentation](https://www.nginx.com/blog/websocket-nginx/). @@ -47,7 +47,7 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c In most cases, NGINX configuration located under `/etc/nginx/sites-available`. Create your own configuration or add your existing configuration at `/etc/nginx/sites-available`. - ``` + ```bash $ cd /etc/nginx/sites-available $ touch my-zeppelin-auth-setting ``` @@ -95,7 +95,7 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c Then make a symbolic link to this file from `/etc/nginx/sites-enabled/` to enable configuration above when NGINX reloads. - ``` + ```bash $ ln -s /etc/nginx/sites-enabled/my-zeppelin-auth-setting /etc/nginx/sites-available/my-zeppelin-auth-setting ``` @@ -103,17 +103,17 @@ This instruction based on Ubuntu 14.04 LTS but may work with other OS with few c Now you need to setup `.htpasswd` file to serve list of authenticated user credentials for NGINX server. - ``` + ```bash $ cd /etc/nginx $ htpasswd -c htpasswd [YOUR-ID] - $ NEW passwd: [YOUR-PASSWORD] - $ RE-type new passwd: [YOUR-PASSWORD-AGAIN] + NEW passwd: [YOUR-PASSWORD] + RE-type new passwd: [YOUR-PASSWORD-AGAIN] ``` Or you can use your own apache `.htpasswd` files in other location for setting up property: `auth_basic_user_file` Restart NGINX server. - ``` + ```bash $ service nginx restart ``` Then check HTTP Basic Authentication works in browser. If you can see regular basic auth popup and then able to login with credential you entered into `.htpasswd` you are good to go. http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/security/http_security_headers.md ---------------------------------------------------------------------- diff --git a/docs/setup/security/http_security_headers.md b/docs/setup/security/http_security_headers.md index 1c55d18..ad4aeef 100644 --- a/docs/setup/security/http_security_headers.md +++ b/docs/setup/security/http_security_headers.md @@ -32,7 +32,7 @@ It also prevents MITM attack by not allowing User to override the invalid certif The following property needs to be updated in the zeppelin-site.xml in order to enable HSTS. You can choose appropriate value for "max-age". -``` +```xml <property> <name>zeppelin.server.strict.transport</name> <value>max-age=631138519</value> @@ -55,7 +55,7 @@ The HTTP X-XSS-Protection response header is a feature of Internet Explorer, Chr The following property needs to be updated in the zeppelin-site.xml in order to set X-XSS-PROTECTION header. -``` +```xml <property> <name>zeppelin.server.xxss.protection</name> <value>1; mode=block</value> @@ -78,7 +78,7 @@ The X-Frame-Options HTTP response header can indicate browser to avoid clickjack The following property needs to be updated in the zeppelin-site.xml in order to set X-Frame-Options header. -``` +```xml <property> <name>zeppelin.server.xframe.options</name> <value>SAMEORIGIN</value> @@ -89,9 +89,9 @@ The following property needs to be updated in the zeppelin-site.xml in order to You can choose appropriate value from below. -* DENY -* SAMEORIGIN -* ALLOW-FROM _uri_ +* `DENY` +* `SAMEORIGIN` +* `ALLOW-FROM uri` ## Setting up Server Header @@ -99,7 +99,7 @@ Security conscious organisations does not want to reveal the Application Server The following property needs to be updated in the zeppelin-site.xml in order to set Server header. -``` +```xml <property> <name>zeppelin.server.jetty.name</name> <value>Jetty(7.6.0.v20120127)</value> http://git-wip-us.apache.org/repos/asf/zeppelin/blob/68cb6761/docs/setup/security/notebook_authorization.md ---------------------------------------------------------------------- diff --git a/docs/setup/security/notebook_authorization.md b/docs/setup/security/notebook_authorization.md index fe0e27a..6410fe9 100644 --- a/docs/setup/security/notebook_authorization.md +++ b/docs/setup/security/notebook_authorization.md @@ -53,13 +53,13 @@ By default, owners and writers have **write** permission, owners, writers and ru ## Separate notebook workspaces (public vs. private) By default, the authorization rights allow other users to see the newly created note, meaning the workspace is `public`. This behavior is controllable and can be set through either `ZEPPELIN_NOTEBOOK_PUBLIC` variable in `conf/zeppelin-env.sh`, or through `zeppelin.notebook.public` property in `conf/zeppelin-site.xml`. Thus, in order to make newly created note appear only in your `private` workspace by default, you can set either `ZEPPELIN_NOTEBOOK_PUBLIC` to `false` in your `conf/zeppelin-env.sh` as follows: -``` +```bash export ZEPPELIN_NOTEBOOK_PUBLIC="false" ``` or set `zeppelin.notebook.public` property to `false` in `conf/zeppelin-site.xml` as follows: -``` +```xml <property> <name>zeppelin.notebook.public</name> <value>false</value>