This is an automated email from the ASF dual-hosted git repository.
zjffdu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/zeppelin.git
The following commit(s) were added to refs/heads/master by this push:
new e5c1cc5 [ZEPPELIN-4685]. No property description in the new created
interpreter setting
e5c1cc5 is described below
commit e5c1cc52d751eac2b7a571ff53a0f2920aa5c149
Author: Jeff Zhang <[email protected]>
AuthorDate: Tue Mar 17 17:18:53 2020 +0800
[ZEPPELIN-4685]. No property description in the new created interpreter
setting
### What is this PR for?
It is a minor PR which fix the issue of no property description in the new
created interpreter setting. The root cause is in the frontend. Besides that I
also improve the flink doc and `intepreter-setting.json`
### What type of PR is it?
[Bug Fix]
### Todos
* [ ] - Task
### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-4685
### How should this be tested?
* CI pass
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Jeff Zhang <[email protected]>
Closes #3690 from zjffdu/ZEPPELIN-4685 and squashes the following commits:
9f3c761d6 [Jeff Zhang] [ZEPPELIN-4685]. No property description in the new
created interpreter setting
---
docs/interpreter/flink.md | 79 ++++++++++++----------
flink/src/main/resources/interpreter-setting.json | 16 ++++-
.../src/app/interpreter/interpreter.controller.js | 1 +
3 files changed, 59 insertions(+), 37 deletions(-)
diff --git a/docs/interpreter/flink.md b/docs/interpreter/flink.md
index ffef1e7..6d628cf 100644
--- a/docs/interpreter/flink.md
+++ b/docs/interpreter/flink.md
@@ -62,6 +62,11 @@ Apache Flink is supported in Zeppelin with Flink interpreter
group which consist
</tr>
</table>
+## Prerequisites
+
+* Download Flink 1.10 for scala 2.11 (Only scala-2.11 is supported, scala-2.12
is not supported yet in Zeppelin)
+* Download
[flink-hadoop-shaded](https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-hadoop-2/2.8.3-10.0/flink-shaded-hadoop-2-2.8.3-10.0.jar)
and put it under lib folder of flink (flink interpreter need that to support
yarn mode)
+
## Configuration
The Flink interpreter can be configured with properties provided by Zeppelin
(as following table).
You can also set other flink properties which are not listed in the table. For
a list of additional properties, refer to [Flink Available
Properties](https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html).
@@ -79,12 +84,12 @@ You can also set other flink properties which are not
listed in the table. For a
<tr>
<td>`HADOOP_CONF_DIR`</td>
<td></td>
- <td>location of hadoop conf, this is must be set if running in yarn
mode</td>
+ <td>Location of hadoop conf, this is must be set if running in yarn
mode</td>
</tr>
<tr>
<td>`HIVE_CONF_DIR`</td>
<td></td>
- <td>location of hive conf, this is must be set if you want to connect to
hive metastore</td>
+ <td>Location of hive conf, this is must be set if you want to connect to
hive metastore</td>
</tr>
<tr>
<td>flink.execution.mode</td>
@@ -94,12 +99,12 @@ You can also set other flink properties which are not
listed in the table. For a
<tr>
<td>flink.execution.remote.host</td>
<td></td>
- <td>jobmanager hostname if it is remote mode</td>
+ <td>Host name of running JobManager. Only used for remote mode</td>
</tr>
<tr>
<td>flink.execution.remote.port</td>
<td></td>
- <td>jobmanager port if it is remote mode</td>
+ <td>Port of running JobManager. Only used for remote mode</td>
</tr>
<tr>
<td>flink.jm.memory</td>
@@ -119,7 +124,7 @@ You can also set other flink properties which are not
listed in the table. For a
<tr>
<td>local.number-taskmanager</td>
<td>4</td>
- <td>Total number of TaskManagers in Local mode</td>
+ <td>Total number of TaskManagers in local mode</td>
</tr>
<tr>
<td>flink.yarn.appName</td>
@@ -139,17 +144,17 @@ You can also set other flink properties which are not
listed in the table. For a
<tr>
<td>flink.udf.jars</td>
<td></td>
- <td>udf jars (comma separated), zeppelin will register udf in this jar
automatically for user. The udf name is the class name.</td>
+ <td>Flink udf jars (comma separated), zeppelin will register udf in this
jar automatically for user. The udf name is the class name.</td>
</tr>
<tr>
<td>flink.execution.jars</td>
<td></td>
- <td>additional user jars (comma separated)</td>
+ <td>Additional user jars (comma separated)</td>
</tr>
<tr>
<td>flink.execution.packages</td>
<td></td>
- <td>additional user packages (comma separated), e.g.
org.apache.flink:flink-connector-kafka_2.11:1.10,org.apache.flink:flink-connector-kafka-base_2.11:1.10,org.apache.flink:flink-json:1.10</td>
+ <td>Additional user packages (comma separated), e.g.
org.apache.flink:flink-connector-kafka_2.11:1.10,org.apache.flink:flink-connector-kafka-base_2.11:1.10.0,org.apache.flink:flink-json:1.10.0</td>
</tr>
<tr>
<td>zeppelin.flink.concurrentBatchSql.max</td>
@@ -164,7 +169,7 @@ You can also set other flink properties which are not
listed in the table. For a
<tr>
<td>zeppelin.pyflink.python</td>
<td>python</td>
- <td>python binary executable for PyFlink</td>
+ <td>Python binary executable for PyFlink</td>
</tr>
<tr>
<td>table.exec.resource.default-parallelism</td>
@@ -174,17 +179,22 @@ You can also set other flink properties which are not
listed in the table. For a
<tr>
<td>zeppelin.flink.scala.color</td>
<td>true</td>
- <td>whether display scala shell output in colorful format</td>
+ <td>Whether display scala shell output in colorful format</td>
</tr>
<tr>
<td>zeppelin.flink.enableHive</td>
<td>false</td>
- <td>whether enable hive</td>
+ <td>Whether enable hive</td>
</tr>
<tr>
- <td>zeppelin.flink.printREPLOutput</td>
- <td>true</td>
- <td>Print REPL output</td>
+ <td>zeppelin.flink.enableHive</td>
+ <td>false</td>
+ <td>Whether enable hive</td>
+ </tr>
+ <tr>
+ <td>zeppelin.flink.hive.version</td>
+ <td>2.3.4</td>
+ <td>Hive version that you would like to connect</td>
</tr>
<tr>
<td>zeppelin.flink.maxResult</td>
@@ -192,12 +202,12 @@ You can also set other flink properties which are not
listed in the table. For a
<td>max number of row returned by sql interpreter</td>
</tr>
<tr>
- <td>flink.interpreter.close.shutdown_cluster</td>
+ <td>`flink.interpreter.close.shutdown_cluster`</td>
<td>true</td>
<td>Whether shutdown application when closing interpreter</td>
</tr>
<tr>
- <td>zeppelin.interpreter.close.cancel_job</td>
+ <td>`zeppelin.interpreter.close.cancel_job`</td>
<td>true</td>
<td>Whether cancel flink job when closing interpreter</td>
</tr>
@@ -240,18 +250,16 @@ because by default it is only 4 TM with 1 Slots which may
not be enough for some
### Run Flink in Remote Mode
-Running Flink in remote mode will connect to a flink standalone cluster,
besides specifying `flink.execution.mode` to be `remote`. You also need to
specify
-`flink.execution.remote.host` and `flink.execution.remote.port` to point to
Flink Job Manager.
+Running Flink in remote mode will connect to a existing flink cluster which
could be standalone cluster or yarn session cluster. Besides specifying
`flink.execution.mode` to be `remote`. You also need to specify
+`flink.execution.remote.host` and `flink.execution.remote.port` to point to
flink job manager.
### Run Flink in Yarn Mode
-In order to run Flink in Yarn mode, you need to make the following settings:
+In order to run flink in Yarn mode, you need to make the following settings:
* Set `flink.execution.mode` to `yarn`
-* Set `HADOOP_CONF_DIR` in `zeppelin-env.sh` or flink's interpreter setting.
-* Copy necessary dependencies to flink's lib folder, check this
[link](https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/hive/#depedencies)
for more details
- * `flink-hadoop-compatibility_{scala_version}-{flink.version}.jar`
- * `flink-shaded-hadoop-2-uber-{hadoop.version}-{flink-shaded.version}.jar`
+* Set `HADOOP_CONF_DIR` in flink's interpreter setting.
+* Make sure `hadoop` command is your PATH. Because internally flink will call
command `hadoop classpath` and load all the hadoop related jars in the flink
interpreter process
## Blink/Flink Planner
@@ -266,14 +274,13 @@ There're 2 planners supported by Flink's table api:
`flink` & `blink`.
In order to use Hive in Flink, you have to make the following setting.
-* Set `zeppelin.flink.enableHive` to `true`
-* Copy necessary dependencies to flink's lib folder, check this
[link](https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/hive/#depedencies)
for more details
- * flink-connector-hive_{scala_version}-{flink.version}.jar
- * flink-hadoop-compatibility_{scala_version}-{flink.version}.jar
- * flink-shaded-hadoop-2-uber-{hadoop.version}-{flink-shaded.version}.jar
- * hive-exec-2.x.jar (for Hive 1.x, you need to copy hive-exec-1.x.jar,
hive-metastore-1.x.jar, libfb303-0.9.2.jar and libthrift-0.9.2.jar)
-* Specify `HIVE_CONF_DIR` either in flink interpreter setting or
`zeppelin-env.sh`
-* Specify `zeppelin.flink.hive.version`, by default it is 2.3.4. If you are
using Hive 1.2.x, then you need to set it as `1.2.2`
+* Set `zeppelin.flink.enableHive` to be true
+* Set `zeppelin.flink.hive.version` to be the hive version you are using.
+* Set `HIVE_CONF_DIR` to be the location where `hive-site.xml` is located.
Make sure hive metastore is started and you have configure
`hive.metastore.uris` in `hive-site.xml`
+* Copy the following dependencies to the lib folder of flink installation.
+ * flink-connector-hive_2.11–1.10.0.jar
+ * flink-hadoop-compatibility_2.11–1.10.0.jar
+ * hive-exec-2.x.jar (for hive 1.x, you need to copy hive-exec-1.x.jar,
hive-metastore-1.x.jar, libfb303–0.9.2.jar and libthrift-0.9.2.jar)
After these settings, you will be able to query hive table via either table
api `%flink` or batch sql `%flink.bsql`
@@ -322,8 +329,8 @@ bt_env.register_function("python_upper", udf(PythonUpper(),
DataTypes.STRING(),
```
Besides defining udf in Zeppelin, you can also load udfs in jars via
`flink.udf.jars`. For example, you can create
-udfs in intellij and then build these udfs in one jar. After that you can
specify `flink.udf.jars` to this jar, and Flink
-interpreter will inspect this jar and register all the udfs in this jar to
TableEnvironment, the udf name is the class name.
+udfs in intellij and then build these udfs in one jar. After that you can
specify `flink.udf.jars` to this jar, and flink
+interpreter will detect all the udfs in this jar and register all the udfs to
TableEnvironment, the udf name is the class name.
## ZeppelinContext
Zeppelin automatically injects `ZeppelinContext` as variable `z` in your
Scala/Python environment. `ZeppelinContext` provides some additional functions
and utilities.
@@ -332,13 +339,13 @@ See
[Zeppelin-Context](../usage/other_features/zeppelin_context.html) for more d
## IPython Support
By default, zeppelin would use IPython in `%flink.pyflink` when IPython is
available, Otherwise it would fall back to the original python implementation.
-If you don't want to use IPython, then you can set
`zeppelin.pyflink.useIPython` as `false` in interpreter setting. For the
IPython features, you can refer doc
-[Python Interpreter](python.html)
+For the IPython features, you can refer doc[Python Interpreter](python.html)
## Tutorial Notes
-Zeppelin is shipped with several Flink tutorial notes which may be helpful for
you. Overall you can use Flink in Zeppelin for 4 main scenairos.
+Zeppelin is shipped with several Flink tutorial notes which may be helpful for
you. Except the first one, the below 4 notes cover the 4 main scenarios of
flink.
+* Flink Basic
* Batch ETL
* Exploratory Data Analytics
* Streaming ETL
diff --git a/flink/src/main/resources/interpreter-setting.json
b/flink/src/main/resources/interpreter-setting.json
index 5ea252d..acf1bf3 100644
--- a/flink/src/main/resources/interpreter-setting.json
+++ b/flink/src/main/resources/interpreter-setting.json
@@ -12,6 +12,20 @@
"description": "Location of flink distribution",
"type": "string"
},
+ "HADOOP_CONF_DIR": {
+ "envName": null,
+ "propertyName": null,
+ "defaultValue": "",
+ "description": "Location of hadoop conf (core-site.xml, hdfs-site.xml
and etc.)",
+ "type": "string"
+ },
+ "HIVE_CONF_DIR": {
+ "envName": null,
+ "propertyName": null,
+ "defaultValue": "",
+ "description": "Location of hive conf (hive-site.xml)",
+ "type": "string"
+ },
"flink.execution.mode": {
"envName": null,
"propertyName": null,
@@ -71,7 +85,7 @@
"flink.yarn.queue": {
"envName": null,
"propertyName": null,
- "defaultValue": "default",
+ "defaultValue": "",
"description": "Yarn queue name",
"type": "string"
},
diff --git a/zeppelin-web/src/app/interpreter/interpreter.controller.js
b/zeppelin-web/src/app/interpreter/interpreter.controller.js
index 1bbd1ee..dddae00 100644
--- a/zeppelin-web/src/app/interpreter/interpreter.controller.js
+++ b/zeppelin-web/src/app/interpreter/interpreter.controller.js
@@ -536,6 +536,7 @@ function InterpreterCtrl($rootScope, $scope, $http,
baseUrlSrv, ngToast, $timeou
newProperties[p] = {
value: newSetting.properties[p].value,
type: newSetting.properties[p].type,
+ description: newSetting.properties[p].description,
name: p,
};
}