[
https://issues.apache.org/jira/browse/FLINK-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14585704#comment-14585704
]
ASF GitHub Bot commented on FLINK-2209:
---------------------------------------
Github user tillrohrmann commented on a diff in the pull request:
https://github.com/apache/flink/pull/835#discussion_r32404022
--- Diff: docs/apis/cluster_execution.md ---
@@ -80,67 +80,73 @@ Note that the program contains custom user code and
hence requires a JAR file wi
the classes of the code attached. The constructor of the remote environment
takes the path(s) to the JAR file(s).
-## Remote Executor
+## Linking with modules not contained in the binary distribution
-Similar to the RemoteEnvironment, the RemoteExecutor lets you execute
-Flink programs on a cluster directly. The remote executor accepts a
-*Plan* object, which describes the program as a single executable unit.
+The binary distribution contains jar packages in the `lib` folder that are
automatically
+provided to the classpath of your distrbuted programs. Almost all of Flink
classes are
+located there with a few exceptions, for example the streaming connectors
and some freshly
+added modules. To run code depending on these modules you need to make
them accessible
+during runtime, for which we suggest two options:
-### Maven Dependency
-
-If you are developing your program in a Maven project, you have to add the
-`flink-clients` module using this dependency:
-
-~~~xml
-<dependency>
- <groupId>org.apache.flink</groupId>
- <artifactId>flink-clients</artifactId>
- <version>{{ site.version }}</version>
-</dependency>
-~~~
-
-### Example
-
-The following illustrates the use of the `RemoteExecutor` with the Scala
API:
-
-~~~scala
-def main(args: Array[String]) {
- val input = TextFile("hdfs://path/to/file")
+1. Either copy the required jar files to the `lib` folder onto all of your
TaskManagers.
+2. Or package them with your usercode.
- val words = input flatMap { _.toLowerCase().split("""\W+""") filter {
_ != "" } }
- val counts = words groupBy { x => x } count()
+The latter version is recommended as it respects the classloader
management in Flink.
- val output = counts.write(wordsOutput, CsvOutputFormat())
-
- val plan = new ScalaPlan(Seq(output), "Word Count")
- val executor = new RemoteExecutor("strato-master", 7881,
"/path/to/jarfile.jar")
- executor.executePlan(p);
-}
-~~~
+### Packaging dependencies with your usercode with Maven
-The following illustrates the use of the `RemoteExecutor` with the Java
API (as
-an alternative to the RemoteEnvironment):
+To provide these dependencies not included by Flink we suggest two options
with Maven.
-~~~java
-public static void main(String[] args) throws Exception {
- ExecutionEnvironment env =
ExecutionEnvironment.getExecutionEnvironment();
+1. The maven assembly plugin builds a so called fat jar cointaining all
your dependencies.
+Easy to configure, but is an overkill in many cases. See
+[usage](http://maven.apache.org/plugins/maven-assembly-plugin/usage.html).
+2. The maven unpack plugin, for unpacking the relevant parts of the
dependencies and
+then package it with your code.
- DataSet<String> data = env.readTextFile("hdfs://path/to/file");
+To the the latter for example for the streaming Kafka connector,
`flink-connector-kafka`
--- End diff --
Wording of the first sentence. Maybe something like: "Using the latter
approach in order to bundle the Kafka connecter..."
> Document how to use TableAPI, Gelly and FlinkML, StreamingConnectors on a
> cluster
> ---------------------------------------------------------------------------------
>
> Key: FLINK-2209
> URL: https://issues.apache.org/jira/browse/FLINK-2209
> Project: Flink
> Issue Type: Improvement
> Reporter: Till Rohrmann
> Assignee: Márton Balassi
>
> Currently the TableAPI, Gelly, FlinkML and StreamingConnectors are not part
> of the Flink dist module. Therefore they are not included in the binary
> distribution. As a consequence, if you want to use one of these libraries the
> corresponding jar and all their dependencies have to be either manually put
> on the cluster or the user has to include them in the user code jar.
> Usually a fat jar is built if the one uses the quickstart archetypes. However
> if one sets the project manually up this ist not necessarily the case.
> Therefore, it should be well documented how to run programs using one of
> these libraries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)