yunfengzhou-hub commented on a change in pull request #71:
URL: https://github.com/apache/flink-ml/pull/71#discussion_r838146133



##########
File path: flink-ml-benchmark/README.md
##########
@@ -0,0 +1,261 @@
+# Flink ML Benchmark Guideline
+
+This document provides instructions about how to run benchmarks on Flink ML's
+stages.
+
+## Write Benchmark Programs
+
+### Add Maven Dependencies
+
+In order to write Flink ML's java benchmark programs, first make sure that the
+following dependencies have been added to your maven project's `pom.xml`.
+
+```xml
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-ml-core_${scala.binary.version}</artifactId>
+  <version>${flink.ml.version}</version>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-ml-iteration_${scala.binary.version}</artifactId>
+  <version>${flink.ml.version}</version>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-ml-lib_${scala.binary.version}</artifactId>
+  <version>${flink.ml.version}</version>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-ml-benchmark_${scala.binary.version}</artifactId>
+  <version>${flink.ml.version}</version>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>statefun-flink-core</artifactId>
+  <version>3.1.0</version>
+  <exclusions>
+    <exclusion>
+      <groupId>org.apache.flink</groupId>
+      <artifactId>flink-streaming-java_2.12</artifactId>
+    </exclusion>
+  </exclusions>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-streaming-java_${scala.binary.version}</artifactId>
+  <version>${flink.version}</version>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-table-api-java-bridge_${scala.binary.version}</artifactId>
+  <version>${flink.version}</version>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-table-planner_${scala.binary.version}</artifactId>
+  <version>${flink.version}</version>
+</dependency>
+
+<dependency>
+  <groupId>org.apache.flink</groupId>
+  <artifactId>flink-clients_${scala.binary.version}</artifactId>
+  <version>${flink.version}</version>
+</dependency>
+```
+
+### Write Java Program
+
+Then you can write a program as follows to run benchmark on Flink ML stages. 
The
+example code below tests the performance of Flink ML's KMeans algorithm, with
+the default configuration parameters used.
+
+```java
+public class Main {
+    public static void main(String[] args) throws Exception {
+        StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
+
+        KMeans kMeans = new KMeans();
+        KMeansInputsGenerator inputsGenerator = new KMeansInputsGenerator();
+
+        BenchmarkResult result =
+                BenchmarkUtils.runBenchmark("exampleBenchmark", tEnv, kMeans, 
inputsGenerator);
+
+        BenchmarkUtils.printResult(result);
+    }
+}
+```
+
+### Execute Benchmark Program
+
+After executing the `main()` method above, you will see benchmark results
+printed out in your terminal. An example of the printed content is as follows.
+
+```
+Benchmark Name: exampleBenchmark
+Total Execution Time(ms): 828.0
+```
+
+### Configure Benchmark Parameters
+
+If you want to run benchmark on customed configuration parameters, you can set
+them with Flink ML's `WithParams` API as follows.
+
+```java
+KMeans kMeans = new KMeans()
+  .setK(5)
+  .setMaxIter(50);
+KMeansInputsGenerator inputsGenerator = new KMeansInputsGenerator()
+  .setDims(3)
+  .setDataSize(10000);
+```
+
+## Execute Benchmark through Command-Line Interface (CLI)
+
+You can also configure and execute benchmarks through Command-Line Interface
+(CLI) without writing java programs.
+
+### Prerequisites
+
+Before using Flink ML's CLI, make sure you have installed Flink 1.14 in your
+local environment, and that you have started a Flink cluster locally. If not,
+you can start a standalone session with the following command.
+
+```bash
+$ start-cluster

Review comment:
       OK. I'll add the instructions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to