[jira] [Commented] (FLINK-2021) Rework examples to use ParameterTool

ASF GitHub Bot (JIRA) Thu, 21 Jan 2016 19:22:14 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111848#comment-15111848
 ]


ASF GitHub Bot commented on FLINK-2021:
---------------------------------------

Github user chiwanpark commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1536#discussion_r50498429
  
    --- Diff: 
flink-examples/flink-examples-batch/src/main/scala/org/apache/flink/examples/scala/clustering/KMeans.scala
 ---
    @@ -26,53 +27,84 @@ import 
org.apache.flink.examples.java.clustering.util.KMeansData
     import scala.collection.JavaConverters._
     
     /**
    - * This example implements a basic K-Means clustering algorithm.
    - *
    - * K-Means is an iterative clustering algorithm and works as follows:
    - * K-Means is given a set of data points to be clustered and an initial 
set of ''K'' cluster
    - * centers.
    - * In each iteration, the algorithm computes the distance of each data 
point to each cluster center.
    - * Each point is assigned to the cluster center which is closest to it.
    - * Subsequently, each cluster center is moved to the center (''mean'') of 
all points that have
    - * been assigned to it.
    - * The moved cluster centers are fed into the next iteration. 
    - * The algorithm terminates after a fixed number of iterations (as in this 
implementation) 
    - * or if cluster centers do not (significantly) move in an iteration.
    - * This is the Wikipedia entry for the [[http://en.wikipedia
    - * .org/wiki/K-means_clustering K-Means Clustering algorithm]].
    - *
    - * This implementation works on two-dimensional data points.
    - * It computes an assignment of data points to cluster centers, i.e., 
    - * each data point is annotated with the id of the final cluster (center) 
it belongs to.
    - *
    - * Input files are plain text files and must be formatted as follows:
    - *
    - *  - Data points are represented as two double values separated by a 
blank character.
    - *    Data points are separated by newline characters.
    - *    For example `"1.2 2.3\n5.3 7.2\n"` gives two data points (x=1.2, 
y=2.3) and (x=5.3,
    - *    y=7.2).
    - *  - Cluster centers are represented by an integer id and a point value.
    - *    For example `"1 6.2 3.2\n2 2.9 5.7\n"` gives two centers (id=1, 
x=6.2,
    - *    y=3.2) and (id=2, x=2.9, y=5.7).
    - *
    - * Usage:
    - * {{{
    - *   KMeans <points path> <centers path> <result path> <num iterations>
    - * }}}
    - * If no parameters are provided, the program is run with default data from
    - * [[org.apache.flink.examples.java.clustering.util.KMeansData]]
    - * and 10 iterations.
    - *
    - * This example shows how to use:
    - *
    - *  - Bulk iterations
    - *  - Broadcast variables in bulk iterations
    - *  - Custom Java objects (PoJos)
    - */
    +  * This example implements a basic K-Means clustering algorithm.
    +  *
    +  * K-Means is an iterative clustering algorithm and works as follows:
    +  * K-Means is given a set of data points to be clustered and an initial 
set of ''K'' cluster
    +  * centers.
    +  * In each iteration, the algorithm computes the distance of each data 
point to each cluster center.
    +  * Each point is assigned to the cluster center which is closest to it.
    +  * Subsequently, each cluster center is moved to the center (''mean'') of 
all points that have
    +  * been assigned to it.
    +  * The moved cluster centers are fed into the next iteration.
    +  * The algorithm terminates after a fixed number of iterations (as in 
this implementation)
    +  * or if cluster centers do not (significantly) move in an iteration.
    +  * This is the Wikipedia entry for the [[http://en.wikipedia
    +  * .org/wiki/K-means_clustering K-Means Clustering algorithm]].
    +  *
    +  * This implementation works on two-dimensional data points.
    +  * It computes an assignment of data points to cluster centers, i.e.,
    +  * each data point is annotated with the id of the final cluster (center) 
it belongs to.
    +  *
    +  * Input files are plain text files and must be formatted as follows:
    +  *
    +  * - Data points are represented as two double values separated by a 
blank character.
    +  * Data points are separated by newline characters.
    +  * For example `"1.2 2.3\n5.3 7.2\n"` gives two data points (x=1.2, 
y=2.3) and (x=5.3,
    +  * y=7.2).
    +  * - Cluster centers are represented by an integer id and a point value.
    +  * For example `"1 6.2 3.2\n2 2.9 5.7\n"` gives two centers (id=1, x=6.2,
    +  * y=3.2) and (id=2, x=2.9, y=5.7).
    +  *
    +  * Usage:
    +  * {{{
    +  *   KMeans <points path> <centers path> <result path> <num iterations>
    +  * }}}
    +  * If no parameters are provided, the program is run with default data 
from
    +  * [[org.apache.flink.examples.java.clustering.util.KMeansData]]
    +  * and 10 iterations.
    +  *
    +  * This example shows how to use:
    +  *
    +  * - Bulk iterations
    +  * - Broadcast variables in bulk iterations
    +  * - Custom Java objects (PoJos)
    --- End diff --
    
    We're using "Scala objects". Could you change this line?


> Rework examples to use ParameterTool
> ------------------------------------
>
>                 Key: FLINK-2021
>                 URL: https://issues.apache.org/jira/browse/FLINK-2021
>             Project: Flink
>          Issue Type: Improvement
>          Components: Examples
>    Affects Versions: 0.9
>            Reporter: Robert Metzger
>            Priority: Minor
>              Labels: starter
>
> In FLINK-1525, we introduced the {{ParameterTool}}.
> We should port the examples to use the tool.
> The examples could look like this (we should maybe discuss it first on the 
> mailing lists):
> {code}
> public static void main(String[] args) throws Exception {
>     ParameterTool pt = ParameterTool.fromArgs(args);
>     boolean fileOutput = pt.getNumberOfParameters() == 2;
>     String textPath = null;
>     String outputPath = null;
>     if(fileOutput) {
>         textPath = pt.getRequired("input");
>         outputPath = pt.getRequired("output");
>     }
>     // set up the execution environment
>     final ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
>     env.getConfig().setUserConfig(pt);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-2021) Rework examples to use ParameterTool

Reply via email to