[ 
https://issues.apache.org/jira/browse/FLINK-1017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093906#comment-14093906
 ] 

Fabian Hueske commented on FLINK-1017:
--------------------------------------

Sure, please use the example :-)
The classic example of a {{DataSource -> Map -> Reduce -> DataSink}} job is the 
infamous WordCount. Flink's examples include that of course as well: [WordCount 
example|http://flink.incubator.apache.org/docs/0.6-SNAPSHOT/java_api_examples.html#word-count]

The system does *not* automatically choose the parallelism of any operator or 
job. 
Instead, there are the parallelism can be configured on different levels:
- Operator level: {{dataSet.map(myMapFunction).setParallelsim(4)}} runs the map 
function with parallelism 4.
- Job level: {{plan.setDefaultParallelism(2)}} runs all operators for which no 
parallelism was specified with DOP 2.
- System level: Set the config parameter {{parallelization.degree.default}} in 
{{conf/flink-conf.yaml}} to run all jobs and operators for which no DOP was 
specified with the configured parallelism.


> Add setParallelism() to Java API documentation
> ----------------------------------------------
>
>                 Key: FLINK-1017
>                 URL: https://issues.apache.org/jira/browse/FLINK-1017
>             Project: Flink
>          Issue Type: Task
>          Components: Documentation
>    Affects Versions: 0.6-incubating, pre-apache-0.5
>            Reporter: Fabian Hueske
>            Assignee: Hung Chang
>            Priority: Minor
>              Labels: starter
>             Fix For: 0.6-incubating
>
>
> The Java API offers {{setParallelism()}} to control the degree of parallelism 
> for each operator. This feature is not documented and should be added.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to