[GitHub] flink pull request #3259: Documentation: Production readiness checklist

alpinegizmo Fri, 03 Feb 2017 08:55:43 -0800

Github user alpinegizmo commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3259#discussion_r99375767
  
    --- Diff: docs/ops/production_ready.md ---
    @@ -0,0 +1,88 @@
    +---
    +title: "Production Readiness Checklist"
    +nav-parent_id: setup
    +nav-pos: 20
    +---
    +<!--
    +Licensed to the Apache Software Foundation (ASF) under one
    +or more contributor license agreements.  See the NOTICE file
    +distributed with this work for additional information
    +regarding copyright ownership.  The ASF licenses this file
    +to you under the Apache License, Version 2.0 (the
    +"License"); you may not use this file except in compliance
    +with the License.  You may obtain a copy of the License at
    +
    +  http://www.apache.org/licenses/LICENSE-2.0
    +
    +Unless required by applicable law or agreed to in writing,
    +software distributed under the License is distributed on an
    +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +KIND, either express or implied.  See the License for the
    +specific language governing permissions and limitations
    +under the License.
    +-->
    +
    +* ToC
    +{:toc}
    +
    +## Production Readiness Checklist
    +
    +Purpose of this production readiness checklist is to provide a condensed 
overview of configuration options that are
    +important and need **careful considerations** if you plan to bring your 
Flink job into **production**. For most of these options
    +Flink provides out-of-the-box defaults to make usage and adoption of Flink 
easier. For many users and scenarios, those
    +defaults are good starting points for development and completely 
sufficient for "one-shot" jobs. 
    +
    +However, once you are planning to bring a Flink appplication to production 
the requirements typically increase. For example,
    +you want your job to be (re-)scalable and to have a good upgrade story for 
your job and new Flink versions.
    +
    +In the following, we present a collection of configuration options that 
you should check before your job goes into production.
    +
    +### Set maximum parallelism for operators explicitly
    +
    +Maximum parallelism is a configuration parameter that is newly introduced 
in Flink 1.2 and has important implications
    +for the (re-)scalability of your Flink job. This parameter, which can be 
set on a per-job and/or per-operator granularity,
    +determines the maximum parallelism to which you can scale operators. It is 
important to understand that (as of now) there
    +is **now way to increase** this parameter after your job was initially 
started, except for restarting your job completely 
    +from scratch (i.e. with a new state, and not from a previous 
checkpoint/savepoint). Even if Flink would provide some way
    +to change maximum parallelism for existing savepoints in the future, you 
can already assume that for large states this is 
    +likely a long running operation that you want to avoid. At this point, you 
might wonder why not just to use a very high
    +value as default for this parameter. The reason behind this is that high 
maximum parallelism can have some impact on your
    +applications performance and even state sizes, because Flink has to 
maintain certain meta data for it's ability to rescale which
    +can increase with the maximum parallelism. In general, you should chose a 
max parallelism that is high enough to fit your
    --- End diff --
    
    choose



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3259: Documentation: Production readiness checklist

Reply via email to