rmetzger commented on a change in pull request #14222: URL: https://github.com/apache/flink/pull/14222#discussion_r530520370
########## File path: docs/ops/deployment/overview.md ########## @@ -0,0 +1,358 @@ +--- +title: "Clusters & Deployment" +nav-id: deployment +nav-parent_id: ops +nav-pos: 1 +nav-show_overview: true +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Flink is a versatile framework, supporting many different deployment scenarios in a mix and match fashion. + +Below, we briefly explain the building blocks of a Flink cluster, their purpose and available implementations. +If you just want to start Flink locally, we recommend setting up a [Standalone Cluster]({% link ops/deployment/local.md %}). + +* This will be replaced by the TOC +{:toc} + + +## Overview and Reference Architecture + +The figure below shows the building blocks of every Flink cluster. There is always somewhere a client running. It takes the code of the Flink applications, transforms it into a job graph and submits it to the JobManager. + +The JobManager distributes the work onto the TaskManagers, where the actual operators (such as sources, transformations and sinks) are running. + +When deploying Flink, there are often multiple options available for each building block. We have listed them in the table below the figure. + +If you don't know where to start, we recommend using the Command Line Interface for submitting Flink applications to a Standalone Cluster. + +<!-- Image source: https://docs.google.com/drawings/d/1s_ZlXXvADqxWfTMNRVwQeg7HZ3hN1Xb7goxDPjTEPrI/edit?usp=sharing --> +<img width="100%" src="{% link fig/deployment_overview.svg %}" alt="Figure for Overview and Reference Architecture" /> + + +<table class="table table-bordered"> + <thead> + <tr> + <th class="text-left" style="width: 25%">Component</th> + <th class="text-left" style="width: 50%">Purpose</th> + <th class="text-left">Implementations</th> + </tr> + </thead> + <tbody> + <tr> + <td>Flink Client</td> + <td> + Flink batch or streaming applications are compiled into a dataflow graph, which is submitted to the JobManager. + </td> + <td> + <ul> + <li><a href="">Command Line Interface</a></li> + <li><a href="">REST Endpoint</a></li> + <li><a href="">SQL Client</a></li> + <li><a href="">Python REPL</a></li> + <li><a href="">Scala REPL</a></li> + </ul> + </td> + </tr> + <tr> + <td>JobManager</td> + <td> + JobManager is the name of the central work coordination component of Flink. It has implementations for different resource providers, which differ on high-availability, resource allocation behavior and supported job submission modes. <br /> + JobManager <a href="">modes for job submissions</a>: + <ul> + <li><b>Session Mode</b>: one JobManager instance manages multiple jobs sharing the same cluster of TaskManagers</li> + <li><b>Application Mode</b>: runs the cluster exclusively for one job. The job main method (or client) gets executed on the JobManager.</li> + <li><b>Per-Job Mode</b>: runs the cluster exclusively for one job. The job main method (or client) runs only prior to the cluster creation.</li> + </ul> Review comment: I don't have a strong opinion on this one, as there is probably no right or wrong here. I wanted to mention the three modes here already, because they are essential to understand for the resource provider detail pages. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
