(flink) 01/02: [FLINK-33235][doc] Update OLAP Quickstart doc

guoyangze Wed, 01 Nov 2023 19:15:09 -0700

This is an automated email from the ASF dual-hosted git repository.

guoyangze pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git


commit 817e3f2b964fa5d86e207d6aa3065f139ec84402
Author: Xiangyu Feng <xiangyu...@gmail.com>
AuthorDate: Wed Nov 1 19:16:08 2023 +0800

    [FLINK-33235][doc] Update OLAP Quickstart doc
---
 docs/content/docs/dev/table/olap_quickstart.md | 181 +++++++++++++------------
 docs/content/docs/dev/table/overview.md        |   2 +-
 docs/static/fig/olap-architecture.svg          |  21 +++
 3 files changed, 117 insertions(+), 87 deletions(-)

diff --git a/docs/content/docs/dev/table/olap_quickstart.md 
b/docs/content/docs/dev/table/olap_quickstart.md
index e0b3afba2bc..e5084e065c2 100644
--- a/docs/content/docs/dev/table/olap_quickstart.md
+++ b/docs/content/docs/dev/table/olap_quickstart.md
@@ -1,5 +1,5 @@
 ---
-title: "Quickstart for Flink OLAP"
+title: "OLAP Quickstart"
 weight: 91
 type: docs
 aliases:
@@ -24,32 +24,41 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-# Introduction
+# OLAP Quickstart
 
-Flink OLAP has already added to [Apache Flink 
Roadmap](https://flink.apache.org/roadmap/). It means Flink can not only 
support streaming and batch computing, but also support OLAP(On-Line Analytical 
Processing). This page will show how to quickly set up a Flink OLAP service, 
and will introduce some best practices.
+OLAP (OnLine Analysis Processing) is a key technology in the field of data 
analysis, it is generally used to perform complex queries on large data sets 
with latencies in seconds. Now Flink can not only support streaming and batch 
computing, but also supports users to deploy it as an OLAP computing service. 
This page will show you how to quickly set up a local Flink OLAP service, and 
will also introduce some best practices helping you deploy Flink OLAP service 
in production.
 
-## Architecture
+## Architecture Introduction
+This chapter will introduce you to the overall architecture of Flink OLAP 
service and the advantages of using it.
 
-The Flink OLAP service consists of three parts: Client, Flink SQL Gateway, 
Flink Session Cluster.
+### Architecture
 
-* **Client**: Could be any client that can interact with Flink SQL Gateway, 
such as  SQL client, Flink JDBC driver and so on.
-* **Flink SQL Gateway**: The SQL Gateway provides an easy way to submit the 
Flink Job, look up the metadata, and analyze table stats.
-* **Flink Session Cluster**: We choose session clusters to run OLAP queries, 
mainly to avoid the overhead of cluster startup.
+Flink OLAP service consists of three parts: Client, Flink SQL Gateway and 
Flink Session Cluster.
 
-## Advantage
+* **Client**: Could be any client that can interact with [Flink SQL 
Gateway]({{< ref "docs/dev/table/sql-gateway/overview" >}}), such as [SQL 
Client]({{< ref "docs/dev/table/sqlClient" >}}), [Flink JDBC Driver]({{< ref 
"docs/dev/table/jdbcDriver" >}}) and so on.
+* **Flink SQL Gateway**: The SQL Gateway provides an easy way to parse the sql 
query, look up the metadata, analyze table stats, optimize the plan and submit 
JobGraphs to cluster.
+* **Flink Session Cluster**: OLAP queries run on [session cluster]({{< ref 
"/docs/deployment/resource-providers/native_kubernetes#starting-a-flink-session-on-kubernetes"
 >}}), mainly to avoid the overhead of cluster startup.
+
+{{< img src="/fig/olap-architecture.svg" alt="Illustration of Flink OLAP 
Architecture" width="85%" >}}
+
+### Advantage
 
 * **Massively Parallel Processing**
-  * Flink OLAP runs naturally as an MPP(Massively Parallel Processing) system, 
which supports low-latency ad-hoc queries
+  * Flink OLAP runs naturally as a massively parallel processing system, which 
enables planners to easily adjust the job parallelism to fulfill queries' 
latency requirement under different data sizes.
+* **Elastic Resource Management**
+  * Flink's resource management supports min/max scaling, which means the 
session cluster can allocate the resource according to workload dynamically.
 * **Reuse Connectors**
-  * Flink OLAP can reuse rich connectors in Flink ecosystem.
+  * Flink OLAP can reuse the rich [Connectors]({{< ref 
"docs/connectors/table/overview" >}}) in Flink ecosystem.
 * **Unified Engine**
   * Unified computing engine for Streaming/Batch/OLAP.
 
-# Deploying in Local Mode
+## Deploying in Local Mode
+
+In this chapter, you will learn how to build Flink OLAP services locally.
 
-## Downloading Flink
+### Downloading Flink
 
-The same as [Local Installation]({{< ref "docs/try-flink/local_installation" 
>}}). Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and 
Cygwin (for Windows). We need to have at least Java 11 installed, Java 17 is 
more recommended in OLAP scenario. To check the Java version installed, type in 
your terminal:
+The same as [Local Installation]({{< ref "docs/try-flink/local_installation" 
>}}). Flink runs on all UNIX-like environments, i.e. Linux, Mac OS X, and 
Cygwin (for Windows). User need to have at __Java 11__ installed. To check the 
Java version installed, user can type in the terminal:
 
 ```
 java -version
@@ -61,7 +70,7 @@ Next, [Download](https://flink.apache.org/downloads/) the 
latest binary release
 tar -xzf flink-*.tgz
 ```
 
-## Starting a local cluster
+### Starting a local cluster
 
 To start a local cluster, run the bash script that comes with Flink:
 
@@ -69,9 +78,9 @@ To start a local cluster, run the bash script that comes with 
Flink:
 ./bin/start-cluster.sh
 ```
 
-You should be able to navigate to the web UI at localhost:8081 to view the 
Flink dashboard and see that the cluster is up and running.
+You should be able to navigate to the web UI at http://localhost:8081 to view 
the Flink dashboard and see that the cluster is up and running.
 
-## Start a SQL Client CLI
+### Start a SQL Client CLI
 
 You can start the CLI with an embedded gateway by calling:
 
@@ -79,7 +88,7 @@ You can start the CLI with an embedded gateway by calling:
 ./bin/sql-client.sh
 ```
 
-## Running Queries
+### Running Queries
 
 You could simply execute queries in CLI and retrieve the results.
 
@@ -102,98 +111,98 @@ GROUP BY  buyer
 ORDER BY  total_cost LIMIT 3;
 ```
 
-And then you could find job detail information in web UI at localhost:8081.
+And then you could find job detail information in web UI at 
http://localhost:8081.
 
-# Deploying in Production
+## Deploying in Production
 
 This section guides you through setting up a production ready Flink OLAP 
service.
 
-## Cluster Deployment
+### Client
 
-In production, we recommend to use Flink Session Cluster, Flink SQL Gateway 
and Flink JDBC Driver to build an OLAP service.
+#### Flink JDBC Driver
 
-### Session Cluster
+You should use Flink JDBC Driver when submitting queries to SQL Gateway since 
it provides low-level connection management. When used in production, you 
should pay attention to reuse the JDBC connection to avoid frequently 
creating/closing sessions in the Gateway and then reduce the E2E query latency. 
For detailed information, please refer to the [Flink JDBC Driver]({{ <ref 
"docs/dev/table/jdbcDriver"> }}).
 
-For Flink Session Cluster, we recommend to deploy Flink on native Kubernetes 
using session mode. Kubernetes is a popular container-orchestration system for 
automating computer application deployment, scaling, and management. By 
deploying on native Kubernetes, Flink Session Cluster is able to dynamically 
allocate and de-allocate TaskManagers. For more information, please refer to 
[Native Kubernetes]({{< ref 
"docs/deployment/resource-providers/native_kubernetes">}}).
+### Cluster Deployment
 
-### SQL Gateway
+In production, you should use Flink Session Cluster, Flink SQL Gateway to 
build an OLAP service.
 
-For Flink SQL Gateway, we recommend deploying it as a stateless microservice 
and register this on the service discovery component.  For more information, 
please refer to the [SQL Gateway Overview]({{< ref 
"docs/dev/table/sql-gateway/overview">}}).
+#### Session Cluster
 
-### Flink JDBC Driver
+For Flink Session Cluster, you can deploy it on Native Kubernetes using 
session mode. Kubernetes is a popular container-orchestration system for 
automating computer application deployment, scaling, and management. By 
deploying on Native Kubernetes, Flink Session Cluster is able to dynamically 
allocate and de-allocate TaskManagers. For more information, please refer to 
[Native Kubernetes]({{ < ref 
"docs/deployment/resource-providers/native_kubernetes"> }}). Furthermore, you 
can config the [...]
 
-When submitting queries to SQL Gateway, we recommend using Flink JDBC Driver 
since it provides low-level connection management. When used in production, we 
need to pay attention to reuse the JDBC connection to avoid frequently 
creating/closing sessions in the Gateway. For more information, please refer to 
the [Flink JDBC Driver]({{{<ref "docs/dev/table/jdbcDriver">}}}).
+#### SQL Gateway
 
-## Datasource Configurations
+For Flink SQL Gateway, you should deploy it as a stateless microservice and 
register the instance on service discovery component. Through this way, client 
can balance the query between instances easily. For more information, please 
refer to [SQL Gateway Overview]({{< ref 
"docs/dev/table/sql-gateway/overview">}}).
 
-### Catalogs
+### Datasource Configurations
 
-In OLAP scenario, we recommend using FileCatalogStore in the catalog 
configuration introduced in 
[FLIP-295](https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations).
 As a long running service, Flink OLAP cluster's catalog information will not 
change frequently and can be re-used cross sessions. For more information, 
please refer to the [Catalog Store]({{< ref 
"docs/dev/table/catalogs#catalog-store">}}).
+#### Catalogs
 
-### Connectors
+In OLAP scenario, you should configure `FileCatalogStore` provided by 
[Catalogs]({{< ref "docs/dev/table/catalogs">}}) as the catalog used by 
cluster. As a long-running service, Flink OLAP cluster's catalog information 
will not change frequently and should be re-used cross sessions to reduce the 
cold-start cost. For more information, please refer to the [Catalog Store]({{< 
ref "docs/dev/table/catalogs#catalog-store">}}).
 
-Both Session Cluster and SQL Gateway rely on connectors to analyze table stats 
and read data from the configured data source. To add connectors, please refer 
to the [Connectors and Formats]({{< ref "docs/connectors/table/overview">}}).
+#### Connectors
 
-## Cluster Configurations
+Both Session Cluster and SQL Gateway rely on connectors to analyze table stats 
and read data from the configured data source. To add connectors, please refer 
to the [Connectors]({{< ref "docs/connectors/table/overview">}}).
 
-In OLAP scenario, we picked out a few configurations that can help improve 
user usability and query performance.
+### Recommended Cluster Configurations
 
-### SQL&Table Options
+In OLAP scenario, appropriate configurations that can greatly help users 
improve the overall usability and query performance. Here are some recommended 
production configurations:
 
-| Parameters                           | Default | Recommended |
-|:-------------------------------------|:--------|:------------|
-| table.optimizer.join-reorder-enabled | false   | true        |
-| pipeline.object-reuse                | false   | true        |
+#### SQL&Table Options
 
-### Runtime Options
+| Parameters                                                                   
                                  | Default | Recommended |
+|:---------------------------------------------------------------------------------------------------------------|:--------|:------------|
+| [table.optimizer.join-reorder-enabled]({{<ref 
"docs/dev/table/config#table-optimizer-join-reorder-enabled">}}) | false | true 
|
+| [pipeline.object-reuse]({{< ref 
"docs/deployment/config#pipeline-object-reuse" >}})                            
| false | true |
 
-| Parameters                   | Default                | Recommended          
                                                                                
                                     |
-|:-----------------------------|:-----------------------|:------------------------------------------------------------------------------------------------------------------------------------------|
-| execution.runtime-mode       | STREAMING              | BATCH                
                                                                                
                                     |
-| execution.batch-shuffle-mode | ALL_EXCHANGES_BLOCKING | 
ALL_EXCHANGES_PIPELINED                                                         
                                                          |
-| env.java.opts.all            | {default value}        | {default value} 
-XX:PerMethodRecompilationCutoff=10000 
-XX:PerBytecodeRecompilationCutoff=10000-XX:ReservedCodeCacheSize=512M 
-XX:+UseZGC |
-| JDK Version                  | 11                     | 17                   
                                                                                
                                     |
+#### Runtime Options
 
-We strongly recommend using JDK17 with ZGC in OLAP scenario in order to 
provide zero gc stw and solve the issue described in 
[FLINK-32746](https://issues.apache.org/jira/browse/FLINK-32746).
+| Parameters                                                                   
                     | Default                | Recommended                     
                                                                                
                          |
+|:--------------------------------------------------------------------------------------------------|:-----------------------|:------------------------------------------------------------------------------------------------------------------------------------------|
+| [execution.runtime-mode]({{< ref 
"docs/deployment/config#execution-runtime-mode" >}})             | STREAMING    
          | BATCH                                                               
                                                                      |
+| [execution.batch-shuffle-mode]({{< ref 
"docs/deployment/config#execution-batch-shuffle-mode" >}}) | 
ALL_EXCHANGES_BLOCKING | ALL_EXCHANGES_PIPELINED                                
                                                                                
   |
+| [env.java.opts.all]({{< ref "docs/deployment/config#env-java-opts-all" >}})  
                     | {default value}        | {default value} 
-XX:PerMethodRecompilationCutoff=10000 
-XX:PerBytecodeRecompilationCutoff=10000-XX:ReservedCodeCacheSize=512M 
-XX:+UseZGC |
+| JDK Version                                                                  
                     | 11                     | 17                              
                                                                                
                          |
 
-### Scheduling Options
+Using JDK17 within ZGC can greatly help optimize the metaspace garbage 
collection issue, detailed information can be found in 
[FLINK-32746](https://issues.apache.org/jira/browse/FLINK-32746). Meanwhile, 
ZGC can provide close to zero application pause time when collecting garbage 
objects in memory. Additionally, OLAP queries need to be executed in `BATCH` 
mode because both `Pipelined` and `Blocking` edges may appear in the execution 
plan of an OLAP query. Batch scheduler allows queries to [...]
+
+#### Scheduling Options
 
 | Parameters                                               | Default           
| Recommended       |
-|:---------------------------------------------------------|:------------------|:------------------|
-| jobmanager.scheduler                                     | Default           
| Default           |
-| jobmanager.execution.failover-strategy                   | region            
| full              |
-| restart-strategy.type                                    | (none)            
| disable           |
-| jobstore.type                                            | File              
| Memory            |
-| jobstore.max-capacity                                    | Integer.MAX_VALUE 
| 500               |
-
-We would like to highlight the usage of `PipelinedRegionSchedulingStrategy`. 
Since many OLAP queries will have blocking edges in their jobGraph.
-
-### Network Options
-
-| Parameters                          | Default    | Recommended    |
-|:------------------------------------|:-----------|:---------------|
-| rest.server.numThreads              | 4          | 32             |
-| web.refresh-interval                | 3000       | 300000         |
-| pekko.framesize                     | 10485760b  | 104857600b     |
-
-### ResourceManager Options
-
-| Parameters                           | Default   | Recommended    |
-|:-------------------------------------|:----------|:---------------|
-| kubernetes.jobmanager.replicas       | 1         | 2              |
-| kubernetes.jobmanager.cpu.amount     | 1.0       | 16.0           |
-| jobmanager.memory.process.size       | (none)    | 65536m         |
-| jobmanager.memory.jvm-overhead.max   | 1g        | 6144m          |
-| kubernetes.taskmanager.cpu.amount    | (none)    | 16             |
-| taskmanager.numberOfTaskSlots        | 1         | 32             |
-| taskmanager.memory.process.size      | (none)    | 65536m         |
-| taskmanager.memory.managed.size      | (none)    | 65536m         |
-
-We prefer to use large taskManager pods in OLAP since this can put more 
computation in local and reduce network/deserialization/serialization overhead. 
Meanwhile, since JobManager is a single point of calculation in OLAP scenario, 
we also prefer large pod.
-
-# Future Work
-There is a big margin for improvement in Flink OLAP, both in usability and 
query performance, and we trace all of them in underlying tickets.
+|:------------------------------------------------------------------------------------------------------------------------|:------------------|:--------|
+| [jobmanager.scheduler]({{< ref "docs/deployment/config#jobmanager-scheduler" 
>}})                                       | Default           | Default |
+| [jobmanager.execution.failover-strategy]({{< ref 
"docs/deployment/config#jobmanager-execution-failover-strategy-1" >}}) | region 
           | full    |
+| [restart-strategy.type]({{< ref 
"docs/deployment/config#restart-strategy-type" >}})                             
        | (none)            | disable |
+| [jobstore.type]({{< ref "docs/deployment/config#jobstore-type" >}})          
                                           | File              | Memory  |
+| [jobstore.max-capacity]({{< ref 
"docs/deployment/config#jobstore-max-capacity" >}})                             
        | Integer.MAX_VALUE | 500     |
+
+
+#### Network Options
+
+| Parameters                                                                   
         | Default    | Recommended    |
+|:--------------------------------------------------------------------------------------|:-----------|:---------------|
+| [rest.server.numThreads]({{< ref 
"docs/deployment/config#rest-server-numthreads" >}}) | 4         | 32         |
+| [web.refresh-interval]({{< ref "docs/deployment/config#web-refresh-interval" 
>}})     | 3000      | 300000     |
+| [pekko.framesize]({{< ref "docs/deployment/config#pekko-framesize" >}})      
         | 10485760b | 104857600b |
+
+#### ResourceManager Options
+
+| Parameters                                                         | Default 
| Recommended                             |
+|:-------------------------------------------------------------------|:--------|:----------------------------------------|
+| [kubernetes.jobmanager.replicas]({{< ref 
"docs/deployment/config#kubernetes-jobmanager-replicas" >}})         | 1      | 
2                                       |
+| [kubernetes.jobmanager.cpu.amount]({{< ref 
"docs/deployment/config#kubernetes-jobmanager-cpu-amount" >}})     | 1.0    | 
16.0                                    |
+| [jobmanager.memory.process.size]({{< ref 
"docs/deployment/config#jobmanager-memory-process-size" >}})         | (none) | 
32g                                     |
+| [jobmanager.memory.jvm-overhead.max]({{< ref 
"docs/deployment/config#jobmanager-memory-jvm-overhead-max" >}}) | 1g     | 3g  
                                    |
+| [kubernetes.taskmanager.cpu.amount]({{< ref 
"docs/deployment/config#kubernetes-taskmanager-cpu-amount" >}})   | (none) | 16 
                                     |
+| [taskmanager.numberOfTaskSlots]({{< ref 
"docs/deployment/config#taskmanager-numberoftaskslots" >}})           | 1      
| 32                                      |
+| [taskmanager.memory.process.size]({{< ref 
"docs/deployment/config#taskmanager-memory-process-size" >}})       | (none) | 
65536m                                  |
+| [taskmanager.memory.managed.size]({{< ref 
"docs/deployment/config#taskmanager-memory-managed-size" >}})       | (none) | 
16384m                                  |
+| [slotmanager.number-of-slots.min]({{< ref 
"docs/deployment/config#slotmanager-number-of-slots-min" >}})       | 0      | 
{taskManagerNumber * numberOfTaskSlots} |
+
+You can configure `slotmanager.number-of-slots.min` to a proper value as the 
reserved resource pool serving OLAP queries. TaskManager should configure with 
a large resource specification in OLAP scenario since this can put more 
computations in local and reduce network/deserialization/serialization 
overhead. Meanwhile, as a single point of calculation in OLAP, JobManager also 
prefer large resource specification.
+
+## Future Work
+Flink OLAP is now part of [Apache Flink 
Roadmap](https://flink.apache.org/what-is-flink/roadmap/), which means the 
community will keep putting efforts to improve Flink OLAP, both in usability 
and query performance. Relevant work are traced in underlying tickets:
 - https://issues.apache.org/jira/browse/FLINK-25318
-- https://issues.apache.org/jira/browse/FLINK-32898
-
-Furthermore, we are adding relevant OLAP benchmarks to the Flink repository 
such as [flink-benchmarks](https://github.com/apache/flink-benchmarks).
\ No newline at end of file
+- https://issues.apache.org/jira/browse/FLINK-32898
\ No newline at end of file
diff --git a/docs/content/docs/dev/table/overview.md 
b/docs/content/docs/dev/table/overview.md
index 6715d064c2f..69bc97a1ccd 100644
--- a/docs/content/docs/dev/table/overview.md
+++ b/docs/content/docs/dev/table/overview.md
@@ -60,7 +60,7 @@ Where to go next?
 * [Built-in Functions]({{< ref "docs/dev/table/functions/systemFunctions" 
>}}): Supported functions in Table API and SQL.
 * [SQL Client]({{< ref "docs/dev/table/sqlClient" >}}): Play around with Flink 
SQL and submit a table program to a cluster without programming knowledge.
 * [SQL Gateway]({{< ref "docs/dev/table/sql-gateway/overview" >}}): A service 
that enables the multiple clients to execute SQL from the remote in concurrency.
-* [SQL JDBC Driver]({{< ref "docs/dev/table/jdbcDriver" >}}): A JDBC Driver 
that submits SQL statements to sql-gateway.
 * [OLAP Quickstart]({{< ref "docs/dev/table/olap_quickstart" >}}): A 
quickstart to show how to set up a Flink OLAP service.
+* [SQL JDBC Driver]({{< ref "docs/dev/table/jdbcDriver" >}}): A JDBC Driver 
that submits SQL statements to sql-gateway.
 
 {{< top >}}
diff --git a/docs/static/fig/olap-architecture.svg 
b/docs/static/fig/olap-architecture.svg
new file mode 100644
index 00000000000..2b39666ab86
--- /dev/null
+++ b/docs/static/fig/olap-architecture.svg
@@ -0,0 +1,21 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+<svg version="1.1" viewBox="0.0 0.0 960.0 720.0" fill="none" stroke="none" 
stroke-linecap="square" stroke-miterlimit="10" 
xmlns:xlink="http://www.w3.org/1999/xlink"; 
xmlns="http://www.w3.org/2000/svg";><clipPath id="p.0"><path d="m0 0l960.0 0l0 
720.0l-960.0 0l0 -720.0z" clip-rule="nonzero"/></clipPath><g 
clip-path="url(#p.0)"><path fill="#000000" fill-opacity="0.0" d="m0 0l960.0 0l0 
720.0l-960.0 0z" fill-rule="evenodd"/><path fill="#000000" fill-opacity="0.0" 
d="m231.58206 85.755905l274.42 [...]
\ No newline at end of file

(flink) 01/02: [FLINK-33235][doc] Update OLAP Quickstart doc

Reply via email to