swuferhong commented on code in PR #20513:
URL: https://github.com/apache/flink/pull/20513#discussion_r954896567


##########
docs/content/docs/dev/table/sql/queries/hints.md:
##########
@@ -84,4 +84,244 @@ insert into kafka_table1 /*+ 
OPTIONS('sink.partitioner'='round-robin') */ select
 
 ```
 
+## Query Hints
+
+`Query Hints` specify that the indicated hints are used in the scope of a 
query. Their effective scope is current `Query 
+block` ([What are query blocks ?](#what-are-query-blocks-)) which `Query 
Hints` specified in. Now, Flink `Query Hints` support `Join Hints`.
+
+### Syntax
+The `Query Hints` syntax in Flink follows the syntax of `Query Hints` in 
Apache Calcite:
+```sql
+# Query Hints:
+SELECT /*+ hint [, hint ] */ ...
+
+hint:
+        hintName
+    |   hintName '(' optionKey '=' optionVal [, optionKey '=' optionVal ]* ')'
+    |   hintName '(' hintOption [, hintOption ]* ')'
+
+optionKey:
+        simpleIdentifier
+    |   stringLiteral
+
+optionVal:
+        stringLiteral
+
+hintOption:
+        simpleIdentifier
+    |   numericLiteral
+    |   stringLiteral
+```
+
+### Join Hints
+
+`Join Hints` allow users to suggest the join strategy to optimizer in order to 
get a more optimal execution plan. 
+Now Flink `Join Hints` support `BROADCAST`, `SHUFFLE_HASH`, `SHUFFLE_MERGE` 
and `NEST_LOOP` Join Hints. Also, for some cases,
+`Join Hints` will conflict, we try to formulate the behavior of `Join Hints` 
for these cases.
+
+*NOTE:* For all types of `Join Hints`, Flink support specifying multiple 
tables in a single `Join Hint` or multiple `Join Hints` like
+`BROADCAST(t1, t2, ..., tn)` or `BROADCAST(t1), BROADCAST(t2), ..., 
BROADCAST(tn)`. During query optimization, planner will
+automatically select the optimal join strategies according to the specified 
`Join Hints`.
+
+#### BROADCAST
+
+{{< label Batch >}}
+
+For this type of `Join Hints`, suggests that join use `BroadCast join` 
strategy. It performs well when the data volume of
+one side of table is very small. The join side with the hint will be broadcast 
regardless of `table.optimizer.join.broadcast-threshold`.
+
+*NOTE:* `BROADCAST Join Hint` only supports join with equivalence join 
condition. And it doesn't support `Full Outer Join`.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Select t1 as join side with this hint, t1 will be the broadcast table.
+SELECT /*+ BROADCAST(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Select t1 and t3 as join side with this hint.
+-- For this case, when join between t1 and t2, t1 will be the broadcast table.
+-- And when join between the result after t1 joins t2 and t3, t3 will be the 
broadcast table.
+SELECT /*+ BROADCAST(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 ON 
t1.id = t3.id;
+
+-- BROADCAST Join Hint don't support non-equivalent join conditions.
+-- For this case, default join strategy will be adopted instead of BroadCast 
join.
+SELECT /*+ BROADCAST(t1) */ * FROM t1 join t2 ON t1.id > t2.id;
+
+-- BROADCAST Join Hint strategy don't support full outer join.
+-- For this case, default join strategy will be adopted instead of BroadCast 
join.
+SELECT /*+ BROADCAST(t1) */ * FROM t1 FULL OUTER JOIN t2 ON t1.id = t2.id;
+```
+
+#### SHUFFLE_HASH
+
+{{< label Batch >}}
+
+For this type of `Join Hint`, suggests that join use `Shuffle Hash join` 
strategy. Compare of `BROADCAST`, This type of
+`Joint Hint` is suitable for the scenario of the data volume of both table is 
not small enough.
+
+*NOTE:* `SHUFFLE_HASH Join Hint` only supports join with equivalence join 
condition.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Select t1 as join side with this hint, t1 will be the join build side.
+SELECT /*+ SHUFFLE_HASH(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Select t1 and t3 as join side with this hint.
+-- For this case, when join between t1 and t2, t1 will be the build side,
+-- And when join between the result after t1 joins t2 and t3, t3 will be the 
build side.
+SELECT /*+ SHUFFLE_HASH(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 
ON t1.id = t3.id;
+
+-- SHUFFLE_HASH Join Hint don't support non-equivalent join conditions.
+-- For this case, default join strategy will be adopted instead of Hash 
Shuffle join.
+SELECT /*+ SHUFFLE_HASH(t1) */ * FROM t1 join t2 ON t1.id > t2.id;
+```
+
+#### SHUFFLE_MERGE
+
+{{< label Batch >}}
+
+For this type of `Join Hint`, suggests that join use `Sort Merge join` 
strategy. This type of `Join Hint` is recommended for
+using in the scenario of joining between two large tables or the scenario that 
the data at both sides of the join is already in order.
+
+*NOTE:* `SHUFFLE_MERGE Join Hint` strategy only supports join with equivalence 
join condition.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Sort Merge join strategy is adopted.
+SELECT /*+ SHUFFLE_MERGE(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Sort Merge join strategy is both adopted in these two joins.
+SELECT /*+ SHUFFLE_MERGE(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 
ON t1.id = t3.id;
+
+-- SHUFFLE_MERGE Join Hint don't support non-equivalent join conditions.
+-- For this case, default join strategy will be adopted instead of Sort Merge 
join.
+SELECT /*+ SHUFFLE_MERGE(t1) */ * FROM t1 join t2 ON t1.id > t2.id;
+```
+
+#### NEST_LOOP
+
+{{< label Batch >}}
+
+For this type of `Join Hint`, suggests that join use `Nested Loop join` 
strategy. This type of join hint is not recommended 
+without special scenario requirements.
+
+*NOTE:* `NEST_LOOP Join Hint` support both equivalent and non-equivalent join 
condition.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Select t1 as join side with this hint, t1 will be the build side.
+SELECT /*+ NEST_LOOP(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Select t1 and t3 as join side with this hint.
+-- For this case, when join between t1 and t2, t1 will be the build side,
+-- after that, when join between the result after t1 joins t2 and t3, t3 will 
be the build side.
+SELECT /*+ NEST_LOOP(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 ON 
t1.id = t3.id;
+```
+
+#### Conflict Cases In Join Hints
+
+For some cases, `Join Hints` will conflict, we formulate the behavior of `Join 
Hints` in Flink for the following conditions:
+ - View

Review Comment:
   > Actually, this is not a conflict. the view and the sub-query are different 
query-block, the hint will not be propagated into the sub-queries.
   
   Done!



##########
docs/content/docs/dev/table/sql/queries/hints.md:
##########
@@ -84,4 +84,244 @@ insert into kafka_table1 /*+ 
OPTIONS('sink.partitioner'='round-robin') */ select
 
 ```
 
+## Query Hints
+
+`Query Hints` specify that the indicated hints are used in the scope of a 
query. Their effective scope is current `Query 
+block` ([What are query blocks ?](#what-are-query-blocks-)) which `Query 
Hints` specified in. Now, Flink `Query Hints` support `Join Hints`.
+
+### Syntax
+The `Query Hints` syntax in Flink follows the syntax of `Query Hints` in 
Apache Calcite:
+```sql
+# Query Hints:
+SELECT /*+ hint [, hint ] */ ...
+
+hint:
+        hintName
+    |   hintName '(' optionKey '=' optionVal [, optionKey '=' optionVal ]* ')'
+    |   hintName '(' hintOption [, hintOption ]* ')'
+
+optionKey:
+        simpleIdentifier
+    |   stringLiteral
+
+optionVal:
+        stringLiteral
+
+hintOption:
+        simpleIdentifier
+    |   numericLiteral
+    |   stringLiteral
+```
+
+### Join Hints
+
+`Join Hints` allow users to suggest the join strategy to optimizer in order to 
get a more optimal execution plan. 
+Now Flink `Join Hints` support `BROADCAST`, `SHUFFLE_HASH`, `SHUFFLE_MERGE` 
and `NEST_LOOP` Join Hints. Also, for some cases,
+`Join Hints` will conflict, we try to formulate the behavior of `Join Hints` 
for these cases.
+
+*NOTE:* For all types of `Join Hints`, Flink support specifying multiple 
tables in a single `Join Hint` or multiple `Join Hints` like
+`BROADCAST(t1, t2, ..., tn)` or `BROADCAST(t1), BROADCAST(t2), ..., 
BROADCAST(tn)`. During query optimization, planner will
+automatically select the optimal join strategies according to the specified 
`Join Hints`.
+
+#### BROADCAST
+
+{{< label Batch >}}
+
+For this type of `Join Hints`, suggests that join use `BroadCast join` 
strategy. It performs well when the data volume of
+one side of table is very small. The join side with the hint will be broadcast 
regardless of `table.optimizer.join.broadcast-threshold`.
+
+*NOTE:* `BROADCAST Join Hint` only supports join with equivalence join 
condition. And it doesn't support `Full Outer Join`.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Select t1 as join side with this hint, t1 will be the broadcast table.
+SELECT /*+ BROADCAST(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Select t1 and t3 as join side with this hint.
+-- For this case, when join between t1 and t2, t1 will be the broadcast table.
+-- And when join between the result after t1 joins t2 and t3, t3 will be the 
broadcast table.
+SELECT /*+ BROADCAST(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 ON 
t1.id = t3.id;
+
+-- BROADCAST Join Hint don't support non-equivalent join conditions.
+-- For this case, default join strategy will be adopted instead of BroadCast 
join.
+SELECT /*+ BROADCAST(t1) */ * FROM t1 join t2 ON t1.id > t2.id;
+
+-- BROADCAST Join Hint strategy don't support full outer join.
+-- For this case, default join strategy will be adopted instead of BroadCast 
join.
+SELECT /*+ BROADCAST(t1) */ * FROM t1 FULL OUTER JOIN t2 ON t1.id = t2.id;
+```
+
+#### SHUFFLE_HASH
+
+{{< label Batch >}}
+
+For this type of `Join Hint`, suggests that join use `Shuffle Hash join` 
strategy. Compare of `BROADCAST`, This type of
+`Joint Hint` is suitable for the scenario of the data volume of both table is 
not small enough.
+
+*NOTE:* `SHUFFLE_HASH Join Hint` only supports join with equivalence join 
condition.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Select t1 as join side with this hint, t1 will be the join build side.
+SELECT /*+ SHUFFLE_HASH(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Select t1 and t3 as join side with this hint.
+-- For this case, when join between t1 and t2, t1 will be the build side,
+-- And when join between the result after t1 joins t2 and t3, t3 will be the 
build side.
+SELECT /*+ SHUFFLE_HASH(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 
ON t1.id = t3.id;
+
+-- SHUFFLE_HASH Join Hint don't support non-equivalent join conditions.
+-- For this case, default join strategy will be adopted instead of Hash 
Shuffle join.
+SELECT /*+ SHUFFLE_HASH(t1) */ * FROM t1 join t2 ON t1.id > t2.id;
+```
+
+#### SHUFFLE_MERGE
+
+{{< label Batch >}}
+
+For this type of `Join Hint`, suggests that join use `Sort Merge join` 
strategy. This type of `Join Hint` is recommended for
+using in the scenario of joining between two large tables or the scenario that 
the data at both sides of the join is already in order.
+
+*NOTE:* `SHUFFLE_MERGE Join Hint` strategy only supports join with equivalence 
join condition.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Sort Merge join strategy is adopted.
+SELECT /*+ SHUFFLE_MERGE(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Sort Merge join strategy is both adopted in these two joins.
+SELECT /*+ SHUFFLE_MERGE(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 
ON t1.id = t3.id;
+
+-- SHUFFLE_MERGE Join Hint don't support non-equivalent join conditions.
+-- For this case, default join strategy will be adopted instead of Sort Merge 
join.
+SELECT /*+ SHUFFLE_MERGE(t1) */ * FROM t1 join t2 ON t1.id > t2.id;
+```
+
+#### NEST_LOOP
+
+{{< label Batch >}}
+
+For this type of `Join Hint`, suggests that join use `Nested Loop join` 
strategy. This type of join hint is not recommended 
+without special scenario requirements.
+
+*NOTE:* `NEST_LOOP Join Hint` support both equivalent and non-equivalent join 
condition.
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- Select t1 as join side with this hint, t1 will be the build side.
+SELECT /*+ NEST_LOOP(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- Select t1 and t3 as join side with this hint.
+-- For this case, when join between t1 and t2, t1 will be the build side,
+-- after that, when join between the result after t1 joins t2 and t3, t3 will 
be the build side.
+SELECT /*+ NEST_LOOP(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 ON 
t1.id = t3.id;
+```
+
+#### Conflict Cases In Join Hints
+
+For some cases, `Join Hints` will conflict, we formulate the behavior of `Join 
Hints` in Flink for the following conditions:
+ - View
+ - Conflict in one same Join Hint strategy
+ - Conflict in different Join Hints strategies
+
+##### Examples
+
+```sql
+CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...);
+CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...);
+
+-- View Case
+
+CREATE VIEW view1 AS SELECT t1.* FROM t1 JOIN ON t2 ON t1.id = t2.id;
+
+-- For this view case, BROADCAST Join Hints will not be propagated into view1, 
and join inside view1 will use default join strategy.
+SELECT /*+ BROADCAST(t1) */ * FROM view1 JOIN t3 ON view1.id = t3.id;
+
+-- For this view case, the join inside view1 will use default join strategy, 
and join between view1 and t1 will use BroadCast join strategy.
+SELECT /*+ BROADCAST(t1) */ * FROM view1 JOIN t1 ON view1.id = t1.id;
+
+
+-- Conflict in One Same Join Hints Strategy Case
+
+-- For this conflict in one same Join Hints strategy case, the first hint will 
be chosen, t2 will be the broadcast table.
+SELECT /*+ BROADCAST(t2), BROADCAST(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;
+
+-- For this conflict in one same Join Hints strategy case, t1 will be the 
broadcast table.
+SELECT /*+ BROADCAST(t1, t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id;

Review Comment:
   > it's not a conflict
   
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to