Re: [DISCUSS]FLIP-113: Support SQL and planner hints

Timo Walther Tue, 17 Mar 2020 07:11:22 -0700

Hi Danny,

thanks for updating the FLIP. I think your current design is sufficientto separate hints from result-related properties.

One remark to the naming itself: I would vote for calling the hintsaround table scan `OPTIONS('k'='v')`. We used the term "properties" inthe past but since we want to unify the Flink configuration experience,we should use consistent naming and classes around `ConfigOptions`.

It would be nice to use `Set<ConfigOption> supportedHintOptions();` tostart using config options instead of pure string properties. This willalso allow us to generate documentation in the future around supporteddata types, ranges, etc. for options. At some point we would also liketo drop `DescriptorProperties` class. "Options" is also used in thedocumentation [1] and in the SQL/MED standard [2].

Furthermore, I would still vote for separating CatalogTable and hintoptions. Otherwise the planner would need to create a new CatalogTableinstance which might not always be easy. We should offer them via:

org.apache.flink.table.factories.TableSourceFactory.Context#getHints:ReadableConfig


What do you think?

Regards,
Timo

[1]https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/create.html#create-table

[2] https://wiki.postgresql.org/wiki/SQL/MED


On 12.03.20 15:06, Stephan Ewen wrote:

@Danny sounds good.

Maybe it is worth listing all the classes of problems that you want to
address and then look at each class and see if hints are a good default
solution or a good optional way of simplifying things?
The discussion has grown a lot and it is starting to be hard to distinguish
the parts where everyone agrees from the parts were there are concerns.

On Thu, Mar 12, 2020 at 2:31 PM Danny Chan <[email protected]> wrote:

Thanks Stephan ~

We can remove the support for properties that may change the semantics of
query if you think that is a trouble.

How about we support the /*+ properties() */ hint only for those optimize
parameters, such as the fetch size of source or something like that, does
that make sense?

Stephan Ewen <[email protected]>于2020年3月12日 周四下午7:45写道：

I think Bowen has actually put it very well.

(1) Hints that change semantics looks like trouble waiting to happen. For
example Kafka offset handling should be in filters. The Kafka source

should

support predicate pushdown.

(2) Hints should not be a workaround for current shortcomings. A lot of

the

suggested above sounds exactly like that. Working around catalog/DDL
shortcomings, missing exposure of metadata (offsets), missing predicate
pushdown in Kafka. Abusing a feature like hints now as a quick fix for
these issues, rather than fixing the root causes, will much likely bite

us

back badly in the future.

Best,
Stephan


On Thu, Mar 12, 2020 at 10:43 AM Kurt Young <[email protected]> wrote:

It seems this FLIP's name is somewhat misleading. From my

understanding,

this FLIP is trying to
address the dynamic parameter issue, and table hints is the way we wan

to

choose. I think we should
be focus on "what's the right way to solve dynamic property" instead of
discussing "whether table
hints can affect query semantics".

For now, there are two proposed ways to achieve dynamic property:
1. FLIP-110: create temporary table xx like xx with (xxx)
2. use custom "from t with (xxx)" syntax
3. "Borrow" the table hints to have a special PROPERTIES hint.

The first one didn't break anything, but the only problem i see is a

little

more verbose than the table hint
approach. I can imagine when someone using SQL CLI to have a sql
experience, it's quite often that
he will modify the table property, some use cases i can think of:
1. the source contains some corrupted data, i want to turn on the
"ignore-error" flag for certain formats.
2. I have a kafka table and want to see some sample data from the
beginning, so i change the offset
to "earliest", and then I want to observe the latest data which keeps
coming in. I would write another query
to select from the latest table.
3. I want to my jdbc sink flush data more eagerly then i can observe

the

data from database side.

Most of such use cases are quite ad-hoc. If every time I want to have a
different experience, i need to create
a temporary table and then also modify my query, it doesn't feel

smooth.

Embed such dynamic property into
query would have better user experience.

Both 2 & 3 can make this happen. The cons of #2 is breaking SQL

compliant,

and for #3, it only breaks some
unwritten rules, but we can have an explanation on that. And I really

doubt

whether user would complain about
this when they actually have flexible and good experience using this.

My tendency would be #3 > #1 > #2, what do you think?

Best,
Kurt


On Thu, Mar 12, 2020 at 1:11 PM Danny Chan <[email protected]>

wrote:

Thanks Aljoscha ~

I agree for most of the query hints, they are optional as an

optimizer

instruction, especially for the traditional RDBMS.

But, just like BenChao said, Flink as a computation engine has many
different kind of data sources, thus, dynamic parameters like

start_offest

can only bind to each table scope, we can not set a session config

like

KSQL because they are all about Kafka:

SET ‘auto.offset.reset’=‘earliest’;


Thus the most flexible way to set up these dynamic params is to bind

to

the table scope in the query when we want to override something, so

we

have

these solutions above (with pros and cons from my side):

• 1. Select * from t(offset=123) (from Timo)

            Pros:
              - Easy to add
              - Parameters are part of the main query
            Cons:
              - Not SQL compliant


• 2. Select * from t /*+ PROPERTIES(offset=123) */ (from me)

            Pros:
            - Easy to add
            - SQL compliant because it is nested in the comments

            Cons:
            - Parameters are not part of the main query
            - Cryptic syntax for new users

The biggest problem for hints way may be the “if hints must be

optional”,

actually we have though about 1 for a while but aborted because it

breaks

the SQL standard too much. And we replace it with 2, because the

hints

syntax do not break SQL standard(nested in comments).

What if we have the special /*+ PROPERTIES */ hint that allows

override

some properties of table dynamically, it does not break anything, at

lease

for current Flink use cases.

Planner hints are optional just because they are naturally enforcers

of

the planner, most of them aim to instruct the optimizer, but, the

table

hints is a little different, table hints can specify the table meta

like

index column, and it is very convenient to specify table properties.

Or shall we not call  /*+ PROPERTIES(offset=123) */ table hint, we

can

call it table dynamic parameters.

Best,
Danny Chan
在 2020年3月11日 +0800 PM9:20，Aljoscha Krettek <[email protected]>，写道：

Hi,

I don't understand this discussion. Hints, as I understand them,

should

work like this:

- hints are *optional* advice for the optimizer to try and help it

to

find a good execution strategy
- hints should not change query semantics, i.e. they should not

change

connector properties executing a query with taking into account the
hints *must* produce the same result as executing the query without
taking into account the hints

 From these simple requirements you can derive a solution that makes
sense. I don't have a strong preference for the syntax but we

should

strive to be in line with prior work.

Best,
Aljoscha

On 11.03.20 11:53, Danny Chan wrote:

Thanks Timo for summarize the 3 options ~

I agree with Kurt that option2 is too complicated to use because:

• As a Kafka topic consumer, the user must define both the

virtual

column for start offset and he must apply a special filter predicate

after

each query

• And for the internal implementation, the metadata column push

down

is another hard topic, each kind of message queue may have its offset
attribute, we need to consider the expression type for different

kind;

the

source also need to recognize the constant column as a config

option(which

is weird because usually what we pushed down is a table column)


For option 1 and option3, I think there is no difference, option1

is

also a hint syntax which is introduced in Sybase and referenced then
deprecated by MS-SQL in 199X years because of the ambitiousness.

Personally

I prefer /*+ */ style table hint than WITH keyword for these reasons:


• We do not break the standard SQL, the hints are nested in SQL

comments

• We do not need to introduce additional WITH keyword which may

appear

in a query if we use that because a table can be referenced in all

kinds

of

SQL contexts: INSERT/DELETE/FROM/JOIN …. That would make our sql

query

break too much of the SQL from standard

• We would have uniform syntax for hints as query hint, one

syntax

fits all and more easy to use



And here is the reason why we choose a uniform Oracle style query

hint syntax which is addressed by Julian Hyde when we design the

syntax

from the Calcite community:


I don’t much like the MSSQL-style syntax for table hints. It

adds a

new use of the WITH keyword that is unrelated to the use of WITH for
common-table expressions.


A historical note. Microsoft SQL Server inherited its hint syntax

from

Sybase a very long time ago. (See “Transact SQL Programming”[1], page

632,

“Optimizer hints”. The book was written in 1999, and covers Microsoft

SQL

Server 6.5 / 7.0 and Sybase Adaptive Server 11.5, but the syntax very
likely predates Sybase 4.3, from which Microsoft SQL Server was

forked

in

1993.)


Microsoft later added the WITH keyword to make it less ambiguous,

and

has now deprecated the syntax that does not use WITH.


They are forced to keep the syntax for backwards compatibility

but

that doesn’t mean that we should shoulder their burden.


I think formatted comments are the right container for hints

because

it allows us to change the hint syntax without changing the SQL

parser,

and

makes clear that we are at liberty to ignore hints entirely.


Julian

[1] https://www.amazon.com/s?k=9781565924017 <

https://www.amazon.com/s?k=9781565924017>


Best,
Danny Chan
在 2020年3月11日 +0800 PM4:03，Timo Walther <[email protected]>，写道：

Hi Danny,

it is true that our DDL is not standard compliant by using the

WITH

clause. Nevertheless, we aim for not diverging too much and the

LIKE

clause is an example of that. It will solve things like

overwriting

WATERMARKs, add additional/modifying properties and inherit

schema.


Bowen is right that Flink's DDL is mixing 3 types definition

together.

We are not the first ones that try to solve this. There is also

the

SQL

MED standard [1] that tried to tackle this problem. I think it

was

not

considered when designing the current DDL.

Currently, I see 3 options for handling Kafka offsets. I will

give

some

examples and look forward to feedback here:

*Option 1* Runtime and semantic parms as part of the query

`SELECT * FROM MyTable('offset'=123)`

Pros:
- Easy to add
- Parameters are part of the main query
- No complicated hinting syntax

Cons:
- Not SQL compliant

*Option 2* Use metadata in query

`CREATE TABLE MyTable (id INT, offset AS

SYSTEM_METADATA('offset'))`


`SELECT * FROM MyTable WHERE offset > TIMESTAMP '2012-12-12

12:34:22'`


Pros:
- SQL compliant in the query
- Access of metadata in the DDL which is required anyway
- Regular pushdown rules apply

Cons:
- Users need to add an additional comlumn in the DDL

*Option 3*: Use hints for properties

`
SELECT *
FROM MyTable /*+ PROPERTIES('offset'=123) */
`

Pros:
- Easy to add

Cons:
- Parameters are not part of the main query
- Cryptic syntax for new users
- Not standard compliant.

If we go with this option, I would suggest to make it available

in

separate map and don't mix it with statically defined

properties.

Such

that the factory can decide which properties have the right to

be

overwritten by the hints:
TableSourceFactory.Context.getQueryHints(): ReadableConfig

Regards,
Timo

[1] https://en.wikipedia.org/wiki/SQL/MED

Currently I see 3 options as a


On 11.03.20 07:21, Danny Chan wrote:

Thanks Bowen ~

I agree we should somehow categorize our connector

parameters.


For type1, I’m already preparing a solution like the

Confluent

schema registry + Avro schema inference thing, so this may not be a

problem

in the near future.


For type3, I have some questions:

"SELECT * FROM mykafka WHERE offset > 12pm yesterday”


Where does the offset column come from, a virtual column from

the

table schema, you said that

They change

almost every time a query starts and have nothing to do with

metadata, thus

should not be part of table definition/DDL

But why you can reference it in the query, I’m confused for

that,

can you elaborate a little ?


Best,
Danny Chan
在 2020年3月11日 +0800 PM12:52，Bowen Li <[email protected]

，写道：

Thanks Danny for kicking off the effort

The root cause of too much manual work is Flink DDL has

mixed 3

types of

params together and doesn't handle each of them very well.

Below

are how I

categorize them and corresponding solutions in my mind:

- type 1: Metadata of external data, like external

endpoint/url,

username/pwd, schemas, formats.

Such metadata are mostly already accessible in external

system

as long as

endpoints and credentials are provided. Flink can get it

thru

catalogs, but

we haven't had many catalogs yet and thus Flink just hasn't

been

able to

leverage that. So the solution should be building more

catalogs.

Such

params should be part of a Flink table DDL/definition, and

not

overridable

in any means.


- type 2: Runtime params, like jdbc connector's fetch size,

elasticsearch

connector's bulk flush size.

Such params don't affect query results, but affect how

results

are produced

(eg. fast or slow, aka performance) - they are essentially

execution and

implementation details. They change often in exploration or

development

stages, but not quite frequently in well-defined

long-running

pipelines.

They should always have default values and can be missing

in

query. They

can be part of a table DDL/definition, but should also be

replaceable in a

query - *this is what table "hints" in FLIP-113 should

cover*.



- type 3: Semantic params, like kafka connector's start

offset.


Such params affect query results - the semantics. They'd

better

be as

filter conditions in WHERE clause that can be pushed down.

They

change

almost every time a query starts and have nothing to do

with

metadata, thus

should not be part of table definition/DDL, nor be

persisted

in

catalogs.

If they will, users should create views to keep such params

around (note

this is different from variable substitution).


Take Flink-Kafka as an example. Once we get these params

right,

here're the

steps users need to do to develop and run a Flink job:
- configure a Flink ConfluentSchemaRegistry with url,

username,

and password

- run "SELECT * FROM mykafka WHERE offset > 12pm yesterday"

(simplified

timestamp) in SQL CLI, Flink automatically retrieves all

metadata of

schema, file format, etc and start the job
- users want to make the job read Kafka topic faster, so it

goes

as "SELECT

* FROM mykafka /* faster_read_key=value*/ WHERE offset >

12pm

yesterday"

- done and satisfied, users submit it to production


Regarding "CREATE TABLE t LIKE with (k1=v1, k2=v2), I think

it's

nice-to-have feature, but not a strategically critical,

long-term solution,

because
1) It may seem promising at the current stage to solve the
too-much-manual-work problem, but that's only because Flink

hasn't

leveraged catalogs well and handled the 3 types of params

above

properly.

Once we get the params types right, the LIKE syntax won't

be

that

important, and will be just an easier way to create tables

without retyping

long fields like username and pwd.
2) Note that only some rare type of catalog can store k-v

property pair, so

table created this way often cannot be persisted. In the

foreseeable

future, such catalog will only be HiveCatalog, and not

everyone

has a Hive

metastore. To be honest, without persistence, recreating

tables

every time

this way is still a lot of keyboard typing.

Cheers,
Bowen

On Tue, Mar 10, 2020 at 8:07 PM Kurt Young <

[email protected]

wrote:

If a specific connector want to have such parameter and

read

if out of

configuration, then that's fine.
If we are talking about a configuration for all kinds of

sources, I would

be super careful about that.
It's true it can solve maybe 80% cases, but it will also

make

the left 20%

feels weird.

Best,
Kurt


On Wed, Mar 11, 2020 at 11:00 AM Jark Wu <

[email protected]

wrote:

Hi Kurt,

#3 Regarding to global offset:
I'm not saying to use the global configuration to

override

connector

properties by the planner.
But the connector should take this configuration and

translate into their

client API.
AFAIK, almost all the message queues support eariliest

and

latest and a

timestamp value as start point.
So we can support 3 options for this configuration:

"eariliest", "latest"

and a timestamp string value.
Of course, this can't solve 100% cases, but I guess can

sovle 80% or 90%

cases.
And the remaining cases can be resolved by LIKE syntax

which

I guess is

not

very common cases.

Best,
Jark


On Wed, 11 Mar 2020 at 10:33, Kurt Young <

[email protected]


wrote:

Good to have such lovely discussions. I also want to

share

some of my

opinions.

#1 Regarding to error handling: I also think ignore

invalid hints would

be

dangerous, maybe
the simplest solution is just throw an exception.

#2 Regarding to property replacement: I don't think

we

should

constraint

ourself to
the meaning of the word "hint", and forbidden it

modifying

any

properties

which can effect
query results. IMO `PROPERTIES` is one of the table

hints,

and a

powerful

one. It can
modify properties located in DDL's WITH block. But I

also

see the harm

that

if we make it
too flexible like change the kafka topic name with a

hint.

Such use

case

is

not common and
sounds very dangerous to me. I would propose we have

map

of hintable

properties for each
connector, and should validate all passed in

properties

are actually

hintable. And combining with
#1 error handling, we can throw an exception once

received

invalid

property.

#3 Regarding to global offset: I'm not sure it's

feasible.

Different

connectors will have totally
different properties to represent offset, some might

be

timestamps,

some

might be string literals
like "earliest", and others might be just integers.

Best,
Kurt


On Tue, Mar 10, 2020 at 11:46 PM Jark Wu <

[email protected]>

wrote:

Hi everyone,

I want to jump in the discussion about the "dynamic

start offset"

problem.

First of all, I share the same concern with Timo

and

Fabian, that the

"start offset" affects the query semantics, i.e.

the

query result.

But "hints" is just used for optimization which

should

affect the

result?


I think the "dynamic start offset" is an very

important

usability

problem

which will be faced by many streaming platforms.
I also agree "CREATE TEMPORARY TABLE Temp (LIKE t)

WITH

('connector.startup-timestamp-millis' =

'1578538374471')" is verbose,

what

if we have 10 tables to join?

However, what I want to propose (should be another

thread) is a

global

configuration to reset start offsets of all the

source

connectors

in the query session, e.g.

"table.sources.start-offset".

This is

possible

now because `TableSourceFactory.Context` has

`getConfiguration`

method to get the session configuration, and use it

to

create an

adapted

TableSource.
Then we can also expose to SQL CLI via SET command,

e.g.

`SET

'table.sources.start-offset'='earliest';`, which is

pretty simple and

straightforward.

This is very similar to KSQL's `SET

'auto.offset.reset'='earliest'`

which

is very helpful IMO.

Best,
Jark


On Tue, 10 Mar 2020 at 22:29, Timo Walther <

[email protected]>

wrote:

Hi Danny,

compared to the hints, FLIP-110 is fully

compliant

to

the SQL

standard.


I don't think that `CREATE TEMPORARY TABLE Temp

(LIKE

t) WITH

(k=v)`

is

too verbose or awkward for the power of basically

changing the

entire

connector. Usually, this statement would just

precede

the query in

multiline file. So it can be change "in-place"

like

the hints you

proposed.


Many companies have a well-defined set of tables

that

should be

used.

It

would be dangerous if users can change the path

or

topic in a hint.

The

catalog/catalog manager should be the entity that

controls which

tables

exist and how they can be accessed.

what’s the problem there if we user the table

hints

to support

“start

offset”?

IMHO it violates the meaning of a hint. According

to

the

dictionary,

hint is "a statement that expresses indirectly

what

one prefers not

to

say explicitly". But offsets are a property that

are

very explicit.


If we go with the hint approach, it should be

expressible in the

TableSourceFactory which properties are supported

for

hinting. Or

do

you

plan to offer those hints in a separate

Map<String,

String> that

cannot

overwrite existing properties? I think this would

be

different

story...


Regards,
Timo


On 10.03.20 10:34, Danny Chan wrote:

Thanks Timo ~

Personally I would say that offset > 0 and

start

offset = 10 does

not

have the same semantic, so from the SQL aspect,

we

can

not

implement

“starting offset” hint for query with such a

syntax.


And the CREATE TABLE LIKE syntax is a DDL which

is

just verbose

for

defining such dynamic parameters even if it could

do

that, shall we

force

users to define a temporal table for each query

with

dynamic

params,

would say it’s an awkward solution.


"Hints should give "hints" but not affect the

actual

produced

result.”

You mentioned that multiple times and could we

give a

reason,

what’s

the

problem there if we user the table hints to

support

“start offset”

From

my side I saw some benefits for that:



• It’s very convent to set up these parameters,

the

syntax is

very

much

like the DDL definition

• It’s scope is very clear, right on the table

it

attathed

• It does not affect the table schema, which

means

in order to

specify

the offset, there is no need to define an offset

column which is

weird

actually, offset should never be a column, it’s

more

like a

metadata

or a

start option.


So in total, FLIP-110 uses the offset more

like a

Hive partition

prune,

we can do that if we have an offset column, but

most

of the case we

do

not

define that, so there is actually no conflict or

overlap.


Best,
Danny Chan
在 2020年3月10日 +0800 PM4:28，Timo Walther <

[email protected]>，写道：

Hi Danny,

shouldn't FLIP-110[1] solve most of the

problems

we have around

defining

table properties more dynamically without

manual

schema work?

Also

offset definition is easier with such a

syntax.

They must not be

defined

in catalog but could be temporary tables that

extend from the

original

table.

In general, we should aim to keep the syntax

concise and don't

provide

too many ways of doing the same thing. Hints

should give "hints"

but

not

affect the actual produced result.

Some connector properties might also change

the

plan or schema

in

the

future. E.g. they might also define whether a

table source

supports

certain push-downs (e.g. predicate

push-down).


Dawid is currently working a draft that might

makes it possible

to

expose a Kafka offset via the schema such

that

`SELECT * FROM

Topic

WHERE offset > 10` would become possible and

could

be pushed

down.

But

this is of course, not planned initially.

Regards,
Timo


[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE




On 10.03.20 08:34, Danny Chan wrote:

Thanks Wenlong ~

For PROPERTIES Hint Error handling

Actually we have no way to figure out

whether a

error prone

hint

is a

PROPERTIES hint, for example, if use writes a

hint

like

‘PROPERTIAS’,

we

do

not know if this hint is a PROPERTIES hint, what

we

know is that

the

hint

name was not registered in our Flink.


If the user writes the hint name correctly

(i.e.

PROPERTIES),

we

did

can enforce the validation of the hint options

though

the pluggable

HintOptionChecker.


For PROPERTIES Hint Option Format

For a key value style hint option, the key

can

be either a

simple

identifier or a string literal, which means that

it’s

compatible

with

our

DDL syntax. We support simple identifier because

many

other hints

do

not

have the component complex keys like the table

properties, and we

want

to

unify the parse block.


Best,
Danny Chan
在 2020年3月10日 +0800 PM3:19，wenlong.lwl <

[email protected]

，写道：

Hi Danny, thanks for the proposal. +1 for

adding table hints,

it

is

really

a necessary feature for flink sql to

integrate

with a catalog.


For error handling, I think it would be

more

natural to throw

an

exception when error table hint provided,

because the

properties

in

hint

will be merged and used to find the table

factory which would

cause

an

exception when error properties provided,

right? On the other

hand,

unlike

other hints which just affect the way to

execute the query,

the

property

table hint actually affects the result of

the

query, we should

never

ignore

the given property hints.

For the format of property hints,

currently,

in sql client, we

accept

properties in format of string only in

DDL:

'connector.type'='kafka',

think the format of properties in hint

should

be the same as

the

format we

defined in ddl. What do you think?

Bests,
Wenlong Lyu

On Tue, 10 Mar 2020 at 14:22, Danny Chan

[email protected]>

wrote:

To Weike: About the Error Handing

To be consistent with other SQL

vendors,

the

default is to

log

warnings

and if there is any error (invalid hint

name

or options), the

hint

is just

ignored. I have already addressed in

the

wiki.


To Timo: About the PROPERTIES Table

Hint


• The properties hints is also

optional,

user can pass in an

option

to

override the table properties but this

does

not mean it is

required.

• They should not include semantics:

does

the properties

belong

to

semantic ? I don't think so, the plan

does

not change right ?

The

result

set may be affected, but there are

already

some hints do so,

for

example,

MS-SQL MAXRECURSION and SNAPSHOT hint

[1]

• `SELECT * FROM t(k=v, k=v)`: this

grammar

breaks the SQL

standard

compared to the hints way(which is

included

in comments)

• I actually didn't found any vendors

to

support such

grammar,

and

there

is no way to override table level

properties

dynamically. For

normal

RDBMS,

I think there are no requests for such

dynamic parameters

because

all the

table have the same storage and

computation

and they are

almost

all

batch

tables.
• While Flink as a computation engine

has

many connectors,

especially for

some message queue like Kafka, we would

have

a start_offset

which

is

different each time we start the query,

such

parameters can

not

be

persisted to catalog, because it’s not

static, this is

actually

the

background we propose the table hints

to

indicate such

properties

dynamically.


To Jark and Jinsong: I have removed the

query hints part and

change

the

title.

[1]

https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-ver15


Best,
Danny Chan
在 2020年3月9日 +0800 PM5:46，Timo Walther <

[email protected]

，写道：

Hi Danny,

thanks for the proposal. I agree with

Jark

and Jingsong.

Planner

hints

and table hints are orthogonal topics

that

should be

discussed

separately.


I share Jingsong's opinion that we

should

not use planner

hints

for

passing connector properties. Planner

hints should be

optional

at

any

time. They should not include

semantics

but only affect

execution

time.

Connector properties are an important

part

of the query

itself.


Have you thought about options such

as

`SELECT * FROM t(k=v,

k=v)`?

How

are other vendors deal with this

problem?


Regards,
Timo


On 09.03.20 10:37, Jingsong Li wrote:

Hi Danny, +1 for table hints,

thanks

for

driving.


I took a look to FLIP, most of

content

are talking about

query

hints.

It is

hard to discussion and voting. So

+1

to

split it as Jark

said.


Another thing is configuration that

suitable to config with

table

hints:

"connector.path" and

"connector.topic",

Are they really

suitable

for

table

hints? Looks weird to me. Because I

think these properties

are

the

core of

table.

Best,
Jingsong Lee

On Mon, Mar 9, 2020 at 5:30 PM Jark

Wu

[email protected]>

wrote:

Thanks Danny for starting the

discussion.

+1 for this feature.

If we just focus on the table

hints

not the query hints in

this

release,

could you split the FLIP into two

FLIPs?

Because it's hard to vote on

partial

part of a FLIP. You

can

keep

the table

hints proposal in FLIP-113 and

move

query hints into

another

FLIP.

So that we can focuse on the

table

hints in the FLIP.


Thanks,
Jark



On Mon, 9 Mar 2020 at 17:14,

DONG,

Weike <

[email protected]

wrote:

Hi Danny,

This is a nice feature, +1.

One thing I am interested in

but

not

mentioned in the

proposal

is

the

error

handling, as it is quite common

for

users to write

inappropriate

hints in

SQL code, if illegal or "bad"

hints

are given, would the

system

simply

ignore them or throw

exceptions?


Thanks : )

Best,
Weike

On Mon, Mar 9, 2020 at 5:02 PM

Danny

Chan <

[email protected]>

wrote:

Note:
we only plan to support table

hints in Flink release

1.11,

so

please

focus

mainly on the table hints

part

and

just ignore the

planner

hints, sorry

for

that mistake ~

Best,
Danny Chan
在 2020年3月9日 +0800

PM4:36，Danny

Chan <

[email protected]

，写道：

Hi, fellows ~

I would like to propose the

supports for SQL hints for

our

Flink SQL.


We would support hints

syntax

as

following:


select /*+ NO_HASH_JOIN,

RESOURCE(mem='128mb',

parallelism='24') */

from
emp /*+ INDEX(idx1, idx2)

*/

join
dept /*+

PROPERTIES(k1='v1',

k2='v2') */

on
emp.deptno = dept.deptno

Basically we would support

both

query hints(after the

SELECT

keyword)

and table hints(after the

referenced table name), for

1.11,

we

plan to

only

support table hints with a

hint

probably named

PROPERTIES:


table_name /*+

PROPERTIES(k1='v1', k2='v2') *+/


I am looking forward to

your

comments.


You can access the FLIP

here:

https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+SQL+and+Planner+Hints


Best,
Danny Chan

Re: [DISCUSS]FLIP-113: Support SQL and planner hints

Reply via email to