Re: [DISCUSS] FLIP-234: Support Retryable Lookup Join To Solve Delayed Updates Issue In External Systems

2022-08-10 Thread Rascal Wu
Thanks Lincoln a lot for your quick response and clear explanation.

It's clear now. So regarding "*For the retry support in FLIP-234, it only
provide a 'lookup_miss' retry predicate which is to retry if can not find
it*" , try to double confirm again:
1) if the dim record doesn't exist in dim table after retry finished, the
output will be +[order1, id1, null], if the dim record coming in dim table
before last retry, the output will be +[order1, id1, xxx], right? (assume
left join here)
2) And if I expect the latest event (+[order1, id1, 12]) update the
previous event (+[order1, id1, 10]), I should use regular join instead of
lookup join, right?

Lincoln Lee  于2022年8月11日周四 12:01写道:

> @Rascal  Thanks for looking at this new feature! There is a lot of content
> in these two flips and we will prepare a more detailed user documentation
> before the 1.16 release.
>
> First of all, unlike regular join, the lookup join only triggers the lookup
> action (access the dimension table) through the records of the stream
> table, and the updates of the dimension table itself do not actively
> trigger the updating results to
> downstream.
>
> For your example, whether retry is enabled or not, only the first
> "+[order1, id1, 10]" will sent to downstream (because only one order record
> has come), the new updates of dimension table "10:02 -> (id1, 12)" will not
> trigger updates.
>
> For the retry support in FLIP-234, it only provide a 'lookup_miss' retry
> predicate which is to retry if can not find it (not always equivalent to
> the complete join condition [1]), and not to trigger the retry if can find
> a non-null value from the dimension table. If more complex value check
> logic required, a viable way is to implement a custom `AsyncRetryPredicate`
> in DataStream API (as FLIP-232 introduced).
>
> [1]: for different connector, the index-lookup capability maybe different,
> e.g.,  HBase can lookup on rowkey only (by default, without secondary
> index), while RDBMS can provided more powerful index-lookup capabilities
> let's see the lookup join example from flink document(
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#lookup-join
> )
>
>
> if the query with a join condition "ON o.customer_id = c.id and c.country
> =
> 'US'"
>
> -- enrich each order with customer informationSELECT o.order_id,
> o.total, c.country, c.zipFROM Orders AS o  JOIN Customers FOR
> SYSTEM_TIME AS OF o.proc_time AS cON o.customer_id = c.id and
> c.country = 'US';
>
> for the dimension table in mysql, all of the columns (id & country) can be
> used as index-lookup condition
>
> -- Customers is backed by the JDBC connector and can be used for
> lookup joinsCREATE TEMPORARY TABLE Customers (  id INT,  name STRING,
> country STRING,  zip STRING) WITH (  'connector' = 'jdbc',  'url' =
> 'jdbc:mysql://mysqlhost:3306/customerdb',  'table-name' =
> 'customers');
>
>
>
> while if the same table stored in HBase (with no secondary index), only the
> 'id' column (rowkey in HBase) can be the index-lookup condition
>
> -- Customers is backed by the JDBC connector and can be used for
> lookup joinsCREATE TEMPORARY TABLE Customers (  id INT,  name STRING,
> country STRING,  zip STRING,
> PRIMARY KEY (id) NOT ENFORCED) WITH (  'connector' = 'hbase-xxx',
> ...);
>
>
> so, the 'lookup_miss' retry predicte may result differently in different
> connectors.
>
> wish this can helps.
>
>
> Best,
> Lincoln Lee
>
>
> Rascal Wu  于2022年8月11日周四 10:32写道:
>
> > Hi here,
> >
> > Sorry for digging up this old email.
> >
> > May I consult one question about what's the final behavior we expected
> > after we enabled retryable mechanism. I viewed the FLIP-232 & FLIP-234,
> but
> > I didn't get a clear answer as lacking of some knowledge.
> >
> > Let me take a example there, let's assume we have a dim table(id, price),
> > the record(id1) existed in dim table before 10:00, and then it was
> updated
> > in business system at 10:01, but delay updated in dim table at 10:02. the
> > record timeline as follow:
> > 10:00 -> (id1, 10)
> > 10:00 -> the id1 records was been updated, but delayed persistency or
> sync
> > to dim table.
> > 10:02 -> (id1, 12)
> >
> > And a Flink application processes an order record that will join the dim
> > table at 10:01, so it will output an event +[order1, id1, 10].
> > And if we enable retry mechanism no matter it's sync or async, does it
> mean
> > there will be two events sink to downstream:
> > 1. retract event: -[order1, id1, 10]
> > 2. new event: +[order1, id1, 12]
> >
> > does the above all behavior is what we expected to solve the delay update
> > for dimension table?
> >
> >
> > Lincoln Lee  于2022年6月7日周二 12:19写道:
> >
> > > Hi everyone,
> > >
> > > I started a vote for this FLIP [1], please vote there or ask additional
> > > questions here. [2]
> > >
> > > [1] https://lists.apache.org/thread/bb0kqjs8co3hhmtklmwptws4fc4rz810
> > > [2] 

Re: [DISCUSS] FLIP-234: Support Retryable Lookup Join To Solve Delayed Updates Issue In External Systems

2022-08-10 Thread Lincoln Lee
@Rascal  Thanks for looking at this new feature! There is a lot of content
in these two flips and we will prepare a more detailed user documentation
before the 1.16 release.

First of all, unlike regular join, the lookup join only triggers the lookup
action (access the dimension table) through the records of the stream
table, and the updates of the dimension table itself do not actively
trigger the updating results to
downstream.

For your example, whether retry is enabled or not, only the first
"+[order1, id1, 10]" will sent to downstream (because only one order record
has come), the new updates of dimension table "10:02 -> (id1, 12)" will not
trigger updates.

For the retry support in FLIP-234, it only provide a 'lookup_miss' retry
predicate which is to retry if can not find it (not always equivalent to
the complete join condition [1]), and not to trigger the retry if can find
a non-null value from the dimension table. If more complex value check
logic required, a viable way is to implement a custom `AsyncRetryPredicate`
in DataStream API (as FLIP-232 introduced).

[1]: for different connector, the index-lookup capability maybe different,
e.g.,  HBase can lookup on rowkey only (by default, without secondary
index), while RDBMS can provided more powerful index-lookup capabilities
let's see the lookup join example from flink document(
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#lookup-join)


if the query with a join condition "ON o.customer_id = c.id and c.country =
'US'"

-- enrich each order with customer informationSELECT o.order_id,
o.total, c.country, c.zipFROM Orders AS o  JOIN Customers FOR
SYSTEM_TIME AS OF o.proc_time AS cON o.customer_id = c.id and
c.country = 'US';

for the dimension table in mysql, all of the columns (id & country) can be
used as index-lookup condition

-- Customers is backed by the JDBC connector and can be used for
lookup joinsCREATE TEMPORARY TABLE Customers (  id INT,  name STRING,
country STRING,  zip STRING) WITH (  'connector' = 'jdbc',  'url' =
'jdbc:mysql://mysqlhost:3306/customerdb',  'table-name' =
'customers');



while if the same table stored in HBase (with no secondary index), only the
'id' column (rowkey in HBase) can be the index-lookup condition

-- Customers is backed by the JDBC connector and can be used for
lookup joinsCREATE TEMPORARY TABLE Customers (  id INT,  name STRING,
country STRING,  zip STRING,
PRIMARY KEY (id) NOT ENFORCED) WITH (  'connector' = 'hbase-xxx',  ...);


so, the 'lookup_miss' retry predicte may result differently in different
connectors.

wish this can helps.


Best,
Lincoln Lee


Rascal Wu  于2022年8月11日周四 10:32写道:

> Hi here,
>
> Sorry for digging up this old email.
>
> May I consult one question about what's the final behavior we expected
> after we enabled retryable mechanism. I viewed the FLIP-232 & FLIP-234, but
> I didn't get a clear answer as lacking of some knowledge.
>
> Let me take a example there, let's assume we have a dim table(id, price),
> the record(id1) existed in dim table before 10:00, and then it was updated
> in business system at 10:01, but delay updated in dim table at 10:02. the
> record timeline as follow:
> 10:00 -> (id1, 10)
> 10:00 -> the id1 records was been updated, but delayed persistency or sync
> to dim table.
> 10:02 -> (id1, 12)
>
> And a Flink application processes an order record that will join the dim
> table at 10:01, so it will output an event +[order1, id1, 10].
> And if we enable retry mechanism no matter it's sync or async, does it mean
> there will be two events sink to downstream:
> 1. retract event: -[order1, id1, 10]
> 2. new event: +[order1, id1, 12]
>
> does the above all behavior is what we expected to solve the delay update
> for dimension table?
>
>
> Lincoln Lee  于2022年6月7日周二 12:19写道:
>
> > Hi everyone,
> >
> > I started a vote for this FLIP [1], please vote there or ask additional
> > questions here. [2]
> >
> > [1] https://lists.apache.org/thread/bb0kqjs8co3hhmtklmwptws4fc4rz810
> > [2] https://lists.apache.org/thread/9k1sl2519kh2n3yttwqc00p07xdfns3h
> >
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Lincoln Lee  于2022年6月2日周四 15:51写道:
> >
> > > Hi everyone,
> > >
> > > When reviewing the name of the hint option 'miss-retry'='true|false', I
> > > feel the current name is not precise enough, it might be easier to
> > > understand by using the retry-predicate directly from flip-232,
> > > e.g. 'retry-predicate'='lookup-miss', which has the additional benefit
> of
> > > extensibility(maybe more retry condition in the future).
> > >
> > > Jark & Jingsong, do you have any suggestions? If we agree with the name
> > > 'retry-predicate' or other better candidate, I'll update the FLIP.
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Lincoln Lee  于2022年6月2日周四 11:23写道:
> > >
> > >> Hi everyone,
> > >>
> > >> I've updated the FLIP[1] based on this discussion thread that we agree
> > to
> > >> have a single unified 'LOOKUP' hint and also a 

[jira] [Created] (FLINK-28915) Flink Native k8s mode jar localtion support s3 schema

2022-08-10 Thread hjw (Jira)
hjw created FLINK-28915:
---

 Summary: Flink Native k8s mode jar localtion support s3 schema 
 Key: FLINK-28915
 URL: https://issues.apache.org/jira/browse/FLINK-28915
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Kubernetes, flink-contrib
Affects Versions: 1.15.1, 1.15.0
Reporter: hjw


As the Flink document show , local is the only supported scheme in Native k8s 
deployment.
Is there have a plan to support s3 filesystem? thx.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28914) Could not find any factories that implement

2022-08-10 Thread Dongming WU (Jira)
Dongming WU created FLINK-28914:
---

 Summary: Could not find any factories that implement
 Key: FLINK-28914
 URL: https://issues.apache.org/jira/browse/FLINK-28914
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway
Affects Versions: 1.16.0
Reporter: Dongming WU
 Fix For: 1.16.0


2022-08-11 11:09:53,135 ERROR org.apache.flink.table.gateway.SqlGateway         
           [] - Failed to start the endpoints.

org.apache.flink.table.api.ValidationException: Could not find any factories 
that implement 
'org.apache.flink.table.gateway.api.endpoint.SqlGatewayEndpointFactory' in the 
classpath.

-

I packaged Flink-Master and tried to start sql-gateway, but some problems arise.

I found tow problem with Factory under resources of flink-sql-gateway module.

META-INF.services should not be a folder name, ti should be ... 
/META-INF/services/... 

The 

`` org.apache.flink.table.gateway.rest.SqlGatewayRestEndpointFactory ``  in the 
org.apache.flink.table.factories.Factory file should be 

``  org.apache.flink.table.gateway.rest.util.SqlGatewayRestEndpointFactory `` . 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-234: Support Retryable Lookup Join To Solve Delayed Updates Issue In External Systems

2022-08-10 Thread Rascal Wu
Hi here,

Sorry for digging up this old email.

May I consult one question about what's the final behavior we expected
after we enabled retryable mechanism. I viewed the FLIP-232 & FLIP-234, but
I didn't get a clear answer as lacking of some knowledge.

Let me take a example there, let's assume we have a dim table(id, price),
the record(id1) existed in dim table before 10:00, and then it was updated
in business system at 10:01, but delay updated in dim table at 10:02. the
record timeline as follow:
10:00 -> (id1, 10)
10:00 -> the id1 records was been updated, but delayed persistency or sync
to dim table.
10:02 -> (id1, 12)

And a Flink application processes an order record that will join the dim
table at 10:01, so it will output an event +[order1, id1, 10].
And if we enable retry mechanism no matter it's sync or async, does it mean
there will be two events sink to downstream:
1. retract event: -[order1, id1, 10]
2. new event: +[order1, id1, 12]

does the above all behavior is what we expected to solve the delay update
for dimension table?


Lincoln Lee  于2022年6月7日周二 12:19写道:

> Hi everyone,
>
> I started a vote for this FLIP [1], please vote there or ask additional
> questions here. [2]
>
> [1] https://lists.apache.org/thread/bb0kqjs8co3hhmtklmwptws4fc4rz810
> [2] https://lists.apache.org/thread/9k1sl2519kh2n3yttwqc00p07xdfns3h
>
>
> Best,
> Lincoln Lee
>
>
> Lincoln Lee  于2022年6月2日周四 15:51写道:
>
> > Hi everyone,
> >
> > When reviewing the name of the hint option 'miss-retry'='true|false', I
> > feel the current name is not precise enough, it might be easier to
> > understand by using the retry-predicate directly from flip-232,
> > e.g. 'retry-predicate'='lookup-miss', which has the additional benefit of
> > extensibility(maybe more retry condition in the future).
> >
> > Jark & Jingsong, do you have any suggestions? If we agree with the name
> > 'retry-predicate' or other better candidate, I'll update the FLIP.
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Lincoln Lee  于2022年6月2日周四 11:23写道:
> >
> >> Hi everyone,
> >>
> >> I've updated the FLIP[1] based on this discussion thread that we agree
> to
> >> have a single unified 'LOOKUP' hint and also a related part in
> FLIP-221[2]
> >>  which is mainly for the necessity of the common table option
> >> 'lookup.async'.
> >>
> >> The main updates are:
> >> 1. the new unified 'LOOKUP' hint, make retry support both on sync and
> >> async lookup
> >> 2. clarify the default choice of the planner for those connectors which
> >> have both sync and async lookup capabilities, and how to deal with the
> >> query hint
> >> 3. will add a followup issue to discuss whether to remove the
> >> 'lookup.async' option in HBase connector.
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-234%3A+Support+Retryable+Lookup+Join+To+Solve+Delayed+Updates+Issue+In+External+Systems
> >> [2] https://lists.apache.org/thread/1vokqdnnt01yycl7y1p74g556cc8yvtq
> >>
> >> Best,
> >> Lincoln Lee
> >>
> >>
> >> Lincoln Lee  于2022年6月1日周三 16:03写道:
> >>
> >>> Hi Jingsong,
> >>>
> >>> There will be no change for connectors with only one capability (sync
> or
> >>> async).
> >>>
> >>> Query hint works in a best effort manner, so if users specifies a hint
> >>> with invalid option, the query plan keeps unchanged, e.g., use
> >>> LOOKUP('table'='customer', 'async'='true'), but backend lookup source
> only
> >>> implemented the sync lookup function, then the async lookup hint takes
> no
> >>> effect.
> >>>
> >>> For these connectors which can have both capabilities of async and sync
> >>> lookup, our advice for the connector developer is implementing both
> sync
> >>> and async interfaces if both capabilities have suitable use cases, and
> the
> >>> planner can decide which capability is the preferable one based on cost
> >>> model or maybe other mechanism (another use case is exactly what we're
> >>> discussing here, users can give the query hint), otherwise choose one
> >>> interface to implement.
> >>>
> >>> Also, this should be clarified for the lookup function related APIs.
> >>>
> >>> Best,
> >>> Lincoln Lee
> >>>
> >>>
> >>> Jingsong Li  于2022年6月1日周三 15:18写道:
> >>>
>  Hi Lincoln,
> 
>  > It's better making decisions at the query level when a connector has
>  both
>  capabilities.
> 
>  Can you clarify the mechanism?
>  - only sync connector: What connector developers should do
>  - only async connector: What connector developers should do
>  - both async and sync connector: What connector developers should do
> 
>  Best,
>  Jingsong
> 
>  On Wed, Jun 1, 2022 at 2:29 PM Lincoln Lee 
>  wrote:
> 
>  > Hi Jingsong,
>  >
>  > Thanks for your feedback!
>  >
>  > Yes, the existing HBase connector use an option 'lookup.async' to
>  control
>  > its lookup source implementations that exposed to the planner,
>  however it's
>  > a private option for the HBase connector only, so it 

[jira] [Created] (FLINK-28913) Fix fail to open HiveCatalog when it's for hive3

2022-08-10 Thread luoyuxia (Jira)
luoyuxia created FLINK-28913:


 Summary: Fix fail to open HiveCatalog when it's for hive3
 Key: FLINK-28913
 URL: https://issues.apache.org/jira/browse/FLINK-28913
 Project: Flink
  Issue Type: Bug
Reporter: luoyuxia






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28912) Add Part of "Who Use Flink" In ReadMe file and https://flink.apache.org/

2022-08-10 Thread Zhou Yao (Jira)
Zhou Yao created FLINK-28912:


 Summary: Add Part of "Who Use Flink" In ReadMe file  and 
https://flink.apache.org/
 Key: FLINK-28912
 URL: https://issues.apache.org/jira/browse/FLINK-28912
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Zhou Yao
 Attachments: image-2022-08-10-20-15-10-418.png

May be ,we can learn from website of  [Apache Kylin|https://kylin.apache.org/], 
add part of  "Who Use Flink"  in Readme or website.  This can make Flink more 
frendly

!image-2022-08-10-20-15-10-418.png|width=147,height=99!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28911) Elasticsearch connector fails build

2022-08-10 Thread Niels Basjes (Jira)
Niels Basjes created FLINK-28911:


 Summary: Elasticsearch connector fails build
 Key: FLINK-28911
 URL: https://issues.apache.org/jira/browse/FLINK-28911
 Project: Flink
  Issue Type: Bug
  Components: Connectors / ElasticSearch
Affects Versions: 1.15.1
Reporter: Niels Basjes
Assignee: Niels Basjes


When I run the `mvn clean verify` of the ES connector some if the integration 
tests fail.

Assesment so far: the SerializationSchema is not opened, triggering an NPE 
later on.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28910) CDC From Mysql To Hbase Bugs

2022-08-10 Thread TE (Jira)
TE created FLINK-28910:
--

 Summary: CDC From Mysql To Hbase Bugs
 Key: FLINK-28910
 URL: https://issues.apache.org/jira/browse/FLINK-28910
 Project: Flink
  Issue Type: Bug
  Components: Connectors / HBase
Reporter: TE


I use Flink for CDC from Mysql to Hbase.

The problem I encountered is that the Mysql record is updated (not deleted), 
but the record in hbase is deleted sometimes.

I tried to analyze the problem and found the reason as follows:

The update action of Mysql will be decomposed into delete + insert by Flink.
The Hbase connector uses a mutator to handle this set of actions.

However, if the order of these actions is not actively set, the processing of 
the mutator will not guarantee the order of execution.

Therefore, when the update of Mysql is triggered, it is possible that hbase 
actually performed the actions in the order of put + delete, resulting in the 
data being deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28909) Add ribbon filter policy option in RocksDBConfiguredOptions

2022-08-10 Thread zlzhang0122 (Jira)
zlzhang0122 created FLINK-28909:
---

 Summary: Add ribbon filter policy option in 
RocksDBConfiguredOptions
 Key: FLINK-28909
 URL: https://issues.apache.org/jira/browse/FLINK-28909
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.15.1, 1.14.2
Reporter: zlzhang0122
 Fix For: 1.16.0, 1.15.2


Ribbon filter can efficiently enhance the read and reduce the disk and memory 
usage on RocksDB, it's supported by rocksdb since 6.15. (more details see 
[http://rocksdb.org/blog/2021/12/29/ribbon-filter.html|http://rocksdb.org/blog/2021/12/29/ribbon-filter.html])



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28908) Coder for LIST type is incorrectly chosen is PyFlink

2022-08-10 Thread Juntao Hu (Jira)
Juntao Hu created FLINK-28908:
-

 Summary: Coder for LIST type is incorrectly chosen is PyFlink
 Key: FLINK-28908
 URL: https://issues.apache.org/jira/browse/FLINK-28908
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.15.1, 1.14.5
Reporter: Juntao Hu
 Fix For: 1.16.0, 1.15.2, 1.14.6


Code to reproduce this bug, the result is `[None, None, None]`:
{code:python}
jvm = get_gateway().jvm
env = StreamExecutionEnvironment.get_execution_environment()
j_item = jvm.java.util.ArrayList()
j_item.add(1)
j_item.add(2)
j_item.add(3)
j_list = jvm.java.util.ArrayList()
j_list.add(j_item)
type_info = Types.LIST(Types.INT())
ds = DataStream(env._j_stream_execution_environment.fromCollection(j_list, 
type_info.get_java_type_info()))
ds.map(lambda e: print(e))
env.execute() {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28907) Flink docs does not compile

2022-08-10 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-28907:
---

 Summary: Flink docs does not compile
 Key: FLINK-28907
 URL: https://issues.apache.org/jira/browse/FLINK-28907
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.16.0
Reporter: Zhu Zhu
 Fix For: 1.16.0


Flink docs suddenly fail to compile in my local environment, without no new 
change or rebase. The error is as below:

go: github.com/apache/flink-connector-elasticsearch/docs upgrade => 
v0.0.0-20220715033920-cbeb08187b3a
hugo: collected modules in 1832 ms
Start building sites …
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/formats/overview.md:54:20": page not 
found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content/docs/connectors/datastream/overview.md:44:20": page not found
ERROR 2022/08/10 17:48:29 [en] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:29 Expand shortcode is deprecated. Use 'details' instead.
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/formats/overview.md:54:20": page 
not found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/datastream/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/datastream/overview.md:43:20": page not 
found
ERROR 2022/08/10 17:48:32 [zh] REF_NOT_FOUND: Ref 
"docs/connectors/table/elasticsearch": 
"/XXX/docs/content.zh/docs/connectors/table/overview.md:58:20": page not found
WARN 2022/08/10 17:48:32 Expand shortcode is deprecated. Use 'details' instead.
Built in 6415 ms
Error: Error building site: logged 6 error(s)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28906) Add AlgoOperator for AgglomerativeClustering

2022-08-10 Thread Zhipeng Zhang (Jira)
Zhipeng Zhang created FLINK-28906:
-

 Summary: Add AlgoOperator for AgglomerativeClustering
 Key: FLINK-28906
 URL: https://issues.apache.org/jira/browse/FLINK-28906
 Project: Flink
  Issue Type: New Feature
  Components: Library / Machine Learning
Reporter: Zhipeng Zhang
 Fix For: ml-2.2.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28905) Fix HiveCatalog use wrong Flink type when get statistic for Hive timestamp partition column

2022-08-10 Thread luoyuxia (Jira)
luoyuxia created FLINK-28905:


 Summary: Fix HiveCatalog use wrong Flink type when get statistic 
for Hive timestamp partition column
 Key: FLINK-28905
 URL: https://issues.apache.org/jira/browse/FLINK-28905
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: luoyuxia


Hive's timestamp type will be mapped to Flink's `TIMESTAMP_WITHOUT_TIME_ZONE`, 
but

in method `HiveStatsUtil#getPartitionColumnStats`, the `

TIMESTAMP_WITH_LOCAL_TIME_ZONE` is mistaken as `TIMESTAMP_WITHOUT_TIME_ZONE`,  
which will cause fail to get partition column's statistic when the partition 
column is 

So, we should change `TIMESTAMP_WITH_LOCAL_TIME_ZONE` to 
`TIMESTAMP_WITHOUT_TIME_ZONE`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28904) Add missing connector/format doc for PyFlink

2022-08-10 Thread Juntao Hu (Jira)
Juntao Hu created FLINK-28904:
-

 Summary: Add missing connector/format doc for PyFlink
 Key: FLINK-28904
 URL: https://issues.apache.org/jira/browse/FLINK-28904
 Project: Flink
  Issue Type: Improvement
  Components: API / Python
Affects Versions: 1.15.1
Reporter: Juntao Hu
 Fix For: 1.16.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28903) flink-table-store-hive-catalog could not shade hive-shims-0.23

2022-08-10 Thread Nicholas Jiang (Jira)
Nicholas Jiang created FLINK-28903:
--

 Summary: flink-table-store-hive-catalog could not shade 
hive-shims-0.23
 Key: FLINK-28903
 URL: https://issues.apache.org/jira/browse/FLINK-28903
 Project: Flink
  Issue Type: Bug
  Components: Table Store
Affects Versions: table-store-0.3.0
Reporter: Nicholas Jiang
 Fix For: table-store-0.3.0


flink-table-store-hive-catalog could not shade hive-shims-0.23 because 
artifactSet doesn't include hive-shims-0.23 and the minimizeJar is set to true. 
The exception is as follows:
{code:java}
Caused by: java.lang.RuntimeException: Unable to instantiate 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1708)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:97)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.hive.HiveCatalog.createClient(HiveCatalog.java:380)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.hive.HiveCatalog.(HiveCatalog.java:80) 
~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.hive.HiveCatalogFactory.create(HiveCatalogFactory.java:51)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.file.catalog.CatalogFactory.createCatalog(CatalogFactory.java:93)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.connector.FlinkCatalogFactory.createCatalog(FlinkCatalogFactory.java:62)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.connector.FlinkCatalogFactory.createCatalog(FlinkCatalogFactory.java:57)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.connector.FlinkCatalogFactory.createCatalog(FlinkCatalogFactory.java:31)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.factories.FactoryUtil.createCatalog(FactoryUtil.java:428)
 ~[flink-table-api-java-uber-1.15.1.jar:1.15.1]
    at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.createCatalog(TableEnvironmentImpl.java:1356)
 ~[flink-table-api-java-uber-1.15.1.jar:1.15.1]
    at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:)
 ~[flink-table-api-java-uber-1.15.1.jar:1.15.1]
    at 
org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeOperation$3(LocalExecutor.java:209)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
    at 
org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
    at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeOperation(LocalExecutor.java:209)
 ~[flink-sql-client-1.15.1.jar:1.15.1]
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
~[?:1.8.0_181]
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 ~[?:1.8.0_181]
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 ~[?:1.8.0_181]
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
~[?:1.8.0_181]
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1706)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 
org.apache.flink.table.store.shaded.org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
 ~[flink-table-store-dist-0.3-SNAPSHOT.jar:0.3-SNAPSHOT]
    at 

[jira] [Created] (FLINK-28902) FileSystemJobResultStoreTestInternal is a unit test that's not following the unit test naming convents

2022-08-10 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-28902:
-

 Summary: FileSystemJobResultStoreTestInternal is a unit test 
that's not following the unit test naming convents
 Key: FLINK-28902
 URL: https://issues.apache.org/jira/browse/FLINK-28902
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination, Test Infrastructure
Affects Versions: 1.15.1, 1.16.0
Reporter: Matthias Pohl


AzureCI still picks the test up as part of {{mvn verify}} (see an [example 
build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39780=logs=0da23115-68bb-5dcd-192c-bd4c8adebde1=24c3384f-1bcb-57b3-224f-51bf973bbee8=8539]).

Anyway, we should move the {{Internal}} suffix and move it somewhere inside of 
the test name.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28901) CassandraSinkBaseTest.testTimeoutExceptionOnInvoke failed with TestTimedOutException

2022-08-10 Thread Xingbo Huang (Jira)
Xingbo Huang created FLINK-28901:


 Summary: CassandraSinkBaseTest.testTimeoutExceptionOnInvoke failed 
with TestTimedOutException
 Key: FLINK-28901
 URL: https://issues.apache.org/jira/browse/FLINK-28901
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Cassandra
Affects Versions: 1.16.0
Reporter: Xingbo Huang
 Fix For: 1.16.0



{code:java}
2022-08-10T03:39:22.6587394Z Aug 10 03:39:22 [ERROR] 
org.apache.flink.streaming.connectors.cassandra.CassandraSinkBaseTest.testTimeoutExceptionOnInvoke
  Time elapsed: 5.113 s  <<< ERROR!
2022-08-10T03:39:22.6588579Z Aug 10 03:39:22 
org.junit.runners.model.TestTimedOutException: test timed out after 5000 
milliseconds
2022-08-10T03:39:22.6589463Z Aug 10 03:39:22at 
java.util.zip.ZipFile.read(Native Method)
2022-08-10T03:39:22.6590286Z Aug 10 03:39:22at 
java.util.zip.ZipFile.access$1400(ZipFile.java:60)
2022-08-10T03:39:22.6591287Z Aug 10 03:39:22at 
java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:734)
2022-08-10T03:39:22.6592323Z Aug 10 03:39:22at 
java.util.zip.ZipFile$ZipFileInflaterInputStream.fill(ZipFile.java:434)
2022-08-10T03:39:22.6593673Z Aug 10 03:39:22at 
java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
2022-08-10T03:39:22.6594638Z Aug 10 03:39:22at 
sun.misc.Resource.getBytes(Resource.java:124)
2022-08-10T03:39:22.6595535Z Aug 10 03:39:22at 
java.net.URLClassLoader.defineClass(URLClassLoader.java:463)
2022-08-10T03:39:22.6596506Z Aug 10 03:39:22at 
java.net.URLClassLoader.access$100(URLClassLoader.java:74)
2022-08-10T03:39:22.6597477Z Aug 10 03:39:22at 
java.net.URLClassLoader$1.run(URLClassLoader.java:369)
2022-08-10T03:39:22.6598393Z Aug 10 03:39:22at 
java.net.URLClassLoader$1.run(URLClassLoader.java:363)
2022-08-10T03:39:22.6599286Z Aug 10 03:39:22at 
java.security.AccessController.doPrivileged(Native Method)
2022-08-10T03:39:22.6600209Z Aug 10 03:39:22at 
java.net.URLClassLoader.findClass(URLClassLoader.java:362)
2022-08-10T03:39:22.6601141Z Aug 10 03:39:22at 
java.lang.ClassLoader.loadClass(ClassLoader.java:418)
2022-08-10T03:39:22.6602070Z Aug 10 03:39:22at 
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
2022-08-10T03:39:22.6603090Z Aug 10 03:39:22at 
java.lang.ClassLoader.loadClass(ClassLoader.java:351)
2022-08-10T03:39:22.6604199Z Aug 10 03:39:22at 
net.bytebuddy.NamingStrategy$SuffixingRandom.(NamingStrategy.java:153)
2022-08-10T03:39:22.6605188Z Aug 10 03:39:22at 
net.bytebuddy.ByteBuddy.(ByteBuddy.java:196)
2022-08-10T03:39:22.6606063Z Aug 10 03:39:22at 
net.bytebuddy.ByteBuddy.(ByteBuddy.java:187)
2022-08-10T03:39:22.6607141Z Aug 10 03:39:22at 
org.mockito.internal.creation.bytebuddy.InlineBytecodeGenerator.(InlineBytecodeGenerator.java:81)
2022-08-10T03:39:22.6608423Z Aug 10 03:39:22at 
org.mockito.internal.creation.bytebuddy.InlineByteBuddyMockMaker.(InlineByteBuddyMockMaker.java:224)
2022-08-10T03:39:22.6609844Z Aug 10 03:39:22at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
2022-08-10T03:39:22.6610990Z Aug 10 03:39:22at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
2022-08-10T03:39:22.6612246Z Aug 10 03:39:22at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
2022-08-10T03:39:22.6613551Z Aug 10 03:39:22at 
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
2022-08-10T03:39:22.6614482Z Aug 10 03:39:22at 
java.lang.Class.newInstance(Class.java:442)
2022-08-10T03:39:22.6615527Z Aug 10 03:39:22at 
org.mockito.internal.configuration.plugins.PluginInitializer.loadImpl(PluginInitializer.java:50)
2022-08-10T03:39:22.6616718Z Aug 10 03:39:22at 
org.mockito.internal.configuration.plugins.PluginLoader.loadPlugin(PluginLoader.java:63)
2022-08-10T03:39:22.6617900Z Aug 10 03:39:22at 
org.mockito.internal.configuration.plugins.PluginLoader.loadPlugin(PluginLoader.java:48)
2022-08-10T03:39:22.6619079Z Aug 10 03:39:22at 
org.mockito.internal.configuration.plugins.PluginRegistry.(PluginRegistry.java:23)
2022-08-10T03:39:22.6620206Z Aug 10 03:39:22at 
org.mockito.internal.configuration.plugins.Plugins.(Plugins.java:19)
2022-08-10T03:39:22.6621435Z Aug 10 03:39:22at 
org.mockito.internal.util.MockUtil.(MockUtil.java:24)
2022-08-10T03:39:22.6622523Z Aug 10 03:39:22at 
org.mockito.internal.util.MockCreationValidator.validateType(MockCreationValidator.java:22)
2022-08-10T03:39:22.6623890Z Aug 10 03:39:22at 
org.mockito.internal.creation.MockSettingsImpl.validatedSettings(MockSettingsImpl.java:250)
2022-08-10T03:39:22.6625051Z Aug 10 03:39:22at 
org.mockito.internal.creation.MockSettingsImpl.build(MockSettingsImpl.java:232)
2022-08-10T03:39:22.6626085Z Aug 10 03:39:22at 

[jira] [Created] (FLINK-28900) RecreateOnResetOperatorCoordinatorTest compile failed in jdk11

2022-08-10 Thread Xingbo Huang (Jira)
Xingbo Huang created FLINK-28900:


 Summary: RecreateOnResetOperatorCoordinatorTest compile failed in 
jdk11
 Key: FLINK-28900
 URL: https://issues.apache.org/jira/browse/FLINK-28900
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.16.0
Reporter: Xingbo Huang
 Fix For: 1.16.0



{code:java}
2022-08-10T00:19:25.3221073Z [ERROR] COMPILATION ERROR : 
2022-08-10T00:19:25.3221634Z [INFO] 
-
2022-08-10T00:19:25.3222878Z [ERROR] 
/__w/1/s/flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/RecreateOnResetOperatorCoordinatorTest.java:[241,58]
 method containsExactly in class 
org.assertj.core.api.AbstractIterableAssert 
cannot be applied to given types;
2022-08-10T00:19:25.3223786Z   required: capture#1 of ? extends 
java.lang.Integer[]
2022-08-10T00:19:25.3224245Z   found: int
2022-08-10T00:19:25.3224684Z   reason: varargs mismatch; int cannot be 
converted to capture#1 of ? extends java.lang.Integer
2022-08-10T00:19:25.3225128Z [INFO] 1 error
{code}
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39795=logs=946871de-358d-5815-3994-8175615bc253=e0240c62-4570-5d1c-51af-dd63d2093da1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-28899) Fix LOOKUP hint with retry option on async lookup mode

2022-08-10 Thread lincoln lee (Jira)
lincoln lee created FLINK-28899:
---

 Summary: Fix LOOKUP hint with retry option on async lookup mode
 Key: FLINK-28899
 URL: https://issues.apache.org/jira/browse/FLINK-28899
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: lincoln lee


Fix the broken path for async lookup with retry options in  LOOKUP hint.

Will use a `RetryableAsyncLookupFunctionDelegator` instead of 
`AsyncWaitOperator`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[ANNOUNCE] Flink 1.16 Feature Freeze

2022-08-10 Thread godfrey he
Hi everyone,

The deadline for merging new features for Flink 1.16 has passed.

* From now on, only bug-fixes and documentation fixes / improvements are
allowed to be merged into the master branch.

* New features merged after this point can be reverted. If you need an
exception to this rule, please open a discussion on dev@ list and reach out
to us.

We plan to wait for the master branch to get a bit more stabilized before
cutting the "release-1.16" branch, in order to reduce the overhead of
having to manage two branches. That also means potentially delaying merging
new features for Flink 1.17 into the master branch. If you are blocked on
this, please let us know and we can come up with a compromise for the
branch cutting time.

What you can do to help with the release testing phase:

* The first release testing sync will be on *the 16th of August at 9am
CEST / 3pm China Standard Time*.
Everyone is welcome to join. The link can be found on the release wiki page
[1].

* Please prepare for the release testing by creating Jira tickets for
documentation and testing tasks for the new features under the
umbrella issue[2].
Tickets should be opened with Priority Blocker, FixVersion 1.16.0 and Label
release-testing (testing tasks only). It is greatly appreciated if
you can help to verify the new features.

* There are currently 55 test-stability issues affecting the 1.16.0 release
[3]. It is also greatly appreciated if you can help address some of them.

Best,
Martijn, Chesnay, Xingbo & Godfrey

[1] https://cwiki.apache.org/confluence/display/FLINK/1.16+Release
[2] https://issues.apache.org/jira/browse/FLINK-28896
[3] https://issues.apache.org/jira/issues/?filter=12352149