Re: [ANNOUNCE] New Apache Flink Committer - Zhongqiang Gong

2024-06-17 Thread Jacky Lau
Congratulations Zhongqiang!

Best Regards
Jacky Lau

Yubin Li  于2024年6月18日周二 00:28写道:

> Congratulations Zhongqiang!
>
> Best Regards
> Yubin
>
> On Tue, Jun 18, 2024 at 12:21 AM Muhammet Orazov
>  wrote:
> >
> > Congratulations Zhongqiang! Well deserved!
> >
> > Best,
> > Muhammet
> >
> >
>


Re: [ANNOUNCE] New Apache Flink Committer - Hang Ruan

2024-06-17 Thread Jacky Lau
Congratulations Hang!

Best Regards
Jacky Lau

Yubin Li  于2024年6月18日周二 00:36写道:

> Congratulations Hang!
>
> Best Regards
> Yubin
>
> On Tue, Jun 18, 2024 at 12:23 AM Muhammet Orazov
>  wrote:
> >
> > Congratulations Hang! Well deserved!
> >
> > Best,
> > Muhammet
>


Re: [ANNOUNCE] New Apache Flink PMC Member - Fan Rui

2024-06-10 Thread Jacky Lau
Congratulations Rui, well deserved!

Regards,
Jacky Lau

Jeyhun Karimov 于2024年6月11日 周二03:49写道:

> Congratulations Rui, well deserved!
>
> Regards,
> Jeyhun
>
> On Mon, Jun 10, 2024, 10:21 Ahmed Hamdy  wrote:
>
> > Congratulations Rui!
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Mon, 10 Jun 2024 at 09:10, David Radley 
> > wrote:
> >
> > > Congratulations, Rui!
> > >
> > > From: Sergey Nuyanzin 
> > > Date: Sunday, 9 June 2024 at 20:33
> > > To: dev@flink.apache.org 
> > > Subject: [EXTERNAL] Re: [ANNOUNCE] New Apache Flink PMC Member - Fan
> Rui
> > > Congratulations, Rui!
> > >
> > > On Fri, Jun 7, 2024 at 5:36 AM Xia Sun  wrote:
> > >
> > > > Congratulations, Rui!
> > > >
> > > > Best,
> > > > Xia
> > > >
> > > > Paul Lam  于2024年6月6日周四 11:59写道:
> > > >
> > > > > Congrats, Rui!
> > > > >
> > > > > Best,
> > > > > Paul Lam
> > > > >
> > > > > > 2024年6月6日 11:02,Junrui Lee  写道:
> > > > > >
> > > > > > Congratulations, Rui.
> > > > > >
> > > > > > Best,
> > > > > > Junrui
> > > > > >
> > > > > > Hang Ruan  于2024年6月6日周四 10:35写道:
> > > > > >
> > > > > >> Congratulations, Rui!
> > > > > >>
> > > > > >> Best,
> > > > > >> Hang
> > > > > >>
> > > > > >> Samrat Deb  于2024年6月6日周四 10:28写道:
> > > > > >>
> > > > > >>> Congratulations Rui
> > > > > >>>
> > > > > >>> Bests,
> > > > > >>> Samrat
> > > > > >>>
> > > > > >>> On Thu, 6 Jun 2024 at 7:45 AM, Yuxin Tan <
> tanyuxinw...@gmail.com
> > >
> > > > > wrote:
> > > > > >>>
> > > > > >>>> Congratulations, Rui!
> > > > > >>>>
> > > > > >>>> Best,
> > > > > >>>> Yuxin
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> Xuannan Su  于2024年6月6日周四 09:58写道:
> > > > > >>>>
> > > > > >>>>> Congratulations!
> > > > > >>>>>
> > > > > >>>>> Best regards,
> > > > > >>>>> Xuannan
> > > > > >>>>>
> > > > > >>>>> On Thu, Jun 6, 2024 at 9:53 AM Hangxiang Yu <
> > master...@gmail.com
> > > >
> > > > > >>> wrote:
> > > > > >>>>>>
> > > > > >>>>>> Congratulations, Rui !
> > > > > >>>>>>
> > > > > >>>>>> On Thu, Jun 6, 2024 at 9:18 AM Lincoln Lee <
> > > > lincoln.8...@gmail.com
> > > > > >>>
> > > > > >>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Congratulations, Rui!
> > > > > >>>>>>>
> > > > > >>>>>>> Best,
> > > > > >>>>>>> Lincoln Lee
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> Lijie Wang  于2024年6月6日周四
> 09:11写道:
> > > > > >>>>>>>
> > > > > >>>>>>>> Congratulations, Rui!
> > > > > >>>>>>>>
> > > > > >>>>>>>> Best,
> > > > > >>>>>>>> Lijie
> > > > > >>>>>>>>
> > > > > >>>>>>>> Rodrigo Meneses  于2024年6月5日周三
> 21:35写道:
> > > > > >>>>>>>>
> > > > > >>>>>>>>> All the best
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Wed, Jun 5, 2024 at 5:56 AM xiangyu feng <
> > > > > >>>> xiangyu...@gmail.com>
> > > > > >>>>>>>> wrote:
> > > > > >>

Re: [ANNOUNCE] New Apache Flink PMC Member - Lincoln Lee

2024-04-15 Thread Jacky Lau
Congratulations, Lincoln!

Best,
Jacky Lau

Jinzhong Li  于2024年4月15日周一 15:45写道:

> Congratulations, Lincoln!
>
> Best,
> Jinzhong Li
>
> On Mon, Apr 15, 2024 at 2:56 PM Hangxiang Yu  wrote:
>
> > Congratulations, Lincoln!
> >
> > On Mon, Apr 15, 2024 at 10:17 AM Zakelly Lan 
> > wrote:
> >
> > > Congratulations, Lincoln!
> > >
> > >
> > > Best,
> > > Zakelly
> > >
> > > On Sat, Apr 13, 2024 at 12:48 AM Ferenc Csaky
>  > >
> > > wrote:
> > >
> > > > Congratulations, Lincoln!
> > > >
> > > > Best,
> > > > Ferenc
> > > >
> > > >
> > > >
> > > >
> > > > On Friday, April 12th, 2024 at 15:54, lorenzo.affe...@ververica.com
> > > .INVALID
> > > >  wrote:
> > > >
> > > > >
> > > > >
> > > > > Huge congrats! Well done!
> > > > > On Apr 12, 2024 at 13:56 +0200, Ron liu ron9@gmail.com, wrote:
> > > > >
> > > > > > Congratulations, Lincoln!
> > > > > >
> > > > > > Best,
> > > > > > Ron
> > > > > >
> > > > > > Junrui Lee jrlee@gmail.com 于2024年4月12日周五 18:54写道:
> > > > > >
> > > > > > > Congratulations, Lincoln!
> > > > > > >
> > > > > > > Best,
> > > > > > > Junrui
> > > > > > >
> > > > > > > Aleksandr Pilipenko z3d...@gmail.com 于2024年4月12日周五 18:29写道:
> > > > > > >
> > > > > > > > > Congratulations, Lincoln!
> > > > > > > > >
> > > > > > > > > Best Regards
> > > > > > > > > Aleksandr
> > > >
> > >
> >
> >
> > --
> > Best,
> > Hangxiang.
> >
>


Re: [ANNOUNCE] New Apache Flink PMC Member - Jing Ge

2024-04-15 Thread Jacky Lau
Congratulations, Jing!

Best,
Jacky Lau


[jira] [Created] (FLINK-35075) Migrate TwoStageOptimizedAggregateRule

2024-04-10 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-35075:
-

 Summary: Migrate TwoStageOptimizedAggregateRule
 Key: FLINK-35075
 URL: https://issues.apache.org/jira/browse/FLINK-35075
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35062) Migrate RewriteMultiJoinConditionRule

2024-04-09 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-35062:
-

 Summary: Migrate RewriteMultiJoinConditionRule
 Key: FLINK-35062
 URL: https://issues.apache.org/jira/browse/FLINK-35062
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35061) Migrate TemporalJoinUtil

2024-04-09 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-35061:
-

 Summary: Migrate TemporalJoinUtil
 Key: FLINK-35061
 URL: https://issues.apache.org/jira/browse/FLINK-35061
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35054) Migrate TemporalJoinRewriteWithUniqueKeyRule

2024-04-08 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-35054:
-

 Summary: Migrate TemporalJoinRewriteWithUniqueKeyRule
 Key: FLINK-35054
 URL: https://issues.apache.org/jira/browse/FLINK-35054
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Apache Flink 1.19.0 released

2024-03-20 Thread Jacky Lau
Congratulations!

Best,
Jacky Lau

Hongshun Wang  于2024年3月21日周四 10:32写道:

> Congratulations!
>
> Best,
> Hongshun
>
> On Tue, Mar 19, 2024 at 3:12 PM Shawn Huang  wrote:
>
> > Congratulations!
> >
> > Best,
> > Shawn Huang
> >
> >
> > Xuannan Su  于2024年3月19日周二 14:40写道:
> >
> > > Congratulations! Thanks for all the great work!
> > >
> > > Best regards,
> > > Xuannan
> > >
> > > On Tue, Mar 19, 2024 at 1:31 PM Yu Li  wrote:
> > > >
> > > > Congrats and thanks all for the efforts!
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > > On Tue, 19 Mar 2024 at 11:51, gongzhongqiang <
> > gongzhongqi...@apache.org>
> > > wrote:
> > > > >
> > > > > Congrats! Thanks to everyone involved!
> > > > >
> > > > > Best,
> > > > > Zhongqiang Gong
> > > > >
> > > > > Lincoln Lee  于2024年3月18日周一 16:27写道:
> > > > >>
> > > > >> The Apache Flink community is very happy to announce the release
> of
> > > Apache
> > > > >> Flink 1.19.0, which is the fisrt release for the Apache Flink 1.19
> > > series.
> > > > >>
> > > > >> Apache Flink® is an open-source stream processing framework for
> > > > >> distributed, high-performing, always-available, and accurate data
> > > streaming
> > > > >> applications.
> > > > >>
> > > > >> The release is available for download at:
> > > > >> https://flink.apache.org/downloads.html
> > > > >>
> > > > >> Please check out the release blog post for an overview of the
> > > improvements
> > > > >> for this bugfix release:
> > > > >>
> > >
> >
> https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
> > > > >>
> > > > >> The full release notes are available in Jira:
> > > > >>
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
> > > > >>
> > > > >> We would like to thank all contributors of the Apache Flink
> > community
> > > who
> > > > >> made this release possible!
> > > > >>
> > > > >>
> > > > >> Best,
> > > > >> Yun, Jing, Martijn and Lincoln
> > >
> >
>


[jira] [Created] (FLINK-34702) Rank should not convert to StreamExecDuplicate when the input is not insert only

2024-03-17 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34702:
-

 Summary: Rank should not convert to StreamExecDuplicate when the 
input is not insert only
 Key: FLINK-34702
 URL: https://issues.apache.org/jira/browse/FLINK-34702
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0


{code:java}
@Test
def testSimpleFirstRowOnBuiltinProctime1(): Unit = {
  val sqlQuery =
"""
  |SELECT *
  |FROM (
  |  SELECT *,
  |ROW_NUMBER() OVER (PARTITION BY a ORDER BY PROCTIME() ASC) as rowNum
  |  FROM (select a, count(b) as b from MyTable group by a)
  |)
  |WHERE rowNum = 1
""".stripMargin

  util.verifyExecPlan(sqlQuery)
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34665) Add streaming rule for union to Rand and it convert to StreamExecDeduplicate finally

2024-03-14 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34665:
-

 Summary: Add streaming rule for union to Rand and it convert to 
StreamExecDeduplicate finally
 Key: FLINK-34665
 URL: https://issues.apache.org/jira/browse/FLINK-34665
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0


The semantics of a union in SQL involves deduplication, and in Calcite, when 
converting a SQL node to a RelNode, a Distinct Aggregate is inserted above the 
Union to achieve this deduplication. In Flink, the Distinct Aggregate 
eventually gets converted into a StreamExecGroupAggregate operator. This 
operator accesses the state multiple times, and from our observations of 
numerous jobs, we can see that the stack often gets stuck at state access. This 
is because the key for the distinct aggregate is all the fields of the union, 
meaning that for the state, the key will be relatively large, and repeated 
access and comparisons to the state can be time-consuming.

In fact, a potential optimization is to add a rule to convert the Union into a 
Rank with processing time, which then ultimately gets converted into a 
StreamExecDeduplicate. Currently, we have users rewrite their SQL to use 
Row_number for deduplication, and this approach works very well. Therefore, it 
is possible to add a rule at the engine level to support this optimization.

 

and it will break the change of plan, it will cause user upgrade flink version 
failed. so i suggest add a flag.default value is not change the behavior



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34650) Migrate PushProjectIntoLegacyTableSourceScanRule

2024-03-11 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34650:
-

 Summary: Migrate PushProjectIntoLegacyTableSourceScanRule
 Key: FLINK-34650
 URL: https://issues.apache.org/jira/browse/FLINK-34650
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34649) Migrate PushFilterIntoLegacyTableSourceScanRule

2024-03-11 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34649:
-

 Summary: Migrate PushFilterIntoLegacyTableSourceScanRule
 Key: FLINK-34649
 URL: https://issues.apache.org/jira/browse/FLINK-34649
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34616) python dist doesn't clean when open method construct resource

2024-03-07 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34616:
-

 Summary: python dist doesn't clean when open method construct 
resource
 Key: FLINK-34616
 URL: https://issues.apache.org/jira/browse/FLINK-34616
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34612) Migrate StreamPhysicalValuesRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34612:
-

 Summary: Migrate StreamPhysicalValuesRule
 Key: FLINK-34612
 URL: https://issues.apache.org/jira/browse/FLINK-34612
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34611) Migrate StreamPhysicalUnionRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34611:
-

 Summary: Migrate StreamPhysicalUnionRule
 Key: FLINK-34611
 URL: https://issues.apache.org/jira/browse/FLINK-34611
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34609) Migrate BatchPhysicalScriptTransformRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34609:
-

 Summary: Migrate BatchPhysicalScriptTransformRule
 Key: FLINK-34609
 URL: https://issues.apache.org/jira/browse/FLINK-34609
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34610) Migrate StreamPhysicalLimitRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34610:
-

 Summary: Migrate StreamPhysicalLimitRule
 Key: FLINK-34610
 URL: https://issues.apache.org/jira/browse/FLINK-34610
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34608) Migrate BatchPhysicalValuesRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34608:
-

 Summary: Migrate BatchPhysicalValuesRule
 Key: FLINK-34608
 URL: https://issues.apache.org/jira/browse/FLINK-34608
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34607) Migrate BatchPhysicalUnionRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34607:
-

 Summary: Migrate BatchPhysicalUnionRule
 Key: FLINK-34607
 URL: https://issues.apache.org/jira/browse/FLINK-34607
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34606) Migrate BatchPhysicalExpandRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34606:
-

 Summary: Migrate BatchPhysicalExpandRule
 Key: FLINK-34606
 URL: https://issues.apache.org/jira/browse/FLINK-34606
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34605) Migrate BatchPhysicalDistributionRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34605:
-

 Summary: Migrate BatchPhysicalDistributionRule
 Key: FLINK-34605
 URL: https://issues.apache.org/jira/browse/FLINK-34605
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34603) Migrate BatchPhysicalBoundedStreamScanRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34603:
-

 Summary: Migrate BatchPhysicalBoundedStreamScanRule
 Key: FLINK-34603
 URL: https://issues.apache.org/jira/browse/FLINK-34603
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34604) Migrate BatchPhysicalCalcRule

2024-03-06 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34604:
-

 Summary: Migrate BatchPhysicalCalcRule
 Key: FLINK-34604
 URL: https://issues.apache.org/jira/browse/FLINK-34604
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34556) Migrate EnumerableToLogicalTableScan

2024-02-29 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34556:
-

 Summary: Migrate EnumerableToLogicalTableScan
 Key: FLINK-34556
 URL: https://issues.apache.org/jira/browse/FLINK-34556
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34555) Migrate JoinConditionTypeCoerceRule

2024-02-29 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34555:
-

 Summary: Migrate JoinConditionTypeCoerceRule
 Key: FLINK-34555
 URL: https://issues.apache.org/jira/browse/FLINK-34555
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34505) Migrate WindowGroupReorderRule

2024-02-23 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34505:
-

 Summary: Migrate WindowGroupReorderRule
 Key: FLINK-34505
 URL: https://issues.apache.org/jira/browse/FLINK-34505
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34503) Migrate JoinDeriveNullFilterRule

2024-02-22 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34503:
-

 Summary: Migrate JoinDeriveNullFilterRule
 Key: FLINK-34503
 URL: https://issues.apache.org/jira/browse/FLINK-34503
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34494) Migrate ReplaceIntersectWithSemiJoinRule

2024-02-21 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34494:
-

 Summary: Migrate ReplaceIntersectWithSemiJoinRule
 Key: FLINK-34494
 URL: https://issues.apache.org/jira/browse/FLINK-34494
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34493) Migrate ReplaceMinusWithAntiJoinRule

2024-02-21 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34493:
-

 Summary: Migrate ReplaceMinusWithAntiJoinRule
 Key: FLINK-34493
 URL: https://issues.apache.org/jira/browse/FLINK-34493
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34492) fix scala style comment link when migrate scala to java

2024-02-21 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34492:
-

 Summary: fix scala style comment link when migrate scala to java
 Key: FLINK-34492
 URL: https://issues.apache.org/jira/browse/FLINK-34492
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34481) Migrate SetOpRewriteUtil

2024-02-21 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34481:
-

 Summary: Migrate SetOpRewriteUtil
 Key: FLINK-34481
 URL: https://issues.apache.org/jira/browse/FLINK-34481
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0


we should Migrate SetOpRewriteUtil for 
ReplaceMinusWithAntiJoinRule ReplaceMinusWithAntiJoinRule
RewriteIntersectAllRule
RewriteMinusAllRule



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34473) Migrate FlinkPruneEmptyRules

2024-02-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34473:
-

 Summary: Migrate FlinkPruneEmptyRules
 Key: FLINK-34473
 URL: https://issues.apache.org/jira/browse/FLINK-34473
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.20.0
Reporter: Jacky Lau
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34033) flink json supports raw type

2024-01-08 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-34033:
-

 Summary: flink json supports raw type 
 Key: FLINK-34033
 URL: https://issues.apache.org/jira/browse/FLINK-34033
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0


when user use es type nested/object, user can using complex type like 
ROW/ARRAY/MAP.

but it will not convenient when the object type is not fixed size like ROW. for 
example, user my user udf to produce those data and insert to es. we can 
supports RAW type
{code:java}
CREATE TABLE es_sink (
 `string` VARCHAR, 
  nested RAW('java.lang.Object', 
'AEdvcmcuYXBhY2hlLmZsaW5rLmFwaS5qYXZhLnR5cGV1dGlscy5ydW50aW1lLmtyeW8uS3J5b1NlcmlhbGl6ZXJTbmFwc2hvdAIAEGphdmEubGFuZy5PYmplY3QAAATyxpo9cAIAEGphdmEubGFuZy5PYmplY3QBEgAQamF2YS5sYW5nLk9iamVjdAEWABBqYXZhLmxhbmcuT2JqZWN0AAApb3JnLmFwYWNoZS5hdnJvLmdlbmVyaWMuR2VuZXJpY0RhdGEkQXJyYXkBKwApb3JnLmFwYWNoZS5hdnJvLmdlbmVyaWMuR2VuZXJpY0RhdGEkQXJyYXkBtgBVb3JnLmFwYWNoZS5mbGluay5hcGkuamF2YS50eXBldXRpbHMucnVudGltZS5rcnlvLlNlcmlhbGl6ZXJzJER1bW15QXZyb1JlZ2lzdGVyZWRDbGFzcwEAWW9yZy5hcGFjaGUuZmxpbmsuYXBpLmphdmEudHlwZXV0aWxzLnJ1bnRpbWUua3J5by5TZXJpYWxpemVycyREdW1teUF2cm9LcnlvU2VyaWFsaXplckNsYXNzAAAE8saaPXAE8saaPXAA'),
  object RAW('java.lang.Object', 
'AEdvcmcuYXBhY2hlLmZsaW5rLmFwaS5qYXZhLnR5cGV1dGlscy5ydW50aW1lLmtyeW8uS3J5b1NlcmlhbGl6ZXJTbmFwc2hvdAIAEGphdmEubGFuZy5PYmplY3QAAATyxpo9cAIAEGphdmEubGFuZy5PYmplY3QBEgAQamF2YS5sYW5nLk9iamVjdAEWABBqYXZhLmxhbmcuT2JqZWN0AAApb3JnLmFwYWNoZS5hdnJvLmdlbmVyaWMuR2VuZXJpY0RhdGEkQXJyYXkBKwApb3JnLmFwYWNoZS5hdnJvLmdlbmVyaWMuR2VuZXJpY0RhdGEkQXJyYXkBtgBVb3JnLmFwYWNoZS5mbGluay5hcGkuamF2YS50eXBldXRpbHMucnVudGltZS5rcnlvLlNlcmlhbGl6ZXJzJER1bW15QXZyb1JlZ2lzdGVyZWRDbGFzcwEAWW9yZy5hcGFjaGUuZmxpbmsuYXBpLmphdmEudHlwZXV0aWxzLnJ1bnRpbWUua3J5by5TZXJpYWxpemVycyREdW1teUF2cm9LcnlvU2VyaWFsaXplckNsYXNzAAAE8saaPXAE8saaPXAA'),
  PRIMARY KEY (`string`) NOT ENFORCED
) WITH
('connector'='elasticsearch'); {code}
and es is dependent on flink-json currently, so we can make flink-json supports 
RAW type



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33951) should use aggCallNeedRetractions instead of needRetraction to check retract method, orelse throw not implement retract method

2023-12-27 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33951:
-

 Summary: should use aggCallNeedRetractions instead of 
needRetraction to check retract method, orelse throw not implement retract 
method
 Key: FLINK-33951
 URL: https://issues.apache.org/jira/browse/FLINK-33951
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33950) Update max aggregate functions to new type system

2023-12-27 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33950:
-

 Summary: Update max aggregate functions to new type system
 Key: FLINK-33950
 URL: https://issues.apache.org/jira/browse/FLINK-33950
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33948) Cleanup usage of org.apache.flink.table.api.dataview.MapView#MapView

2023-12-26 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33948:
-

 Summary: Cleanup usage of 
org.apache.flink.table.api.dataview.MapView#MapView
 Key: FLINK-33948
 URL: https://issues.apache.org/jira/browse/FLINK-33948
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33945) Cleanup usage of deprecated org.apache.flink.table.api.dataview.ListView#ListView

2023-12-26 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33945:
-

 Summary: Cleanup usage of deprecated 
org.apache.flink.table.api.dataview.ListView#ListView
 Key: FLINK-33945
 URL: https://issues.apache.org/jira/browse/FLINK-33945
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33896) Implement restore tests for Correlate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33896:
-

 Summary: Implement restore tests for Correlate node
 Key: FLINK-33896
 URL: https://issues.apache.org/jira/browse/FLINK-33896
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33895) Implement restore tests for PythonGroupWindowAggregate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33895:
-

 Summary: Implement restore tests for PythonGroupWindowAggregate 
node
 Key: FLINK-33895
 URL: https://issues.apache.org/jira/browse/FLINK-33895
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33894) Implement restore tests for PythonGroupAggregate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33894:
-

 Summary: Implement restore tests for PythonGroupAggregate node
 Key: FLINK-33894
 URL: https://issues.apache.org/jira/browse/FLINK-33894
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33893) Implement restore tests for PythonCorrelate node

2023-12-20 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33893:
-

 Summary: Implement restore tests for PythonCorrelate node
 Key: FLINK-33893
 URL: https://issues.apache.org/jira/browse/FLINK-33893
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33781) Cleanup usage of deprecated org.apache.flink.table.api.TableConfig#ctor()

2023-12-08 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33781:
-

 Summary: Cleanup usage of deprecated 
org.apache.flink.table.api.TableConfig#ctor()
 Key: FLINK-33781
 URL: https://issues.apache.org/jira/browse/FLINK-33781
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33779) Cleanup usage of deprecated BaseExpressions#cast(TypeInformation)

2023-12-08 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33779:
-

 Summary: Cleanup usage of deprecated 
BaseExpressions#cast(TypeInformation)
 Key: FLINK-33779
 URL: https://issues.apache.org/jira/browse/FLINK-33779
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33778) Cleanup usage of deprecated TableConfig#setIdleStateRetentionTime(Time, Time)

2023-12-07 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33778:
-

 Summary: Cleanup usage of deprecated 
TableConfig#setIdleStateRetentionTime(Time, Time)
 Key: FLINK-33778
 URL: https://issues.apache.org/jira/browse/FLINK-33778
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33546) Implement type inference for order(asc/desc) function

2023-11-14 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33546:
-

 Summary: Implement type inference for order(asc/desc) function
 Key: FLINK-33546
 URL: https://issues.apache.org/jira/browse/FLINK-33546
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33518) Implement restore tests for WatermarkAssigner node

2023-11-10 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33518:
-

 Summary: Implement restore tests for WatermarkAssigner node
 Key: FLINK-33518
 URL: https://issues.apache.org/jira/browse/FLINK-33518
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33517) Implement restore tests for Value node

2023-11-10 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-33517:
-

 Summary: Implement restore tests for Value node
 Key: FLINK-33517
 URL: https://issues.apache.org/jira/browse/FLINK-33517
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Jacky Lau
 Fix For: 1.19.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32721) agg supports char type

2023-07-31 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-32721:
-

 Summary: agg supports char type
 Key: FLINK-32721
 URL: https://issues.apache.org/jira/browse/FLINK-32721
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Jacky Lau
 Fix For: 1.18.0


{code:java}
// flink
Flink SQL> CREATE TABLE Orders (
>     name char(10),
>     price        DECIMAL(32,2),
>     buyer        ROW,
>     order_time   TIMESTAMP(3)
> ) WITH (
>   'connector' = 'datagen'
> );
[INFO] Execute statement succeed.


Flink SQL> select max(name) from Orders;
[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.TableException: Max aggregate function does not 
support type: ''CHAR''.
Please re-check the data type. {code}
{code:java}
// mysql

CREATE TABLE IF NOT EXISTS `docs` (
  `id` int(6) unsigned NOT NULL,
  `rev` int(3) unsigned NOT NULL,
  `content` char(200) NOT NULL,
  PRIMARY KEY (`id`,`rev`)
) DEFAULT CHARSET=utf8;
INSERT INTO `docs` (`id`, `rev`, `content`) VALUES
  ('1', '1', 'The earth is flat'),
  ('2', '1', 'One hundred angels can dance on the head of a pin'),
  ('1', '2', 'The earth is flat and rests on a bull\'s horn'),
  ('1', '3', 'The earth is like a ball.');

select max(content) from docs;

// result 
|max(content)|
The earth is like a ball.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32699) select typeof(proctime()); throw exception in sql-client

2023-07-26 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-32699:
-

 Summary: select typeof(proctime()); throw exception in sql-client
 Key: FLINK-32699
 URL: https://issues.apache.org/jira/browse/FLINK-32699
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Jacky Lau
 Fix For: 1.18.0


{code:java}
Flink SQL> select typeof(proctime()); 
 
[ERROR] Could not execute SQL statement. Reason: 
org.apache.flink.table.planner.codegen.CodeGenException: Mismatch of function's 
argument data type 'TIMESTAMP_LTZ(3) NOT NULL' and actual argument type 
'TIMESTAMP_LTZ(3)'
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32641) json format supports pojo type

2023-07-21 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-32641:
-

 Summary: json format supports pojo type
 Key: FLINK-32641
 URL: https://issues.apache.org/jira/browse/FLINK-32641
 Project: Flink
  Issue Type: Improvement
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.18.0
Reporter: Jacky Lau
 Fix For: 1.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32530) array_position semantic should align with array_contains instead of spark

2023-07-04 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-32530:
-

 Summary: array_position semantic should align with array_contains 
instead of spark
 Key: FLINK-32530
 URL: https://issues.apache.org/jira/browse/FLINK-32530
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Jacky Lau
 Fix For: 1.18.0


when i supports array_contains to calcite 
https://issues.apache.org/jira/browse/CALCITE-5707 i found the spark and 
flink's behavior is different.
{code:java}
spark: array_contains(array[1, null], null) -> null
flink: array_contains(array[1, null], null) -> true  {code}
so array_remove is also different(the array_remove is  supported by me, which 
aligns with flink).
{code:java}
spark: array_remove(array[1, null], null) -> null 
flink: array_remove(array[1, null], null) -> 1  {code}
while array_position is align with spark, i think it is not correct.
{code:java}
spark: array_position(array[1, null], null) -> 2 
flink: array_position(array[1, null], null) -> 2   {code}
and i test on postgre which is also 2
{code:java}
postgre:
select array_position(ARRAY[1, null], null); {code}
so the semantic should only follow one way to handle null element. so i think 
it should be changed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32498) array_max return type should always nullable

2023-06-30 Thread Jacky Lau (Jira)
Jacky Lau created FLINK-32498:
-

 Summary: array_max return type should always nullable
 Key: FLINK-32498
 URL: https://issues.apache.org/jira/browse/FLINK-32498
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.18.0
Reporter: Jacky Lau
 Fix For: 1.18.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Re: [VOTE] FLIP-302: Support TRUNCATE TABLE statement in batch mode

2023-04-17 Thread Jacky Lau
+1 (no-binding)

Best,
Jacky Lau

Jingsong Li  于2023年4月18日周二 11:57写道:

> +1
>
> On Tue, Apr 18, 2023 at 9:39 AM Aitozi  wrote:
> >
> > +1
> >
> > Best,
> > Aitozi
> >
> > ron  于2023年4月18日周二 09:18写道:
> > >
> > > +1
> > >
> > >
> > > > -原始邮件-
> > > > 发件人: "Lincoln Lee" 
> > > > 发送时间: 2023-04-18 09:08:08 (星期二)
> > > > 收件人: dev@flink.apache.org
> > > > 抄送:
> > > > 主题: Re: [VOTE] FLIP-302: Support TRUNCATE TABLE statement in batch
> mode
> > > >
> > > > +1 (binding)
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > >
> > > > yuxia  于2023年4月17日周一 23:54写道:
> > > >
> > > > > Hi all.
> > > > >
> > > > > Thanks for all the feedback on FLIP-302: Support TRUNCATE TABLE
> statement
> > > > > in batch mode [1].
> > > > > Based on the discussion [2], we have come to a consensus, so I
> would like
> > > > > to start a vote.
> > > > >
> > > > > The vote will last for at least 72 hours unless there is an
> objection or
> > > > > insufficient votes.
> > > > >
> > > > > [1]:
> > > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement+in+batch+mode
> > > > > [2]: [
> https://lists.apache.org/thread/m4r3wrd7p96wdst3nz3ncqzog6kf51cf |
> > > > > https://lists.apache.org/thread/m4r3wrd7p96wdst3nz3ncqzog6kf51cf ]
> > > > >
> > > > >
> > > > > Best regards,
> > > > > Yuxia
> > > > >
> > >
> > >
> > > --
> > > Best,
> > > Ron
>


Re: [DISCUSS] Add support for Apache Arrow format

2023-04-11 Thread Jacky Lau
Hi
   I also think arrow format  will be useful when reading/writing with
message queue.
   Arrow defines a language-independent columnar memory format for flat and
hierarchical data, organized for efficient analytic operations on modern
hardware like CPUs and GPUs. The Arrow memory format also supports
zero-copy reads for lightning-fast data access without serialization
overhead. it will bring a lot.
   And we  may do some surveys, what other engines support like
spark/hive/presto and so on, how that supports and how it be used.

   Best,
   Jacky.

Aitozi  于2023年4月2日周日 22:22写道:

> Hi all,
> Thanks for your input.
>
> @Ran > However, as mentioned in the issue you listed, it may take a lot of
> work
> and the community's consideration for integrating Arrow.
>
> To clarify, this proposal solely aims to introduce flink-arrow as a new
> format,
> similar to flink-csv and flink-protobuf. It will not impact the internal
> data
> structure representation in Flink. For proof of concept, please refer to:
> https://github.com/Aitozi/flink/commits/arrow-format.
>
> @Martijn > I'm wondering if there's really much benefit for the Flink
> project to
> add another file format, over properly supporting the format that we
> already
> have in the project.
>
> Maintain the format we already have and introduce new formats should be
> orthogonal. The requirement of supporting arrow format originally observed
> in
> our internal usage to deserialize the data(VectorSchemaRoot) from other
> storage
> systems to flink internal RowData and serialize the flink internal RowData
> to
> VectorSchemaRoot out to the storage system.  And the requirement from the
> slack[1] is to support the arrow file format. Although, Arrow is not
> usually
> used as the final disk storage format.  But it has a tendency to be used
> as the
> inter-exchange format between different systems or temporary storage for
> analysis due to its columnar format and can be memory mapped to other
> analysis
> programs.
>
> So, I think it's meaningful to support arrow formats in Flink.
>
> @Jim >  If the Flink format interface is used there, then it may be useful
> to
> consider Arrow along with other columnar formats.
>
> I am not well-versed with the formats utilized in Paimon. Upon checking
> [2], it
> appears that Paimon does not directly employ flink formats. Instead, it
> utilizes
> FormatWriterFactory and FormatReaderFactory to handle data serialization
> and
> deserialization. Therefore, I believe that the current work may not be
> applicable for reuse in Paimon at this time.
>
> Best,
> Aitozi.
>
> [1]: https://apache-flink.slack.com/archives/C03GV7L3G2C/p1677915016551629
> [2]:
> https://github.com/apache/incubator-paimon/tree/master/paimon-format/src/main/java/org/apache/paimon/format
>
> Jim Hughes  于2023年3月31日周五 00:36写道:
> >
> > Hi all,
> >
> > How do Flink formats relate to or interact with Paimon (formerly
> > Flink-Table-Store)?  If the Flink format interface is used there, then it
> > may be useful to consider Arrow along with other columnar formats.
> >
> > Separately, from previous experience, I've seen the Arrow format be
> useful
> > as an output format for clients to read efficiently.  Arrow does support
> > returning batches of records, so there may be some options to use the
> > format in a streaming situation where a sufficient collection of records
> > can be gathered.
> >
> > Cheers,
> >
> > Jim
> >
> >
> >
> > On Thu, Mar 30, 2023 at 8:32 AM Martijn Visser  >
> > wrote:
> >
> > > Hi,
> > >
> > > To be honest, I haven't seen that much demand for supporting the Arrow
> > > format directly in Flink as a flink-format. I'm wondering if there's
> really
> > > much benefit for the Flink project to add another file format, over
> > > properly supporting the format that we already have in the project.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > On Thu, Mar 30, 2023 at 2:21 PM Ran Tao  wrote:
> > >
> > > > It is a good point that flink integrates apache arrow as a format.
> > > > Arrow can take advantage of SIMD-specific or vectorized
> optimizations,
> > > > which should be of great benefit to batch tasks.
> > > > However, as mentioned in the issue you listed, it may take a lot of
> work
> > > > and the community's consideration for integrating Arrow.
> > > >
> > > > I think you can try to make a simple poc for verification and some
> > > specific
> > > > plans.
> > > >
> > > >
> > > > Best Regards,
> > > > Ran Tao
> > > >
> > > >
> > > > Aitozi  于2023年3月29日周三 19:12写道:
> > > >
> > > > > Hi guys
> > > > >  I'm opening this thread to discuss supporting the Apache Arrow
> > > > format
> > > > > in Flink.
> > > > >  Arrow is a language-independent columnar memory format that
> has
> > > > become
> > > > > widely used in different systems, and It can also serve as an
> > > > > inter-exchange format between other systems.
> > > > > So, using it directly in the Flink system will be nice. We also
> > > received
> > > > > some 

Re: [ANNOUNCE] Kafka Connector Code Removal from apache/flink:main branch and code freezing

2023-03-26 Thread Jacky Lau
Hi Gordon. https://issues.apache.org/jira/browse/FLINK-31006, which is also
a critical bug in kafka. it will not exit after all partitions consumed
when jobmanager failover in pipeline mode running unbounded source. and i
talked with   @PatrickRen  offline, don't
have a suitable way to fix it before. and we will solved it in this week

Shammon FY  于2023年3月25日周六 13:13写道:

> Thanks Jing and Gordon, I have closed the pr
> https://github.com/apache/flink/pull/21965 and will open a new one for
> kafka connector
>
>
> Best,
> shammon FY
>
>
> On Saturday, March 25, 2023, Ran Tao  wrote:
>
> > Thank you Gordon and all the people who have worked on the externalized
> > kafka implementation.
> > I have another pr related to Kafka[1]. I will be very appreciative if you
> > can help me review it in your free time.
> >
> > [1] https://github.com/apache/flink-connector-kafka/pull/10
> >
> > Best Regards,
> > Ran Tao
> >
> >
> > Tzu-Li (Gordon) Tai  于2023年3月24日周五 23:21写道:
> >
> > > Thanks Jing! I missed https://github.com/apache/flink/pull/21965
> indeed.
> > >
> > > Please let us know if anything else was overlooked.
> > >
> > > On Fri, Mar 24, 2023 at 8:13 AM Jing Ge 
> > > wrote:
> > >
> > > > Thanks Gordon for driving this! There is another PR related to Kafka
> > > > connector: https://github.com/apache/flink/pull/21965
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Fri, Mar 24, 2023 at 4:06 PM Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org
> > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Now that Flink 1.17 has been released, and given that we've already
> > > > synced
> > > > > the latest Kafka connector code up to Flink 1.17 to the
> > > > > apache/flink-connector-kafka repo (thanks to Mason and Martijn for
> > most
> > > > of
> > > > > the effort!), we're now in the final step of completely removing
> the
> > > > Kafka
> > > > > connector code from apache/flink:main branch, tracked by
> FLINK-30859
> > > [1].
> > > > >
> > > > > As such, we'd like to ask that no more Kafka connector changes gets
> > > > merged
> > > > > to apache/flink:main, effective now. Going forward, all Kafka
> > connector
> > > > PRs
> > > > > should be opened directly against the apache/flink-connector-kafka:
> > main
> > > > > branch.
> > > > >
> > > > > Meanwhile, there's a couple of "dangling" Kafka connector PRs over
> > the
> > > > last
> > > > > 2 months that is opened against apache/flink:main:
> > > > >
> > > > >1. [FLINK-31305] Propagate producer exceptions outside of
> mailbox
> > > > >executor [2]
> > > > >2. [FLINK-31049] Add support for Kafka record headers to
> KafkaSink
> > > [3]
> > > > >3. [FLINK-31262] Move kafka sql connector fat jar test to
> > > > >SmokeKafkaITCase [4 ]
> > > > >4. [hotfix] Add writeTimestamp option to
> > > > >KafkaRecordSerializationSchemaBuilder [5]
> > > > >
> > > > > Apart from 1. [FLINK-31305] which is a critical bug and is already
> in
> > > > > review closed to being merged, for the rest we will be reaching out
> > on
> > > > the
> > > > > PRs to ask the authors to close the PR and reopen them against
> > > > > apache/flink-connector-kafka:main.
> > > > >
> > > > > Thanks,
> > > > > Gordon
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-30859
> > > > > [2] https://github.com/apache/flink/pull/22150
> > > > > [3] https://github.com/apache/flink/pull/8
> > > > > [4] https://github.com/apache/flink/pull/22060
> > > > > [5] https://github.com/apache/flink/pull/22037
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] FLIP-293: Introduce Flink Jdbc Driver For Sql Gateway

2023-03-14 Thread Jacky Lau
+1 (non-binding)
it will make integrate with other tools convenient, like beeline/tableau

Best Regards,
Jacky Lau

Ran Tao  于2023年3月15日周三 13:11写道:

> It's a very nice improvement.
>
> +1 (non-binding)
>
> Best Regards,
> Ran Tao
>
>
> Shammon FY  于2023年3月13日周一 13:47写道:
>
> > Hi Devs,
> >
> > I'd like to start the vote on FLIP-293: Introduce Flink Jdbc Driver For
> Sql
> > Gateway [1].
> >
> > The FLIP was discussed in thread [2], and it aims to introduce Flink Jdbc
> > Driver module in Flink.
> >
> > The vote will last for at least 72 hours (03/16, 15:00 UTC+8) unless
> there
> > is an objection or insufficient vote. Thank you all.
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-293%3A+Introduce+Flink+Jdbc+Driver+For+Sql+Gateway
> > [2] https://lists.apache.org/thread/d1owrg8zh77v0xygcpb93fxt0jpjdkb3
> >
> >
> > Best,
> > Shammon.FY
> >
>


Re: [VOTE] FLIP-300: Add targetColumns to DynamicTableSink#Context to solve the null overwrite problem of partial-insert

2023-03-14 Thread Jacky Lau
+1 (non-binding)

Best,
Jacky

Jingsong Li  于2023年3月14日周二 17:17写道:

> +1 binding
>
> On Tue, Mar 14, 2023 at 10:54 AM Samrat Deb  wrote:
> >
> > +1 (non binding)
> > Thanks for driving it .
> >
> > Bests,
> > Samrat
> >
> > On Tue, 14 Mar 2023 at 7:41 AM, Jark Wu  wrote:
> >
> > > +1 (binding)
> > >
> > > Best,
> > > Jark
> > >
> > > > 2023年3月13日 23:25,Aitozi  写道:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Best,
> > > > Aitozi
> > > >
> > > > Jing Ge  于2023年3月13日周一 22:10写道:
> > > >
> > > >> +1 (binding)
> > > >>
> > > >> Best Regards,
> > > >> Jing
> > > >>
> > > >> On Mon, Mar 13, 2023 at 1:57 PM Hang Ruan 
> > > wrote:
> > > >>
> > > >>> +1 (non-binding)
> > > >>>
> > > >>> Best,
> > > >>> Hang
> > > >>>
> > > >>> yuxia  于2023年3月13日周一 20:52写道:
> > > >>>
> > >  +1 (binding)
> > >  Thanks Lincoln Lee for driving it.
> > > 
> > >  Best regards,
> > >  Yuxia
> > > 
> > >  - 原始邮件 -
> > >  发件人: "Lincoln Lee" 
> > >  收件人: "dev" 
> > >  发送时间: 星期一, 2023年 3 月 13日 下午 8:17:52
> > >  主题: [VOTE] FLIP-300: Add targetColumns to
> DynamicTableSink#Context to
> > >  solve the null overwrite problem of partial-insert
> > > 
> > >  Dear Flink developers,
> > > 
> > >  Thanks for all your feedback for FLIP-300: Add targetColumns to
> > >  DynamicTableSink#Context to solve the null overwrite problem of
> > >  partial-insert[1] on the discussion thread[2].
> > > 
> > >  I'd like to start a vote for it. The vote will be open for at
> least 72
> > >  hours unless there is an objection or not enough votes.
> > > 
> > >  [1]
> > > 
> > > >>>
> > > >>
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081
> > >  [2]
> https://lists.apache.org/thread/bk8x0nqg4oc62jqryj9ntzzlpj062wd9
> > > 
> > > 
> > >  Best,
> > >  Lincoln Lee
> > > 
> > > >>>
> > > >>
> > >
> > >
>


Re: [DISCUSS] FLIP-300: Add targetColumns to DynamicTableSink#Context to solve the null overwrite problem of partial-insert

2023-03-08 Thread Jacky Lau
Thanks for bringing this up. this is a good feature. but i have two
questions:
1. if the two insert into with same columns, the result is
not  nondeterminism. will it check in planner and throw exception
2. some sink connectors can not supports it like queue such as kafka
compacted topic. will also it check in planner  and throw exception

Lincoln Lee  于2023年3月7日周二 14:53写道:

> Hi Aitozi,
>
> Thanks for your feedback!  Yes, including HBase and JDBC connector, they
> can be considered for support in the next step (JDBC as as a standard
> protocol supported not only in traditional databases, but also in more and
> more new types of storage). Considering the ongoing externalizing of
> connectors and the release cycles of the connectors are decoupled with the
> release cycle of Flink, we can initiate corresponding support issues for
> specific connectors to follow up on support after finalizing the API
> changes, WDYT?
>
> Best,
> Lincoln Lee
>
>
> Hang Ruan  于2023年3月7日周二 12:14写道:
>
> > Hi, Lincoln,
> >
> > Thanks for bringing this up. It looks good to me. I also agree with
> > Jingsong's suggestion.
> >
> > Best,
> > Hang
> >
> > Jingsong Li  于2023年3月7日周二 11:15写道:
> >
> > > Wow, we have 300 FLIPs...
> > >
> > > Thanks Lincoln,
> > >
> > > Have you considered returning an Optional?
> > >
> > > Empty array looks a little weird to me.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Tue, Mar 7, 2023 at 10:32 AM Aitozi  wrote:
> > > >
> > > > Hi Lincoln,
> > > > Thank you for sharing this FLIP. Overall, it looks good to me. I
> > have
> > > > one question: with the introduction of this interface,
> > > > will any existing Flink connectors need to be updated in order to
> take
> > > > advantage of its capabilities? For example, HBase.
> > > >
> > > > yuxia  于2023年3月7日周二 10:01写道:
> > > >
> > > > > Thanks. It makes sense to me.
> > > > >
> > > > > Best regards,
> > > > > Yuxia
> > > > >
> > > > > - 原始邮件 -
> > > > > 发件人: "Lincoln Lee" 
> > > > > 收件人: "dev" 
> > > > > 发送时间: 星期一, 2023年 3 月 06日 下午 10:26:26
> > > > > 主题: Re: [DISCUSS] FLIP-300: Add targetColumns to
> > > DynamicTableSink#Context
> > > > > to solve the null overwrite problem of partial-insert
> > > > >
> > > > > hi yuxia,
> > > > >
> > > > > Thanks for your feedback and tracking the issue of update
> statement!
> > > I've
> > > > > updated the FLIP[1] and also the poc[2].
> > > > > Since the bug and flip are orthogonal, we can focus on finalizing
> the
> > > api
> > > > > changes first, and then work on the flip implementation and bugfix
> > > > > separately, WDYT?
> > > > >
> > > > > [1]
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081
> > > > > [2] https://github.com/apache/flink/pull/22041
> > > > >
> > > > > Best,
> > > > > Lincoln Lee
> > > > >
> > > > >
> > > > > yuxia  于2023年3月6日周一 21:21写道:
> > > > >
> > > > > > Hi, Lincoln.
> > > > > > Thanks for bringing this up. +1 for this FLIP, it's helpful for
> > > external
> > > > > > storage system to implement partial update.
> > > > > > The FLIP looks good to me. I only want to add one comment, update
> > > > > > statement also doesn't support updating nested column, I have
> > created
> > > > > > FLINK-31344[1] to track it.
> > > > > > Maybe we also need to explain it in this FLIP.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-31344
> > > > > >
> > > > > > Best regards,
> > > > > > Yuxia
> > > > > >
> > > > > > - 原始邮件 -
> > > > > > 发件人: "Lincoln Lee" 
> > > > > > 收件人: "dev" 
> > > > > > 发送时间: 星期五, 2023年 3 月 03日 下午 12:22:19
> > > > > > 主题: [DISCUSS] FLIP-300: Add targetColumns to
> > > DynamicTableSink#Context to
> > > > > > solve the null overwrite problem of partial-insert
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > This FLIP[1] aims to support connectors in avoiding overwriting
> > > > > non-target
> > > > > > columns with null values when processing partial column updates,
> we
> > > > > propose
> > > > > > adding information on the target column list to
> > > DynamicTableSink#Context.
> > > > > >
> > > > > > FLINK-18726[2] supports inserting statements with specified
> column
> > > list,
> > > > > it
> > > > > > fills null values (or potentially declared default values in the
> > > future)
> > > > > > for columns not appearing in the column list of insert statement
> to
> > > the
> > > > > > target table.
> > > > > > But this behavior does not satisfy some partial column update
> > > > > requirements
> > > > > > of some storage systems which allow storing null values. The
> > problem
> > > is
> > > > > > that connectors cannot distinguish whether the null value of a
> > > column is
> > > > > > really from the user's data or whether it is a null value
> populated
> > > > > because
> > > > > > of partial insert behavior.
> > > > > >
> > > > > > Looking forward to your comments or feedback.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > >
> >
> 

Re: [VOTE] FLIP-297: Improve Auxiliary Sql Statements

2023-03-06 Thread Jacky Lau
Thanks Ran.
+1 (non-binding)

Regards,
Jacky Lau

Ran Tao  于2023年3月6日周一 19:32写道:

> Hi Everyone,
>
>
> I want to start the vote on FLIP-297: Improve Auxiliary Sql Statements [1].
> The FLIP was discussed in this thread [2].
>
>
> The goal of the FLIP is to improve flink auxiliary sql statements(compared
> with sql standard or other mature engines).
>
> The vote will last for at least 72 hours (03/09, 19:30 UTC+8)
> unless there is an objection or insufficient votes. Thank you all.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> [2] https://lists.apache.org/thread/54fyd27m8on1cf3hn6dz564zqmkobjyd
>
> Best Regards,
> Ran Tao
> https://github.com/chucheng92
>


Re: [DISCUSS] FLIP-297: Improve Auxiliary Sql Statements

2023-02-22 Thread Jacky Lau
Hi Ran:
Thanks for driving the FLIP. the google doc looks really good. it is
important to improve user interactive experience. +1 to support this
feature.

Jing Ge  于2023年2月23日周四 00:51写道:

> Hi Ran,
>
> Thanks for driving the FLIP.  It looks overall good. Would you like to add
> a description of useLike and notLike? I guess useLike true is for "LIKE"
> and notLike true is for "NOT LIKE" but I am not sure if I understood it
> correctly. Furthermore, does it make sense to support "ILIKE" too?
>
> Best regards,
> Jing
>
> On Wed, Feb 22, 2023 at 1:17 PM Ran Tao  wrote:
>
> > Currently flink sql auxiliary statements has supported some good features
> > such as catalog/databases/table support.
> >
> > But these features are not very complete compared with other popular
> > engines such as spark, presto, hive and commercial engines such as
> > snowflake.
> >
> > For example, many engines support show operation with filtering except
> > flink, and support describe other object(flink only support describe
> > table).
> >
> > I wonder can we add these useful features for flink?
> > You can find details in this doc.[1] or FLIP.[2]
> >
> > Also, please let me know if there is a mistake. Looking forward to your
> > reply.
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1hAiOfPx14VTBTOlpyxG7FA2mB1k5M31VnKYad2XpJ1I/
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements
> >
> > Best Regards,
> > Ran Tao
> >
>


RE: Re: [DISCUSS ] add --jars to support users dependencies jars.

2022-10-27 Thread Jacky Lau
hi Martijn:
Some platform users may not package all the jars to the fat jars, spark
also has - jars for dependencies
https://stackoverflow.com/questions/29099115/spark-submit-add-multiple-jars-in-classpath

On 2022/10/27 06:48:52 Martijn Visser wrote:
> Hi Jacky Lau,
>
> Since you've sent the email to multiple mailing lists, I've decided to
> reply to the one that you've sent to both the Dev and User ML.
>
> > but it is not possible for platform users to create fat jars to package
> all their dependencies into the final jar package
>
> Can you elaborate on why that's not possible?
>
> Best regards,
>
> Martijn
>
> On Thu, Oct 27, 2022 at 6:59 AM Jacky Lau  wrote:
>
> > Hi guys:
> >
> > I'd like to initiate a discussion about adding command-line arguments to
> > support user-dependent jar packages.
> >
> > Currently flink supports user's main jars through -jarfile or without
> > setting this , the flink client will treat the first argument after
that as
> > the user master jar package when it encounters the first command line
> > argument that cannot be parsed. but it is not possible for platform
users
> > to create fat jars to package all their dependencies into the final jar
> > package. In the meantime, the configuration pipeline.jars is currently
> > exposed, and this value is overridden by command-line arguments such as
> > -jarfile.
> >
> > And If the user is using both the command-line argument and the
> > pipeline.jars argument, which can make the user werild and confused. In
> > addition, we should specify the priority "command line parameter > -D
> > dynamic parameter > flink-conf.yml configuration file parameter" in docs
> >
>


[DISCUSS ] add --jars to support users dependencies jars.

2022-10-26 Thread Jacky Lau
Hi guys:

I'd like to initiate a discussion about adding command-line arguments to
support user-dependent jar packages.

Currently flink supports user's main jars through -jarfile or without
setting this , the flink client will treat the first argument after that as
the user master jar package when it encounters the first command line
argument that cannot be parsed. but it is not possible for platform users
to create fat jars to package all their dependencies into the final jar
package. In the meantime, the configuration pipeline.jars is currently
exposed, and this value is overridden by command-line arguments such as
-jarfile.

And If the user is using both the command-line argument and the
pipeline.jars argument, which can make the user werild and confused. In
addition, we should specify the priority "command line parameter > -D
dynamic parameter > flink-conf.yml configuration file parameter" in docs


[DISCUSS ] add --jars to support users dependencies jars.

2022-10-26 Thread Jacky Lau
Hi guys:

I'd like to initiate a discussion about adding command-line arguments to
support user-dependent jar packages.

Currently flink supports user's main jars through -jarfile or without
setting this , the flink client will treat the first argument after that as
the user master jar package when it encounters the first command line
argument that cannot be parsed. but it is not possible for platform users
to create fat jars to package all their dependencies into the final jar
package. In the meantime, the configuration pipeline.jars is currently
exposed, and this value is overridden by command-line arguments such as
-jarfile.

And If the user is using both the command-line argument and the
pipeline.jars argument, which can make the user werild and confused. In
addition, we should specify the priority "command line parameter > -D
dynamic parameter > flink-conf.yml configuration file parameter" in docs


[DISCUSS ] add --jars to support users dependencies jars.

2022-10-26 Thread Jacky Lau
Hi guys:

I'd like to initiate a discussion about adding command-line arguments to
support user-dependent jar packages.

Currently flink supports user's main jars through -jarfile or without
setting this , the flink client will treat the first argument after that as
the user master jar package when it encounters the first command line
argument that cannot be parsed. but it is not possible for platform users
to create fat jars to package all their dependencies into the final jar
package. In the meantime, the configuration pipeline.jars is currently
exposed, and this value is overridden by command-line arguments such as
-jarfile.

And If the user is using both the command-line argument and the
pipeline.jars argument, which can make the user werild and confused. In
addition, we should specify the priority "command line parameter > -D
dynamic parameter > flink-conf.yml configuration file parameter" in docs


Re: [DISCUSS ] HybridSouce Table & Sql api timeline

2022-05-06 Thread Jacky Lau
I am interested in this, can you share your DDL first? At the same time,
there is no standard to describe hybrid Source DDL.
 I also hope that the community can discuss together and share their ideas.

Ran Tao  于2022年5月7日周六 11:59写道:

> HybridSource is a good feature, but now release version did not support
> table & sql api, i wonder which time it will be ready to end-users.
>
> And i have implemented a inner version of my company and it works well now.
> the implementation of table & sql api may involve some core questions, e.g.
> bounded & unbounded source start and end offset. child sources's schema is
> different from hybrid source ddl schema (batch or streaming is more fields
> or lack of some fields) we need process inconsistent filed problem or
> default filed value problem (in some child source lack of some fields) etc.
> so here we may need a field mapping.
>
> i have some ideas and implementations, if table & sql api work in progress
> i'm glad to share or take part in developing.
>
> thanks~
>


Re: Re: [ANNOUNCE] New Flink PMC member: Yang Wang

2022-05-06 Thread Jacky Lau
Congrats Yang and well Deserved!

Best,
Jacky Lau

Yun Gao  于2022年5月7日周六 10:44写道:

> Congratulations Yang!
>
> Best,
> Yun Gao
>
>
>
>  --Original Mail --
> Sender:David Morávek 
> Send Date:Sat May 7 01:05:41 2022
> Recipients:Dev 
> Subject:Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> Nice! Congrats Yang, well deserved! ;)
>
> On Fri 6. 5. 2022 at 17:53, Peter Huang 
> wrote:
>
> > Congrats, Yang!
> >
> >
> >
> > Best Regards
> > Peter Huang
> >
> > On Fri, May 6, 2022 at 8:46 AM Yu Li  wrote:
> >
> > > Congrats and welcome, Yang!
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > > On Fri, 6 May 2022 at 14:48, Paul Lam  wrote:
> > >
> > > > Congrats, Yang! Well Deserved!
> > > >
> > > > Best,
> > > > Paul Lam
> > > >
> > > > > 2022年5月6日 14:38,Yun Tang  写道:
> > > > >
> > > > > Congratulations, Yang!
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Jing Ge 
> > > > > Sent: Friday, May 6, 2022 14:24
> > > > > To: dev 
> > > > > Subject: Re: [ANNOUNCE] New Flink PMC member: Yang Wang
> > > > >
> > > > > Congrats Yang and well Deserved!
> > > > >
> > > > > Best regards,
> > > > > Jing
> > > > >
> > > > > On Fri, May 6, 2022 at 7:38 AM Lincoln Lee  >
> > > > wrote:
> > > > >
> > > > >> Congratulations Yang!
> > > > >>
> > > > >> Best,
> > > > >> Lincoln Lee
> > > > >>
> > > > >>
> > > > >> Őrhidi Mátyás  于2022年5月6日周五 12:46写道:
> > > > >>
> > > > >>> Congrats Yang! Well deserved!
> > > > >>> Best,
> > > > >>> Matyas
> > > > >>>
> > > > >>> On Fri, May 6, 2022 at 5:30 AM huweihua 
> > > > wrote:
> > > > >>>
> > > > >>>> Congratulations Yang!
> > > > >>>>
> > > > >>>> Best,
> > > > >>>> Weihua
> > > > >>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> >
>


RE: Re: [DISCUSS] FLIP-213: TaskManager's Flame Graphs

2022-02-11 Thread Jacky Lau
Our flink application is on k8s.Yes, user can use the async-profiler directly, 
but it is not convenient for user, who should download the jars and need to 
know how to use it. And some users don’t know the tool.if we integrate it, user 
will benefit a lot.

On 2022/01/26 18:56:17 David Morávek wrote:
> I'd second to Alex's concerns. Is there a reason why you can't use the
> async-profiler directly? In what kind of environment are your Flink
> clusters running (YARN / k8s / ...)?
> 
> Best,
> D.
> 
> On Wed, Jan 26, 2022 at 4:32 PM Alexander Fedulov 
> wrote:
> 
>> Hi Jacky,
>> 
>> Could you please clarify what kind of *problems* you experience with the
>> large parallelism? You referred to D3, is it something related to rendering
>> on the browser side or is it about the samples collection process? Were you
>> able to identify the bottleneck?
>> 
>> Fundamentally I have some concerns regarding the proposed approach:
>> 1. Calling shell scripts triggered via the web UI is a security concern and
>> it needs to be evaluated carefully if it could introduce any unexpected
>> attack vectors (depending on the implementation, passed parameters etc.)
>> 2. My understanding is that the async-profiler implementation is
>> system-dependent. How do you propose to handle multiple architectures?
>> Would you like to ship each available implementation within Flink? [1]
>> 3. Do you plan to make use of full async-profiler features including native
>> calls sampling with perf_events? If so, the issue I see is that some
>> environments restrict ptrace calls by default [2]
>> 
>> [1] https://github.com/jvm-profiling-tools/async-profiler#download
>> [2]
>> 
>> https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces
>> 
>> 
>> Best,
>> Alexander Fedulov
>> 
>> On Wed, Jan 26, 2022 at 1:59 PM 李森  wrote:
>> 
>>> This is an expected feature, as we also experienced browser crashes on
>>> existing operator-level flame graphs
>>> 
>>> Best,
>>> Echo Lee
>>> 
>>>> 在 2022年1月24日,下午6:16,David Morávek  写道:
>>>> 
>>>> Hi Jacky,
>>>> 
>>>> The link seems to be broken, here is the correct one [1].
>>>> 
>>>> [1]
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
>>>> 
>>>> Best,
>>>> D.
>>>> 
>>>>> On Mon, Jan 24, 2022 at 9:48 AM Jacky Lau <28...@qq.com.invalid>
>>> wrote:
>>>>> 
>>>>> Hi All,
>>>>>   I would like to start the discussion on FLIP-213 <
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
>>>>> ;
>>>>> which aims to provide taskmanager level(process level) flame
>> graph
>>>>> by async profiler, which is most popular tool in java performance. and
>>> the
>>>>> arthas and intellij both use it.
>>>>> And we support it in our ant group company.
>>>>>  AndFlink supports FLIP-165: Operator's Flame Graphs
>>>>> now. and it draw flame graph by thefront-end
>>>>> librariesd3-flame-graph, which has some problem in jobs
>>>>> oflarge of parallelism.
>>>>>  Please be aware that the FLIP wiki area is not fully done
>>>>> since i don't konw whether it will accept by
>> flinkcommunity.
>>>>>  Feel free to add your thoughts to make this feature
>>> better! i
>>>>> am looking forward to all your response. Thanks too much!
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Best Jacky Lau
>>> 
>> 


RE: Re: [DISCUSS] FLIP-213: TaskManager's Flame Graphs

2022-02-11 Thread Jacky Lau
Our flink application is on k8s.Yes, user can use the async-profiler directly, 
but it is not convenient for user, who should download the jars and need to 
know how to use it. And some users don’t know the tool.if we integrate it, user 
will benefit a lot.

On 2022/01/26 18:56:17 David Morávek wrote:
> I'd second to Alex's concerns. Is there a reason why you can't use the
> async-profiler directly? In what kind of environment are your Flink
> clusters running (YARN / k8s / ...)?
> 
> Best,
> D.
> 
> On Wed, Jan 26, 2022 at 4:32 PM Alexander Fedulov 
> wrote:
> 
>> Hi Jacky,
>> 
>> Could you please clarify what kind of *problems* you experience with the
>> large parallelism? You referred to D3, is it something related to rendering
>> on the browser side or is it about the samples collection process? Were you
>> able to identify the bottleneck?
>> 
>> Fundamentally I have some concerns regarding the proposed approach:
>> 1. Calling shell scripts triggered via the web UI is a security concern and
>> it needs to be evaluated carefully if it could introduce any unexpected
>> attack vectors (depending on the implementation, passed parameters etc.)
>> 2. My understanding is that the async-profiler implementation is
>> system-dependent. How do you propose to handle multiple architectures?
>> Would you like to ship each available implementation within Flink? [1]
>> 3. Do you plan to make use of full async-profiler features including native
>> calls sampling with perf_events? If so, the issue I see is that some
>> environments restrict ptrace calls by default [2]
>> 
>> [1] https://github.com/jvm-profiling-tools/async-profiler#download
>> [2]
>> 
>> https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces
>> 
>> 
>> Best,
>> Alexander Fedulov
>> 
>> On Wed, Jan 26, 2022 at 1:59 PM 李森  wrote:
>> 
>>> This is an expected feature, as we also experienced browser crashes on
>>> existing operator-level flame graphs
>>> 
>>> Best,
>>> Echo Lee
>>> 
>>>> 在 2022年1月24日,下午6:16,David Morávek  写道:
>>>> 
>>>> Hi Jacky,
>>>> 
>>>> The link seems to be broken, here is the correct one [1].
>>>> 
>>>> [1]
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
>>>> 
>>>> Best,
>>>> D.
>>>> 
>>>>> On Mon, Jan 24, 2022 at 9:48 AM Jacky Lau <28...@qq.com.invalid>
>>> wrote:
>>>>> 
>>>>> Hi All,
>>>>>   I would like to start the discussion on FLIP-213 <
>>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
>>>>> ;
>>>>> which aims to provide taskmanager level(process level) flame
>> graph
>>>>> by async profiler, which is most popular tool in java performance. and
>>> the
>>>>> arthas and intellij both use it.
>>>>> And we support it in our ant group company.
>>>>>  AndFlink supports FLIP-165: Operator's Flame Graphs
>>>>> now. and it draw flame graph by thefront-end
>>>>> librariesd3-flame-graph, which has some problem in jobs
>>>>> oflarge of parallelism.
>>>>>  Please be aware that the FLIP wiki area is not fully done
>>>>> since i don't konw whether it will accept by
>> flinkcommunity.
>>>>>  Feel free to add your thoughts to make this feature
>>> better! i
>>>>> am looking forward to all your response. Thanks too much!
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Best Jacky Lau
>>> 
>> 


RE: Re: [DISCUSS] FLIP-213: TaskManager's Flame Graphs

2022-02-11 Thread Jacky Lau
Our flink application is on k8s.Yes, user can use the async-profiler directly, 
but it is not convenient for user, who should download the jars and need to 
know how to use it. And some users don’t know the tool.if we integrate it, user 
will benefit a lot.

On 2022/01/26 18:56:17 David Morávek wrote:
> I'd second to Alex's concerns. Is there a reason why you can't use the
> async-profiler directly? In what kind of environment are your Flink
> clusters running (YARN / k8s / ...)?
> 
> Best,
> D.
> 
> On Wed, Jan 26, 2022 at 4:32 PM Alexander Fedulov 
> wrote:
> 
> > Hi Jacky,
> >
> > Could you please clarify what kind of *problems* you experience with the
> > large parallelism? You referred to D3, is it something related to rendering
> > on the browser side or is it about the samples collection process? Were you
> > able to identify the bottleneck?
> >
> > Fundamentally I have some concerns regarding the proposed approach:
> > 1. Calling shell scripts triggered via the web UI is a security concern and
> > it needs to be evaluated carefully if it could introduce any unexpected
> > attack vectors (depending on the implementation, passed parameters etc.)
> > 2. My understanding is that the async-profiler implementation is
> > system-dependent. How do you propose to handle multiple architectures?
> > Would you like to ship each available implementation within Flink? [1]
> > 3. Do you plan to make use of full async-profiler features including native
> > calls sampling with perf_events? If so, the issue I see is that some
> > environments restrict ptrace calls by default [2]
> >
> > [1] https://github.com/jvm-profiling-tools/async-profiler#download
> > [2]
> >
> > https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces
> >
> >
> > Best,
> > Alexander Fedulov
> >
> > On Wed, Jan 26, 2022 at 1:59 PM 李森  wrote:
> >
> > > This is an expected feature, as we also experienced browser crashes on
> > > existing operator-level flame graphs
> > >
> > > Best,
> > > Echo Lee
> > >
> > > > 在 2022年1月24日,下午6:16,David Morávek  写道:
> > > >
> > > > Hi Jacky,
> > > >
> > > > The link seems to be broken, here is the correct one [1].
> > > >
> > > > [1]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
> > > >
> > > > Best,
> > > > D.
> > > >
> > > >> On Mon, Jan 24, 2022 at 9:48 AM Jacky Lau <28...@qq.com.invalid>
> > > wrote:
> > > >>
> > > >> Hi All,
> > > >>   I would like to start the discussion on FLIP-213 <
> > > >>
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
> > > >> ;
> > > >> which aims to provide taskmanager level(process level) flame
> > graph
> > > >> by async profiler, which is most popular tool in java performance. and
> > > the
> > > >> arthas and intellij both use it.
> > > >> And we support it in our ant group company.
> > > >>  AndFlink supports FLIP-165: Operator's Flame Graphs
> > > >> now. and it draw flame graph by thefront-end
> > > >> librariesd3-flame-graph, which has some problem in jobs
> > > >> oflarge of parallelism.
> > > >>  Please be aware that the FLIP wiki area is not fully done
> > > >> since i don't konw whether it will accept by
> > flinkcommunity.
> > > >>  Feel free to add your thoughts to make this feature
> > > better! i
> > > >> am looking forward to all your response. Thanks too much!
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> Best Jacky Lau
> > >
> >
> 

RE: Re: [DISCUSS] FLIP-213: TaskManager's Flame Graphs

2022-02-10 Thread Jacky Lau
Hi Alexander:
   Sorry for late response for Chinese Spring Festival.
   The bottleneck is rendering on the browser side.
   For 1) we support user define script capability like yarn. And the flame 
graph script just encapsulate async profiler. So we should make it secure.
   For 2) yeah, we use different async profiler package for  different  
architectures.
   For 3) may not

On 2022/01/26 15:24:51 Alexander Fedulov wrote:
> Hi Jacky,
> 
> Could you please clarify what kind of *problems* you experience with the
> large parallelism? You referred to D3, is it something related to rendering
> on the browser side or is it about the samples collection process? Were you
> able to identify the bottleneck?
> 
> Fundamentally I have some concerns regarding the proposed approach:
> 1. Calling shell scripts triggered via the web UI is a security concern and
> it needs to be evaluated carefully if it could introduce any unexpected
> attack vectors (depending on the implementation, passed parameters etc.)
> 2. My understanding is that the async-profiler implementation is
> system-dependent. How do you propose to handle multiple architectures?
> Would you like to ship each available implementation within Flink? [1]
> 3. Do you plan to make use of full async-profiler features including native
> calls sampling with perf_events? If so, the issue I see is that some
> environments restrict ptrace calls by default [2]
> 
> [1] https://github.com/jvm-profiling-tools/async-profiler#download
> [2]
> https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces
> 
> 
> Best,
> Alexander Fedulov
> 
> On Wed, Jan 26, 2022 at 1:59 PM 李森  wrote:
> 
> > This is an expected feature, as we also experienced browser crashes on
> > existing operator-level flame graphs
> >
> > Best,
> > Echo Lee
> >
> > > 在 2022年1月24日,下午6:16,David Morávek  写道:
> > >
> > > Hi Jacky,
> > >
> > > The link seems to be broken, here is the correct one [1].
> > >
> > > [1]
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
> > >
> > > Best,
> > > D.
> > >
> > >> On Mon, Jan 24, 2022 at 9:48 AM Jacky Lau <28...@qq.com.invalid>
> > wrote:
> > >>
> > >> Hi All,
> > >>   I would like to start the discussion on FLIP-213 <
> > >>
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs
> > >> ;
> > >> which aims to provide taskmanager level(process level) flame graph
> > >> by async profiler, which is most popular tool in java performance. and
> > the
> > >> arthas and intellij both use it.
> > >> And we support it in our ant group company.
> > >>  AndFlink supports FLIP-165: Operator's Flame Graphs
> > >> now. and it draw flame graph by thefront-end
> > >> librariesd3-flame-graph, which has some problem in jobs
> > >> oflarge of parallelism.
> > >>  Please be aware that the FLIP wiki area is not fully done
> > >> since i don't konw whether it will accept by flinkcommunity.
> > >>  Feel free to add your thoughts to make this feature
> > better! i
> > >> am looking forward to all your response. Thanks too much!
> > >>
> > >>
> > >>
> > >>
> > >> Best Jacky Lau
> >
> 

[DISCUSS] FLIP-213: TaskManager's Flame Graphs

2022-01-24 Thread Jacky Lau
Hi All,
  I would like to start the discussion on FLIP-213 
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-213%3A+TaskManager%27s+Flame+Graphs;
which aims to provide taskmanager level(process level) flame graph by 
async profiler, which is most popular tool in java performance. and the arthas 
and intellij both use it.
And we support it in our ant group company.
 AndFlink supports FLIP-165: Operator's Flame Graphs now. and 
it draw flame graph by thefront-end librariesd3-flame-graph, which 
has some problem in jobs oflarge of parallelism.
 Please be aware that the FLIP wiki area is not fully done since i 
don't konw whether it will accept by flinkcommunity.
 Feel free to add your thoughts to make this feature better! i am 
looking forward to all your response. Thanks too much!




Best Jacky Lau

[DISCUSS] flink authorization supports

2020-08-13 Thread Jacky Lau
Hi all!

i want to integrate flink with ranger in our project. And i have some
investigation about hive/spark/hbase/hadoop ranger. And found that
hbase/hive/hadoop all have interface for authorization while spark sql does
it with inject spark rules. You can find it here
https://github.com/yaooqinn/spark-ranger.
This is my simple investigation
https://www.yuque.com/jackylau-sc7w6/bve18l/kkhn9e, which is chinese.And has
some authentication info, which is not relevant and you can ignore.
So if flink integrates with ranger, we can do it by following options
1) refer to the Spark Ranger code without modifying Flink code by inject
rules. But is has some problems.
Because if we do it by injecting rules refering to spark ranger, we can only
authorize DML/DDL.
sql String -> sqlNode -> Operation -> DDL Operation -> run
|
-> DML/DQL Operation ->
RelNode(logical) -> optimizer rules -> ExecNode(physical)-> Transformation.

The spark can do it because the ddl is also a logical plan in spark, which
can optimize it by rules.
case _: ShowColumnsCommand => SHOWCOLUMNS
case _: ShowDatabasesCommand => SHOWDATABASES
case _: ShowFunctionsCommand => SHOWFUNCTIONS
case _: ShowPartitionsCommand => SHOWPARTITIONS
case _: ShowTablesCommand => SHOWTABLES
  }
  case _ => QUERY
2) So i think flink may do it as hive does. add interface for authorization.
And users can implement it by themself.could the flink conmunity add some
interfaces or have some future work to support authorization
Cheers,
Jacky Lau



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


Re: request create flip permission for flink es bounded source/lookup source connector

2020-06-15 Thread Jacky Lau
Hi Robert:
 When i edit the FLIP, and upload the images. It will show the prompting
message like this "You'll need to ask permission to insert files here" 
Could you also help me give the permission for uploading images to FLIP
iki?

Robert Metzger wrote
> Hi,
> I gave you access to the Wiki!
> 
> On Fri, Jun 12, 2020 at 11:50 AM Jacky Lau 

> liuyongvs@

>  wrote:
> 
>> Hi Jack:
>>Thank you so much. My wiki name is jackylau
>>
>>
>> Jark Wu-2 wrote
>> > Hi Jacky,
>> >
>> > What's your username in wiki? So that I can give the permission to you.
>> >
>> > Best,
>> > Jark
>> >
>> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>>
>> > liuyongvs@
>>
>> >  wrote:
>> >
>> >> hi all:
>> >>After this simple discussion here
>> >>
>> >>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> >> ,
>> >>and i should create i flip127 to  track this. But i don't have
>> create
>> >> flip permision.
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from:
>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>> >>
>>
>>
>> Jark Wu-2 wrote
>> > Hi Jacky,
>> >
>> > What's your username in wiki? So that I can give the permission to you.
>> >
>> > Best,
>> > Jark
>> >
>> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>>
>> > liuyongvs@
>>
>> >  wrote:
>> >
>> >> hi all:
>> >>After this simple discussion here
>> >>
>> >>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> >> ,
>> >>and i should create i flip127 to  track this. But i don't have
>> create
>> >> flip permision.
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from:
>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>> >>
>>
>>
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>>





--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


Re: request create flip permission for flink es bounded source/lookup source connector

2020-06-14 Thread Jacky Lau
Hi Robert:
 Thank you so much.


Robert Metzger wrote
> Hi,
> I gave you access to the Wiki!
> 
> On Fri, Jun 12, 2020 at 11:50 AM Jacky Lau 

> liuyongvs@

>  wrote:
> 
>> Hi Jack:
>>Thank you so much. My wiki name is jackylau
>>
>>
>> Jark Wu-2 wrote
>> > Hi Jacky,
>> >
>> > What's your username in wiki? So that I can give the permission to you.
>> >
>> > Best,
>> > Jark
>> >
>> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>>
>> > liuyongvs@
>>
>> >  wrote:
>> >
>> >> hi all:
>> >>After this simple discussion here
>> >>
>> >>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> >> ,
>> >>and i should create i flip127 to  track this. But i don't have
>> create
>> >> flip permision.
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from:
>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>> >>
>>
>>
>> Jark Wu-2 wrote
>> > Hi Jacky,
>> >
>> > What's your username in wiki? So that I can give the permission to you.
>> >
>> > Best,
>> > Jark
>> >
>> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>>
>> > liuyongvs@
>>
>> >  wrote:
>> >
>> >> hi all:
>> >>After this simple discussion here
>> >>
>> >>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> >> ,
>> >>and i should create i flip127 to  track this. But i don't have
>> create
>> >> flip permision.
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from:
>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>> >>
>>
>>
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>>


Robert Metzger wrote
> Hi,
> I gave you access to the Wiki!
> 
> On Fri, Jun 12, 2020 at 11:50 AM Jacky Lau 

> liuyongvs@

>  wrote:
> 
>> Hi Jack:
>>Thank you so much. My wiki name is jackylau
>>
>>
>> Jark Wu-2 wrote
>> > Hi Jacky,
>> >
>> > What's your username in wiki? So that I can give the permission to you.
>> >
>> > Best,
>> > Jark
>> >
>> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>>
>> > liuyongvs@
>>
>> >  wrote:
>> >
>> >> hi all:
>> >>After this simple discussion here
>> >>
>> >>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> >> ,
>> >>and i should create i flip127 to  track this. But i don't have
>> create
>> >> flip permision.
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from:
>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>> >>
>>
>>
>> Jark Wu-2 wrote
>> > Hi Jacky,
>> >
>> > What's your username in wiki? So that I can give the permission to you.
>> >
>> > Best,
>> > Jark
>> >
>> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau 
>>
>> > liuyongvs@
>>
>> >  wrote:
>> >
>> >> hi all:
>> >>After this simple discussion here
>> >>
>> >>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> >> ,
>> >>and i should create i flip127 to  track this. But i don't have
>> create
>> >> flip permision.
>> >>
>> >>
>> >>
>> >> --
>> >> Sent from:
>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>> >>
>>
>>
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>>





--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


Re: request create flip permission for flink es bounded source/lookup source connector

2020-06-12 Thread Jacky Lau
Hi Jack:
   Thank you so much. My wiki name is jackylau


Jark Wu-2 wrote
> Hi Jacky,
> 
> What's your username in wiki? So that I can give the permission to you.
> 
> Best,
> Jark
> 
> On Fri, 12 Jun 2020 at 11:38, Jacky Lau 

> liuyongvs@

>  wrote:
> 
>> hi all:
>>After this simple discussion here
>>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> ,
>>and i should create i flip127 to  track this. But i don't have create
>> flip permision.
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>>


Jark Wu-2 wrote
> Hi Jacky,
> 
> What's your username in wiki? So that I can give the permission to you.
> 
> Best,
> Jark
> 
> On Fri, 12 Jun 2020 at 11:38, Jacky Lau 

> liuyongvs@

>  wrote:
> 
>> hi all:
>>After this simple discussion here
>>
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
>> ,
>>and i should create i flip127 to  track this. But i don't have create
>> flip permision.
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>>





--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


request create flip permission for flink es bounded source/lookup source connector

2020-06-11 Thread Jacky Lau
hi all:
   After this simple discussion here
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106,
   and i should create i flip127 to  track this. But i don't have create
flip permision.



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


Re: [Discussion] flink elasticsearch connector supports

2020-06-05 Thread Jacky Lau
hi Etienne Chauchot:
you can read here https://www.jianshu.com/p/d32e17dab90c, which is
chinese.But you can konw that slice api has poor performance in es-hadoop
project .

And i found that es-hadoop has removed this and disable sliced scrolls by
default. you can see below, which i found in the lastest es-hadoop release
version
 Configuration Changes
`es.input.use.sliced.partitions` is deprecated in 6.5.0, and will be removed
in 7.0.0. The default value for `es.input.max.docs.per.partition` (10)
will also be removed in 7.0.0, thus disabling sliced scrolls by default, and
switching them to be an explicitly opt-in feature.

added[5.0.0]
`es.input.max.docs.per.partition` ::
When reading from an {es} cluster that supports scroll slicing ({es} v5.0.0
and above), this parameter advises the
connector on what the maximum number of documents per input partition should
be. The connector will sample and estimate
the number of documents on each shard to be read and divides each shard into
input slices using the value supplied by
this property. This property is a suggestion, not a guarantee. The final
number of documents per partition is not
guaranteed to be below this number, but rather, they will be close to this
number. This property is ignored if you are
reading from an {es} cluster that does not support scroll slicing ({es} any
version below v5.0.0). By default, this
value is unset, and the input partitions are calculated based on the number
of shards in the indices being read.



Jacky Lau wrote
> hi Etienne Chauchot:
> thanks for your discussion.
> for 1) we do not supprt es  unbouded source currently
> 
> for 2) RichParallelSourceFunction is used for streaming ,InputFormat is
> for
> batch
> 
> for 3)  i downloaded beam just now. and the beam es connector is also
> using
> es-hadoop. i have read the code of es-hadoop(inputsplit contains shard and
> slice. And i think it is better when diffirent shard has diffirent number
> of
> docs), which you can seed here
> .https://github.com/elastic/elasticsearch-hadoop. But the code is not
> good.
> so we do not want to reference . and you can see presto, there is also
> just
> using inputsplit with shard not contains slice
> 
> for 4) because flink es connectro has alreay using diffrent client (es 5
> for
> tranport client, es 6,7 for highlevelrest), we just  reuse it,which will
> not
> change too much code
> 
> 
> 
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/





--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


Re: [Discussion] flink elasticsearch connector supports

2020-06-04 Thread Jacky Lau
Thanks for Jacky and Yangze.
1) we will reorganize your design and move it to the flip
2) we will support lookup and DynamicSource. and current DynamicSource's
optimize, such as supportLimitPushDown/supportLimitPushdown doesn't have 
achieved. so we will do it after it have done
Thanks for your response.


Jark Wu-2 wrote
> Thanks Jacky for starting this discussion.
> 
> The requirement of ES source has been proposed in the community many
> times. +1 for the feature from my side.
> 
> Here are my thoughts:
> 
> 1. streaming source
> As we only support bounded source for JDBC and HBase, so I think it's fine
> to have a bounded ES source.
> 
> 2. lookup source
> Have you ever thought about having ES as a lookup source just like an
> HBase
> table and lookup by index?
> I'm not sure whether it works. But my gut feeling tells me it is an
> interesting feature and there may be a need for it.
> 
> 3. DDL options
> I agree with Yangze, it is important to list what new options you want to
> add. It would be nice to organize your design
>  doc according to FLIP template (to have "Public Interface" and "Proposed
> Changes").
> 
> 4. Implement in new table source interface (FLIP-95)
> Since 1.11, we proposed a new set of table connector interfaces (FLIP-95)
> with more powerful features.
> Old table source interface will be removed in the future.
> 
> 5. DataStream source
> It would be nicer to expose a DataStream source too, and share
> implementations as much as possible.
> 
> 
> Best,
> Jark
> 
> 
> On Thu, 4 Jun 2020 at 22:07, Etienne Chauchot 

> echauchot@

>  wrote:
> 
>> Hi,
>>
>> I made the Elasticsearch connector of Apache Beam and I was thinking
>> about doing the same for Flink when I came by this discussion. I have
>> some comments regarding the design doc:
>>
>> 1. Streaming source:
>>
>> ES has data streams features but only for time series data; the aim of
>> this source is to read all kind of data. Apart from data streams,  ES
>> behaves like a database: you read the content of an index (similar to a
>> table) corresponding to the given query (similar to SQL). So, regarding
>> streaming changes, if there are changes between 2 read requests made by
>> the source, at the second the whole index (containing the change) will
>> be read another time. So, I see no way of having a regular flow of
>> documents updates (insertion, deletion, update) as we would need for a
>> streaming source. Regarding failover: I guess exactly once semantics
>> cannot be guaranteed, only at least once semantics can. Indeed there is
>> no ack mechanism on already read data. As a conclusion, IMO you are
>> right to target only batch source. Also this answers Yangze Guo's
>> question about streaming source. Question is: can a batch only source be
>> accepted as a built in flink source ?
>>
>> 2. hadoop ecosystem
>>
>> Why not use RichParallelSourceFunction ?
>>
>> 3. Splitting
>>
>> Splitting with one split = one ES shard could lead to sub-parallelism.
>> IMHO I think that what's important is the number of executors there are
>> in the Flink cluster: it is better to use
>> runtimeContext.getIndexOfThisSubtask() and
>> runtimeContext.getMaxNumberOfParallelSubtasks() to split the input data
>> using ES slice API.
>>
>> 4. Targeting ES 5, 6, 7
>>
>> In Beam I used low level REST client because it is compatible with all
>> ES versions so it allows to have the same code base for all versions.
>> But this client is very low level (String based requests). Now, high
>> level rest client exists (it was not available at the time), it is the
>> one I would use. It is also available for ES 5 so you should use it for
>> ES 5 instead of deprecated Transport client.
>>
>> Best
>>
>> Etienne Chauchot.
>>
>>
>> On 04/06/2020 08:47, Yangze Guo wrote:
>> > Hi, Jackey.
>> >
>> > Thanks for driving this discussion. I think this proposal should be a
>> > FLIP[1] since it impacts the public interface. However, as we have
>> > only some preliminary discussions atm, a design draft would be ok. But
>> > it would be better to organize your document according to [2].
>> >
>> > I've two basic questions:
>> > - Could your summarize all the public API and configurations (DDL) of
>> > the ElasticSearchTableSource?
>> > - If we want to implement ElasticSearch DataStream Source at the same
>> > time, do we need to do a lot of extra work apart from this?
>> >
>> > It also would be good if you cou

Re: [Discussion] flink elasticsearch connector supports

2020-06-04 Thread Jacky Lau
hi Etienne Chauchot:
thanks for your discussion.
for 1) we do not supprt es  unbouded source currently

for 2) RichParallelSourceFunction is used for streaming ,InputFormat is for
batch

for 3)  i downloaded beam just now. and the beam es connector is also using
es-hadoop. i have read the code of es-hadoop(inputsplit contains shard and
slice. And i think it is better when diffirent shard has diffirent number of
docs), which you can seed here
.https://github.com/elastic/elasticsearch-hadoop. But the code is not good.
so we do not want to reference . and you can see presto, there is also just
using inputsplit with shard not contains slice

for 4) because flink es connectro has alreay using diffrent client (es 5 for
tranport client, es 6,7 for highlevelrest), we just  reuse it,which will not
change too much code



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


Re: [Discussion] flink elasticsearch connector supports

2020-06-02 Thread Jacky Lau
Hi Robert Metzger:
Thanks for your response. could you please read this docs.
https://www.yuque.com/jackylau-sc7w6/bve18l/14a2ad5b7f86998433de83dd0f8ec067
. Any Is it any problem here? we are worried about
we do not think  throughly. thanks.



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


[Discussion] flink elasticsearch connector supports

2020-06-02 Thread Jacky Lau
Hi all!

We have started some preliminary work on the flink elasticsearch integration
(es connector for es version7) at hikvision research institute. 
It seems that the integration should think throughly. And we want to
contribute our code for the conmunity. 
So I think I should open a discussion thread with my initial ideas to get
some early feedback.

*Minimal background*
Some companies truly have the scenario of reading elasticsearch. You can see
our initial discussion here.
https://issues.apache.org/jira/browse/FLINK-16713 

*Design docs*
Here is our design docs for adapt diffirent es version such as 5,6,7
Becase we can access google docs, so we just use yuque project.
https://www.yuque.com/jackylau-sc7w6/bve18l/14a2ad5b7f86998433de83dd0f8ec067

We are very looking forwary your response.

Cheers,
Jacky Lau



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-29 Thread Jacky Lau
thanks Gyula Fóra. i have read it. And i think it is lark of flink catalog
info, which you can see spark atlas project here
https://github.com/hortonworks-spark/spark-atlas-connector



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/