date:20200228

Re: [DISCUSS] Remove Eclipse-specific plugin configurations

2020-02-28 Thread Stephan Ewen

Flavio, do you load Flink source code into Eclipse, or develop Flink
applications in Eclipse (based on the quickstart archetypes)?

On Fri, Feb 28, 2020 at 4:10 PM Chesnay Schepler  wrote:

> What do you have to change it to?
>
> What happens if you just remove it completely?
>
> On 28/02/2020 16:08, Flavio Pompermaier wrote:
> > I use Eclipse but the stuff added in the pom.xml to improve the
> > out-of-the-box experience is pretty useless, I always have to change it
> >
> > On Fri, Feb 28, 2020 at 4:01 PM Chesnay Schepler 
> wrote:
> >
> >> Hello,
> >>
> >> in various maven pom.xml we have some plugin definitions exclusively to
> >> increase support for the Eclipse IDE.
> >>
> >>   From what I have heard developing Flink is not really possible with
> >> Exclipse (we explicitly recommend IntelliJ in our documentation); I'm
> >> not aware of any committer using it at least.
> >>
> >> Hence I wanted to ask here to find out whether anyone is using Eclipse.
> >>
> >> If not, then I would like remove this stuff from the poms in an effort
> >> to reduce noise, and to reduce issues when using other IDE's (like what
> >> happened for vscode-java in FLINK-16150).
> >>
>
>

[jira] [Created] (FLINK-16348) Add commas to large numeric accumulators

2020-02-28 Thread Kboh (Jira)

Kboh created FLINK-16348:


 Summary: Add commas to large numeric accumulators
 Key: FLINK-16348
 URL: https://issues.apache.org/jira/browse/FLINK-16348
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Kboh


Make large numeric accumulator values easier to read.

 

Ex 273232 -> 273,232

 

[https://github.com/apache/flink/pull/10988]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] FLIP-103: Better TM/JM Log Display

2020-02-28 Thread Till Rohrmann

Hi Yadong,

Gary might be able to help with this effort.

Cheers,
Till

On Fri, Feb 28, 2020 at 11:08 AM Yadong Xie  wrote:

> Hi Till
>
> In addition to the front end, the FLIP-103 would also include some rest API
> changes.
> Could you assign one PMC or committer to help us review the pull request
> and drive this FLIP forward?
> We will start to create PR as soon as possible, thanks!
>
> Till Rohrmann  于2020年2月25日周二 下午6:03写道：
>
> > late +1 (binding)
> >
> > Cheers,
> > Till
> >
> > On Tue, Feb 25, 2020 at 4:46 AM Yadong Xie  wrote:
> >
> > > Thanks all for the votes.
> > >
> > > So far, we have
> > >
> > > - 4 binding +1 votes (Kurt, Jark, jincheng, Zhu Zhu)
> > > - 5 non-binding +1 votes (Yang Wang, Xintong Song, lining, Yangze,
> > zhenya)
> > > - No -1 votes
> > >
> > > The voting time has past and there are enough +1 votes to consider the
> > > FLIP-103 approved.
> > > Thank you all.
> > >
> > > Zhu Zhu  于2020年2月24日周一 下午3:56写道：
> > >
> > > > +1 (binding)
> > > >
> > > > It can be very helpful to check GC state when GC logging is enabled.
> > > >
> > > > Thanks,
> > > > Zhu Zhu
> > > >
> > > > zhenya Sun  于2020年2月24日周一 下午3:51写道：
> > > >
> > > > > 1
> > > > >
> > > > >
> > > > > able-sun
> > > > >
> > > > >
> > > > > On 02/24/2020 15:45，jincheng sun wrote：
> > > > > +1(binding) Thank you for driving this Yadong!
> > > > >
> > > > > Best,
> > > > > Jincheng
> > > > >
> > > > > Yangze Guo  于2020年2月21日周五 下午3:08写道：
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > It will make debugging more convenient and hence improve usability.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Fri, Feb 21, 2020 at 2:50 PM Yang Wang 
> > > wrote:
> > > > >
> > > > > Hi Jark,
> > > > >
> > > > > The GC log is not enabled by default. If the users want, he/she
> > should
> > > > > configure it by the following config option.
> > > > > env.java.opts.jobmanager: -XX:+PrintGCDetails
> -XX:+PrintGCDateStamps
> > > > > -Xloggc:/opt/flink/log/jobmanager-gc.log -XX:+UseGCLogFileRotation
> > > > > -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=512M
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Jark Wu  于2020年2月21日周五 下午2:19写道：
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > > A minor question: I didn't find the GC log in the POC web. Is it
> only
> > > > > generated when some configuration is enabled?
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > On Fri, 21 Feb 2020 at 11:40, Kurt Young  wrote:
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > >
> > > > > On Fri, Feb 21, 2020 at 11:25 AM lining jing <
> jinglini...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > +1 (non-binding)
> > > > > It lists all log files, the user could see the GC log.
> > > > >
> > > > > Xintong Song  于2020年2月21日周五 上午10:44写道：
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > I like the ideas of having a list of all log files, and make them
> > > > > downloadable.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Feb 21, 2020 at 10:24 AM Yang Wang <
> > > > > danrtsey...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > +1 non-binding
> > > > >
> > > > > The log files list are shown in the new design. I like the
> > > > > idea.
> > > > > Since
> > > > > the
> > > > > user defined logs(e.g. my-log.file)
> > > > > could be found here now.
> > > > >
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Yadong Xie  于2020年2月20日周四 下午5:56写道：
> > > > >
> > > > > Hi all
> > > > >
> > > > > I want to start the vote for FLIP-103, which proposes to
> > > > > provide
> > > > > better
> > > > > user log display UI in task manager and job manager.
> > > > >
> > > > > To help everyone better understand the proposal, we spent
> > > > > some
> > > > > efforts
> > > > > on
> > > > > making an online POC
> > > > >
> > > > > Task manager:
> > > > >
> > > > > previous web:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/#/task-manager/6df6c5f37b2bff125dbc3a7388128559/logs
> > > > > POC web:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/web/#/task-manager/6df6c5f37b2bff125dbc3a7388128559/log
> > > > >
> > > > > Job manager:
> > > > >
> > > > > previous web: http://101.132.122.69:8081/#/job-manager/logs
> > > > > POC web: http://101.132.122.69:8081/web/#/job-manager/log
> > > > >
> > > > >
> > > > > The vote will last for at least 72 hours, following the
> > > > > consensus
> > > > > voting
> > > > > process.
> > > > >
> > > > > FLIP wiki:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=147427143
> > > > >
> > > > > Discussion thread:
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [VOTE] FLIP-98: Better Back Pressure Detection

2020-02-28 Thread Till Rohrmann

I'm pulling in Gary who could help with this effort.

Cheers,
Till

On Fri, Feb 28, 2020 at 11:08 AM Yadong Xie  wrote:

> Hi Till
>
> In addition to the front end, the FLIP-98 would also include some rest API
> changes.
> Could you assign one PMC or committer to help us review the pull request
> and drive this FLIP forward?
> We will start to create PR as soon as possible, thanks!
>
> Till Rohrmann  于2020年2月25日周二 下午6:12写道：
>
> > late +1 (binding).
> >
> > Cheers,
> > Till
> >
> > On Tue, Feb 25, 2020 at 4:55 AM Yadong Xie  wrote:
> >
> > > Thanks all for the votes.
> > >
> > > So far, we have
> > >
> > > - 3 binding +1 votes (Kurt, Jark, zhijiang)
> > > - 4 non-binding +1 votes (Congxian, Wangyang, Benchao, Yun Gao)
> > > - No -1 votes
> > >
> > > The voting time has past and there are enough +1 votes to consider the
> > > FLIP-98 approved.
> > > Thank you all.
> > >
> > > Yun Gao  于2020年2月24日周一 上午10:27写道：
> > >
> > > > +1 (non-binding)
> > > >   This should make tuning back pressure easier, which is one of
> the
> > > > most common problems met for users.
> > > >
> > > > Best,
> > > >Yun
> > > >
> > > >
> > > > --
> > > > From:Benchao Li 
> > > > Send Time:2020 Feb. 24 (Mon.) 08:32
> > > > To:dev 
> > > > Cc:Zhijiang 
> > > > Subject:Re: [VOTE] FLIP-98: Better Back Pressure Detection
> > > >
> > > > +1 (non-binding)
> > > >
> > > > It's really helpful!
> > > >
> > > > Yang Wang  于2020年2月21日周五 下午2:44写道：
> > > >
> > > > > +1 non-binding
> > > > >
> > > > > It is really straightforward.
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Jark Wu  于2020年2月21日周五 下午1:52写道：
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > On Fri, 21 Feb 2020 at 11:51, Kurt Young 
> wrote:
> > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Feb 21, 2020 at 1:09 AM Zhijiang <
> > > wangzhijiang...@aliyun.com
> > > > > > > .invalid>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 (binding).
> > > > > > > > It seems more clearly and directly to highlight the back
> > > pressured
> > > > > > vertex
> > > > > > > > in topology, which can raise the attention of users.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Zhijiang
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > --
> > > > > > > > From:Congxian Qiu 
> > > > > > > > Send Time:2020 Feb. 20 (Thu.) 21:49
> > > > > > > > To:dev@flink.apache.org 
> > > > > > > > Subject:Re: [VOTE] FLIP-98: Better Back Pressure Detection
> > > > > > > >
> > > > > > > > +1 for this improvement, It's very helpful to know whether a
> > task
> > > > is
> > > > > > > > backpressured or not.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Congxian
> > > > > > > >
> > > > > > > >
> > > > > > > > Yadong Xie  于2020年2月20日周四 下午5:34写道：
> > > > > > > >
> > > > > > > > > Hi all
> > > > > > > > >
> > > > > > > > > I want to start the vote for FLIP-98, which proposes to
> > provide
> > > > > > better
> > > > > > > > > backpressure detection experience in web UI.
> > > > > > > > >
> > > > > > > > > To help everyone better understand the proposal, we spent
> > some
> > > > > > efforts
> > > > > > > on
> > > > > > > > > making an online POC
> > > > > > > > >
> > > > > > > > > previous web:
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/#/job/add9d3ea1006af15b9f872987eb3befd/overview
> > > > > > > > > POC web:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/web/#/job/add9d3ea1006af15b9f872987eb3befd/overview
> > > > > > > > >
> > > > > > > > > The vote will last for at least 72 hours, following the
> > > consensus
> > > > > > > voting
> > > > > > > > > process.
> > > > > > > > >
> > > > > > > > > FLIP wiki:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-98%3A+Better+Back+Pressure+Detection
> > > > > > > > >
> > > > > > > > > Discussion thread:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-75-Flink-Web-UI-Improvement-Proposal-td33540.html
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Yadong
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Benchao Li
> > > > School of Electronics Engineering and Computer Science, Peking
> > University
> > > > Tel:+86-15650713730
> > > > Email: libenc...@gmail.com; libenc...@pku.edu.cn
> > > >
> > > >
> > >
> >
>

[jira] [Created] (FLINK-16347) Remove distinction between "operatorIDs" and "userDefinedOperatorIds"

2020-02-28 Thread Stephan Ewen (Jira)

Stephan Ewen created FLINK-16347:


 Summary: Remove distinction between "operatorIDs" and 
"userDefinedOperatorIds"
 Key: FLINK-16347
 URL: https://issues.apache.org/jira/browse/FLINK-16347
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Stephan Ewen


The distinction between the two (in the JobVertex / ExecutionJobVertex) still 
comes from the upgrade path when operator IDs were introduced.
We should be able to remove this by now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16346) BlobsCleanupITCase.testBlobServerCleanupCancelledJob fails on Travis

2020-02-28 Thread Till Rohrmann (Jira)

Till Rohrmann created FLINK-16346:
-

 Summary: BlobsCleanupITCase.testBlobServerCleanupCancelledJob 
fails on Travis
 Key: FLINK-16346
 URL: https://issues.apache.org/jira/browse/FLINK-16346
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.11.0
Reporter: Till Rohrmann
 Fix For: 1.11.0


{{BlobsCleanupITCase.testBlobServerCleanupCancelledJob}} and 
{{BlobsCleanupITCase.testBlobServerCleanupFinishedJob}} fails on Travis with 

{code}
12:22:22.314 [ERROR] 
testBlobServerCleanupCancelledJob(org.apache.flink.runtime.jobmanager.BlobsCleanupITCase)
  Time elapsed: 30.037 s  <<< FAILURE!
java.lang.AssertionError: Timeout while waiting for 
/tmp/junit226171740428505862/junit6248655677631371057/blobStore-0eb47b50-41ae-4c3c-885e-4ee944b8ef38
 to become empty. Current contents: [job_bf8d556f932a0d225c1dcc60f1665a26]
at 
org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.waitForEmptyBlobDir(BlobsCleanupITCase.java:276)
at 
org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.testBlobServerCleanup(BlobsCleanupITCase.java:233)
at 
org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.testBlobServerCleanupCancelledJob(BlobsCleanupITCase.java:141)

12:22:22.314 [ERROR] 
testBlobServerCleanupFinishedJob(org.apache.flink.runtime.jobmanager.BlobsCleanupITCase)
  Time elapsed: 30.098 s  <<< FAILURE!
java.lang.AssertionError: Timeout while waiting for 
/tmp/junit226171740428505862/junit6248655677631371057/blobStore-0eb47b50-41ae-4c3c-885e-4ee944b8ef38
 to become empty. Current contents: [job_bf8d556f932a0d225c1dcc60f1665a26]
at 
org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.waitForEmptyBlobDir(BlobsCleanupITCase.java:276)
at 
org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.testBlobServerCleanup(BlobsCleanupITCase.java:233)
at 
org.apache.flink.runtime.jobmanager.BlobsCleanupITCase.testBlobServerCleanupFinishedJob(BlobsCleanupITCase.java:133)
{code}

https://api.travis-ci.com/v3/job/292161105/log.txt



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16345) Computed column can not refer time attribute column

2020-02-28 Thread Leonard Xu (Jira)

Leonard Xu created FLINK-16345:
--

 Summary: Computed column can not refer time attribute column 
 Key: FLINK-16345
 URL: https://issues.apache.org/jira/browse/FLINK-16345
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: Leonard Xu


If a computed column refer a time attribute column, computed column will lose  
time attribute and cause validation fail.
{code:java}
CREATE TABLE orders (
  order_id STRING,
  order_time TIMESTAMP(3),
  amount DOUBLE,
  amount_kg as amount * 1000,
  // can not select computed column standard_ts which from column order_time 
that used as WATERMARK
  standard_ts as order_time + INTERVAL '8' HOUR,
  WATERMARK FOR order_time AS order_time
) WITH (
  'connector.type' = 'kafka',
  'connector.version' = '0.10',
  'connector.topic' = 'flink_orders',
  'connector.properties.zookeeper.connect' = 'localhost:2181',
  'connector.properties.bootstrap.servers' = 'localhost:9092',
  'connector.properties.group.id' = 'testGroup',
  'connector.startup-mode' = 'earliest-offset',
  'format.type' = 'json',
  'format.derive-schema' = 'true'
);

{code}
The query `select amount_kg from orders` runs normally,  

the` he query `select standard_ts from orders` throws a validation exception 
message as following:
{noformat}
[ERROR] Could not execute SQL statement. Reason:
 java.lang.AssertionError: Conversion to relational algebra failed to preserve 
datatypes:
 validated type:
 RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" order_id, TIME 
ATTRIBUTE(ROWTIME) order_time, DOUBLE amount, DOUBLE amount_kg, TIMESTAMP(3) 
ts) NOT NULL
 converted type:
 RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" order_id, TIME 
ATTRIBUTE(ROWTIME) order_time, DOUBLE amount, DOUBLE amount_kg, TIME 
ATTRIBUTE(ROWTIME) ts) NOT NULL
 rel:
 LogicalProject(order_id=[$0], order_time=[$1], amount=[$2], amount_kg=[$3], 
ts=[$4])
 LogicalWatermarkAssigner(rowtime=[order_time], watermark=[$1])
 LogicalProject(order_id=[$0], order_time=[$1], amount=[$2], amount_kg=[*($2, 
1000)], ts=[+($1, 2880:INTERVAL HOUR)])
 LogicalTableScan(table=[[default_catalog, default_database, orders, source: 
[Kafka010TableSource(order_id, order_time, amount)]]])
 {noformat}
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16344) Preserve nullability for nested types

2020-02-28 Thread Timo Walther (Jira)

Timo Walther created FLINK-16344:


 Summary: Preserve nullability for nested types
 Key: FLINK-16344
 URL: https://issues.apache.org/jira/browse/FLINK-16344
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Timo Walther
Assignee: Timo Walther


Currently, FlinkTypeFactory does not always preserve nullability attributes 
when handling nested types.

E.g. a table function that returns {{ROW NOT NULL>}} 
looses information to {{ROW>}}.

Same for built-in functions such as {{COLLECT}} which results in mismatches as 
mentioned in FLINK-14042.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] FLIP-93: JDBC catalog and Postgres catalog

2020-02-28 Thread Benchao Li

+1 (non-binding)
It's a very useful feature, especially for batch jobs.

Bowen Li  于2020年2月28日周五 上午2:39写道：

> Hi all,
>
> I'd like to kick off the vote for FLIP-93 [1] to add JDBC catalog and
> Postgres catalog.
>
> The vote will last for at least 72 hours, following the consensus voting
> protocol.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-93%3A+JDBC+catalog+and+Postgres+catalog
>
> Discussion thread:
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-92-JDBC-catalog-and-Postgres-catalog-td36505.html
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn

Re: [DISCUSS] Remove Eclipse-specific plugin configurations

2020-02-28 Thread Chesnay Schepler


What do you have to change it to?

What happens if you just remove it completely?

On 28/02/2020 16:08, Flavio Pompermaier wrote:

I use Eclipse but the stuff added in the pom.xml to improve the
out-of-the-box experience is pretty useless, I always have to change it

On Fri, Feb 28, 2020 at 4:01 PM Chesnay Schepler  wrote:


Hello,

in various maven pom.xml we have some plugin definitions exclusively to
increase support for the Eclipse IDE.

  From what I have heard developing Flink is not really possible with
Exclipse (we explicitly recommend IntelliJ in our documentation); I'm
not aware of any committer using it at least.

Hence I wanted to ask here to find out whether anyone is using Eclipse.

If not, then I would like remove this stuff from the poms in an effort
to reduce noise, and to reduce issues when using other IDE's (like what
happened for vscode-java in FLINK-16150).

Re: [DISCUSS] Remove Eclipse-specific plugin configurations

2020-02-28 Thread Flavio Pompermaier

I use Eclipse but the stuff added in the pom.xml to improve the
out-of-the-box experience is pretty useless, I always have to change it

On Fri, Feb 28, 2020 at 4:01 PM Chesnay Schepler  wrote:

> Hello,
>
> in various maven pom.xml we have some plugin definitions exclusively to
> increase support for the Eclipse IDE.
>
>  From what I have heard developing Flink is not really possible with
> Exclipse (we explicitly recommend IntelliJ in our documentation); I'm
> not aware of any committer using it at least.
>
> Hence I wanted to ask here to find out whether anyone is using Eclipse.
>
> If not, then I would like remove this stuff from the poms in an effort
> to reduce noise, and to reduce issues when using other IDE's (like what
> happened for vscode-java in FLINK-16150).
>

[DISCUSS] Remove Eclipse-specific plugin configurations

2020-02-28 Thread Chesnay Schepler


Hello,

in various maven pom.xml we have some plugin definitions exclusively to 
increase support for the Eclipse IDE.


From what I have heard developing Flink is not really possible with 
Exclipse (we explicitly recommend IntelliJ in our documentation); I'm 
not aware of any committer using it at least.


Hence I wanted to ask here to find out whether anyone is using Eclipse.

If not, then I would like remove this stuff from the poms in an effort 
to reduce noise, and to reduce issues when using other IDE's (like what 
happened for vscode-java in FLINK-16150).

[jira] [Created] (FLINK-16343) Failed to read a table with watermark in batch mode

2020-02-28 Thread Kurt Young (Jira)

Kurt Young created FLINK-16343:
--

 Summary: Failed to read a table with watermark in batch mode
 Key: FLINK-16343
 URL: https://issues.apache.org/jira/browse/FLINK-16343
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: Kurt Young


We can just ignore watermark in batch mode. 

cc [~jark]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-76: Unaligned checkpoints

2020-02-28 Thread Yu Li

Thanks for the further feedback Zhijiang and Piotr! I think this is a great
feature and will watch the progress. Please also feel free to involve me in
discussions/reviews on state-related part. Thanks.

Best Regards,
Yu


On Thu, 27 Feb 2020 at 23:24, Piotr Nowojski  wrote:

> Hi Yu,
>
> Re 4.
>
> Dynamic switching between unaligned and aligned checkpoints based on some
> kind of thresholds (timeout, or checkpoint size) is definitely one of the
> very first improvement that we want to tackle after implementing the MVP.
> Depending on the time constraints, dynamic switching can make to 1.11 or
> not. It’s hard to tell for me at this point of time.
>
> Piotrek
>
> > On 26 Feb 2020, at 15:59, Zhijiang 
> wrote:
> >
> > Thanks for the further explanations, Yu!
> >
> > 1. The inflight buffer spilling process is indeed handled
> asynchronously. While the buffer is not finished spilling, it would not be
> recycled to reuse again.
> > Your understanding is right. I guess I misunderstood your previous
> concern of additional memory consumption from the perspective of buffer
> usage.
> > My point of no additional memory consumption is from the perspective of
> total network memory size which would not be increased as a result.
> >
> > 2. We treat the inflight buffers as input states which are
> equivalent with existing operator states, and try to make use of all the
> existing mechanisms for
> > state handle and assignment during recovery. So i guess for the local
> recovery it should be the similar case. I would think through whether it
> has some special
> > work to do around with local recovery, and then clarify it in FLIP after
> we reach an agreement internally. BTW, this FLIP has not finalized yet.
> >
> > 3. Yes, the previous proposal is for measuring how many inflight buffers
> to be spilled which refers to the data size if really taking this way. I
> think the proposed option
> > in FLIP are the initial thoughts for various of possibilities. Which way
> we decide to take for the first version, I guess we need to further
> finalize before voting.
> >
> > 4. I think there probably exists the requirements or scenarios from
> users as you mentioned. Actually we have not finalized the way of switching
> to unaligned checkpoint yet.
> > Anyway we could provide an option for users to try out this feature at
> the beginning, although it might be not the most ideal one. Another input
> is that we know the motivation
> > of unaligned checkpoint is from the scenarios of backpressure, but it
> might also performs well in the case of non backpressure, even shorten the
> checkpoint duration without
> > obvious performance regression in our previous POC testing. So the
> backpressure might not be the only factor to switch to the unaligned way in
> practice I guess. Anyway your
> > inputs are valuable for us to make the final decision.
> >
> > Best,
> > Zhijiang
> >
> >
> >
> >
> > --
> > From:Yu Li 
> > Send Time:2020 Feb. 26 (Wed.) 15:59
> > To:dev ; Zhijiang 
> > Subject:Re: [DISCUSS] FLIP-76: Unaligned checkpoints
> >
> > Hi Zhijiang,
> >
> > Thanks for the quick reply!
> >
> > For the 1st question, please allow me to confirm, that when doing
> asynchronous checkpointing, disk spilling should happen in background in
> parallel with receiving/sending new data, or else it would become
> synchronous, right? Based on such assumption, some copy-on-write like
> mechanism would be necessary to make sure the new updates won't modify the
> to-be-checkpointed data, and this is where the additional memory
> consumption comes from.
> >
> > About point #2, I suggest we write it down in the FLIP document about
> local recovery support (if reach a consensus here), to make sure it won't
> be neglected in later implementation (I believe there're still some work to
> do following existing local recovery mechanism). What do you think?
> >
> > For the 3rd topic, do you mean UNALIGNED_WITH_MAX_INFLIGHT_DATA would
> set some kind of threshold about "how much in-flight data to checkpoint"?
> If so, could you further clarify the measurement (data size? record number?
> others?) since there seems to be no description in the current FLIP doc?
> This is somewhat different from my understanding after reading the FLIP...
> >
> > Regarding question #4, I have no doubt that the new unaligned checkpoint
> mechanism could make fast checkpoint possible, at the cost of more memory,
> network bandwidth and disk space consumption. However, (correct me if I'm
> wrong) for users who are satisfied with the existing aligned checkpoint
> interval, taking the constant cost to prevent delayed checkpoint during
> back pressure - a relatively low frequency event - may not be that
> pragmatic.
> >
> > Best Regards,
> > Yu
> >
> > On Wed, 26 Feb 2020 at 15:07, Zhijiang 
> wrote:
> > Hi Yu,
> >
> > Thanks for concerning of this FLIP and sharing your thoughts! Let me try
> to answer some below

[jira] [Created] (FLINK-16342) Remove mocking from DatadogHttpClientTest

2020-02-28 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-16342:


 Summary: Remove mocking from DatadogHttpClientTest
 Key: FLINK-16342
 URL: https://issues.apache.org/jira/browse/FLINK-16342
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


The {{DatadogHttpClientTest}} uses some nasty powermock features, like 
suppressing methods and mocking static methods.

There are simple workarounds for these that we should employ instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] FLIP-84: Improve & Refactor API of TableEnvironment

2020-02-28 Thread Benchao Li

+1 (non-binding)

Jark Wu  于2020年2月28日周五 下午5:11写道：

> +1 from my side.
>
> Best,
> Jark
>
> On Fri, 28 Feb 2020 at 15:07, kant kodali  wrote:
>
> > Nice!!
> >
> > Sent from my iPhone
> >
> > > On Feb 27, 2020, at 9:03 PM, godfrey he  wrote:
> > >
> > > Hi kant, yes. We hope to deprecate the methods which confuse users
> ASAP.
> > >
> > > Bests,
> > > godfrey
> > >
> > > kant kodali  于2020年2月28日周五 上午11:17写道：
> > >
> > >> Is this targeted towards Flink 1.11?
> > >>
> > >>> On Thu, Feb 27, 2020 at 6:32 PM Kurt Young  wrote:
> > >>>
> > >>> +1 (binding)
> > >>>
> > >>> Best,
> > >>> Kurt
> > >>>
> > >>>
> >  On Fri, Feb 28, 2020 at 9:15 AM Terry Wang 
> > wrote:
> > >>>
> >  I look through the whole design and it’s a big improvement of
> > usability
> > >>> on
> >  TableEnvironment’s api.
> > 
> >  +1 (non-binding)
> > 
> >  Best,
> >  Terry Wang
> > 
> > 
> > 
> > > 2020年2月27日 14:59，godfrey he  写道：
> > >
> > > Hi everyone,
> > >
> > > I'd like to start the vote of FLIP-84[1], which proposes to
> deprecate
> >  some
> > > old APIs and introduce some new APIs in TableEnvironment. This FLIP
> > >> is
> > > discussed and reached consensus in the discussion thread[2].
> > >
> > > The vote will be open for at least 72 hours. Unless there is an
> >  objection,
> > > I will try to close it by Mar 1, 2020 07:00 UTC if we have received
> > > sufficient votes.
> > >
> > >
> > > [1]
> > >
> > 
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-84%3A+Improve+%26+Refactor+API+of+TableEnvironment
> > >
> > > [2]
> > >
> > 
> > >>>
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-84-Improve-amp-Refactor-API-of-Table-Module-td34537.html
> > >
> > >
> > > Bests,
> > > Godfrey
> > 
> > 
> > >>>
> > >>
> >
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

2020-02-28 Thread Benchao Li

Hi godfrey,

Thanks for your explanation.

Do we need to clarify this in the FLIP? Maybe this confuses other users as
well.

godfrey he  于2020年2月28日周五 下午4:54写道：

> Hi Benchao,
>
> > I have one question about this FLIP:
> > executeStatement  accepts DML, what if it's a streaming DML ?
> >does it submit the job to cluster directly and blocks forever? what's
> > the behavior for the next statements?
> `executeStatement` is a synchronous method, will execute the statement once
> calling this method and return the result until the job is finished.
> We will introduce asynchronous method like `executeStatementAsync` in the
> future.
>
> > nit: there's a typo in "the table describing the result for each kind of
> > statement", "*Result Scheam" -> "Result Schema"*
> Thanks for the reminding, I will fix it now.
>
> Bests,
> Godfrey
>
> Benchao Li  于2020年2月28日周五 下午4:00写道：
>
> > Hi Terry,
> >
> > Thanks for the propose, and sorry for joining the party late.
> >
> > I have one question about this FLIP:
> > executeStatement  accepts DML, what if it's a streaming DML ?
> > does it submit the job to cluster directly and blocks forever? what's
> > the behavior for the next statements?
> >
> > nit: there's a typo in "the table describing the result for each kind of
> > statement", "*Result Scheam" -> "Result Schema"*
> >
> >
> > godfrey he  于2020年2月18日周二 下午4:41写道：
> >
> > > Thanks Kurt and Jark for explanation, I now also think we should make
> the
> > > TableEnvironment interface more statable and should not change
> "sqlQuery"
> > > method and "from" method.
> > >
> > > Hi Jingsong. Regarding to the "DmlBatch", I totally agree with
> advantages
> > > of "addBatch" method. However, there are two more questions need to
> > solve:
> > > one is how users write multi-sink programs in a Table API ? and another
> > is
> > > how users explain multi-sink program in both SQL and Table API ?
> > > Currently, "DmlBatch" class can solve those questions. (the main
> > > disadvantages is Inconsistent with the current interface)
> > >
> > > Bests,
> > > godfrey
> > >
> > > Jingsong Li  于2020年2月15日周六 下午9:09写道：
> > >
> > > > Hi Kurt and Godfrey,
> > > >
> > > > Thank you for your explanation.
> > > >
> > > > Regarding to the "DmlBatch",
> > > > I see there are some description for JDBC Statement.addBatch in the
> > > > document.
> > > > What do you think about introducing "addBatch" to the TableEnv
> instead
> > of
> > > > introducing a new class?
> > > > The advantage is:
> > > > - Consistent with JDBC statement.
> > > > - Consistent with current interface, what we need do is just modify
> > > method
> > > > name.
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > >
> > > > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young  wrote:
> > > >
> > > > > I don't think we should change `from` to `fromCatalog`, especially
> > > `from`
> > > > > is just
> > > > > introduced in 1.10. I agree with Jark we should change interface
> only
> > > > when
> > > > > necessary,
> > > > > e.g. the semantic is broken or confusing. So I'm +1 to keep
> > `sqlQuery`
> > > as
> > > > > it is.
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu  wrote:
> > > > >
> > > > > > Thanks Kurt and Godfrey for the explanation,
> > > > > >
> > > > > > It makes sense to me that renaming `from(tableName)` to
> > > > > > `fromCatalog(tableName)`.
> > > > > > However, I still think `sqlQuery(query)` is clear and works well.
> > Is
> > > it
> > > > > > necessary to change it?
> > > > > >
> > > > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we
> > removed
> > > > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > > > and now we want to remove them again. Users will feel like the
> > > > interface
> > > > > is
> > > > > > very unstable, that really frustrates users.
> > > > > > I think we should be cautious to remove interface and only when
> it
> > is
> > > > > > necessary.
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, 13 Feb 2020 at 20:58, godfrey he 
> > > wrote:
> > > > > >
> > > > > > > hi kurt，jark，jingsong
> > > > > > >
> > > > > > > Regarding to "fromQuery", I agree with kurt. In addition, I
> think
> > > > > `Table
> > > > > > > from(String tableName)` should be renamed to `Table
> > > > fromCatalog(String
> > > > > > > tableName)`.
> > > > > > >
> > > > > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> > > > "DELETE",
> > > > > > and
> > > > > > > they can be executed in a same batch in the future. So we can
> add
> > > > > > > "addUpdate" method and "addDelete" method to support them.
> > > > > > >
> > > > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > > > "DmlBatchBuilder".
> > > > > > >
> > > > > > > open to more discussion
> > > > > > >
> > > > > > > Best,
> > > > > > > godfrey
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Kurt Young  于2020年2月13日周四 下午4:56写道：
> >

[jira] [Created] (FLINK-16341) Remove DatadogReporter#DatadogHttpRequest

2020-02-28 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-16341:


 Summary: Remove DatadogReporter#DatadogHttpRequest
 Key: FLINK-16341
 URL: https://issues.apache.org/jira/browse/FLINK-16341
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


{{DatadogReporter#DatadogHttpRequest}} is just a wrapper around {{DSeries}} 
that is simple unnecessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16340) Expose stop with savepoint feature in the webUI

2020-02-28 Thread Piotr Nowojski (Jira)

Piotr Nowojski created FLINK-16340:
--

 Summary: Expose stop with savepoint feature in the webUI
 Key: FLINK-16340
 URL: https://issues.apache.org/jira/browse/FLINK-16340
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Web Frontend
Affects Versions: 1.10.0, 1.9.0
Reporter: Piotr Nowojski


[As requested by 
user.|https://lists.apache.org/thread.html/r96c1dec19f0754959f19293306a6c42f296cd127cb5ff84a1a51aeaf%40%3Cuser.flink.apache.org%3E]

Since Flink 1.9 stop button was removed the webUI, as the feature was not 
working properly. But since then Flink provides stop with savepoint feature. We 
could re-introduce this functionality to the webUI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16339) DatadogReporter isn't logging configuration options

2020-02-28 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-16339:


 Summary: DatadogReporter isn't logging configuration options
 Key: FLINK-16339
 URL: https://issues.apache.org/jira/browse/FLINK-16339
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


To better detect wrongly configured options for reporters it is customary to 
log the used configured options.

The datadog reporter is not doing that at the moment, and this should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16338) Log series size in DatadogMetricReporter

2020-02-28 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-16338:


 Summary: Log series size in DatadogMetricReporter
 Key: FLINK-16338
 URL: https://issues.apache.org/jira/browse/FLINK-16338
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


For easier debugging of cases where metrics are missing it is useful to log the 
size of the series (=reported metrics) after reporting.

With this it is easier to determine whether Flink is reporting less metrics 
than expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Migrate build infrastructure from Travis CI to Azure Pipelines

2020-02-28 Thread Robert Metzger

Quick update on this effort: Since yesterday, I'm experimenting with
running the end to end tests with each pull request and "master" push.
I hope that this helps to uncover issues earlier (without waiting for the
nightly test execution)
The tests run for almost 3 hours, so the overall build status will remain
"PENDING" for quite a while. You should still have the regular compile /
test results quicker (depending on the time of day).
We might run into capacity issues with the end to end test execution for
each PR. I'll be closely monitoring this and report back.

In general, please let me know where if you have any problems with the new
CI setup.
For test failures, I'm happy to fix any issues caused by the build system,
just file a ticket for the new "Build System / Azure Pipelines

" component.




On Mon, Feb 17, 2020 at 12:23 PM Robert Metzger  wrote:

> @Leonard: On Azure, I'm not splitting the execution of the end to end
> tests anymore. We won't have the overhead of compiling the same profile
> multiple times anymore.
>
>
> @all: We have recently merged a first version of the Azure configuration
> files to Flink [1]. This will allow us to build pull requests with all the
> additional checks we had in place for Travis as well.
> In the next few days, I'm going to build pushes and the nightly crons on
> Azure as well.
>
> From now on, you can set up Azure Pipelines for your own Flink fork as
> well, and execute end to end tests there quite easily [2].
> I'll be closely monitoring the new setup in the coming days. Expect some
> smaller issues while not all pull requests have my changes (at some point,
> I will change a configuration in Azure, which will break builds that do not
> have my changes)
> Once Azure is stable, and we have the same features as the Travis build,
> we'll stop processing builds on Travis.
>
>
> [1] https://github.com/apache/flink/pull/10976
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/%5Bpreview%5D+Azure+Pipelines#id-[preview]AzurePipelines-Runningendtoendtests:
>
> On Mon, Dec 9, 2019 at 2:16 PM Leonard Xu  wrote:
>
>> +1 for the migration.
>> *10 parallel builds with 300 minute timeouts * is very useful for tasks
>> that takes long time like e2e tests.
>> And in Travis, looks like we compile entire project for every cron task
>> even if they use same profile, eg:
>>  `name: e2e - misc - hadoop 2.8
>>   name: e2e - ha - hadoop 2.8
>>   name: e2e - sticky - hadoop 2.8
>>   name: e2e - checkpoints - hadoop 2.8
>>   name: e2e - container - hadoop 2.8
>>   name: e2e - heavy - hadoop 2.8
>>   name: e2e - tpcds - hadoop 2.8`
>> We will compile entire project with profile `hadoop 2.8`  7 times, and
>> every task will take about 25  minutes.
>> @robert @chesnay Should we consider to compile once for multi cron task
>> which have same profile in the new Azure Pipelines？
>>
>> Best,
>> Leonard Xu
>>
>> > On Dec 9, 2019, at 11:57, Congxian Qiu  wrote:
>> >
>> > +1 for migrating to Azure pipelines as this can have shorter build time,
>> > and faster response.
>> >
>> > Best,
>> > Congxian
>> >
>> >
>> > Xiyuan Wang  于2019年12月9日周一 上午10:13写道：
>> >
>> >> Hi Robert,
>> >>  Thanks for bring up this topic. The 2 ARM machines(16cores) which I
>> >> donated is just for POC test. We(Huawei) can donate more once moving to
>> >> official Azure pipeline. :)
>> >>
>> >> Robert Metzger  于2019年12月6日周五 上午3:25写道：
>> >>
>> >>> Thanks for your comments Yun.
>> >>> If there's strong support for idea 2, it would actually make my
>> >>> life easier: the migration would be easier to do.
>> >>>
>> >>> I also noticed that the uploads to transfer.sh were broken, but this
>> >> should
>> >>> be fixed in the "rmetzger.flink" builds (coming from rmetzger/flink).
>> The
>> >>> builds in "flink-ci.flink" (coming from flink-ci/flink) might have
>> >> troubles
>> >>> with transfer.sh.
>> >>>
>> >>>
>> >>> On Thu, Dec 5, 2019 at 5:50 PM Yun Tang  wrote:
>> >>>
>>  Hi Robert
>> 
>>  Really exciting to see this new more powerful CI tool to get rid of
>> the
>> >>> 50
>>  minutes limit of traivs-CI free account.
>> 
>>  After reading the wiki, I support idea 2 of AZP-setup version-2.
>> 
>>  However, after I dig into some failing builds at
>>  https://dev.azure.com/rmetzger/Flink/_build , I found we cannot view
>> >> the
>>  logs of some IT cases which would be uploaded by traivs_watchdog to
>>  transfer.sh previously.
>>  I think this feature is also easy to implement in AZP, right?
>> 
>>  Best
>>  Yun Tang
>> 
>>  On 12/6/19, 12:19 AM, "Robert Metzger"  wrote:
>> 
>> I've created a first draft of my plans in the wiki:
>> 
>> 
>> >>>
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/%5Bpreview%5D+Azure+Pipelines
>>  .
>> I'm looking forward to your comments.

[jira] [Created] (FLINK-16337) Add RelNodes and Rules for vectorized Python UDF execution

2020-02-28 Thread Dian Fu (Jira)

Dian Fu created FLINK-16337:
---

 Summary: Add RelNodes and Rules for vectorized Python UDF execution
 Key: FLINK-16337
 URL: https://issues.apache.org/jira/browse/FLINK-16337
 Project: Flink
  Issue Type: Sub-task
  Components: API / Python
Reporter: Dian Fu
Assignee: Dian Fu
 Fix For: 1.11.0


As the title describes, the aim of this JIRA is to add RelNodes and Rules for 
vectorized Python UDF execution. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16336) Support new type inference for temporal table functions

2020-02-28 Thread Timo Walther (Jira)

Timo Walther created FLINK-16336:


 Summary: Support new type inference for temporal table functions
 Key: FLINK-16336
 URL: https://issues.apache.org/jira/browse/FLINK-16336
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API, Table SQL / Planner
Reporter: Timo Walther
Assignee: Timo Walther


Temporal table functions have not been updated to the new type inference yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16335) vertex timeline should display all status other than the last one

2020-02-28 Thread Yadong Xie (Jira)

Yadong Xie created FLINK-16335:
--

 Summary: vertex timeline should display all status other than the 
last one
 Key: FLINK-16335
 URL: https://issues.apache.org/jira/browse/FLINK-16335
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.11.0
Reporter: Yadong Xie


The current design of vertex timeline in web UI follow the old UI logic

The vertex timeline only use the state(the latest state), start-time, duration, 
and end-time to display the vertex status, all data inside the timestamps are 
ignored

 

change the vertex to display all status according to the timestamps time point



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16334) flink-sql kafka-connector support ignore the invalid data during parsing bytes to json row

2020-02-28 Thread roncenzhao (Jira)

roncenzhao created FLINK-16334:
--

 Summary: flink-sql kafka-connector support ignore the invalid data 
during parsing bytes to json row
 Key: FLINK-16334
 URL: https://issues.apache.org/jira/browse/FLINK-16334
 Project: Flink
  Issue Type: Wish
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.10.0
 Environment: flink1.10+kafka+json
Reporter: roncenzhao


We found that, if we create table like this:

 
{code:java}
CREATE TABLE MyUserTable (
  id BIGINT,
  name STRING
) WITH (
  'connector.type' = 'kafka',
  'connector.version' = 'universal',
  'connector.topic' = 'test_topic',
  'connector.properties.bootstrap.servers' = 'xxx',
  'connector.properties.zookeeper.connect' = 'xxx',
  'connector.properties.group.id' = 'g_test',
  --'connector.startup-mode' = 'earliest-offset',
  --'connector.startup-mode' = 'latest-offset',
  'connector.startup-mode' = 'group-offsets',
  'format.type' = 'json',
  'format.fail-on-missing-field' = 'false'
);
{code}
If execute `select * from MyUserTable` and the current row is not json type, 
the job will be failed and the offset of the consumer group will be reset to 
the latest offset.

I think we should add some configuration like 'format.fail-on-missing-field' 
e.g 'format.fail-on-invalid-json' to ignore current invalid row.

Looking forward to your reply!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16333) After exiting from the command line client, the process still exists

2020-02-28 Thread jinxin (Jira)

jinxin created FLINK-16333:
--

 Summary: After exiting from the command line client, the process 
still exists
 Key: FLINK-16333
 URL: https://issues.apache.org/jira/browse/FLINK-16333
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client
Affects Versions: 1.10.0
 Environment: JDK 8 

Flink 1.10 
Reporter: jinxin
 Attachments: 微信图片_20200228190743.png, 微信图片_20200228190935.png

After exiting from flink scala shell, the process still exists. Then I tried to 
kill him with kill -9 PID, but it still didn't work. Process still exists.

Similar problems also appear in SQL client.Fortunately, kill-9 takes effect.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16332) Support un-ordered mode for async lookup join

2020-02-28 Thread Jark Wu (Jira)

Jark Wu created FLINK-16332:
---

 Summary: Support un-ordered mode for async lookup join
 Key: FLINK-16332
 URL: https://issues.apache.org/jira/browse/FLINK-16332
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API, Table SQL / Planner
Reporter: Jark Wu


Currently, we only support "ordered" mode for async lookup join.  Because the 
ordering in streaming SQL is very important, the accumulate and retract 
messages shoudl be in ordered. If messages are out of order, the result will be 
wrong. 

The "un-ordered" mode can be enabled by a job configuration. But it will be a 
prefered option. Only if it doesn't affect the order of acc/retract messages 
(e.g. it is just an append-only stream), the "un-ordered" mode will be enabled. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-106: Support Python UDF in SQL Function DDL

2020-02-28 Thread Benchao Li

Big +1 for this feature.

We built our SQL platform on Java Table API, and most common UDF are
implemented in Java. However some python developers are not familiar with
Java/Scala, and it's very inconvenient for these users to use UDF in SQL.

Wei Zhong  于2020年2月28日周五 下午6:58写道：

> Thank for your reply Dan!
>
> By the way, this FLIP is closely related to the SQL API.  @Jark Wu <
> imj...@gmail.com> @Timo  could you please take a look?
>
> Thanks,
> Wei
>
> > 在 2020年2月25日，16:25，zoudan  写道：
> >
> > +1 for supporting Python UDF in Java/Scala Table API.
> > This is a great feature and would be helpful for python users!
> >
> > Best,
> > Dan Zou
> >
> >
>
>

-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn

[jira] [Created] (FLINK-16331) Remove source licenses for old WebUI

2020-02-28 Thread Chesnay Schepler (Jira)

Chesnay Schepler created FLINK-16331:


 Summary: Remove source licenses for old WebUI
 Key: FLINK-16331
 URL: https://issues.apache.org/jira/browse/FLINK-16331
 Project: Flink
  Issue Type: Bug
  Components: Release System
Affects Versions: 1.10.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.10.1, 1.11.0


When we removed the old WebUI we only removed the licenses from 
flink-runtime-web, but missed the ones for the source release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-106: Support Python UDF in SQL Function DDL

2020-02-28 Thread Wei Zhong

Thank for your reply Dan!

By the way, this FLIP is closely related to the SQL API.  @Jark Wu 
 @Timo  could you please take a look?

Thanks,
Wei

> 在 2020年2月25日，16:25，zoudan  写道：
> 
> +1 for supporting Python UDF in Java/Scala Table API.
> This is a great feature and would be helpful for python users!
> 
> Best,
> Dan Zou
> 
>

Re: [VOTE] FLIP-100: Add Attempt Information

2020-02-28 Thread Yadong Xie

Hi everyone

A subtask filter and an attempt id filter was added to the attempt timeline
page now

check the POC here:
http://101.132.122.69:8081/web/#/job/9d651769488466d33e7a607e85203543/timeline

FLIP doc was also updated:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-100%3A+Add+Attempt+Information



Till Rohrmann  于2020年2月27日周四 下午9:49写道：

> I think your proposal is better because then we don't have to bump the REST
> API version.
>
> Cheers,
> Till
>
> On Thu, Feb 27, 2020 at 8:02 AM Yadong Xie  wrote:
>
> > Hi Till
> >
> > We keep response a flattened SubtaskTimeInfo and an array of
> > SubtaskTimeInfo to keep the restAPI backward compatible, since Flink
> users
> > may still need the API('/jobs/{jobId}/vertices/{vertexId}/subtasktimes')
> to
> > get the flattened SubtaskTimeInfo.
> >
> > If we want to get an array of SubtaskTimeInfo in the API, a new URL needs
> > to be created other than reuse the old one.
> >
> > Both solutions are ok for me. What do you think about it?
> >
> > Till Rohrmann  于2020年2月26日周三 下午10:53写道：
> >
> > > Fair enough. If this should become a problem we could introduce it
> later
> > as
> > > well.
> > >
> > > What about changing the SubtasksTimeInfo response type into an array of
> > > SubtaskTimeInfo? At the moment SubtasksTimeInfo contains a
> > > flattened SubtaskTimeInfo and and array of SubtaskTimeInfo for the
> > previous
> > > attempts.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Feb 26, 2020 at 1:16 PM Yadong Xie 
> wrote:
> > >
> > > > Hi Till
> > > >
> > > > Thanks for your comments.
> > > >
> > > > > I have a comment concerning the SubtasksTimesHandler
> > > >
> > > > It would be much easier for the frontend to handle a large amount of
> > data
> > > > if we have a rest API parameter filter, but in my opinion, the
> attempt
> > > list
> > > > data is not large enough that we have to rely on the rest API
> > parameters
> > > > paging, we still can handle them all in the frontend.
> > > >
> > > > Users can filter the attempt list by the
> > > status(scheduled/created/deploying
> > > > and so on) and other keywords(attempt_id and so on) directly in the
> > > > frontend since all data are listed from the rest API.
> > > > If we move some of the filter parameters to the rest API path
> > parameter,
> > > > all the other filter parameters need to be moved too.
> > > >
> > > > I suggest adding an attempt id filter in the UI to help users filter
> > the
> > > > desired attempt, and all the filtering process is running inside the
> > > > browser side, what do you think about this?
> > > >
> > > >
> > > >
> > > >
> > > > Till Rohrmann  于2020年2月25日周二 下午11:40写道：
> > > >
> > > > > Hi Yadong,
> > > > >
> > > > > thanks for creating this FLIP. I like the idea to make the web-ui
> > > > > information richer wrt to subtask attempt information.
> > > > >
> > > > > I have a comment concerning the SubtasksTimesHandler: Should we
> > change
> > > > the
> > > > > response type SubtasksTimeInfo so that it simply contains an
> > > > > array of SubtaskTimeInfo? One could add an attempt range path
> > parameter
> > > > to
> > > > > the SubtasksTimesHandler to be able to control which attempts will
> be
> > > > > returned.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Feb 25, 2020 at 9:57 AM Benchao Li 
> > > wrote:
> > > > >
> > > > > > Hi Yadong,
> > > > > >
> > > > > > Thanks for the updating.  LGTM now.
> > > > > >
> > > > > > +1 (non-binding)
> > > > > >
> > > > > > Yadong Xie  于2020年2月25日周二 下午4:41写道：
> > > > > >
> > > > > > > Hi Kurt
> > > > > > >
> > > > > > > There will be no differences between batch jobs and stream jobs
> > in
> > > > > > > subtask-attempt level in the UI
> > > > > > > The only differences are in the vertex timeline, I have added a
> > > > > > screenshot
> > > > > > > of the batch job in the FLIP-100 since the batch job will
> > disappear
> > > > > from
> > > > > > > the list after it finished soon.
> > > > > > > here is the link:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-100%3A+Add+Attempt+Information
> > > > > > >
> > > > > > >
> > > > > > > Kurt Young  于2020年2月21日周五 上午11:51写道：
> > > > > > >
> > > > > > > > Hi Yadong,
> > > > > > > >
> > > > > > > > Thanks for the proposal, it's a useful feature, especially
> for
> > > > batch
> > > > > > > jobs.
> > > > > > > > But according
> > > > > > > > to the examples you gave, I can't tell whether i got required
> > > > > > information
> > > > > > > > from that.
> > > > > > > > Can you replace the demo job to a more complex batch job and
> > then
> > > > we
> > > > > > can
> > > > > > > > see some
> > > > > > > > differences of start/stop time of different tasks and
> attempts?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Kurt
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Feb 20, 2020 at 5:46 PM Yadong Xie <
> > vthink...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > >

Re: [VOTE] FLIP-98: Better Back Pressure Detection

2020-02-28 Thread Yadong Xie

Hi Till

In addition to the front end, the FLIP-98 would also include some rest API
changes.
Could you assign one PMC or committer to help us review the pull request
and drive this FLIP forward?
We will start to create PR as soon as possible, thanks!

Till Rohrmann  于2020年2月25日周二 下午6:12写道：

> late +1 (binding).
>
> Cheers,
> Till
>
> On Tue, Feb 25, 2020 at 4:55 AM Yadong Xie  wrote:
>
> > Thanks all for the votes.
> >
> > So far, we have
> >
> > - 3 binding +1 votes (Kurt, Jark, zhijiang)
> > - 4 non-binding +1 votes (Congxian, Wangyang, Benchao, Yun Gao)
> > - No -1 votes
> >
> > The voting time has past and there are enough +1 votes to consider the
> > FLIP-98 approved.
> > Thank you all.
> >
> > Yun Gao  于2020年2月24日周一 上午10:27写道：
> >
> > > +1 (non-binding)
> > >   This should make tuning back pressure easier, which is one of the
> > > most common problems met for users.
> > >
> > > Best,
> > >Yun
> > >
> > >
> > > --
> > > From:Benchao Li 
> > > Send Time:2020 Feb. 24 (Mon.) 08:32
> > > To:dev 
> > > Cc:Zhijiang 
> > > Subject:Re: [VOTE] FLIP-98: Better Back Pressure Detection
> > >
> > > +1 (non-binding)
> > >
> > > It's really helpful!
> > >
> > > Yang Wang  于2020年2月21日周五 下午2:44写道：
> > >
> > > > +1 non-binding
> > > >
> > > > It is really straightforward.
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Jark Wu  于2020年2月21日周五 下午1:52写道：
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > On Fri, 21 Feb 2020 at 11:51, Kurt Young  wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 21, 2020 at 1:09 AM Zhijiang <
> > wangzhijiang...@aliyun.com
> > > > > > .invalid>
> > > > > > wrote:
> > > > > >
> > > > > > > +1 (binding).
> > > > > > > It seems more clearly and directly to highlight the back
> > pressured
> > > > > vertex
> > > > > > > in topology, which can raise the attention of users.
> > > > > > >
> > > > > > > Best,
> > > > > > > Zhijiang
> > > > > > >
> > > > > > >
> > > > > > >
> > --
> > > > > > > From:Congxian Qiu 
> > > > > > > Send Time:2020 Feb. 20 (Thu.) 21:49
> > > > > > > To:dev@flink.apache.org 
> > > > > > > Subject:Re: [VOTE] FLIP-98: Better Back Pressure Detection
> > > > > > >
> > > > > > > +1 for this improvement, It's very helpful to know whether a
> task
> > > is
> > > > > > > backpressured or not.
> > > > > > >
> > > > > > > Best,
> > > > > > > Congxian
> > > > > > >
> > > > > > >
> > > > > > > Yadong Xie  于2020年2月20日周四 下午5:34写道：
> > > > > > >
> > > > > > > > Hi all
> > > > > > > >
> > > > > > > > I want to start the vote for FLIP-98, which proposes to
> provide
> > > > > better
> > > > > > > > backpressure detection experience in web UI.
> > > > > > > >
> > > > > > > > To help everyone better understand the proposal, we spent
> some
> > > > > efforts
> > > > > > on
> > > > > > > > making an online POC
> > > > > > > >
> > > > > > > > previous web:
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/#/job/add9d3ea1006af15b9f872987eb3befd/overview
> > > > > > > > POC web:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/web/#/job/add9d3ea1006af15b9f872987eb3befd/overview
> > > > > > > >
> > > > > > > > The vote will last for at least 72 hours, following the
> > consensus
> > > > > > voting
> > > > > > > > process.
> > > > > > > >
> > > > > > > > FLIP wiki:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-98%3A+Better+Back+Pressure+Detection
> > > > > > > >
> > > > > > > > Discussion thread:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-75-Flink-Web-UI-Improvement-Proposal-td33540.html
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Yadong
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Benchao Li
> > > School of Electronics Engineering and Computer Science, Peking
> University
> > > Tel:+86-15650713730
> > > Email: libenc...@gmail.com; libenc...@pku.edu.cn
> > >
> > >
> >
>

Re: [VOTE] FLIP-103: Better TM/JM Log Display

2020-02-28 Thread Yadong Xie

Hi Till

In addition to the front end, the FLIP-103 would also include some rest API
changes.
Could you assign one PMC or committer to help us review the pull request
and drive this FLIP forward?
We will start to create PR as soon as possible, thanks!

Till Rohrmann  于2020年2月25日周二 下午6:03写道：

> late +1 (binding)
>
> Cheers,
> Till
>
> On Tue, Feb 25, 2020 at 4:46 AM Yadong Xie  wrote:
>
> > Thanks all for the votes.
> >
> > So far, we have
> >
> > - 4 binding +1 votes (Kurt, Jark, jincheng, Zhu Zhu)
> > - 5 non-binding +1 votes (Yang Wang, Xintong Song, lining, Yangze,
> zhenya)
> > - No -1 votes
> >
> > The voting time has past and there are enough +1 votes to consider the
> > FLIP-103 approved.
> > Thank you all.
> >
> > Zhu Zhu  于2020年2月24日周一 下午3:56写道：
> >
> > > +1 (binding)
> > >
> > > It can be very helpful to check GC state when GC logging is enabled.
> > >
> > > Thanks,
> > > Zhu Zhu
> > >
> > > zhenya Sun  于2020年2月24日周一 下午3:51写道：
> > >
> > > > 1
> > > >
> > > >
> > > > able-sun
> > > >
> > > >
> > > > On 02/24/2020 15:45，jincheng sun wrote：
> > > > +1(binding) Thank you for driving this Yadong!
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > > Yangze Guo  于2020年2月21日周五 下午3:08写道：
> > > >
> > > > +1 (non-binding)
> > > >
> > > > It will make debugging more convenient and hence improve usability.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Fri, Feb 21, 2020 at 2:50 PM Yang Wang 
> > wrote:
> > > >
> > > > Hi Jark,
> > > >
> > > > The GC log is not enabled by default. If the users want, he/she
> should
> > > > configure it by the following config option.
> > > > env.java.opts.jobmanager: -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> > > > -Xloggc:/opt/flink/log/jobmanager-gc.log -XX:+UseGCLogFileRotation
> > > > -XX:NumberOfGCLogFiles=2 -XX:GCLogFileSize=512M
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Jark Wu  于2020年2月21日周五 下午2:19写道：
> > > >
> > > > +1 (binding)
> > > >
> > > > A minor question: I didn't find the GC log in the POC web. Is it only
> > > > generated when some configuration is enabled?
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Fri, 21 Feb 2020 at 11:40, Kurt Young  wrote:
> > > >
> > > > +1 (binding)
> > > >
> > > >
> > > > On Fri, Feb 21, 2020 at 11:25 AM lining jing 
> > > > wrote:
> > > >
> > > > +1 (non-binding)
> > > > It lists all log files, the user could see the GC log.
> > > >
> > > > Xintong Song  于2020年2月21日周五 上午10:44写道：
> > > >
> > > > +1 (non-binding)
> > > >
> > > > I like the ideas of having a list of all log files, and make them
> > > > downloadable.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Fri, Feb 21, 2020 at 10:24 AM Yang Wang <
> > > > danrtsey...@gmail.com>
> > > > wrote:
> > > >
> > > > +1 non-binding
> > > >
> > > > The log files list are shown in the new design. I like the
> > > > idea.
> > > > Since
> > > > the
> > > > user defined logs(e.g. my-log.file)
> > > > could be found here now.
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Yadong Xie  于2020年2月20日周四 下午5:56写道：
> > > >
> > > > Hi all
> > > >
> > > > I want to start the vote for FLIP-103, which proposes to
> > > > provide
> > > > better
> > > > user log display UI in task manager and job manager.
> > > >
> > > > To help everyone better understand the proposal, we spent
> > > > some
> > > > efforts
> > > > on
> > > > making an online POC
> > > >
> > > > Task manager:
> > > >
> > > > previous web:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/#/task-manager/6df6c5f37b2bff125dbc3a7388128559/logs
> > > > POC web:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> http://101.132.122.69:8081/web/#/task-manager/6df6c5f37b2bff125dbc3a7388128559/log
> > > >
> > > > Job manager:
> > > >
> > > > previous web: http://101.132.122.69:8081/#/job-manager/logs
> > > > POC web: http://101.132.122.69:8081/web/#/job-manager/log
> > > >
> > > >
> > > > The vote will last for at least 72 hours, following the
> > > > consensus
> > > > voting
> > > > process.
> > > >
> > > > FLIP wiki:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=147427143
> > > >
> > > > Discussion thread:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-75-Flink-Web-UI-Improvement-Proposal-td33540.html
> > > >
> > > > Thanks,
> > > >
> > > > Yadong
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>

[jira] [Created] (FLINK-16330) Set up environment variables for the S3 tests in Azure

2020-02-28 Thread Robert Metzger (Jira)

Robert Metzger created FLINK-16330:
--

 Summary: Set up environment variables for the S3 tests in Azure
 Key: FLINK-16330
 URL: https://issues.apache.org/jira/browse/FLINK-16330
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / Azure Pipelines
Reporter: Robert Metzger


Some S3 tests on azure are skipped, because some environment variables have not 
been set



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-16329) Set up build notifications from AZP to builds@f.a.o

2020-02-28 Thread Robert Metzger (Jira)

Robert Metzger created FLINK-16329:
--

 Summary: Set up build notifications from AZP to builds@f.a.o
 Key: FLINK-16329
 URL: https://issues.apache.org/jira/browse/FLINK-16329
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / Azure Pipelines
Reporter: Robert Metzger
Assignee: Robert Metzger


We want that failed nightly builds are reported to builds@f.a.o.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] FLIP-84: Improve & Refactor API of TableEnvironment

2020-02-28 Thread Jark Wu

+1 from my side.

Best,
Jark

On Fri, 28 Feb 2020 at 15:07, kant kodali  wrote:

> Nice!!
>
> Sent from my iPhone
>
> > On Feb 27, 2020, at 9:03 PM, godfrey he  wrote:
> >
> > Hi kant, yes. We hope to deprecate the methods which confuse users ASAP.
> >
> > Bests,
> > godfrey
> >
> > kant kodali  于2020年2月28日周五 上午11:17写道：
> >
> >> Is this targeted towards Flink 1.11?
> >>
> >>> On Thu, Feb 27, 2020 at 6:32 PM Kurt Young  wrote:
> >>>
> >>> +1 (binding)
> >>>
> >>> Best,
> >>> Kurt
> >>>
> >>>
>  On Fri, Feb 28, 2020 at 9:15 AM Terry Wang 
> wrote:
> >>>
>  I look through the whole design and it’s a big improvement of
> usability
> >>> on
>  TableEnvironment’s api.
> 
>  +1 (non-binding)
> 
>  Best,
>  Terry Wang
> 
> 
> 
> > 2020年2月27日 14:59，godfrey he  写道：
> >
> > Hi everyone,
> >
> > I'd like to start the vote of FLIP-84[1], which proposes to deprecate
>  some
> > old APIs and introduce some new APIs in TableEnvironment. This FLIP
> >> is
> > discussed and reached consensus in the discussion thread[2].
> >
> > The vote will be open for at least 72 hours. Unless there is an
>  objection,
> > I will try to close it by Mar 1, 2020 07:00 UTC if we have received
> > sufficient votes.
> >
> >
> > [1]
> >
> 
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-84%3A+Improve+%26+Refactor+API+of+TableEnvironment
> >
> > [2]
> >
> 
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-84-Improve-amp-Refactor-API-of-Table-Module-td34537.html
> >
> >
> > Bests,
> > Godfrey
> 
> 
> >>>
> >>
>

Re: Flink 1.10 exception : Unable to instantiate java compiler

2020-02-28 Thread Till Rohrmann

Hi,

with Flink 1.10 we changed the behaviour on the client side so that it also
uses the child first class loader [1]. Due to that it might be the case
that you have some conflicting dependencies bundled in your user code jar
which don't play well together with what you have on the system class path
of your client. If the problematic dependency originates from
flink-table-planner-blink, then setting it to provided makes sense.

Please also take a look at this issue if you are using Hive [2].

[1] https://issues.apache.org/jira/browse/FLINK-13749
[2] https://issues.apache.org/jira/browse/FLINK-14849

Cheers,
Till

On Fri, Feb 28, 2020 at 10:01 AM LakeShen  wrote:

>  I have solved this problem. I set the  flink-table-planner-blink maven
> scope to provided .
>
> kant kodali  于2020年2月28日周五 下午3:32写道：
>
> > Same problem!
> >
> > On Thu, Feb 27, 2020 at 11:10 PM LakeShen 
> > wrote:
> >
> >> Hi community,
> >>   now  I am using the flink 1.10 to run the flink task
> >> ,cluster type is yarn . I use commandline to submit my flink job , the
> >> commandline just like this :
> >>
> >> flink run  -m yarn-cluster  --allowNonRestoredState  -c xxx.xxx.xx
> >>  flink-stream-xxx.jar
> >>
> >> Bug there is a exception to throw,the exception info is :
> >>
> >> *org.apache.flink.client.program.ProgramInvocationException: The main
> >> method caused an error: Unable to instantiate java compiler*
> >> at
> >>
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
> >> at
> >>
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
> >> at
> >> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
> >> at
> >>
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
> >> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
> >> at
> >>
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
> >> at
> >>
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
> >> at java.security.AccessController.doPrivileged(Native Method)
> >> at javax.security.auth.Subject.doAs(Subject.java:422)
> >> at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
> >> at
> >>
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
> >> Caused by: java.lang.IllegalStateException: Unable to instantiate java
> >> compiler
> >> at
> >>
> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.compile(JaninoRelMetadataProvider.java:434)
> >> at
> >>
> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.load3(JaninoRelMetadataProvider.java:375)
> >> at
> >>
> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.lambda$static$0(JaninoRelMetadataProvider.java:109)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.CacheLoader$FunctionToCacheLoader.load(CacheLoader.java:149)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache.get(LocalCache.java:3953)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3957)
> >> at
> >>
> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4875)
> >> at
> >>
> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.create(JaninoRelMetadataProvider.java:475)
> >> at
> >>
> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.revise(JaninoRelMetadataProvider.java:488)
> >> at
> >>
> org.apache.calcite.rel.metadata.RelMetadataQuery.revise(RelMetadataQuery.java:193)
> >> at
> >>
> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:797)
> >> at
> >>
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:298)
> >> at
> >>
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319)
> >> at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560)
> >> at
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419)
> >> at
> >>
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256)
> >> at
> >>
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
> >> at
> >>
>

Re: Flink 1.10 exception : Unable to instantiate java compiler

2020-02-28 Thread LakeShen

 I have solved this problem. I set the  flink-table-planner-blink maven
scope to provided .

kant kodali  于2020年2月28日周五 下午3:32写道：

> Same problem!
>
> On Thu, Feb 27, 2020 at 11:10 PM LakeShen 
> wrote:
>
>> Hi community,
>>   now  I am using the flink 1.10 to run the flink task
>> ,cluster type is yarn . I use commandline to submit my flink job , the
>> commandline just like this :
>>
>> flink run  -m yarn-cluster  --allowNonRestoredState  -c xxx.xxx.xx
>>  flink-stream-xxx.jar
>>
>> Bug there is a exception to throw,the exception info is :
>>
>> *org.apache.flink.client.program.ProgramInvocationException: The main
>> method caused an error: Unable to instantiate java compiler*
>> at
>> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:335)
>> at
>> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:205)
>> at
>> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:138)
>> at
>> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:664)
>> at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>> at
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:895)
>> at
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:968)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:422)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
>> at
>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:968)
>> Caused by: java.lang.IllegalStateException: Unable to instantiate java
>> compiler
>> at
>> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.compile(JaninoRelMetadataProvider.java:434)
>> at
>> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.load3(JaninoRelMetadataProvider.java:375)
>> at
>> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.lambda$static$0(JaninoRelMetadataProvider.java:109)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.CacheLoader$FunctionToCacheLoader.load(CacheLoader.java:149)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache.get(LocalCache.java:3953)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3957)
>> at
>> org.apache.flink.calcite.shaded.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4875)
>> at
>> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.create(JaninoRelMetadataProvider.java:475)
>> at
>> org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.revise(JaninoRelMetadataProvider.java:488)
>> at
>> org.apache.calcite.rel.metadata.RelMetadataQuery.revise(RelMetadataQuery.java:193)
>> at
>> org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:797)
>> at
>> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:298)
>> at
>> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319)
>> at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560)
>> at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419)
>> at
>> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256)
>> at
>> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>> at
>> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215)
>> at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202)
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkHepProgram.optimize(FlinkHepProgram.scala:69)
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkHepRuleSetProgram.optimize(FlinkHepRuleSetProgram.scala:87)
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)
>> at
>> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)
>> at
>> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>> at
>> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>> at

[jira] [Created] (FLINK-16328) Restart batch timer in JDBC sink

2020-02-28 Thread Roman Khachatryan (Jira)

Roman Khachatryan created FLINK-16328:
-

 Summary: Restart batch timer in JDBC sink
 Key: FLINK-16328
 URL: https://issues.apache.org/jira/browse/FLINK-16328
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.10.0
Reporter: Roman Khachatryan


During the review of [https://github.com/apache/flink/pull/11061]

the following issue was discovered by @kkl0u :
{quote}Here shouldn't we restart the timer? It seems like we may be in the 
situation where we flushed this batch because we reached the max size, so for 
the new one we have to set a new timer (renew the BatchIntervalMs), right?
{quote}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

2020-02-28 Thread godfrey he

Hi Benchao,

> I have one question about this FLIP:
> executeStatement  accepts DML, what if it's a streaming DML ?
>does it submit the job to cluster directly and blocks forever? what's
> the behavior for the next statements?
`executeStatement` is a synchronous method, will execute the statement once
calling this method and return the result until the job is finished.
We will introduce asynchronous method like `executeStatementAsync` in the
future.

> nit: there's a typo in "the table describing the result for each kind of
> statement", "*Result Scheam" -> "Result Schema"*
Thanks for the reminding, I will fix it now.

Bests,
Godfrey

Benchao Li  于2020年2月28日周五 下午4:00写道：

> Hi Terry,
>
> Thanks for the propose, and sorry for joining the party late.
>
> I have one question about this FLIP:
> executeStatement  accepts DML, what if it's a streaming DML ?
> does it submit the job to cluster directly and blocks forever? what's
> the behavior for the next statements?
>
> nit: there's a typo in "the table describing the result for each kind of
> statement", "*Result Scheam" -> "Result Schema"*
>
>
> godfrey he  于2020年2月18日周二 下午4:41写道：
>
> > Thanks Kurt and Jark for explanation, I now also think we should make the
> > TableEnvironment interface more statable and should not change "sqlQuery"
> > method and "from" method.
> >
> > Hi Jingsong. Regarding to the "DmlBatch", I totally agree with advantages
> > of "addBatch" method. However, there are two more questions need to
> solve:
> > one is how users write multi-sink programs in a Table API ? and another
> is
> > how users explain multi-sink program in both SQL and Table API ?
> > Currently, "DmlBatch" class can solve those questions. (the main
> > disadvantages is Inconsistent with the current interface)
> >
> > Bests,
> > godfrey
> >
> > Jingsong Li  于2020年2月15日周六 下午9:09写道：
> >
> > > Hi Kurt and Godfrey,
> > >
> > > Thank you for your explanation.
> > >
> > > Regarding to the "DmlBatch",
> > > I see there are some description for JDBC Statement.addBatch in the
> > > document.
> > > What do you think about introducing "addBatch" to the TableEnv instead
> of
> > > introducing a new class?
> > > The advantage is:
> > > - Consistent with JDBC statement.
> > > - Consistent with current interface, what we need do is just modify
> > method
> > > name.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > >
> > > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young  wrote:
> > >
> > > > I don't think we should change `from` to `fromCatalog`, especially
> > `from`
> > > > is just
> > > > introduced in 1.10. I agree with Jark we should change interface only
> > > when
> > > > necessary,
> > > > e.g. the semantic is broken or confusing. So I'm +1 to keep
> `sqlQuery`
> > as
> > > > it is.
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu  wrote:
> > > >
> > > > > Thanks Kurt and Godfrey for the explanation,
> > > > >
> > > > > It makes sense to me that renaming `from(tableName)` to
> > > > > `fromCatalog(tableName)`.
> > > > > However, I still think `sqlQuery(query)` is clear and works well.
> Is
> > it
> > > > > necessary to change it?
> > > > >
> > > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we
> removed
> > > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > > and now we want to remove them again. Users will feel like the
> > > interface
> > > > is
> > > > > very unstable, that really frustrates users.
> > > > > I think we should be cautious to remove interface and only when it
> is
> > > > > necessary.
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > >
> > > > >
> > > > > On Thu, 13 Feb 2020 at 20:58, godfrey he 
> > wrote:
> > > > >
> > > > > > hi kurt，jark，jingsong
> > > > > >
> > > > > > Regarding to "fromQuery", I agree with kurt. In addition, I think
> > > > `Table
> > > > > > from(String tableName)` should be renamed to `Table
> > > fromCatalog(String
> > > > > > tableName)`.
> > > > > >
> > > > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> > > "DELETE",
> > > > > and
> > > > > > they can be executed in a same batch in the future. So we can add
> > > > > > "addUpdate" method and "addDelete" method to support them.
> > > > > >
> > > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > > "DmlBatchBuilder".
> > > > > >
> > > > > > open to more discussion
> > > > > >
> > > > > > Best,
> > > > > > godfrey
> > > > > >
> > > > > >
> > > > > >
> > > > > > Kurt Young  于2020年2月13日周四 下午4:56写道：
> > > > > >
> > > > > > > Regarding to "fromQuery" is confusing users with "Table
> > from(String
> > > > > > > tableName)", I have
> > > > > > > a just opposite opinion. I think this "fromXXX" pattern can
> make
> > > > users
> > > > > > > quite clear when they
> > > > > > > want to get a Table from TableEnvironment. Similar interfaces
> > will
> > > > also
> > > > > > > include like "fromElements".
> > > > > > >
> > > > > > > Regarding to the name of DmlBatch,

[jira] [Created] (FLINK-16327) Add TableEnvironment.fromElements interfaces for usability

2020-02-28 Thread Zhenghua Gao (Jira)

Zhenghua Gao created FLINK-16327:


 Summary: Add TableEnvironment.fromElements interfaces for usability
 Key: FLINK-16327
 URL: https://issues.apache.org/jira/browse/FLINK-16327
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Affects Versions: 1.11.0
Reporter: Zhenghua Gao


h1. Interface
{code:java}
/** 
   * Creates a table from a group of objects (known as its elements). The 
schema of the table 
   * would be inferred from the type of elements. 
   * 
   * @param data a group of objects. 
   */
Table fromElements(Collection data);
/** 
   * Creates a table from a group of objects (known as its elements). The 
schema of the table 
   * would be inferred from the passed in data type. 
   * 
   * @param data a group of objects 
   * @param dataType the data type of the data 
   */
Table fromElements(Collection data, DataType dataType);
{code}
h1. Use Case
 * One potential use case for Table API

{code:java}
@Test 
def testUnregisteredCollectionSource1(): Unit = {
  val env = StreamExecutionEnvironment.getExecutionEnvironment
  val tEnv = StreamTableEnvironment.create(env)
  StreamITCase.testResults = mutable.MutableList()

  val data = Seq(
Row.of("Mike", new JInt(5), new JDouble(12.3), "Smith"))
  
  tEnv.fromElements(data.asJava)
.as('first, 'id, 'score, 'last)
.where('id > 4)
.select('last, 'score * 2)
.toAppendStream[Row]
.addSink(new StreamITCase.StringSink[Row])

  env.execute()
}

@Test 
def testUnregisteredCollectionSource2(): Unit = {
  val env = StreamExecutionEnvironment.getExecutionEnvironment
  val tEnv = StreamTableEnvironment.create(env)
  StreamITCase.testResults = mutable.MutableList()

  val data = Seq(
Row.of("Mike", new JInt(5), new JDouble(12.3), "Smith"))

  val dataType = DataTypes.ROW(
DataTypes.FIELD("first", DataTypes.STRING()),
DataTypes.FIELD("id", DataTypes.INT()),
DataTypes.FIELD("score", DataTypes.DOUBLE()),
DataTypes.FIELD("last", DataTypes.STRING()))

  tEnv.fromElements(data.asJava, dataType)
.where('id > 4)
.select('last, 'score * 2)
.toAppendStream[Row]
.addSink(new StreamITCase.StringSink[Row])

  env.execute()
}
{code}
 * One potential use case for SQL

{code:java}
@Test 
def testUnregisteredCollectionSource1(): Unit = {
  val env = StreamExecutionEnvironment.getExecutionEnvironment
  val tEnv = StreamTableEnvironment.create(env)
  StreamITCase.testResults = mutable.MutableList()

  val data = Seq(
Row.of("Mike", new JInt(5), new JDouble(12.3), "Smith"))
  
  val table = tEnv.fromElements(data.asJava).as('first, 'id, 'score, 'last)
  
  tEnv.createTemporaryView("T", table)

  tEnv.sqlQuery("SELECT last, score * 2 FROM T WHERE id > 4")
.toAppendStream[Row]
.addSink(new StreamITCase.StringSink[Row])

  env.execute()
}

@Test 
def testUnregisteredCollectionSource2(): Unit = {
  val env = StreamExecutionEnvironment.getExecutionEnvironment
  val tEnv = StreamTableEnvironment.create(env)
  StreamITCase.testResults = mutable.MutableList()

  val data = Seq(
Row.of("Mike", new JInt(5), new JDouble(12.3), "Smith"))

  val dataType = DataTypes.ROW(
DataTypes.FIELD("first", DataTypes.STRING()),
DataTypes.FIELD("id", DataTypes.INT()),
DataTypes.FIELD("score", DataTypes.DOUBLE()),
DataTypes.FIELD("last", DataTypes.STRING()))

  val table = tEnv.fromElements(data.asJava, dataType)
  tEnv.createTemporaryView("T", table)

  tEnv.sqlQuery("SELECT last, score * 2 FROM T WHERE id > 4")
.toAppendStream[Row]
.addSink(new StreamITCase.StringSink[Row])

  env.execute()
}
{code}
h1. The proposal
 * data type inference

We need to infer the data type from the data for the first interface. A 
potential tool is the DataTypeExtractor, but it doesn't support scala.tuple, 
Row, etc. For the most popular in our test cases Row or scala.tuple type, we 
could enumerate and use a recursive traversal method to get all available types 
of underlying objects. This can solve most of the cases and improve usability.
 * proposed changes
 ** A CollectionQueryOperation which implements QueryOperation to describe the 
relational operation
 ** The logical and physical RelNode for legacy planner. In the physical node, 
we can translate the data to DataStream
 ** The logical and physical RelNode for blink planner. In the physical node, 
we can translate the data to Transformation

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

2020-02-28 Thread Benchao Li

Hi Terry,

Thanks for the propose, and sorry for joining the party late.

I have one question about this FLIP:
executeStatement  accepts DML, what if it's a streaming DML ?
does it submit the job to cluster directly and blocks forever? what's
the behavior for the next statements?

nit: there's a typo in "the table describing the result for each kind of
statement", "*Result Scheam" -> "Result Schema"*


godfrey he  于2020年2月18日周二 下午4:41写道：

> Thanks Kurt and Jark for explanation, I now also think we should make the
> TableEnvironment interface more statable and should not change "sqlQuery"
> method and "from" method.
>
> Hi Jingsong. Regarding to the "DmlBatch", I totally agree with advantages
> of "addBatch" method. However, there are two more questions need to solve:
> one is how users write multi-sink programs in a Table API ? and another is
> how users explain multi-sink program in both SQL and Table API ?
> Currently, "DmlBatch" class can solve those questions. (the main
> disadvantages is Inconsistent with the current interface)
>
> Bests,
> godfrey
>
> Jingsong Li  于2020年2月15日周六 下午9:09写道：
>
> > Hi Kurt and Godfrey,
> >
> > Thank you for your explanation.
> >
> > Regarding to the "DmlBatch",
> > I see there are some description for JDBC Statement.addBatch in the
> > document.
> > What do you think about introducing "addBatch" to the TableEnv instead of
> > introducing a new class?
> > The advantage is:
> > - Consistent with JDBC statement.
> > - Consistent with current interface, what we need do is just modify
> method
> > name.
> >
> > Best,
> > Jingsong Lee
> >
> >
> > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young  wrote:
> >
> > > I don't think we should change `from` to `fromCatalog`, especially
> `from`
> > > is just
> > > introduced in 1.10. I agree with Jark we should change interface only
> > when
> > > necessary,
> > > e.g. the semantic is broken or confusing. So I'm +1 to keep `sqlQuery`
> as
> > > it is.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu  wrote:
> > >
> > > > Thanks Kurt and Godfrey for the explanation,
> > > >
> > > > It makes sense to me that renaming `from(tableName)` to
> > > > `fromCatalog(tableName)`.
> > > > However, I still think `sqlQuery(query)` is clear and works well. Is
> it
> > > > necessary to change it?
> > > >
> > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we removed
> > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > and now we want to remove them again. Users will feel like the
> > interface
> > > is
> > > > very unstable, that really frustrates users.
> > > > I think we should be cautious to remove interface and only when it is
> > > > necessary.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > >
> > > > On Thu, 13 Feb 2020 at 20:58, godfrey he 
> wrote:
> > > >
> > > > > hi kurt，jark，jingsong
> > > > >
> > > > > Regarding to "fromQuery", I agree with kurt. In addition, I think
> > > `Table
> > > > > from(String tableName)` should be renamed to `Table
> > fromCatalog(String
> > > > > tableName)`.
> > > > >
> > > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> > "DELETE",
> > > > and
> > > > > they can be executed in a same batch in the future. So we can add
> > > > > "addUpdate" method and "addDelete" method to support them.
> > > > >
> > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > "DmlBatchBuilder".
> > > > >
> > > > > open to more discussion
> > > > >
> > > > > Best,
> > > > > godfrey
> > > > >
> > > > >
> > > > >
> > > > > Kurt Young  于2020年2月13日周四 下午4:56写道：
> > > > >
> > > > > > Regarding to "fromQuery" is confusing users with "Table
> from(String
> > > > > > tableName)", I have
> > > > > > a just opposite opinion. I think this "fromXXX" pattern can make
> > > users
> > > > > > quite clear when they
> > > > > > want to get a Table from TableEnvironment. Similar interfaces
> will
> > > also
> > > > > > include like "fromElements".
> > > > > >
> > > > > > Regarding to the name of DmlBatch, I think it's mainly for
> > > > > > future flexibility, in case we can support
> > > > > > other statement in a single batch. If that happens, the name
> > > "Inserts"
> > > > > will
> > > > > > be weird.
> > > > > >
> > > > > > Best,
> > > > > > Kurt
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu 
> wrote:
> > > > > >
> > > > > > > I agree with Jingsong.
> > > > > > >
> > > > > > > +1 to keep `sqlQuery`, it's clear from the method name and
> return
> > > > type
> > > > > > that
> > > > > > > it accepts a SELECT query and returns a logic representation
> > > `Table`.
> > > > > > > The `fromQuery` is a little confused users with the `Table
> > > > from(String
> > > > > > > tableName)` method.
> > > > > > >
> > > > > > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the
> > > > purpose
> > > > > of
> > > > > > > `DmlBatch` is used to batching insert statements.
> > > > > > > Besides, DML terminology is

43 matches

Mail list logo