Re: [DISCUSS] Set default planner for SQL Client to Blink planner in 1.10 release

2020-01-02 Thread Terry Wang
Since what blink planner can do is a superset of flink planner, big +1 for 
changing the default planner to Blink planner from my side.

Best,
Terry Wang



> 2020年1月3日 15:00,Jark Wu  写道:
> 
> Hi everyone,
> 
> In 1.10 release, Flink SQL supports many awesome features and improvements,
> including:
> - support watermark statement and computed column in DDL
> - fully support all data types in Hive
> - Batch SQL performance improvements (TPC-DS 7x than Hive MR)
> - support INSERT OVERWRITE and INSERT PARTITION
> 
> However, all the features and improvements are only avaiable in Blink
> planner, not in Old planner.
> There are also some other features are limited in Blink planner, e.g.
> Dimension Table Join [1],
> TopN [2], Deduplicate [3], streaming aggregates optimization [4], and so on.
> 
> But Old planner is still the default planner in Table API & SQL. It is
> frustrating for users to set
> to blink planner manually when every time start a SQL CLI. And it's
> surprising to see unsupported
> exception if they trying out the new features but not switch planner.
> 
> SQL CLI is a very important entrypoint for trying out new feautures and
> prototyping for users.
> In order to give new planner more exposures, I would like to suggest to set
> default planner
> for SQL Client to Blink planner before 1.10 release.
> 
> The approach is just changing the default SQL CLI yaml configuration[5]. In
> this way, the existing
> environment is still compatible and unaffected.
> 
> Changing the default planner for the whole Table API & SQL is another topic
> and is out of scope of this discussion.
> 
> What do you think?
> 
> Best,
> Jark
> 
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/joins.html#join-with-a-temporal-table
> [2]:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#top-n
> [3]:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication
> [4]:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/tuning/streaming_aggregation_optimization.html
> [5]:
> https://github.com/apache/flink/blob/master/flink-table/flink-sql-client/conf/sql-client-defaults.yaml#L100



[DISCUSS] Set default planner for SQL Client to Blink planner in 1.10 release

2020-01-02 Thread Jark Wu
Hi everyone,

In 1.10 release, Flink SQL supports many awesome features and improvements,
including:
- support watermark statement and computed column in DDL
- fully support all data types in Hive
- Batch SQL performance improvements (TPC-DS 7x than Hive MR)
- support INSERT OVERWRITE and INSERT PARTITION

However, all the features and improvements are only avaiable in Blink
planner, not in Old planner.
There are also some other features are limited in Blink planner, e.g.
Dimension Table Join [1],
TopN [2], Deduplicate [3], streaming aggregates optimization [4], and so on.

But Old planner is still the default planner in Table API & SQL. It is
frustrating for users to set
to blink planner manually when every time start a SQL CLI. And it's
surprising to see unsupported
exception if they trying out the new features but not switch planner.

SQL CLI is a very important entrypoint for trying out new feautures and
prototyping for users.
In order to give new planner more exposures, I would like to suggest to set
default planner
for SQL Client to Blink planner before 1.10 release.

The approach is just changing the default SQL CLI yaml configuration[5]. In
this way, the existing
environment is still compatible and unaffected.

Changing the default planner for the whole Table API & SQL is another topic
and is out of scope of this discussion.

What do you think?

Best,
Jark

[1]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/joins.html#join-with-a-temporal-table
[2]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#top-n
[3]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/queries.html#deduplication
[4]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/tuning/streaming_aggregation_optimization.html
[5]:
https://github.com/apache/flink/blob/master/flink-table/flink-sql-client/conf/sql-client-defaults.yaml#L100


[jira] [Created] (FLINK-15466) `FlinkAggregateExpandDistinctAggregatesRule` generates wrong plan for cases that have distinct aggs with filter.

2020-01-02 Thread Shuo Cheng (Jira)
Shuo Cheng created FLINK-15466:
--

 Summary: `FlinkAggregateExpandDistinctAggregatesRule` generates 
wrong plan for cases that have distinct aggs with filter.
 Key: FLINK-15466
 URL: https://issues.apache.org/jira/browse/FLINK-15466
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.9.1, 1.9.0, 1.8.3
Reporter: Shuo Cheng
 Fix For: 1.10.0
 Attachments: image-2020-01-03-14-20-54-887.png

For the sql in batch mode,

 
{code:java}
SELECT 
   a, COUNT(a), SUM(DISTINCT b) FILTER (WHERE a > 0)
FROM MyTable 
GROUP BY a{code}
 

plan generated after logical stage is as following, which is not correct. The 
`Filter $4` should be `$f2 *and* $g_0`.

!image-2020-01-03-14-20-54-887.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-91 - Support SQL Client Gateway

2020-01-02 Thread Yangze Guo
Thanks for driving this, Xiaoling!

+1 for supporting SQL client gateway.

Best,
Yangze Guo


On Thu, Jan 2, 2020 at 9:58 AM 贺小令  wrote:
>
> Hey everyone,
> FLIP-24
> 
> proposes the whole conception and architecture of SQL Client. The embedded
> mode is already supported since release-1.5, which is helpful for
> debugging/demo purposes.
> Many users ask that how to submit a Flink job to online environment without
> programming on Flink API. To solve this, we create FLIP-91 [0] which
> supports sql client gateway mode, then users can submit a job through CLI
> client, REST API or JDBC.
>
> I'm glad that you can give me more feedback about FLIP-91.
>
> Best,
> godfreyhe
>
> [0]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway


[jira] [Created] (FLINK-15465) Avoid failing when required memory calculation not accurate in BinaryHashTable

2020-01-02 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-15465:


 Summary: Avoid failing when required memory calculation not 
accurate in BinaryHashTable
 Key: FLINK-15465
 URL: https://issues.apache.org/jira/browse/FLINK-15465
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Reporter: Jingsong Lee
 Fix For: 1.10.0


In BinaryHashBucketArea.insertToBucket.

When BinaryHashTable.buildTableFromSpilledPartition."Build in memory hash 
table", it requires memory can put all records, if not, will fail.

Because the linked hash conflict solution, the required memory calculation are 
not accurate, in this case, we should apply for insufficient memory from heap.

And must be careful, the steal memory should not return to table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

2020-01-02 Thread Jingsong Li
+1 for this documentation change.
Hope less confuse to users.

Best,
Jingsong Lee

On Fri, Jan 3, 2020 at 12:09 PM Benchao Li  wrote:

> +1
>
> It's good to align the terminology between Table API & SQL and DataStream.
>
> Jark Wu  于2020年1月3日周五 下午12:04写道:
>
> > Hi everyone,
> >
> > As we discussed in the mailing list[1], the current "Time-windowed Join"
> in
> > Table API & SQL is a little misleading which is not the same to "Window
> > Join" in DataStream, but the same to "Interval Join" in DataStream.
> >
> > So I would like to start a vote to rename the terminology of
> "Time-windowed
> > Join" to "Interval Join" in Table API & SQL **before 1.10 release**.
> >
> > Note that this is a purely documentation change, no updates for public
> API
> > or Javadocs. Updates for implementation codes (e.g. rename
> > DataStreamWindowJoin) is not targeted to 1.10.
> >
> > This vote will be open for at least 72 hours. Unless there is an
> objection.
> > This vote is required Consensus Approval which is the same to a FLIP
> vote.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-the-terminology-of-quot-Time-windowed-Join-quot-to-quot-Interval-Join-quot-in-Table-L-td36202.html
> >
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: libenc...@gmail.com; libenc...@pku.edu.cn
>


-- 
Best, Jingsong Lee


Re: [ANNOUNCE] Progress of Apache Flink 1.10 #3

2020-01-02 Thread Jingsong Li
Thanks Yu and Gary,

Table and hive layer have resolved all blockers and critical issues. (38
blockers and 33 critical issues)

Look forward to the first RC.

Best,
Jingsong Lee

On Fri, Jan 3, 2020 at 11:16 AM Danny Chan  wrote:

> Thanks for the great work ~
>
> Best,
> Danny Chan
> 在 2020年1月2日 +0800 PM8:48,Yu Li ,写道:
> > Hi community,
> >
> > It has been three weeks since we cut the release-1.10 branch and kicked
> off
> > the release testing phase [1]. Thanks to all the efforts of the
> community,
> > 46 blocker and 11 critical issues [2] have been resolved during this
> > period, and now we are left with only 12 blocker issues [3].
> >
> > If everything goes smoothly, we could possibly resolve all blockers and
> > produce the first RC in the 3rd week of Jan. Please be prepared for
> testing
> > the RC and let us know if there are any questions.
> >
> > Best,
> > Gary and Yu
> >
> > [1] https://s.apache.org/1okuv
> > [2] https://s.apache.org/gbd0r
> > [3] https://s.apache.org/euzaj
>


-- 
Best, Jingsong Lee


[jira] [Created] (FLINK-15464) HiveTableSourceTest::testPartitionFilterDateTimestamp fails with Hive 1.x

2020-01-02 Thread Rui Li (Jira)
Rui Li created FLINK-15464:
--

 Summary: HiveTableSourceTest::testPartitionFilterDateTimestamp 
fails with Hive 1.x
 Key: FLINK-15464
 URL: https://issues.apache.org/jira/browse/FLINK-15464
 Project: Flink
  Issue Type: Test
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15463) export table sql jar ,meet error: java.lang.ClassNotFoundException: org.apache.flink.calcite.shaded.com.google.protobuf.MessageOrBuilder

2020-01-02 Thread xiexinyuan (Jira)
xiexinyuan created FLINK-15463:
--

 Summary: export table sql jar ,meet error: 
java.lang.ClassNotFoundException: 
org.apache.flink.calcite.shaded.com.google.protobuf.MessageOrBuilder
 Key: FLINK-15463
 URL: https://issues.apache.org/jira/browse/FLINK-15463
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.9.1, 1.7.2
 Environment: Windows 10

Idea IDE 
Reporter: xiexinyuan
 Attachments: phoenix-test.zip

I can use JDBCInputFormat connect phoenix correct.but when i export flink-table 
api , the program will crash. the error is :

java.lang.ClassNotFoundException: 
org.apache.flink.calcite.shaded.com.google.protobuf.MessageOrBuilder

the program  code  is zipped in the attachement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

2020-01-02 Thread Benchao Li
+1

It's good to align the terminology between Table API & SQL and DataStream.

Jark Wu  于2020年1月3日周五 下午12:04写道:

> Hi everyone,
>
> As we discussed in the mailing list[1], the current "Time-windowed Join" in
> Table API & SQL is a little misleading which is not the same to "Window
> Join" in DataStream, but the same to "Interval Join" in DataStream.
>
> So I would like to start a vote to rename the terminology of "Time-windowed
> Join" to "Interval Join" in Table API & SQL **before 1.10 release**.
>
> Note that this is a purely documentation change, no updates for public API
> or Javadocs. Updates for implementation codes (e.g. rename
> DataStreamWindowJoin) is not targeted to 1.10.
>
> This vote will be open for at least 72 hours. Unless there is an objection.
> This vote is required Consensus Approval which is the same to a FLIP vote.
>
> Best,
> Jark
>
> [1]:
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-the-terminology-of-quot-Time-windowed-Join-quot-to-quot-Interval-Join-quot-in-Table-L-td36202.html
>


-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn


[VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

2020-01-02 Thread Jark Wu
Hi everyone,

As we discussed in the mailing list[1], the current "Time-windowed Join" in
Table API & SQL is a little misleading which is not the same to "Window
Join" in DataStream, but the same to "Interval Join" in DataStream.

So I would like to start a vote to rename the terminology of "Time-windowed
Join" to "Interval Join" in Table API & SQL **before 1.10 release**.

Note that this is a purely documentation change, no updates for public API
or Javadocs. Updates for implementation codes (e.g. rename
DataStreamWindowJoin) is not targeted to 1.10.

This vote will be open for at least 72 hours. Unless there is an objection.
This vote is required Consensus Approval which is the same to a FLIP vote.

Best,
Jark

[1]:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-the-terminology-of-quot-Time-windowed-Join-quot-to-quot-Interval-Join-quot-in-Table-L-td36202.html


Re: [DISCUSS] Correct the terminology of "Time-windowed Join" to "Interval Join" in Table API & SQL

2020-01-02 Thread Jark Wu
Thanks all for the feedbacks. I will start a VOTE soon.

Best,
Jark

On Mon, 23 Dec 2019 at 15:45, Jark Wu  wrote:

> I agree with Jingsong, we are discussing to align the "concepts", not
> align the "implementations".
>
> For the "concepts", the "Time-windowed Join" in SQL and "Interval Join" in
> DataStream are the same thing.
>
> Best,
> Jark
>
> On Mon, 23 Dec 2019 at 15:16, Jingsong Li  wrote:
>
>> Hi Danny,
>>
>> > DatasStream interval join and Table/SQL Time-windowed Join are
>> not equivalent
>>
>> In my opinion, there is no difference between table and DataStream except
>> that outer join is not implemented in DataStream.
>> KeyedStream has defined equivalent conditions.
>> Other conditions can be completed in the subsequent
>> IntervalJoined.process.
>> And the interval join of DataStream is implemented according to the
>> feature
>> of SQL.[1] You can see the references in the description.
>>
>> > why not choose Time-windowed Join
>>
>> As Jark said, there is a "Window Join" in DataStream, we can support it in
>> table too in future. It is very easy to misunderstand with "Time-windowed
>> Join".
>> So, in my opinion, "Interval join" or "Range join" are the "complete" word
>> to describe this kind of join.  But better not "Time-windowed Join".
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-8478
>>
>> Best,
>> Jingsong Lee
>>
>


[jira] [Created] (FLINK-15462) Support Prestosql JDBC source/sink

2020-01-02 Thread li yu (Jira)
li yu created FLINK-15462:
-

 Summary: Support Prestosql JDBC source/sink
 Key: FLINK-15462
 URL: https://issues.apache.org/jira/browse/FLINK-15462
 Project: Flink
  Issue Type: Wish
  Components: Connectors / JDBC
Reporter: li yu


flink-jdbc support

Derby, Mysql, Postgre

[https://github.com/apache/flink/blob/master/flink-connectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc/dialect/JDBCDialects.java]

Could we add support for prestosql ?

Link to prestosql jdbc [https://prestosql.io/download.html]

Advantage is presto support a variety of data source (i.e we could ingest/load 
data to or from those data source just through presto jdbc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [ANNOUNCE] Progress of Apache Flink 1.10 #3

2020-01-02 Thread Danny Chan
Thanks for the great work ~

Best,
Danny Chan
在 2020年1月2日 +0800 PM8:48,Yu Li ,写道:
> Hi community,
>
> It has been three weeks since we cut the release-1.10 branch and kicked off
> the release testing phase [1]. Thanks to all the efforts of the community,
> 46 blocker and 11 critical issues [2] have been resolved during this
> period, and now we are left with only 12 blocker issues [3].
>
> If everything goes smoothly, we could possibly resolve all blockers and
> produce the first RC in the 3rd week of Jan. Please be prepared for testing
> the RC and let us know if there are any questions.
>
> Best,
> Gary and Yu
>
> [1] https://s.apache.org/1okuv
> [2] https://s.apache.org/gbd0r
> [3] https://s.apache.org/euzaj


[jira] [Created] (FLINK-15461) Add Stream SQL end2end test to cover connecting to external systems

2020-01-02 Thread Leonard Xu (Jira)
Leonard Xu created FLINK-15461:
--

 Summary: Add Stream SQL end2end test to cover connecting to 
external systems
 Key: FLINK-15461
 URL: https://issues.apache.org/jira/browse/FLINK-15461
 Project: Flink
  Issue Type: Test
  Components: Table SQL / API, Tests
Affects Versions: 1.11.0
Reporter: Leonard Xu


We enhanced FLINK SQL in release 1.10, but we are lack of test/example that can 
cover connecting to external systems, eg: Read from Kafka, join Mysql/hbase 
dimension table and then sink to Kafka/hbase by SQL.
This issue aims at addressing above scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15460) planner dependencies won't be necessary for JDBC connector

2020-01-02 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-15460:


 Summary: planner dependencies won't be necessary for JDBC connector
 Key: FLINK-15460
 URL: https://issues.apache.org/jira/browse/FLINK-15460
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / HBase, Connectors / JDBC
Affects Versions: 1.10.0
Reporter: Zhenghua Gao
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-02 Thread Peter Huang
Hi Dian,
Thanks for giving us valuable feedbacks.

1) It's better to have a whole design for this feature
For the suggestion of enabling the cluster mode also session cluster, I
think Flink already supported it. WebSubmissionExtension already allows
users to start a job with the specified jar by using web UI.
But we need to enable the feature from CLI for both local jar, remote jar.
I will align with Yang Wang first about the details and update the design
doc.

2) It's better to consider the convenience for users, such as debugging

I am wondering whether we can store the exception in jobgragh generation in
application master. As no streaming graph can be scheduled in this case,
there will be no more TM will be requested from FlinkRM.
If the AM is still running, users can still query it from CLI. As it
requires more change, we can get some feedback from 
and @zjf...@gmail.com .

3) It's better to consider the impact to the stability of the cluster

I agree with Yang Wang's opinion.



Best Regards
Peter Huang


On Sun, Dec 29, 2019 at 9:44 PM Dian Fu  wrote:

> Hi all,
>
> Sorry to jump into this discussion. Thanks everyone for the discussion.
> I'm very interested in this topic although I'm not an expert in this part.
> So I'm glad to share my thoughts as following:
>
> 1) It's better to have a whole design for this feature
> As we know, there are two deployment modes: per-job mode and session mode.
> I'm wondering which mode really needs this feature. As the design doc
> mentioned, per-job mode is more used for streaming jobs and session mode is
> usually used for batch jobs(Of course, the job types and the deployment
> modes are orthogonal). Usually streaming job is only needed to be submitted
> once and it will run for days or weeks, while batch jobs will be submitted
> more frequently compared with streaming jobs. This means that maybe session
> mode also needs this feature. However, if we support this feature in
> session mode, the application master will become the new centralized
> service(which should be solved). So in this case, it's better to have a
> complete design for both per-job mode and session mode. Furthermore, even
> if we can do it phase by phase, we need to have a whole picture of how it
> works in both per-job mode and session mode.
>
> 2) It's better to consider the convenience for users, such as debugging
> After we finish this feature, the job graph will be compiled in the
> application master, which means that users cannot easily get the exception
> message synchorousely in the job client if there are problems during the
> job graph compiling (especially for platform users), such as the resource
> path is incorrect, the user program itself has some problems, etc. What I'm
> thinking is that maybe we should throw the exceptions as early as possible
> (during job submission stage).
>
> 3) It's better to consider the impact to the stability of the cluster
> If we perform the compiling in the application master, we should consider
> the impact of the compiling errors. Although YARN could resume the
> application master in case of failures, but in some case the compiling
> failure may be a waste of cluster resource and may impact the stability the
> cluster and the other jobs in the cluster, such as the resource path is
> incorrect, the user program itself has some problems(in this case, job
> failover cannot solve this kind of problems) etc. In the current
> implemention, the compiling errors are handled in the client side and there
> is no impact to the cluster at all.
>
> Regarding to 1), it's clearly pointed in the design doc that only per-job
> mode will be supported. However, I think it's better to also consider the
> session mode in the design doc.
> Regarding to 2) and 3), I have not seen related sections in the design
> doc. It will be good if we can cover them in the design doc.
>
> Feel free to correct me If there is anything I misunderstand.
>
> Regards,
> Dian
>
>
> > 在 2019年12月27日,上午3:13,Peter Huang  写道:
> >
> > Hi Yang,
> >
> > I can't agree more. The effort definitely needs to align with the final
> > goal of FLIP-73.
> > I am thinking about whether we can achieve the goal with two phases.
> >
> > 1) Phase I
> > As the CLiFrontend will not be depreciated soon. We can still use the
> > deployMode flag there,
> > pass the program info through Flink configuration,  use the
> > ClassPathJobGraphRetriever
> > to generate the job graph in ClusterEntrypoints of yarn and Kubernetes.
> >
> > 2) Phase II
> > In  AbstractJobClusterExecutor, the job graph is generated in the execute
> > function. We can still
> > use the deployMode in it. With deployMode = cluster, the execute function
> > only starts the cluster.
> >
> > When {Yarn/Kuberneates}PerJobClusterEntrypoint starts, It will start the
> > dispatch first, then we can use
> > a ClusterEnvironment similar to ContextEnvironment to submit the job with
> > jobName the local
> > dispatcher. For the details, we need more investig

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-02 Thread Peter Huang
Hi Yang,

I understand your point. As for Kubernates per job cluster, users only have
the image path for starting the job. The user code is inaccessible.
I think it is a common question for containerized deployment (For example
yanr with docker image) after FLIP-73. Let's get some feedback from <
aljos...@apache.org> and @zjf...@gmail.com .



Best Regards
Peter Huang




On Mon, Dec 30, 2019 at 1:48 AM Yang Wang  wrote:

> Hi Peter,
> Certainly, we could add a 'if-else' in `AbstractJobClusterExecutor` to
> handle different deploy mode. However, i
> think we need to avoid executing any user program code in cluster
> deploy-mode including in the `ExecutionEnvironment`.
> Let's wait for some feedback from FLIP-73's author @Aljoscha Krettek
>  and @zjf...@gmail.com .
>
>
> Hi Dian Fu,
>
> Many thanks for jumping out and give the very useful suggestions.
>
> >> 1) It's better to have a whole design for this feature
> You are right. We should not add a specific config option
> "execution.deploy-mode" for per-job. In per-job
> mode, the job graph will be generated in JobClusterEntrypoint when the
> deploy-mode is cluster. The standalone
> per-job has already done by `ClasspathJobGraphRetriever`. In session mode,
> it will be more complicated. We
> should not generate the job graph in entrypoint. Instead, for each job, we
> need to do it separately and then submit
> the job by a local client. Peter and i will try to enrich the design doc of
> this part. As for the implementation, we could
> do it phase by phase.
>
> >> 2) It's better to consider the convenience for users, such as debugging
> Indeed, if the deploy-mode is cluster, it may be not convenient for user
> debugging. For different cluster, there
> is different ways to debugging. For example, using 'yarn logs' and 'kubectl
> log' to get the jobmanager logs. Also
> we could consider to throw the exception to client by rest. I'm not sure
> whether we could achieve this purpose.
> Compared to the client deploy-model, it is really a fallback in user
> experience. We will try to add more description in
> the document about the user experience.
>
> >> 3) It's better to consider the impact to the stability of the cluster
> I do not think it will take too many negative impacts to the cluster. Yarn,
> Kubernetes and other resource management
> cluster could give a good isolation for different applications . One app
> failed should not affect others. If error occurs in
> generating job graph, the jobmanager process will fail very fast and the
> whole app will deregister after several attempts.
> We cannot just avoid this, even in client deploy-mode, it could also happen
> when the user specify a wrong checkpoint path.
>
>
>
> Best,
> Yang
>
> Dian Fu  于2019年12月30日周一 下午1:44写道:
>
> > Hi all,
> >
> > Sorry to jump into this discussion. Thanks everyone for the discussion.
> > I'm very interested in this topic although I'm not an expert in this
> part.
> > So I'm glad to share my thoughts as following:
> >
> > 1) It's better to have a whole design for this feature
> > As we know, there are two deployment modes: per-job mode and session
> mode.
> > I'm wondering which mode really needs this feature. As the design doc
> > mentioned, per-job mode is more used for streaming jobs and session mode
> is
> > usually used for batch jobs(Of course, the job types and the deployment
> > modes are orthogonal). Usually streaming job is only needed to be
> submitted
> > once and it will run for days or weeks, while batch jobs will be
> submitted
> > more frequently compared with streaming jobs. This means that maybe
> session
> > mode also needs this feature. However, if we support this feature in
> > session mode, the application master will become the new centralized
> > service(which should be solved). So in this case, it's better to have a
> > complete design for both per-job mode and session mode. Furthermore, even
> > if we can do it phase by phase, we need to have a whole picture of how it
> > works in both per-job mode and session mode.
> >
> > 2) It's better to consider the convenience for users, such as debugging
> > After we finish this feature, the job graph will be compiled in the
> > application master, which means that users cannot easily get the
> exception
> > message synchorousely in the job client if there are problems during the
> > job graph compiling (especially for platform users), such as the resource
> > path is incorrect, the user program itself has some problems, etc. What
> I'm
> > thinking is that maybe we should throw the exceptions as early as
> possible
> > (during job submission stage).
> >
> > 3) It's better to consider the impact to the stability of the cluster
> > If we perform the compiling in the application master, we should consider
> > the impact of the compiling errors. Although YARN could resume the
> > application master in case of failures, but in some case the compiling
> > failure may be a waste of cluster resource and may impact the stab

[jira] [Created] (FLINK-15459) Drop vendor specific repositories from pom.xml

2020-01-02 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-15459:
-

 Summary: Drop vendor specific repositories from pom.xml
 Key: FLINK-15459
 URL: https://issues.apache.org/jira/browse/FLINK-15459
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.11.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.11.0


Since Flink no longer bundles Hadoop dependencies we also don't need the vendor 
specific Hadoop repositories in Flink's {{pom.xml}}. Consequently, I suggest to 
remove them. 

This idea has been discussed on Flink's dev ML: 
https://lists.apache.org/thread.html/be402a11bc986219eabd9dd8af507f36f49784d5400d0873e9ec0c2e%40%3Cdev.flink.apache.org%3E.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop vendor specific repositories from pom.xml

2020-01-02 Thread Till Rohrmann
Thanks for the feedback. As I haven't heard any objections, I'll open a
JIRA issue and a PR to remove the vendor specific repositories.

Cheers,
Till

On Thu, Jan 2, 2020 at 12:28 PM Aljoscha Krettek 
wrote:

> +1 to remove
>
> > On 20. Dec 2019, at 10:34, Robert Metzger  wrote:
> >
> > Okay, I understand. I'm okay with removing the profile.
> >
> > On Thu, Dec 19, 2019 at 11:34 AM Till Rohrmann 
> wrote:
> >
> >> The profiles make bumping ZooKeeper's version a bit more cumbersome. I
> >> would be interested for this reason to get rid of them, too.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Wed, Dec 18, 2019 at 5:35 PM Robert Metzger 
> >> wrote:
> >>
> >>> I guess we are talking about this profile [1] in the pom.xml?
> >>>
> >>> +1 to remove.
> >>>
> >>> I'm not sure if we need to rush this for the 1.10 release. The profile
> is
> >>> not doing us any harm at the moment.
> >>>
> >>> [1]https://github.com/apache/flink/blob/master/pom.xml#L1035
> >>>
> >>> On Wed, Dec 18, 2019 at 4:51 PM Till Rohrmann 
> >>> wrote:
> >>>
>  Hi everyone,
> 
>  following the discussion started by Seth [1] I would like to discuss
>  dropping the vendor specific repositories from Flink's parent
> pom.xml. As
>  building Flink against a vendor specific Hadoop version is no longer
>  needed
>  (as it simply needs to be added to the classpath) and documented, I
>  believe
>  that the vendor specific repositories and the mapr profile have become
>  obsolete. Moreover, users can still use vendor specific Hadoop
> versions
>  if
>  they configure their local maven to point to the respective repository
>  [2].
>  Flink's sources would simply no longer be shipped with this option.
> 
>  Are there any concerns about dropping the vendor specific repositories
>  from
>  pom.xml? I would like to make this change for the upcoming Flink 1.10
>  release if possible.
> 
>  [1]
> 
> 
> https://lists.apache.org/thread.html/83afcf6c0d5d7a0a7179cbdac9593ebe7478b0dc548781bf9915a006%40%3Cdev.flink.apache.org%3E
>  [2]
>  https://maven.apache.org/guides/mini/guide-multiple-repositories.html
> 
>  Cheers,
>  Till
> 
> >>>
>
>


[jira] [Created] (FLINK-15458) Add whitelist for ambiguous config options

2020-01-02 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-15458:


 Summary: Add whitelist for ambiguous config options
 Key: FLINK-15458
 URL: https://issues.apache.org/jira/browse/FLINK-15458
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


The {{ConfigOptionsDocsCompletenessITCase}} verifies that all existing and 
documented options are well-defined; as in that for any key there is exactly 1 
default and description present, globally.

 

There is one use-case however where this check is too strict: reporters. These 
only work with key suffixes (like "port") and hence are failing this check. As 
a result we don't use the generator mechanism for reporters, which is 
unfortunate.

 

I propose to slightly extend this test so that this check is not performed for 
specific keys.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[ANNOUNCE] Progress of Apache Flink 1.10 #3

2020-01-02 Thread Yu Li
Hi community,

It has been three weeks since we cut the release-1.10 branch and kicked off
the release testing phase [1]. Thanks to all the efforts of the community,
46 blocker and 11 critical issues [2] have been resolved during this
period, and now we are left with only 12 blocker issues [3].

If everything goes smoothly, we could possibly resolve all blockers and
produce the first RC in the 3rd week of Jan. Please be prepared for testing
the RC and let us know if there are any questions.

Best,
Gary and Yu

[1] https://s.apache.org/1okuv
[2] https://s.apache.org/gbd0r
[3] https://s.apache.org/euzaj


[jira] [Created] (FLINK-15457) Remove the finished TODO in the YarnEntrypointUtils

2020-01-02 Thread Yang Wang (Jira)
Yang Wang created FLINK-15457:
-

 Summary: Remove the finished TODO in the YarnEntrypointUtils
 Key: FLINK-15457
 URL: https://issues.apache.org/jira/browse/FLINK-15457
 Project: Flink
  Issue Type: Task
  Components: Deployment / YARN
Reporter: Yang Wang


We have the following todo in the {{YarnEntrypointUtils#loadConfiguration}}. It 
has been supported by {{rest.bind-port}}. It accepts a list of ports 
(“50100,50101”), ranges" + " (“50100-50200”) or a combination of both.
{code:java}
   // TODO: Support port ranges for the AM
//  final String portRange = configuration.getString(
//   ConfigConstants.YARN_APPLICATION_MASTER_PORT,
//   ConfigConstants.DEFAULT_YARN_JOB_MANAGER_PORT);
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop vendor specific repositories from pom.xml

2020-01-02 Thread Aljoscha Krettek
+1 to remove

> On 20. Dec 2019, at 10:34, Robert Metzger  wrote:
> 
> Okay, I understand. I'm okay with removing the profile.
> 
> On Thu, Dec 19, 2019 at 11:34 AM Till Rohrmann  wrote:
> 
>> The profiles make bumping ZooKeeper's version a bit more cumbersome. I
>> would be interested for this reason to get rid of them, too.
>> 
>> Cheers,
>> Till
>> 
>> On Wed, Dec 18, 2019 at 5:35 PM Robert Metzger 
>> wrote:
>> 
>>> I guess we are talking about this profile [1] in the pom.xml?
>>> 
>>> +1 to remove.
>>> 
>>> I'm not sure if we need to rush this for the 1.10 release. The profile is
>>> not doing us any harm at the moment.
>>> 
>>> [1]https://github.com/apache/flink/blob/master/pom.xml#L1035
>>> 
>>> On Wed, Dec 18, 2019 at 4:51 PM Till Rohrmann 
>>> wrote:
>>> 
 Hi everyone,
 
 following the discussion started by Seth [1] I would like to discuss
 dropping the vendor specific repositories from Flink's parent pom.xml. As
 building Flink against a vendor specific Hadoop version is no longer
 needed
 (as it simply needs to be added to the classpath) and documented, I
 believe
 that the vendor specific repositories and the mapr profile have become
 obsolete. Moreover, users can still use vendor specific Hadoop versions
 if
 they configure their local maven to point to the respective repository
 [2].
 Flink's sources would simply no longer be shipped with this option.
 
 Are there any concerns about dropping the vendor specific repositories
 from
 pom.xml? I would like to make this change for the upcoming Flink 1.10
 release if possible.
 
 [1]
 
 https://lists.apache.org/thread.html/83afcf6c0d5d7a0a7179cbdac9593ebe7478b0dc548781bf9915a006%40%3Cdev.flink.apache.org%3E
 [2]
 https://maven.apache.org/guides/mini/guide-multiple-repositories.html
 
 Cheers,
 Till
 
>>> 



Re: Error:java: 无效的标记: --add-exports=java.base/sun.net.util=ALL-UNNAMED

2020-01-02 Thread Chesnay Schepler
For future reference, this issue (and the solution) are documented here 
.


On 30/12/2019 08:32, Jark Wu wrote:

I have the same problem when upgrade my IDEA to 2019.3.1.
And the Biao's solution works for me too.

Best,
Jark

On Thu, 21 Nov 2019 at 22:23, 刘建刚  wrote:


Thank you very much. It works for me.


在 2019年11月14日,下午1:06,Biao Liu  写道:

Hi,

I have encountered the same issue when setting up a dev environment.

It seems that the my Intellij (2019.2.1) unexpectedly activates java11
profile of maven. It doesn't match the Java compiler (JDK8). I'm not sure
why it happened silently.

So for me, the solution is "Intellij" -> "View" -> "Tool Windows" ->
"Maven" -> "Profiles" -> uncheck the "java11" -> reimport maven project.

Thanks,
Biao /'bɪ.aʊ/



On Mon, 4 Nov 2019 at 18:01, OpenInx  wrote:


Hi
I met the same problem before. After some digging,  I find that the idea
will detect the JDK version
and choose whether to use the jdk11 option to run the flink maven

building.

if you are in jdk11 env,  then
it will add the option --add-exports when maven building in IDEA.

For my case,  I was in IntelliJIdea2019.2 which depends on the jdk11,

and

once I re-import the flink
modules then the IDEA will add the --add-exports flag even if  I removed
all the flags in .idea/compile.xml
explicitly.  I noticed that the Intellij's JDK affected the flink maven
building, so I turned to use the Intellij with JDK8
bundled,  then the problem was gone.

You can verify it, and if  it's really the same. can just replace your

IDEA

with the pkg suffix with "with bundled JBR 8" in
here [1].
Say if you are using MacOS, then should download the package "2019.2.4

for

macOS with bundled JBR 8 (dmg)"

Hope it works for you
Thanks.

[1]. https://www.jetbrains.com/idea/download/other.html


On Mon, Nov 4, 2019 at 5:44 PM Till Rohrmann 

wrote:

Try to reimport that maven project. This should resolve this issue.

Cheers,
Till

On Mon, Nov 4, 2019 at 10:34 AM 刘建刚  wrote:


  Hi, I am using flink 1.9 in idea. But when I run a unit test in

idea.

The idea reports the following error:"Error:java: 无效的标记:
--add-exports=java.base/sun.net.util=ALL-UNNAMED".
  Everything is ok when I use flink 1.6. I am using jdk 1.8. Is it
related to the java version?







Re: Watermark options for TwoInputStreamOperator

2020-01-02 Thread Victor Wong
Hi Jocean,

How about assigning the control data stream with Long.MAX_VALUE as
watermark?

Best,
Victor

Jocean shi  于2019年12月25日周三 上午9:53写道:

> Hi Congxian,
>
> It is not the same with Idling Source
>
> Best
> Ocean
>
> Congxian Qiu  于2019年12月24日周二 下午4:39写道:
> >
> > Hi
> >
> > Maybe the doc[1] about  Idling Source can help there.
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_time.html#idling-sources
> >
> > Best,
> > Congxian
> >
> >
> > Jocean shi  于2019年12月24日周二 下午3:05写道:
> >
> > > Hi Jark,
> > >
> > > I got you. We have discussed this question in Flink Forward 2019.
> > > I know that i can custom operator to resolve this problem.
> > > but also has some other problems:
> > >
> > > First,
> > > This is a very common scene that we often meet
> > > I have to rewrite "BroadcastConnectedStream","ConnectedStreams" ...
> > > and "TwoInputStreamOperator" and rewrite "SingleOutputStreamOperator"
> > > bucause it's "protected",
> > > It's unfriendly for new user.
> > >
> > > Second.
> > > One stream even don't call "assignTimestampsAndWatermarks" method,
> > > It's mean this stream don't have time and watermark,
> > > It don't have time and watermark why It have a hand in computing of
> > > watermak and time default.
> > >
> > > Best
> > > Jark
> > >
> > > Jark Wu  于2019年12月24日周二 下午12:08写道:
> > > >
> > > > Hi Ocean,
> > > >
> > > > You can implement your custom operator by the
> "TwoInputStreamOperator"
> > > > interface.
> > > > The TwoInputStreamOperator interface provides "processWatermark1" and
> > > > "processWatermark2" which handles
> > > > watermarks for left stream and right stream. You can then ignore the
> > > > watermarks from right stream and forward
> > > > watermark from left stream.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Mon, 23 Dec 2019 at 21:47, Jocean shi 
> wrote:
> > > >
> > > > > Hi all:
> > > > > Currently, The "TwoInputStreamOperator" such as
> > > > > "CoBroadcastWithKeyedOperator" "KeyedCoProcessOperator" and the
> > > > > (Co)stream such as "ConnectedStreams" "BroadcastConnectedStream"
> only
> > > > > support compute watermark by two stream.
> > > > > but we just need one stream to compute watermark in some case.
> > > > >  For example: one stream is only rule data or control command,
> it
> > > > > don't hava event time.
> > > > > I think user has right to choose which stream to compute watermark.
> > > > >
> > > > > Best
> > > > > Ocean
> > > > >
> > >
>


-- 

Best,
Victor


[jira] [Created] (FLINK-15456) Job keeps failing on slot allocation timeout due to RM not allocating new TMs for slot requests

2020-01-02 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-15456:
---

 Summary: Job keeps failing on slot allocation timeout due to RM 
not allocating new TMs for slot requests
 Key: FLINK-15456
 URL: https://issues.apache.org/jira/browse/FLINK-15456
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0
 Attachments: jm_part.log

As in the attached JM log, the job tried to start 30 TMs but only 29 are 
registered. So the job fails due to not able to acquire all 30 slots needed in 
time.
And when the failover happens and tasks are re-scheduled, the RM will not ask 
for new TMs even if it cannot fulfill the slot requests. So the job will keep 
failing for slot allocation timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15455) Enable TCP connection reuse across multiple jobs.

2020-01-02 Thread Yingjie Cao (Jira)
Yingjie Cao created FLINK-15455:
---

 Summary: Enable TCP connection reuse across multiple jobs.
 Key: FLINK-15455
 URL: https://issues.apache.org/jira/browse/FLINK-15455
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: Yingjie Cao


Currently, tcp connections can be only reuse by tasks residing in the same 
TaskManager and consumes the same IntermediateResult. And after job finish or 
failover, the TCP connections are closed and new connections must be setup 
latter.

As an improvement, we can make tcp connections a cluster level resource which 
can be reused by multi jobs. The advantages are as follows:
 # Reduce the number of TCP connections so we can save some resources.
 # Reduce the overhead of connection setup and close so restarted jobs after 
failover and latter jobs submitted to the same session cluster can reuse the 
previous connections.

We use Flink session cluster as a service for ad-hoc queries and the users can 
produce some statistics or create some statements and reports at any time. Most 
of the queries finish in 2s and we find tcp connection reuse help a lot to 
reduce the average execution time which means more queries can be processed 
using the same resource and time with even better user experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15454) ClusterEntrypoint#installSecurityContext is now unique

2020-01-02 Thread Zili Chen (Jira)
Zili Chen created FLINK-15454:
-

 Summary: ClusterEntrypoint#installSecurityContext is now unique
 Key: FLINK-15454
 URL: https://issues.apache.org/jira/browse/FLINK-15454
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.11.0
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.11.0


So that we can remove some dead/duplicated codes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)