[RESULT] Make AppendingState#add refuse to add null element

2020-01-19 Thread Congxian Qiu
Hi everyone,

Thanks for the discussion and votes.

So far the email[1] wants to make AppendingState#add refuse to add null
element, received 3 approving votes, and there is no -1 votes:

* Aljoscha
* Yu Li
* Yun Tang

Therefore, I'm happy to announce that we'll apply this change and make
AppendingState#add refuse to add null element.

Thanks, everyone!

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Make-AppendingState-add-refuse-to-add-null-element-td36493.html
Best,
Congxian


[jira] [Created] (FLINK-15679) Improve Flink's ID system

2020-01-19 Thread Yangze Guo (Jira)
Yangze Guo created FLINK-15679:
--

 Summary: Improve Flink's ID system
 Key: FLINK-15679
 URL: https://issues.apache.org/jira/browse/FLINK-15679
 Project: Flink
  Issue Type: Improvement
Reporter: Yangze Guo
 Fix For: 1.11.0


Flink uses a lot of IDs which are hard to decipher for the user:
 - The string literal of most Flink’s IDs are meaningless hashcode. It does not 
help the user to understand what happened.
 - The LOGs do not contain the lineage info of those IDs. Currently, it’s 
impossible to track the end to end lifecycle of an Execution or a Task.
 - Some redundancy exists in Flink’s ID system, we need to sort out the current 
ID system before introducing more complexity.

To help the user to understand what Flink is doing, we target to add more 
meaning to Flink’s IDs and LOGs while cleaning up the redundancy and 
refactoring the ID system.

This is an umbrella ticket to track all the relevant issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15678) Optimize producing primary key without row number in special Top 1

2020-01-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-15678:


 Summary: Optimize producing primary key without row number in 
special Top 1
 Key: FLINK-15678
 URL: https://issues.apache.org/jira/browse/FLINK-15678
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: Jingsong Lee
 Fix For: 1.11.0


{code:java}
SELECT c1, c2, c3, c4
FROM (
  SELECT *,ROW_NUMBER() OVER (PARTITION BY c1, c2, c3 ORDER BY c4 DESC) AS 
rownum
  FROM t
) WHERE rownum <= 1
{code}
This SQL is Top 1, Top N produce stream with primary keys contains row number, 
but your sql didn't select row number, so there is not primary key.

But for Top 1, we can produce primary key, because row number is always 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Understanding watermark

2020-01-19 Thread Guowei Ma
>>What I understand from you, one operator has two watermarks? If so, one
operator's output watermark would be an input watermark of the next
operator? Does it sounds redundant?
There are no two watermarks for an operator. What I want to say is
"watermark metrics".

>>Or do you mean the Web UI only show the input watermarks of every
operator, but since the source doesn't have input watermark show it show
"No Watermark" ? And we should have output watermark for source?
Yes. But the web UI only shows the task level watermarks metrics, not the
operator level. Yout could find more detail information about metrics in
the link[1].

>>And, yes we want to understand when we should expect to see watermarks
for our "combined" sources (bounded and un-bounded) for our pipeline?
Do you try a topology with only Kinesis source and the web UI shows the
Watermark of source?  Actually, I think it might not be related to the
"combined" source.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/metrics.html
Best,
Guowei


Cam Mach  于2020年1月15日周三 下午3:53写道:

> Hi Guowei,
>
> Thanks for your response.
>
> What I understand from you, one operator has two watermarks? If so, one
> operator's output watermark would be an input watermark of the next
> operator? Does it sounds redundant?
>
> Or do you mean the Web UI only show the input watermarks of every
> operator, but since the source doesn't have input watermark show it show
> "No Watermark" ? And we should have output watermark for source?
>
> And, yes we want to understand when we should expect to see watermarks for
> our "combined" sources (bounded and un-bounded) for our pipeline?
>
> If you can be more directly, it would be very helpful.
>
> Thanks,
>
> On Tue, Jan 14, 2020 at 5:42 PM Guowei Ma  wrote:
>
>> Hi, Cam,
>> I think you might want to know why the web page does not show the
>> watermark of the source.
>> Currently, the web only shows the "input" watermark. The source only
>> outputs the watermark so the web shows you that there is "No Watermark".
>>  Actually Flink has "output" watermark metrics. I think Flink should also
>> show this information on the web. Would you mind open a Jira to track this?
>>
>>
>> Best,
>> Guowei
>>
>>
>> Cam Mach  于2020年1月15日周三 上午4:05写道:
>>
>>> Hi Till,
>>>
>>> Thanks for your response.
>>>
>>> Our sources are S3 and Kinesis. We have run several tests, and we are
>>> able to take savepoint/checkpoint, but only when S3 complete reading. And
>>> at that point, our pipeline has watermarks for other operators, but not the
>>> source operator. We are not running `PROCESS_CONTINUOUSLY`, so we should
>>> have watermark for the source as well, right?
>>>
>>>  Attached is snapshot of our pipeline.
>>>
>>> [image: image.png]
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Tue, Jan 14, 2020 at 10:43 AM Till Rohrmann 
>>> wrote:
>>>
 Hi Cam,

 could you share a bit more details about your job (e.g. which sources
 are you using, what are your settings, etc.). Ideally you can provide a
 minimal example in order to better understand the program.

 From a high level perspective, there might be different problems: First
 of all, Flink does not support checkpointing/taking a savepoint if some of
 the job's operator have already terminated iirc. But your description
 points rather into the direction that your bounded source does not
 terminate. So maybe you are reading a file via
 StreamExecutionEnvironment.createFileInput
 with FileProcessingMode.PROCESS_CONTINUOUSLY. But these things are hard to
 tell without a better understanding of your job.

 Cheers,
 Till

 On Mon, Jan 13, 2020 at 8:35 PM Cam Mach  wrote:

> Hello Flink expert,
>
> We have a pipeline that read both bounded and unbounded sources and
> our understanding is that when the bounded sources complete they should 
> get
> a watermark of +inf and then we should be able to take a savepoint and
> safely restart the pipeline. However, we have source that never get
> watermarks and we are confused as to what we are seeing and what we should
> expect
>
>
> Cam Mach
> Software Engineer
> E-mail: cammac...@gmail.com
> Tel: 206 972 2768
>
>


[jira] [Created] (FLINK-15677) TableSource#explainSource should return the field names from getProducedDataType instead of getTableSchema

2020-01-19 Thread hailong wang (Jira)
hailong wang created FLINK-15677:


 Summary: TableSource#explainSource should return the field names 
from getProducedDataType instead of getTableSchema
 Key: FLINK-15677
 URL: https://issues.apache.org/jira/browse/FLINK-15677
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: hailong wang
 Fix For: 1.11.0


For the fields push down, the final fieldNames  of tablesource are not 
consistent with FieldNames from TableSchema. So we should return the field 
names from getProducedDataType instead of getTableSchema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15676) Improve test code of JDBCUpsertTableSinkITCase and JDBCLookupFunctionITCase

2020-01-19 Thread hailong wang (Jira)
hailong wang created FLINK-15676:


 Summary: Improve test code of JDBCUpsertTableSinkITCase and 
JDBCLookupFunctionITCase
 Key: FLINK-15676
 URL: https://issues.apache.org/jira/browse/FLINK-15676
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / JDBC
Affects Versions: 1.10.0
Reporter: hailong wang
 Fix For: 1.11.0


JDBC Connector test code has basic class JDBCTestBase

to create table, insert data and drop table. So we should extend it when we 
should look data from jdbc or insert. This will ensure the test code is clean. 
I found JDBCUpsertTableSinkITCase and JDBCLookupFunctionITCase create table and 
drop table by itself. So I think JDBCUpsertTableSinkITCase and 
JDBCLookupFunctionITCase should reflect to extend JDBCTestBase to reuse it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: ApplyForContributorPermission

2020-01-19 Thread Benchao Li
Hi

Welcome to the Flink community

You no longer need contributor permissions. You can simply create a JIRA
ticket and ask to be assigned to it in order to start working. Please also
take a look at the Flink's contribution guidelines [1] for more information.

[1] https://flink.apache.org/contributing/how-to-contribute.html

273339930 <273339...@qq.com> 于2020年1月19日周日 下午11:24写道:

> Dear My Worshippers, I want to contribute to Apache Flink. Would you
> please give me the permission as a contributor? My JIRA ID is forward.



-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn


[jira] [Created] (FLINK-15675) Add documentation that Python UDF is not supported for Flink Planner under batch mode

2020-01-19 Thread Hequn Cheng (Jira)
Hequn Cheng created FLINK-15675:
---

 Summary: Add documentation that Python UDF is not supported for 
Flink Planner under batch mode
 Key: FLINK-15675
 URL: https://issues.apache.org/jira/browse/FLINK-15675
 Project: Flink
  Issue Type: Improvement
  Components: API / Python, Documentation
Reporter: Hequn Cheng
Assignee: Hequn Cheng
 Fix For: 1.10.0


We should add document to info users that Python UDF is not supported for Flink 
Planner under batch mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15674) Let Java and Scala Type Extraction go through the same stack

2020-01-19 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15674:


 Summary: Let Java and Scala Type Extraction go through the same 
stack
 Key: FLINK-15674
 URL: https://issues.apache.org/jira/browse/FLINK-15674
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Stephan Ewen


Currently, the Java and Scala Type Extraction stacks are completely different.
* Java uses the {{TypeExtractor}}
* Scala uses the type extraction macros.

As a result, the same class can be extracted as different types in the 
different stacks, which can lead to very confusing results. In particular, when 
you use the TypeExtractor on Scala Classes, you always get a {{GenericType}}.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15673) Shepherd FLIP-75 (Various Web UI fixes)

2020-01-19 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15673:


 Summary: Shepherd FLIP-75 (Various Web UI fixes)
 Key: FLINK-15673
 URL: https://issues.apache.org/jira/browse/FLINK-15673
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Stephan Ewen


FLIP-75 suggests various good web ui fixes, including

* Better support for larger log files
* Better linking and correlation between exceptions, failed tasks, and log 
messages

The current state of the discussion, including a live demo, is here: 
https://lists.apache.org/thread.html/76e646afe3ddff7b4bec029dec720512a6fcb3033c6e42b0e44b2b6b%40%3Cdev.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15672) Switch to Log4j 2 by default

2020-01-19 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15672:


 Summary: Switch to Log4j 2 by default
 Key: FLINK-15672
 URL: https://issues.apache.org/jira/browse/FLINK-15672
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Stephan Ewen


Log4j 1.x is outdated and has multiple problems:

* No dynamic adjustments of Log Level
* Problems with newer Java Versions
* Not actively developed any more

Switching to Log4J 2 by default would solve this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15671) Provide one place for Docker Images with all supported cluster modes

2020-01-19 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15671:


 Summary: Provide one place for Docker Images with all supported 
cluster modes
 Key: FLINK-15671
 URL: https://issues.apache.org/jira/browse/FLINK-15671
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Docker
Reporter: Stephan Ewen
 Fix For: 1.11.0


Currently, there are different places where Docker Images for Flink exist:

* https://github.com/docker-flink
* https://github.com/apache/flink/tree/master/flink-container/docker
* https://github.com/apache/flink/tree/master/flink-contrib/docker-flink

The different Dockerfiles in the different places support different modes, for 
example some support session clusters while other support the "self-contained 
application mode" (per job mode).

There should be one single place to defined images that cover both session and 
application mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15670) Provide a Kafka Source/Sink pair that aligns Kafka's Partitions and Flink's KeyGroups

2020-01-19 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-15670:


 Summary: Provide a Kafka Source/Sink pair that aligns Kafka's 
Partitions and Flink's KeyGroups
 Key: FLINK-15670
 URL: https://issues.apache.org/jira/browse/FLINK-15670
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, Connectors / Kafka
Reporter: Stephan Ewen
 Fix For: 1.11.0


This Source/Sink pair would serve two purposes:

1. You can read topics that are already partitioned by key and process them 
without partitioning them again (avoid shuffles)

2. You can use this to shuffle through Kafka, thereby decomposing the job into 
smaller jobs and independent pipelined regions that fail over independently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Flink 1.6, increment Checkpoint, the shared dir stored the last year checkpoint state

2020-01-19 Thread Yun Tang
Hi Lake

A more suitable place for this mail should be in user-mail list.

There are three reasons why this could happen:

  1.  This file is orphan file e.g. file uploaded during one checkpoint but 
task manager exited unexpectedly leave that checkpoint not completed.
  2.  This file should be removed by checkpoint coordinator but takes too long 
to complete before job shut down.
  3.  This file is still useful. This is possible in theory because some 
specific rocksDB sst file might not be selected during compactions for a long 
time.

Best
Yun Tang

From: LakeShen 
Sent: Sunday, January 19, 2020 18:55
To: user ; user...@flink.apache.org 
; dev 
Subject: Flink 1.6, increment Checkpoint, the shared dir stored the last year 
checkpoint state

Hi community,
now I have a flink sql job, and I set the flink sql sate retention 
time, there are three dir in flink checkpoint dir :
1. chk -xx dir
2. shared dir
3. taskowned dir

I find the shared dir store the last year checkpoint state,the only reason I 
thought is that the latest
checkpoint retain reference of last year checkpoint state file.
Are there any other reason to lead this? Or is it a bug?

Thanks to your replay.

Best wishes,
Lake Shen



ApplyForContributorPermission

2020-01-19 Thread 273339930
Dear My Worshippers, I want to contribute to Apache Flink. Would you please 
give me the permission as a contributor? My JIRA ID is forward.

Re: [jira] [Created] (FLINK-15644) Add support for SQL query validation

2020-01-19 Thread Flavio Pompermaier
Ok thanks for the pointer, I wasn't awareof that!

Il Dom 19 Gen 2020, 03:00 godfrey he  ha scritto:

> hi Flavio, TableEnvironment.getCompletionHints maybe already meet the
> requirement.
>
> Flavio Pompermaier  于2020年1月18日周六 下午3:39写道:
>
> > Why not adding also a suggest() method (also unimplemented initially)
> that
> > would return the list of suitable completions/tokens on the current
> query?
> > How complex eould it be to implement it in you opinion?
> >
> > Il Ven 17 Gen 2020, 18:32 Fabian Hueske (Jira)  ha
> > scritto:
> >
> > > Fabian Hueske created FLINK-15644:
> > > -
> > >
> > >  Summary: Add support for SQL query validation
> > >  Key: FLINK-15644
> > >  URL:
> https://issues.apache.org/jira/browse/FLINK-15644
> > >  Project: Flink
> > >   Issue Type: New Feature
> > >   Components: Table SQL / API
> > > Reporter: Fabian Hueske
> > >
> > >
> > > It would be good if the {{TableEnvironment}} would offer methods to
> check
> > > the validity of SQL queries. Such a method could be used by services
> (CLI
> > > query shells, notebooks, SQL UIs) that are backed by Flink and execute
> > > their queries on Flink.
> > >
> > > Validation should be available in two levels:
> > >  # Validation of syntax and semantics: This includes parsing the query,
> > > checking the catalog for dbs, tables, fields, type checks for
> expressions
> > > and functions, etc. This will check if the query is a valid SQL query.
> > >  # Validation that query is supported: Checks if Flink can execute the
> > > given query. Some syntactically and semantically valid SQL queries are
> > not
> > > supported, esp. in a streaming context. This requires running the
> > > optimizer. If the optimizer generates an execution plan, the query can
> be
> > > executed. This check includes the first step and is more expensive.
> > >
> > > The reason for this separation is that the first check can be done much
> > > fast as it does not involve calling the optimizer. Hence, it would be
> > > suitable for fast checks in an interactive query editor. The second
> check
> > > might take more time (depending on the complexity of the query) and
> might
> > > not be suitable for rapid checks but only on explicit user request.
> > >
> > > Requirements:
> > >  * validation does not modify the state of the {{TableEnvironment}},
> i.e.
> > > it does not add plan operators
> > >  * validation does not require connector dependencies
> > >  * validation can identify the update mode of a continuous query result
> > > (append-only, upsert, retraction).
> > >
> > > Out of scope for this issue:
> > >  * better error messages for unsupported features as suggested by
> > > FLINK-7217
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian Jira
> > > (v8.3.4#803005)
> > >
> >
>


[ANNOUNCE] Weekly Community Update 2020/03

2020-01-19 Thread Konstantin Knauf
Dear community,

happy to share this week's weekly community digest with a release candidate
for Flink 1.10, a Pulsar Catalog for Flink, a 50% discount code for Flink
Forward SF and bit more.

Flink Development
==

* [releases] The first (preview)* release candidate for Flink 1.10* has
been created. Every help in testing the release candidate is highly
appreciated. [1]

* [sql] I believe I have never covered the contribution of a *Pulsar
Catalog* to Flink's SQL ecosystem in the past. So here it is. Apache Pulsar
includes a schema registry out-of-the-box. With Flink's Pulsar catalog,
Pulsar topics will automatically available as tables in Flink. Pulsar
namespaces are mapped to databases in Flink. [2,3]

* [deployment] End of last year Patrick Lucas had started a discussion on
integrating the *Flink Docker images *into the Apache Flink project. The
discussion has stalled a bit by now, but there seems to be a consensus that
a) the Flink Docker images should move to Apache Flink and b) the
Dockerfiles in apache/flink need to be consolidated. [4]

* [statebackends] When using the *RocksDbStatebackend Timer* state is
currently - by default - still stored on the Java heap. Stephan proposes to
change this behaviour to store Timers in RocksDB by default and has
received a lot of positive feedback. [5]

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-10-0-release-candidate-0-tp36770.html
[2]
https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-72-Introduce-Pulsar-Connector-tp33283.html
[4]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Integrate-Flink-Docker-image-publication-into-Flink-release-process-tp36139.html
[5]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Change-default-for-RocksDB-timers-Java-Heap-in-RocksDB-tp36720.html

Notable Bugs
==

* [FLINK-15577] [1.9.1] When using two different windows within on SQL
query, Flink might generate an invalid physical execution plan as it
incorrectly considers the two window transformations equivalent. More
details in the ticket. [6]

[6] https://issues.apache.org/jira/browse/FLINK-15577

Events, Blog Posts, Misc
===

* *Dian Fu* is now an Apache Flink Comitter. Congratulations! [7]

* *Bird *has published an in-depth blog post on how they use Apache Flink
to detect offline scooters. [8]

* On the Flink Blog, Alexander Fedulov has published a first blog post of a
series on how to implement a *financial fraud detection use case* with
Apache Flink. [9]

* The extended *Call for Presentations for Flink Forward San Francisco*
ends today. Make sure to submit your talk in time. [10,11]

* Still on the fence whether to attend *Flink Forward San Francisco*? Let
me help you with that: when registering use discount code
*FFSF20-MailingList* to get a * 50% discount* on your conference ticket.

* Upcoming Meetups
   * On January 22 my colleague *Alexander Fedulov *will talk about Fraud
Detection with Apache Flink at the Apache Flink Meetup in Madrid [12].
   * On February 6 *Alexander Fedulov *will talk about Stateful Stream
Processing with Apache Flink at the R-Ladies meetup in Kiew. [13]

[7]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Dian-Fu-becomes-a-Flink-committer-tp36696p36760.html
[8]
https://medium.com/bird-engineering/replayable-process-functions-in-flink-time-ordering-and-timers-28007a0210e1
[9] https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html
[10]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Flink-Forward-San-Francisco-2020-Call-for-Presentation-extended-tp36595.html
[11] https://www.flink-forward.org/sf-2020
[12]
https://www.meetup.com/Meetup-de-Apache-Flink-en-Madrid/events/267744681/
[13] https://www.meetup.com/rladies-kyiv/events/267948988/

Cheers,

Konstantin (@snntrable)

-- 

Konstantin Knauf | Solutions Architect

+49 160 91394525


Follow us @VervericaData Ververica 


--

Join Flink Forward  - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Tony) Cheng


[jira] [Created] (FLINK-15669) SQL client can't cancel flink job

2020-01-19 Thread godfrey he (Jira)
godfrey he created FLINK-15669:
--

 Summary: SQL client can't cancel flink job
 Key: FLINK-15669
 URL: https://issues.apache.org/jira/browse/FLINK-15669
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.10.0
Reporter: godfrey he
 Fix For: 1.10.0


in sql client, CLI client do cancel query through {{void cancelQuery(String 
sessionId, String resultId)}} method in {{Executor}}. However, the {{resultId}} 
is a random UUID, is not the job id. So CLI client can't cancel a running job.


{code:java}
private  ResultDescriptor executeQueryInternal(String sessionId, 
ExecutionContext context, String query) {
..

// store the result with a unique id
final String resultId = UUID.randomUUID().toString();
resultStore.storeResult(resultId, result);

   ..

// create execution
final ProgramDeployer deployer = new ProgramDeployer(
configuration, jobName, pipeline);

// start result retrieval
result.startRetrieval(deployer);

return new ResultDescriptor(
resultId,
removeTimeAttributes(table.getSchema()),
result.isMaterialized());
}

private  void cancelQueryInternal(ExecutionContext context, String 
resultId) {
..

// stop Flink job
try (final ClusterDescriptor clusterDescriptor = 
context.createClusterDescriptor()) {
ClusterClient clusterClient = null;
try {
// retrieve existing cluster
clusterClient = 
clusterDescriptor.retrieve(context.getClusterId()).getClusterClient();
try {
clusterClient.cancel(new 
JobID(StringUtils.hexStringToByte(resultId))).get();
} catch (Throwable t) {
// the job might has finished earlier
}
} catch (Exception e) {
throw new SqlExecutionException("Could not 
retrieve or create a cluster.", e);
} finally {
try {
if (clusterClient != null) {
clusterClient.close();
}
} catch (Exception e) {
// ignore
}
}
} catch (SqlExecutionException e) {
throw e;
} catch (Exception e) {
throw new SqlExecutionException("Could not locate a 
cluster.", e);
}
}
{code}







--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Flink 1.6, increment Checkpoint, the shared dir stored the last year checkpoint state

2020-01-19 Thread LakeShen
Hi community,
now I have a flink sql job, and I set the flink sql sate retention
time, there are three dir in flink checkpoint dir :
1. chk -xx dir
2. shared dir
3. taskowned dir

I find the shared dir store the last year checkpoint state,the only reason
I thought is that the latest
checkpoint retain reference of last year checkpoint state file.
Are there any other reason to lead this? Or is it a bug?

Thanks to your replay.

Best wishes,
Lake Shen


Re: [DISCUSS] Active Kubernetes integration phase2

2020-01-19 Thread felixzheng zheng
Hi Yang Wang,

Thanks for your effort on this topic and inviting me, I am glad to join the
future work on native Kubernetes integration, will try to join the slack
channel latter.

Yang Wang  于2020年1月19日周日 下午6:04写道:

> Hi Canbin Zheng,
>
> I have found that you created some tickets about Flink on Kubernetes. We
> just have the same requirements.
> Maybe we could have more discussion about the use cases and implementation
> details. I have created
> a slack channel[1], please join in if you want.
>
> Any dev users if you have some ideas about the new features. Join in us,
> let's make Flink on Kubernetes
> moving forward together.
>
>
> [1].
>
> https://slack.com/share/ISUKQ2WG5/cXwlw5HZgp3AE5ElE8m580Pk/enQtOTEyNjcwMDk4NTQ5LWZiNTU0ZmJiZTU4MWU1OTk2YjllNmE0OTg2YjIxYjA2YmE1MWFlOTE4NWFhMDBkMzE4NDQzYjk1YmQwMDI2MzU
>
>
> Yang Wang  于2020年1月19日周日 下午5:07写道:
>
> > Hi everyone,
> >
> >
> > Currently Flink supports the resource management system YARN and Mesos.
> > However, they were not
> > designed for fast moving cloud native architectures, and they could not
> > support mixed workloads (e.g. batch,
> > streaming, deep learning, web services, etc.) relatively well. At the
> same
> > time, Kubernetes is evolving very
> > fast to fill those gaps and become the de-facto orchestration framework.
> > So running Flink on Kubernetes is
> > a very basic requirement for many users.
> >
> >
> > At least, we have the following advantages when natively running Flink on
> > Kubernetes.
> > * Flink KubernetesResourceManager will allocate TaskManager pods
> > dynamically based on the resource
> > requirement of the jobs.
> > * Using Flink bundled scripts to start/stop session cluster on
> > Kuberenetes. Do not need external tools anymore.
> > * Compared with Yarn deployment, different Flink clusters could get
> better
> > isolation by leveraging the ability
> > of Kubernetes.
> >
> >
> > Recently, i also find more and more uses are very interested in running
> > Flink on Kubernetes natively.
> > The community has already made some efforts[1] and will be released in
> > 1.10. Welcome to have a taste and
> > give us your feedback. However, it is a basic requirement and we still
> > need many features before production.
> > So i want to start this discussion to collect the requirements that you
> > have came across. Feel free to share
> > your valuable thoughts. We will try to conclude and create sub tasks in
> > the umbrella ticket[2]. Also i will move
> > some existing tickets there for easier tracking them.
> >
> >
> > Best,
> > Yang
> >
> > [1].
> >
> https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
> > [2]. https://issues.apache.org/jira/browse/FLINK-14460
> >
>


Re: [DISCUSS] Active Kubernetes integration phase2

2020-01-19 Thread Yang Wang
Hi Canbin Zheng,

I have found that you created some tickets about Flink on Kubernetes. We
just have the same requirements.
Maybe we could have more discussion about the use cases and implementation
details. I have created
a slack channel[1], please join in if you want.

Any dev users if you have some ideas about the new features. Join in us,
let's make Flink on Kubernetes
moving forward together.


[1].
https://slack.com/share/ISUKQ2WG5/cXwlw5HZgp3AE5ElE8m580Pk/enQtOTEyNjcwMDk4NTQ5LWZiNTU0ZmJiZTU4MWU1OTk2YjllNmE0OTg2YjIxYjA2YmE1MWFlOTE4NWFhMDBkMzE4NDQzYjk1YmQwMDI2MzU


Yang Wang  于2020年1月19日周日 下午5:07写道:

> Hi everyone,
>
>
> Currently Flink supports the resource management system YARN and Mesos.
> However, they were not
> designed for fast moving cloud native architectures, and they could not
> support mixed workloads (e.g. batch,
> streaming, deep learning, web services, etc.) relatively well. At the same
> time, Kubernetes is evolving very
> fast to fill those gaps and become the de-facto orchestration framework.
> So running Flink on Kubernetes is
> a very basic requirement for many users.
>
>
> At least, we have the following advantages when natively running Flink on
> Kubernetes.
> * Flink KubernetesResourceManager will allocate TaskManager pods
> dynamically based on the resource
> requirement of the jobs.
> * Using Flink bundled scripts to start/stop session cluster on
> Kuberenetes. Do not need external tools anymore.
> * Compared with Yarn deployment, different Flink clusters could get better
> isolation by leveraging the ability
> of Kubernetes.
>
>
> Recently, i also find more and more uses are very interested in running
> Flink on Kubernetes natively.
> The community has already made some efforts[1] and will be released in
> 1.10. Welcome to have a taste and
> give us your feedback. However, it is a basic requirement and we still
> need many features before production.
> So i want to start this discussion to collect the requirements that you
> have came across. Feel free to share
> your valuable thoughts. We will try to conclude and create sub tasks in
> the umbrella ticket[2]. Also i will move
> some existing tickets there for easier tracking them.
>
>
> Best,
> Yang
>
> [1].
> https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
> [2]. https://issues.apache.org/jira/browse/FLINK-14460
>


[jira] [Created] (FLINK-15668) Remove AlsoRunWithLegacyScheduler

2020-01-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-15668:
---

 Summary: Remove AlsoRunWithLegacyScheduler
 Key: FLINK-15668
 URL: https://issues.apache.org/jira/browse/FLINK-15668
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination, Tests
Affects Versions: 1.11.0
Reporter: Zhu Zhu
Assignee: Zhu Zhu
 Fix For: 1.11.0


AlsoRunWithLegacyScheduler is used to enable IT cases for legacy scheduler.
This task is to remove it as well as the related test stages from travis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15667) Hadoop Configurations Mount Support

2020-01-19 Thread Canbin Zheng (Jira)
Canbin Zheng created FLINK-15667:


 Summary: Hadoop Configurations Mount Support
 Key: FLINK-15667
 URL: https://issues.apache.org/jira/browse/FLINK-15667
 Project: Flink
  Issue Type: New Feature
  Components: Deployment / Kubernetes
Reporter: Canbin Zheng


Mounts the Hadoop configurations, either as a pre-defined config map, or a 
local configuration directory on Pod.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Active Kubernetes integration phase2

2020-01-19 Thread Yang Wang
 Hi everyone,


Currently Flink supports the resource management system YARN and Mesos.
However, they were not
designed for fast moving cloud native architectures, and they could not
support mixed workloads (e.g. batch,
streaming, deep learning, web services, etc.) relatively well. At the same
time, Kubernetes is evolving very
fast to fill those gaps and become the de-facto orchestration framework. So
running Flink on Kubernetes is
a very basic requirement for many users.


At least, we have the following advantages when natively running Flink on
Kubernetes.
* Flink KubernetesResourceManager will allocate TaskManager pods
dynamically based on the resource
requirement of the jobs.
* Using Flink bundled scripts to start/stop session cluster on Kuberenetes.
Do not need external tools anymore.
* Compared with Yarn deployment, different Flink clusters could get better
isolation by leveraging the ability
of Kubernetes.


Recently, i also find more and more uses are very interested in running
Flink on Kubernetes natively.
The community has already made some efforts[1] and will be released in
1.10. Welcome to have a taste and
give us your feedback. However, it is a basic requirement and we still need
many features before production.
So i want to start this discussion to collect the requirements that you
have came across. Feel free to share
your valuable thoughts. We will try to conclude and create sub tasks in the
umbrella ticket[2]. Also i will move
some existing tickets there for easier tracking them.


Best,
Yang

[1].
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
[2]. https://issues.apache.org/jira/browse/FLINK-14460


[jira] [Created] (FLINK-15666) GPU scheduling support in Kubernetes mode

2020-01-19 Thread Canbin Zheng (Jira)
Canbin Zheng created FLINK-15666:


 Summary: GPU scheduling support in Kubernetes mode
 Key: FLINK-15666
 URL: https://issues.apache.org/jira/browse/FLINK-15666
 Project: Flink
  Issue Type: New Feature
  Components: Deployment / Kubernetes
Reporter: Canbin Zheng


This is an umbrella ticket for work on GPU scheduling in Kubernetes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15665) External shuffle service support in Kubernetes mode

2020-01-19 Thread Canbin Zheng (Jira)
Canbin Zheng created FLINK-15665:


 Summary: External shuffle service support in Kubernetes mode
 Key: FLINK-15665
 URL: https://issues.apache.org/jira/browse/FLINK-15665
 Project: Flink
  Issue Type: New Feature
  Components: Deployment / Kubernetes
Reporter: Canbin Zheng


This is an umbrella ticket for work on Kubernetes-specific external shuffle 
service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15664) Flink plan visualizer does not show operators information completely when there are line wrap

2020-01-19 Thread Yun Gao (Jira)
Yun Gao created FLINK-15664:
---

 Summary: Flink plan visualizer does not show operators information 
completely when there are line wrap
 Key: FLINK-15664
 URL: https://issues.apache.org/jira/browse/FLINK-15664
 Project: Flink
  Issue Type: Bug
  Components: Project Website
Affects Versions: 1.9.1, 1.9.0
Reporter: Yun Gao
 Attachments: bad_display.png

When there are line wrap in the diagram created by ([flink plan 
visualizer|[https://flink.apache.org/visualizer/]]), the operator information 
will not be shown completely, as shown in the attached image.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15663) Kerberos Support in Kubernetes Deploy Mode

2020-01-19 Thread Canbin Zheng (Jira)
Canbin Zheng created FLINK-15663:


 Summary: Kerberos Support in Kubernetes Deploy Mode
 Key: FLINK-15663
 URL: https://issues.apache.org/jira/browse/FLINK-15663
 Project: Flink
  Issue Type: New Feature
  Components: Deployment / Kubernetes
Reporter: Canbin Zheng


This is the umbrella issue for all Kerberos related tasks with relation to 
Flink on Kubernetes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15662) Make K8s client timeouts configurable

2020-01-19 Thread Canbin Zheng (Jira)
Canbin Zheng created FLINK-15662:


 Summary: Make K8s client timeouts configurable
 Key: FLINK-15662
 URL: https://issues.apache.org/jira/browse/FLINK-15662
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Kubernetes
Reporter: Canbin Zheng


Kubernetes clients used in the client-side submission and requesting worker 
pods should have configurable read and connect timeouts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15661) JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure failed because of Could not find Flink job

2020-01-19 Thread Congxian Qiu(klion26) (Jira)
Congxian Qiu(klion26) created FLINK-15661:
-

 Summary: 
JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure failed 
because of Could not find Flink job 
 Key: FLINK-15661
 URL: https://issues.apache.org/jira/browse/FLINK-15661
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.11.0
Reporter: Congxian Qiu(klion26)


2020-01-19T06:25:02.3856954Z [ERROR] 
JobManagerHAProcessFailureRecoveryITCase.testDispatcherProcessFailure:347 The 
program encountered a ExecutionException : 
org.apache.flink.runtime.rest.util.RestClientException: 
[org.apache.flink.runtime.rest.handler.RestHandlerException: 
org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find 
Flink job (47fe3e8df0e59994938485f683d1410e)
 2020-01-19T06:25:02.3857171Z at 
org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.propagateException(JobExecutionResultHandler.java:91)
 2020-01-19T06:25:02.3857571Z at 
org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.lambda$handleRequest$1(JobExecutionResultHandler.java:82)
 2020-01-19T06:25:02.3857866Z at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
 2020-01-19T06:25:02.3857982Z at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
 2020-01-19T06:25:02.3859852Z at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
 2020-01-19T06:25:02.3860440Z at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
 2020-01-19T06:25:02.3860732Z at 
org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:872)
 2020-01-19T06:25:02.3860960Z at 
akka.dispatch.OnComplete.internal(Future.scala:263)
 2020-01-19T06:25:02.3861099Z at 
akka.dispatch.OnComplete.internal(Future.scala:261)
 2020-01-19T06:25:02.3861232Z at 
akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
 2020-01-19T06:25:02.3861391Z at 
akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
 2020-01-19T06:25:02.3861546Z at 
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
 2020-01-19T06:25:02.3861712Z at 
org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
 2020-01-19T06:25:02.3861809Z at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
 2020-01-19T06:25:02.3861916Z at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
 2020-01-19T06:25:02.3862221Z at 
akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
 2020-01-19T06:25:02.3862475Z at 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:23)
 2020-01-19T06:25:02.3862626Z at 
akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
 2020-01-19T06:25:02.3862736Z at 
scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
 2020-01-19T06:25:02.3862820Z at 
scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
 2020-01-19T06:25:02.3867146Z at 
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
 2020-01-19T06:25:02.3867318Z at 
akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
 2020-01-19T06:25:02.3867441Z at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
 2020-01-19T06:25:02.3867552Z at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
 2020-01-19T06:25:02.3867664Z at 
akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
 2020-01-19T06:25:02.3867763Z at 
scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
 2020-01-19T06:25:02.3867843Z at 
akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
 2020-01-19T06:25:02.3867936Z at 
akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
 2020-01-19T06:25:02.3868036Z at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
 2020-01-19T06:25:02.3868145Z at 
akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
 2020-01-19T06:25:02.3868223Z at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
 2020-01-19T06:25:02.3868313Z at 
akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
 2020-01-19T06:25:02.3868390Z at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
 2020-01-19T06:25:02.3868520Z Caused by: 
java.util.concurrent.CompletionException: 
org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find 
Flink job (47fe3e8df0e59994938485f683d1410e)
 2020-01-19T06:25:02.3868625Z at 
org.apache.flink.runtime.dispatcher.Dispatcher.lambda$requestJobStatus$17(Dispatcher.java:516)