Re: Flink Query Optimizer

2018-07-13 Thread Xingcan Cui
Hi Albert,

Calcite provides a rule-based optimizer (as a framework), which means users can 
customize it by adding rules. That’s exactly what Flink did. From the logical 
plan to the physical plan, the translations are triggered by different sets of 
rules, according to which the relational expressions are replaced, reordered or 
optimized.

However, IMO, the current optimization rules in Flink Table API are quite 
primal. Some SQL statements (e.g., multiple joins) are just translated to 
feasible execution plans, instead of optimized ones, since it’s much more 
difficult to conduct query optimization on large datasets or dynamic streams. 
You could first start from the Calcite query optimizer, and then try to make 
your own rules.

Best,
Xingcan

> On Jul 14, 2018, at 11:55 AM, vino yang  wrote:
> 
> Hi Albert,
> 
> First I guess the query optimizer you mentioned is about Flink table & sql
> (for batch API there is another optimizer which is implemented by Flink).
> 
> Yes, now for table & sql, Flink use Apache Calcite's query optimizer to
> translate into a Calcite plan
> which is then optimized according to Calcite's optimization rules.
> 
> The following rules are applied so far:
> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala
> 
> In view of Flink depends on the Calcite to do the optimization, I think
> enhance Flink and Calcite would be the right direction.
> 
> Hope for you provide more idea and details. Flink community welcome your
> idea and contribution.
> 
> Thanks.
> Vino.
> 
> 
> 2018-07-13 23:39 GMT+08:00 Albert Jonathan :
> 
>> Hello,
>> 
>> I am just wondering, does Flink use Apache Calcite's query optimizer to
>> generate an optimal logical plan, or does it have its own query optimizer?
>> From what I observed so far, the Flink's query optimizer only groups
>> operator together without changing the order of aggregation operators
>> (e.g., join). Did I miss anything?
>> 
>> I am thinking of extending Flink to apply query optimization as in the
>> RDBMS by either integrating it with Calcite or implementing it as a new
>> module.
>> Any feedback or guidelines will be highly appreciated.
>> 
>> Thank you,
>> Albert
>> 



Re: Flink Query Optimizer

2018-07-13 Thread vino yang
Hi Albert,

First I guess the query optimizer you mentioned is about Flink table & sql
(for batch API there is another optimizer which is implemented by Flink).

Yes, now for table & sql, Flink use Apache Calcite's query optimizer to
translate into a Calcite plan
which is then optimized according to Calcite's optimization rules.

The following rules are applied so far:
https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala

In view of Flink depends on the Calcite to do the optimization, I think
enhance Flink and Calcite would be the right direction.

Hope for you provide more idea and details. Flink community welcome your
idea and contribution.

Thanks.
Vino.


2018-07-13 23:39 GMT+08:00 Albert Jonathan :

> Hello,
>
> I am just wondering, does Flink use Apache Calcite's query optimizer to
> generate an optimal logical plan, or does it have its own query optimizer?
> From what I observed so far, the Flink's query optimizer only groups
> operator together without changing the order of aggregation operators
> (e.g., join). Did I miss anything?
>
> I am thinking of extending Flink to apply query optimization as in the
> RDBMS by either integrating it with Calcite or implementing it as a new
> module.
> Any feedback or guidelines will be highly appreciated.
>
> Thank you,
> Albert
>


[jira] [Created] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-13 Thread Ted Yu (JIRA)
Ted Yu created FLINK-9849:
-

 Summary: Upgrade hbase version to 2.0.1 for hbase connector
 Key: FLINK-9849
 URL: https://issues.apache.org/jira/browse/FLINK-9849
 Project: Flink
  Issue Type: Improvement
Reporter: Ted Yu


Currently hbase 1.4.3 is used for hbase connector.

We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] Apache Flink 1.5.1 released

2018-07-13 Thread Yaz Sh
Excellent!
Thanks to Chesnay, great work!

Yaz

> On Jul 13, 2018, at 8:37 AM, Dawid Wysakowicz  wrote:
> 
> Good job everyone and Chesnay for being the release manager!
> 
> 
> On 13/07/18 14:34, Hequn Cheng wrote:
>> Cool, thanks to Chesnay!
>> 
>> Best, Hequn
>> 
>> On Fri, Jul 13, 2018 at 8:25 PM, vino yang > > wrote:
>> 
>>Thanks Chesnay, great job!
>> 
>>Thanks,
>>Vino
>> 
>>2018-07-13 20:20 GMT+08:00 Till Rohrmann >>:
>> 
>>Great to hear. Big thank you to the community for the hard
>>work and to Chesnay for being our release manager.
>> 
>>Cheers,
>>Till
>> 
>>On Fri, Jul 13, 2018 at 12:05 PM Chesnay Schepler
>>mailto:ches...@apache.org>> wrote:
>> 
>>The Apache Flink community is very happy to announce the
>>release of Apache Flink 1.5.1, which is the first bugfix
>>release for the Apache Flink 1.5 series.
>> 
>>Apache Flink® is an open-source stream processing
>>framework for distributed, high-performing,
>>always-available, and accurate data streaming applications.
>> 
>>The release is available for download at:
>>https://flink.apache.org/downloads.html
>> 
>> 
>>Please check out the release blog post for an overview of
>>the improvements for this bugfix release:
>>http://flink.apache.org/news/2018/07/12/release-1.5.1.html
>>
>> 
>>The full release notes are available in Jira:
>>
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12343053
>>
>> 
>> 
>>We would like to thank all contributors of the Apache
>>Flink community who made this release possible!
>> 
>>PS: A regression was identified where the CLI job
>>submission fails when SSL is enabled. (FLINK-9842)
>> 
>>Regards,
>>Chesnay
>> 
>> 
>> 
> 
> 



Re: [VOTE] Release 1.5.1, release candidate #3

2018-07-13 Thread Yaz Sh
Hi Till,

Thanks for reply!

For version 1.4.x when Parallelism > Available task Slots has been selected, 
Flink throw bellow error immediately as you said

NoResourceAvailableException: Not enough free slots available to run the job
 
but for version 1.5.x there are two different behaviors: Sometimes job ran 
successfully and sometimes it throw error after 5 minutes with this message:

org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Could not allocate all requires slots within timeout of 30 ms. Slots 
required: 5, slots allocated: 2


I think it might be a bug and created this ticket to keep track of it
https://issues.apache.org/jira/browse/FLINK-9848 


Please let me know if you think otherwise.


Cheers,
Yazdan 

> On Jul 12, 2018, at 3:54 AM, Till Rohrmann  wrote:
> 
> Hi Yazdan,
> 
> that is the expected behaviour. If the system cannot allocate enough slots
> it will fail with a NoResourceAvailableException after
> the slot.request.timeout has been exceeded.
> 
> Cheers,
> Till
> 
> On Wed, Jul 11, 2018 at 7:44 PM Yaz Sh  > wrote:
> 
>> +1
>> 
>> - Verified the signatures for all binary artifacts
>> - Verifies Checksum for all binary packages
>> - Ran local cluster with no error on logs and empty *.out
>> - Stop local cluster  with no error on logs
>> - Ran multiple batch and streaming example via WebUI
>> - Rab multiple batch and streaming examples via CLI
>> - Increase number of task managers and ran examples with Parallelism > 1
>> - Ran WebUI on multiple browsers
>> - Check Example folder for all binary packages
>> 
>> Just an observation:
>> When ran a job with Parallelism > Available task slots intermediately job
>> stay in “Running" status for a very long time and neither finish nor throw
>> any errors.
>> 
>> Please check if someone else can reproduce it.
>> 
>> Cheers,
>> Yazdan
>> 
>> 
>>> On Jul 11, 2018, at 11:21 AM, Till Rohrmann 
>> wrote:
>>> 
>>> +1 (binding)
>>> 
>>> - Verified the signatures of all binary artifacts
>>> - Verified that no new dependencies were added for which the LICENSE and
>> NOTICE files need to be adapted.
>>> - Build 1.5.1 from the source artifact
>>> - Run flink-end-to-end tests for 12 hours for the 1.5.1 Hadoop 2.7
>> binary artifact
>>> - Run Jepsen tests for 12 hours for the 1.5.1 Hadoop 2.8 binary artifact
>>> 
>>> Cheers,
>>> Till
>>> 
>>> 
>>> On Wed, Jul 11, 2018 at 9:49 AM Chesnay Schepler > >> wrote:
>>> Correction on my part, it does affect all packages.
>>> 
>>> I've also found the cause. To speed up the process I only built modules
>>> that flink-dist depends on (see FLINK-9768). However flink-dist depends
>>> on neither flink-examples-batch nor flink-examples-streaming, yet
>>> happily accesses their target directory. The existing build process only
>>> worked since _by chance_ these 2 modules are built before flink-dist
>>> when doing a complete build.
>>> 
>>> I will rebuild the binaries (I don't think we have to cancel the RC for
>>> this) and open a JIRA to fix the dependencies.
>>> 
>>> On 11.07.2018 09:27, Chesnay Schepler wrote:
 oh, the packages that include hadoop are really missing it...
 
 On 11.07.2018 09:25, Chesnay Schepler wrote:
> @Yaz which binary package did you check? I looked into the
> hadoop-free package and the folder is there.
> 
> Did you maybe encounter an error when extracting the package?
> 
> On 11.07.2018 05:44, Yaz Sh wrote:
>> -1
>> 
>> ./examples/streaming folder is missing in binary packages
>> 
>> 
>> Cheers,
>> Yazdan
>> 
>>> On Jul 10, 2018, at 9:57 PM, vino yang >> 
>> >> wrote:
>>> 
>>> +1
>>> reviewed [1], [4] and [6]
>>> 
>>> 2018-07-11 3:10 GMT+08:00 Chesnay Schepler >> 
>> >>:
>>> 
 Hi everyone,
 Please review and vote on the release candidate #3 for the version
 1.5.1,
 as follows:
 [ ] +1, Approve the release
 [ ] -1, Do not approve the release (please provide specific
>> comments)
 
 
 The complete staging area is available for your review, which
 includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience
 releases to be
 deployed to dist.apache.org  
 > [2], which
>> are signed with the key with
 fingerprint 11D464BA [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.5.1-rc3" [5],
 * website pull request listing the new release and adding
 announcement

[jira] [Created] (FLINK-9848) When Parallelism more than available task slots Flink stays idle

2018-07-13 Thread Yazdan Shirvany (JIRA)
Yazdan Shirvany created FLINK-9848:
--

 Summary: When Parallelism more than available task slots Flink 
stays idle
 Key: FLINK-9848
 URL: https://issues.apache.org/jira/browse/FLINK-9848
 Project: Flink
  Issue Type: Bug
  Components: Cluster Management
Affects Versions: 1.5.1, 1.5.0
Reporter: Yazdan Shirvany
Assignee: Yazdan Shirvany
 Attachments: idleJob.jpg

For version 1.4.x when select Parallelism > Available task Slots Flink throw 
bellow error
NoResourceAvailableException: Not enough free slots available to run the job
 
but for version 1.5.x there are two different behaviors: Sometimes job ran 
successfully and sometimes it stay in "Running" status for a log time (I waited 
more than 3 minutes) and never throw error
 
In bellow image Available task slots is 1 and  job has been submitted with 
Parallelism = 4
Flink Version 1.5.1 !idleJob.jpg!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Flink Query Optimizer

2018-07-13 Thread Albert Jonathan
Hello,

I am just wondering, does Flink use Apache Calcite's query optimizer to
generate an optimal logical plan, or does it have its own query optimizer?
>From what I observed so far, the Flink's query optimizer only groups
operator together without changing the order of aggregation operators
(e.g., join). Did I miss anything?

I am thinking of extending Flink to apply query optimization as in the
RDBMS by either integrating it with Calcite or implementing it as a new
module.
Any feedback or guidelines will be highly appreciated.

Thank you,
Albert


[jira] [Created] (FLINK-9847) OneInputStreamTaskTest.testWatermarksNotForwardedWithinChainWhenIdle unstable

2018-07-13 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9847:


 Summary: 
OneInputStreamTaskTest.testWatermarksNotForwardedWithinChainWhenIdle unstable
 Key: FLINK-9847
 URL: https://issues.apache.org/jira/browse/FLINK-9847
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.6.0
Reporter: Till Rohrmann
 Fix For: 1.6.0


The test 
{{OneInputStreamTaskTest.testWatermarksNotForwardedWithinChainWhenIdle}} is 
unstable. When executing repeatedly the test fails from time to time with 
{code}
java.lang.Exception: error in task

at 
org.apache.flink.streaming.runtime.tasks.StreamTaskTestHarness.waitForTaskCompletion(StreamTaskTestHarness.java:250)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskTestHarness.waitForTaskCompletion(StreamTaskTestHarness.java:233)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTaskTest.testWatermarksNotForwardedWithinChainWhenIdle(OneInputStreamTaskTest.java:348)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:67)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: org.mockito.exceptions.misusing.WrongTypeOfReturnValue: 
Boolean cannot be returned by getChannelIndex()
getChannelIndex() should return int
***
If you're unsure why you're getting above error read on.
Due to the nature of the syntax above problem might occur because:
1. This exception *might* occur in wrongly written multi-threaded tests.
   Please refer to Mockito FAQ on limitations of concurrency testing.
2. A spy is stubbed using when(spy.foo()).then() syntax. It is safer to stub 
spies - 
   - with doReturn|Throw() family of methods. More in javadocs for 
Mockito.spy() method.

at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:165)
at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:209)
at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskTestHarness$TaskThread.run(StreamTaskTestHarness.java:437)
{code}

Given the exception I suspect that there is a problem with mocking in the 
{{OneInputStreamTaskTestHarness}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9846) Add a Kafka table sink factory

2018-07-13 Thread Timo Walther (JIRA)
Timo Walther created FLINK-9846:
---

 Summary: Add a Kafka table sink factory
 Key: FLINK-9846
 URL: https://issues.apache.org/jira/browse/FLINK-9846
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Timo Walther
Assignee: Timo Walther


FLINK-8866 implements a unified way of creating sinks and using the format 
discovery for searching for formats (FLINK-8858). It is now possible to add a 
Kafka table sink factory for streaming environment that uses the new interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9845) Make InternalTimerService's timer processing interruptible/abortable

2018-07-13 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9845:


 Summary: Make InternalTimerService's timer processing 
interruptible/abortable
 Key: FLINK-9845
 URL: https://issues.apache.org/jira/browse/FLINK-9845
 Project: Flink
  Issue Type: Improvement
  Components: State Backends, Checkpointing
Affects Versions: 1.5.1, 1.6.0
Reporter: Till Rohrmann
 Fix For: 1.7.0


When cancelling a {{Task}}, the task thread might currently process the timers 
registered at the {{InternalTimerService}}. Depending on the timer action, this 
might take a while and, thus, blocks the cancellation of the {{Task}}. In the 
most extreme case, the {{TaskCancelerWatchDog}} kicks in and kills the whole 
{{TaskManager}} process.

In order to alleviate the problem (speed up the cancellation reaction), we 
should make the processing of the timers interruptible/abortable. This means 
that instead of processing all timers we should check in between timers whether 
the {{Task}} is currently being cancelled or not. If this is the case, then we 
should directly stop processing the remaining timers and return.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] Apache Flink 1.5.1 released

2018-07-13 Thread Dawid Wysakowicz
Good job everyone and Chesnay for being the release manager!


On 13/07/18 14:34, Hequn Cheng wrote:
> Cool, thanks to Chesnay!
>
> Best, Hequn
>
> On Fri, Jul 13, 2018 at 8:25 PM, vino yang  > wrote:
>
> Thanks Chesnay, great job!
>
> Thanks,
> Vino
>
> 2018-07-13 20:20 GMT+08:00 Till Rohrmann  >:
>
> Great to hear. Big thank you to the community for the hard
> work and to Chesnay for being our release manager.
>
> Cheers,
> Till
>
> On Fri, Jul 13, 2018 at 12:05 PM Chesnay Schepler
> mailto:ches...@apache.org>> wrote:
>
> The Apache Flink community is very happy to announce the
> release of Apache Flink 1.5.1, which is the first bugfix
> release for the Apache Flink 1.5 series.
>
> Apache Flink® is an open-source stream processing
> framework for distributed, high-performing,
> always-available, and accurate data streaming applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>  
>
> Please check out the release blog post for an overview of
> the improvements for this bugfix release:
> http://flink.apache.org/news/2018/07/12/release-1.5.1.html
> 
>
> The full release notes are available in Jira:
> 
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12343053
> 
> 
>
> We would like to thank all contributors of the Apache
> Flink community who made this release possible!
>
> PS: A regression was identified where the CLI job
> submission fails when SSL is enabled. (FLINK-9842)
>
> Regards,
> Chesnay
>
>
>




Re: [ANNOUNCE] Apache Flink 1.5.1 released

2018-07-13 Thread Hequn Cheng
Cool, thanks to Chesnay!

Best, Hequn

On Fri, Jul 13, 2018 at 8:25 PM, vino yang  wrote:

> Thanks Chesnay, great job!
>
> Thanks,
> Vino
>
> 2018-07-13 20:20 GMT+08:00 Till Rohrmann :
>
>> Great to hear. Big thank you to the community for the hard work and to
>> Chesnay for being our release manager.
>>
>> Cheers,
>> Till
>>
>> On Fri, Jul 13, 2018 at 12:05 PM Chesnay Schepler 
>> wrote:
>>
>>> The Apache Flink community is very happy to announce the release of
>>> Apache Flink 1.5.1, which is the first bugfix release for the Apache Flink
>>> 1.5 series.
>>>
>>> Apache Flink® is an open-source stream processing framework for
>>> distributed, high-performing, always-available, and accurate data streaming
>>> applications.
>>>
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>>
>>> Please check out the release blog post for an overview of the
>>> improvements for this bugfix release:
>>> http://flink.apache.org/news/2018/07/12/release-1.5.1.html
>>>
>>> The full release notes are available in Jira:
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?proje
>>> ctId=12315522&version=12343053
>>>
>>> We would like to thank all contributors of the Apache Flink community
>>> who made this release possible!
>>>
>>> PS: A regression was identified where the CLI job submission fails when
>>> SSL is enabled. (FLINK-9842)
>>>
>>> Regards,
>>> Chesnay
>>>
>>>
>


Re: [ANNOUNCE] Apache Flink 1.5.1 released

2018-07-13 Thread vino yang
Thanks Chesnay, great job!

Thanks,
Vino

2018-07-13 20:20 GMT+08:00 Till Rohrmann :

> Great to hear. Big thank you to the community for the hard work and to
> Chesnay for being our release manager.
>
> Cheers,
> Till
>
> On Fri, Jul 13, 2018 at 12:05 PM Chesnay Schepler 
> wrote:
>
>> The Apache Flink community is very happy to announce the release of
>> Apache Flink 1.5.1, which is the first bugfix release for the Apache Flink
>> 1.5 series.
>>
>> Apache Flink® is an open-source stream processing framework for
>> distributed, high-performing, always-available, and accurate data streaming
>> applications.
>>
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>>
>> Please check out the release blog post for an overview of the
>> improvements for this bugfix release:
>> http://flink.apache.org/news/2018/07/12/release-1.5.1.html
>>
>> The full release notes are available in Jira:
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> projectId=12315522&version=12343053
>>
>> We would like to thank all contributors of the Apache Flink community who
>> made this release possible!
>>
>> PS: A regression was identified where the CLI job submission fails when
>> SSL is enabled. (FLINK-9842)
>>
>> Regards,
>> Chesnay
>>
>>


Re: [ANNOUNCE] Apache Flink 1.5.1 released

2018-07-13 Thread Till Rohrmann
Great to hear. Big thank you to the community for the hard work and to
Chesnay for being our release manager.

Cheers,
Till

On Fri, Jul 13, 2018 at 12:05 PM Chesnay Schepler 
wrote:

> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.5.1, which is the first bugfix release for the Apache Flink 1.5
> series.
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data streaming
> applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Please check out the release blog post for an overview of the improvements
> for this bugfix release:
> http://flink.apache.org/news/2018/07/12/release-1.5.1.html
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12343053
>
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
>
> PS: A regression was identified where the CLI job submission fails when
> SSL is enabled. (FLINK-9842)
>
> Regards,
> Chesnay
>
>


[ANNOUNCE] Apache Flink 1.5.1 released

2018-07-13 Thread Chesnay Schepler

The Apache Flink community is very happy to announce the release of Apache 
Flink 1.5.1, which is the first bugfix release for the Apache Flink 1.5 series.

Apache Flink® is an open-source stream processing framework for distributed, 
high-performing, always-available, and accurate data streaming applications.

The release is available for download at:
https://flink.apache.org/downloads.html  


Please check out the release blog post for an overview of the improvements for 
this bugfix release:
http://flink.apache.org/news/2018/07/12/release-1.5.1.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12343053

We would like to thank all contributors of the Apache Flink community who made 
this release possible!

PS: A regression was identified where the CLI job submission fails when SSL is 
enabled. (FLINK-9842)

Regards,
Chesnay



[jira] [Created] (FLINK-9844) PackagedProgram does not close URLClassLoader

2018-07-13 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9844:
---

 Summary: PackagedProgram does not close URLClassLoader
 Key: FLINK-9844
 URL: https://issues.apache.org/jira/browse/FLINK-9844
 Project: Flink
  Issue Type: Improvement
  Components: Core, Job-Submission
Reporter: Chesnay Schepler


The {{PackagedProgram}} class creates a user-code classloader to execute the 
programs {{main}} method.
This classloader is a {{URLClassLoader}} (except in the case of some tests), 
which contains opened {{JarFiles}} for all accessed jars.

The {{URLClassLoader}} class implements {{Closeable}} for the purpose of 
cleaning up closeable resources, however we never actually call this method. 
The {{PackagedProgram}} only works against the {{ClassLoader}} class, which 
doesn't implement {{Closeable}}.

As a result, deleting a jar after submitting it with the {{JarRunHandler}} 
currently fails on Windows, since the jar is still in use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9843) Build website automatically

2018-07-13 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9843:
---

 Summary: Build website automatically
 Key: FLINK-9843
 URL: https://issues.apache.org/jira/browse/FLINK-9843
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Chesnay Schepler


The project website currently has to be built manually for every change, in 
contrast to the documentation which is built nightly with buildbot.
We should look into building the website automatically as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9842) Job submission fails via CLI with SSL enabled

2018-07-13 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-9842:
--

 Summary: Job submission fails via CLI with SSL enabled
 Key: FLINK-9842
 URL: https://issues.apache.org/jira/browse/FLINK-9842
 Project: Flink
  Issue Type: Bug
  Components: Client, Job-Submission
Affects Versions: 1.5.1
Reporter: Nico Kruber
 Fix For: 1.5.2, 1.6.0


There's a regression in Flink 1.5.1 which leads to the job submission via CLI 
failing with SSL enabled (1.5.0 works). Tried with the {{WordCount}} example:

Client log:
{code}
2018-07-13 09:51:45,088 INFO  org.apache.flink.client.cli.CliFrontend   
- 

2018-07-13 09:51:45,091 INFO  org.apache.flink.client.cli.CliFrontend   
-  Starting Command Line Client (Version: 1.5.1, Rev:3488f8b, 
Date:10.07.2018 @ 11:51:27 GMT)
2018-07-13 09:51:45,091 INFO  org.apache.flink.client.cli.CliFrontend   
-  OS current user: nico
2018-07-13 09:51:45,092 INFO  org.apache.flink.client.cli.CliFrontend   
-  Current Hadoop/Kerberos user: 
2018-07-13 09:51:45,092 INFO  org.apache.flink.client.cli.CliFrontend   
-  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 
1.8/25.171-b11
2018-07-13 09:51:45,092 INFO  org.apache.flink.client.cli.CliFrontend   
-  Maximum heap size: 3534 MiBytes
2018-07-13 09:51:45,092 INFO  org.apache.flink.client.cli.CliFrontend   
-  JAVA_HOME: /usr/lib64/jvm/java
2018-07-13 09:51:45,093 INFO  org.apache.flink.client.cli.CliFrontend   
-  No Hadoop Dependency available
2018-07-13 09:51:45,093 INFO  org.apache.flink.client.cli.CliFrontend   
-  JVM Options:
2018-07-13 09:51:45,093 INFO  org.apache.flink.client.cli.CliFrontend   
- 
-Dlog.file=/home/nico/Downloads/flink-1.5.1/log/flink-nico-client-nico-work.fritz.box.log
2018-07-13 09:51:45,093 INFO  org.apache.flink.client.cli.CliFrontend   
- 
-Dlog4j.configuration=file:/home/nico/Downloads/flink-1.5.1/conf/log4j-cli.properties
2018-07-13 09:51:45,094 INFO  org.apache.flink.client.cli.CliFrontend   
- 
-Dlogback.configurationFile=file:/home/nico/Downloads/flink-1.5.1/conf/logback.xml
2018-07-13 09:51:45,094 INFO  org.apache.flink.client.cli.CliFrontend   
-  Program Arguments:
2018-07-13 09:51:45,094 INFO  org.apache.flink.client.cli.CliFrontend   
- run
2018-07-13 09:51:45,094 INFO  org.apache.flink.client.cli.CliFrontend   
- ./examples/streaming/WordCount.jar
2018-07-13 09:51:45,094 INFO  org.apache.flink.client.cli.CliFrontend   
- --input
2018-07-13 09:51:45,095 INFO  org.apache.flink.client.cli.CliFrontend   
- LICENSE
2018-07-13 09:51:45,095 INFO  org.apache.flink.client.cli.CliFrontend   
- --output
2018-07-13 09:51:45,095 INFO  org.apache.flink.client.cli.CliFrontend   
- /home/nico/flink/build-target/test.txt
2018-07-13 09:51:45,095 INFO  org.apache.flink.client.cli.CliFrontend   
-  Classpath: 
/home/nico/Downloads/flink-1.5.1/lib/flink-python_2.11-1.5.1.jar:/home/nico/Downloads/flink-1.5.1/lib/log4j-1.2.17.jar:/home/nico/Downloads/flink-1.5.1/lib/slf4j-log4j12-1.7.7.jar:/home/nico/Downloads/flink-1.5.1/lib/flink-dist_2.11-1.5.1.jar:::
2018-07-13 09:51:45,095 INFO  org.apache.flink.client.cli.CliFrontend   
- 

2018-07-13 09:51:45,105 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: jobmanager.rpc.address, localhost
2018-07-13 09:51:45,105 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: jobmanager.rpc.port, 6123
2018-07-13 09:51:45,105 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: jobmanager.heap.mb, 1024
2018-07-13 09:51:45,106 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: taskmanager.heap.mb, 1024
2018-07-13 09:51:45,106 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: taskmanager.numberOfTaskSlots, 1
2018-07-13 09:51:45,106 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: parallelism.default, 1
2018-07-13 09:51:45,108 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: rest.port, 8081
2018-07-13 09:51:45,110 INFO  
org.apache.flink.configuration.GlobalConfiguration- Loading 
configuration property: security.

[jira] [Created] (FLINK-9841) Web UI only show partial taskmanager log

2018-07-13 Thread vinoyang (JIRA)
vinoyang created FLINK-9841:
---

 Summary: Web UI only show partial taskmanager log 
 Key: FLINK-9841
 URL: https://issues.apache.org/jira/browse/FLINK-9841
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.5.0
 Environment: env : Flink on YARN

version : 1.5.0
Reporter: vinoyang
Assignee: vinoyang


 

In the web UI, we select a task manager and click the "log" tab, but the UI 
only show the partial log (first part), can never update even if we click the 
"refresh" button.

However, the job manager is always OK.

The reason is the resource be closed twice.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSS] Improve broadcast serialization

2018-07-13 Thread Zhijiang(wangzhijiang999)
Hi all,

In current implementation, the RecordSerializer is created separately for each 
subpartition in RecordWriter, that means the number of serializers equals to 
the number of subpartitions.
For broadcast partitioner, every record will be serialized many times in all 
the subpartitions, and this may bring bad performance to some extent.
In theory every record can be serialized only once and referenced for all the 
subpartitions in broadcast mode.

To do so, I propose the following changes:
1. Create and maintain only one serializer in RecordWriter, and it will 
serialize the record for all the subpartitions. It makes sense for any 
partitioners, and the memory overhead can be also decreased, because every 
serializer will maintain some separate byte buffers internally.
2. Maybe we can abstract the RecordWriter as a base class used for other 
partitioner mode and implement a BroadcastRecordWriter for 
BroadcastPartitioner. And this new implementation will add buffer references 
based on the number of subpartitions before adding into subpartition queue.
3. Maybe we can remove StreamRecordWriter by migrating flusher from it to 
RecordWriter, then the uniform RecordWriter can be used for both stream and 
batch. The above BroadcastRecordWriter can aslo uniform for both stream and 
batch.

I am not sure whether this improvement is proposed before and what do you think 
of it?
If necessary I can create JIRAs to contirbute it, and may need one commiter 
cooperate with me.

Best,

Zhijiang

[jira] [Created] (FLINK-9840) Remove Scala dependency from Flink's Mesos integration

2018-07-13 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9840:


 Summary: Remove Scala dependency from Flink's Mesos integration
 Key: FLINK-9840
 URL: https://issues.apache.org/jira/browse/FLINK-9840
 Project: Flink
  Issue Type: Improvement
  Components: Mesos
Affects Versions: 1.5.0, 1.4.0, 1.6.0
Reporter: Till Rohrmann


As a long term goal we could think about removing the Scala dependency from 
Flink's Mesos integration. In order to do this we need to rework some of the 
{{MesosResourceManager's}} components.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9839) End-to-end test: Streaming job with SSL

2018-07-13 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-9839:
--

 Summary: End-to-end test: Streaming job with SSL
 Key: FLINK-9839
 URL: https://issues.apache.org/jira/browse/FLINK-9839
 Project: Flink
  Issue Type: Sub-task
  Components: Tests
Reporter: Nico Kruber
Assignee: Nico Kruber


None of the existing e2e tests run with an SSL configuration but there should 
be such a test as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)