Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Chesnay Schepler
Generally, for bugfix releases I only regard issues as blockers that 
were introduced in that version, which doesn't appear to be the case here.


There's nothing speaking against another bugfix release soon after 1.5.3 
once the issue is resolved.


On 16.08.2018 18:46, Elias Levy wrote:

If I may, think FLINK-10011 should be considered a blocker and included in
1.5.3 and a 1.4.3 release.

On Thu, Aug 16, 2018 at 3:09 AM Dominik Wosiński  wrote:


+1, I agree that frequent releases are good for users.
Dominik


Wysłane z aplikacji Poczta dla Windows 10

Od: Stefan Richter
Wysłano: czwartek, 16 sierpnia 2018 10:57
Do: dev
Temat: Re: [DISCUSS] Release Flink 1.5.3

+1 sounds good.

Stefan


Am 16.08.2018 um 09:55 schrieb Piotr Nowojski :

+1 from me

Piotrek


On 16 Aug 2018, at 09:40, Timo Walther  wrote:

Thank you for starting this discussion.

+1 for this

Regards,
Timo

Am 16.08.18 um 09:27 schrieb vino yang:

Agree! This sounds very good.

Till Rohrmann  于2018年8月16日周四 下午3:14写道:


+1 for starting the release process 1.5.3 immediately. We can always
create another bug fix release afterwards. I think the more often we
release the better it is for our users, because they receive

incremental

improvements faster.

Cheers,
Till

On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler 
wrote:


I would actually start with the release process today unless anyone
objects.

On 16.08.2018 08:46, vino yang wrote:

Hi Chesnay,

+1
  I want to know when you plan to cut the branch.

Thanks, vino.

Chesnay Schepler  于2018年8月16日周四 下午2:29写道:


Hello everyone,

it has been a little over 2 weeks since 1.5.2 was released, and

since

then a number of fixes were added to the release-1.5 that I think

we

should push to users.

Notable fixes, (among others) are FLINK-9969 which fixes a memory

leak

when using batch, FLINK-10066 that reduces the memory print on the

JM

for long-running jobs or FLINK-10070 that reverses a regression
introduced in 1.5.2 due to which Flink could not be compiled with
certain maven versions.

I would of course volunteer as Release Manager.









Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Elias Levy
If I may, think FLINK-10011 should be considered a blocker and included in
1.5.3 and a 1.4.3 release.

On Thu, Aug 16, 2018 at 3:09 AM Dominik Wosiński  wrote:

> +1, I agree that frequent releases are good for users.
> Dominik
>
>
> Wysłane z aplikacji Poczta dla Windows 10
>
> Od: Stefan Richter
> Wysłano: czwartek, 16 sierpnia 2018 10:57
> Do: dev
> Temat: Re: [DISCUSS] Release Flink 1.5.3
>
> +1 sounds good.
>
> Stefan
>
> > Am 16.08.2018 um 09:55 schrieb Piotr Nowojski :
> >
> > +1 from me
> >
> > Piotrek
> >
> >> On 16 Aug 2018, at 09:40, Timo Walther  wrote:
> >>
> >> Thank you for starting this discussion.
> >>
> >> +1 for this
> >>
> >> Regards,
> >> Timo
> >>
> >> Am 16.08.18 um 09:27 schrieb vino yang:
> >>> Agree! This sounds very good.
> >>>
> >>> Till Rohrmann  于2018年8月16日周四 下午3:14写道:
> >>>
>  +1 for starting the release process 1.5.3 immediately. We can always
>  create another bug fix release afterwards. I think the more often we
>  release the better it is for our users, because they receive
> incremental
>  improvements faster.
> 
>  Cheers,
>  Till
> 
>  On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler 
>  wrote:
> 
> > I would actually start with the release process today unless anyone
> > objects.
> >
> > On 16.08.2018 08:46, vino yang wrote:
> >> Hi Chesnay,
> >>
> >> +1
> >>  I want to know when you plan to cut the branch.
> >>
> >> Thanks, vino.
> >>
> >> Chesnay Schepler  于2018年8月16日周四 下午2:29写道:
> >>
> >>> Hello everyone,
> >>>
> >>> it has been a little over 2 weeks since 1.5.2 was released, and
> since
> >>> then a number of fixes were added to the release-1.5 that I think
> we
> >>> should push to users.
> >>>
> >>> Notable fixes, (among others) are FLINK-9969 which fixes a memory
> leak
> >>> when using batch, FLINK-10066 that reduces the memory print on the
> JM
> >>> for long-running jobs or FLINK-10070 that reverses a regression
> >>> introduced in 1.5.2 due to which Flink could not be compiled with
> >>> certain maven versions.
> >>>
> >>> I would of course volunteer as Release Manager.
> >>>
> >>>
> >
> >>
> >
>
>
>


[jira] [Created] (FLINK-10164) Add support for resuming from savepoints to StandaloneJobClusterEntrypoint

2018-08-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-10164:
-

 Summary: Add support for resuming from savepoints to 
StandaloneJobClusterEntrypoint
 Key: FLINK-10164
 URL: https://issues.apache.org/jira/browse/FLINK-10164
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination
Affects Versions: 1.6.0, 1.7.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.6.1, 1.7.0


The {{StandaloneJobClusterEntrypoint}} should support to resume from a 
savepoint/checkpoint. I suggest to introduce an optional command line parameter 
for specifying the savepoint/checkpoint path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: How to handle tests that need docker?

2018-08-16 Thread Till Rohrmann
I agree with Dominik. We could add this test to
`flink-end-to-end-test/run-pre-commit-tests.sh` or
`flink-end-to-end-test/run-nightly-tests.sh` like we did for the
`flink-end-to-end-test/test-scripts/test_docker_embedded_job.sh` and
`flinke-end-to-end-test/test-scripts/test_yarn_kerberos_docker.sh`.

Cheers,
Till

On Thu, Aug 16, 2018 at 2:10 PM Dominik Wosiński  wrote:

> Imho,
> Such tests should be off by default as they impose some requirements on
> the people trying to build Flink from source. And one would have to install
> docker only for the tests to pass. But of course they should be enabled for
> CI builds.
>
> Best Regards,
> Dominik.
>
> Wysłane z aplikacji Poczta dla Windows 10
>
> Od: Niels Basjes
> Wysłano: czwartek, 16 sierpnia 2018 14:05
> Do: dev@flink.apache.org
> Temat: How to handle tests that need docker?
>
> Hi,
>
> I'm working together with Richard Deurwaarder on the PubSub Source and Sink
> https://issues.apache.org/jira/browse/FLINK-9311
>
> We currently have most of the code done and we also have several tests that
> verify the behavior by actually reading and writing messages to and from
> PubSub.
> These tests use a docker image provided by Google that includes the pubsub
> emulator.
>
> Starting and stopping this thing is really fast.
> Starting up, creating topic, running a few tests and cleaning up takes
> about 5 seconds.
>
> Although the complete set of tests run on my machine in about 20-25 seconds
> it does mean that anyone doing that MUST have docker installed locally.
>
> So my question to the committers: Do we simply leave these tests "always
> on" or do you want them to be "off by default" or something else.
> I case you do not want the "always on" then please indicate how we should
> implement the "on"/"off" switch for these tests.
>
> Thanks.
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>
>


[jira] [Created] (FLINK-10163) Support CREATE VIEW in SQL Client

2018-08-16 Thread Timo Walther (JIRA)
Timo Walther created FLINK-10163:


 Summary: Support CREATE VIEW in SQL Client
 Key: FLINK-10163
 URL: https://issues.apache.org/jira/browse/FLINK-10163
 Project: Flink
  Issue Type: New Feature
  Components: Table API  SQL
Reporter: Timo Walther
Assignee: Timo Walther


The possibility to define a name for a subquery would improve the usability of 
the SQL Client. The SQL standard defines \{{CREATE VIEW}} for defining a 
virtual table.

 

Example:

{code}
CREATE VIEW view_name AS SELECT 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10162) Add host in checkpoint detail

2018-08-16 Thread xymaqingxiang (JIRA)
xymaqingxiang created FLINK-10162:
-

 Summary: Add host in checkpoint detail
 Key: FLINK-10162
 URL: https://issues.apache.org/jira/browse/FLINK-10162
 Project: Flink
  Issue Type: Improvement
Reporter: xymaqingxiang
Assignee: xymaqingxiang


Add the host information to the checkpoint so that if the checkpoint fails, you 
can specifically login to the appropriate machine to see the cause of the 
failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10161) Add host in checkpoint detail

2018-08-16 Thread xymaqingxiang (JIRA)
xymaqingxiang created FLINK-10161:
-

 Summary: Add host in checkpoint detail
 Key: FLINK-10161
 URL: https://issues.apache.org/jira/browse/FLINK-10161
 Project: Flink
  Issue Type: New Feature
Reporter: xymaqingxiang


Add the host information to the checkpoint so that if the checkpoint fails, you 
can specifically login to the appropriate machine to see the cause of the 
failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10160) Index out of bound when serializing operator state

2018-08-16 Thread Zdenek (JIRA)
Zdenek created FLINK-10160:
--

 Summary: Index out of bound when serializing operator state
 Key: FLINK-10160
 URL: https://issues.apache.org/jira/browse/FLINK-10160
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 1.4.2
 Environment: Cloudera Hadoop 2.6.5

JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.121-b13

Scala 2.11
Reporter: Zdenek
 Attachments: log.txt

I am getting unexpected randomly happend error when checkpoint state is 
serialized to state backend (In-Memory). 

Source code usage:
{code:java}

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)

env.registerType(classOf[ImpressionMetric])
env.registerType(classOf[RecommendationMetric])
env.registerType(classOf[AdRequestMetric])
env.registerType(classOf[AdRequestStatusMetric])

val impressionsStream = env
.addSource(impsSource)
.map(imp => ImpressionMetric(imp))
.map(imp => imp.asInstanceOf[DomainMetric])
.name("Impressions")


val recommendationsStream = env
.addSource(recommendationsSource)
.map(rcmd => RecommendationMetric(rcmd))
.map(rcmd => rcmd.asInstanceOf[DomainMetric])
.name("Recommendations")


val adRequestsStream = env
.addSource(adRequestsSource)
.flatMap(new MapToAdRequestDomainMetrics())
.name("Ad requests")

{code}
Log with error, longer version in [^log.txt]
{code:java}
2018-08-16 12:49:37,663 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
checkpoint 1 @ 1534416577551
2018-08-16 12:49:38,100 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed 
checkpoint 1 (1404248 bytes in 545 ms).
2018-08-16 12:57:54,217 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
checkpoint 2 @ 1534417074216
2018-08-16 12:57:54,368 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed 
checkpoint 2 (2199656 bytes in 152 ms).
2018-08-16 13:07:54,217 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
checkpoint 3 @ 1534417674216
2018-08-16 13:17:54,218 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Checkpoint 3 
expired before completing.
2018-08-16 13:17:54,220 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
checkpoint 4 @ 1534418274220
2018-08-16 13:27:54,220 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Checkpoint 4 
expired before completing.
2018-08-16 13:27:54,222 INFO 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering 
checkpoint 5 @ 1534418874221
2018-08-16 13:29:36,640 WARN 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late 
message for now expired checkpoint attempt 3 from 
e4ce857f2ef5dd4cc75a48b6fdb7b694 of job a98b29dcef6ad8eac28f8290034a0590.
2018-08-16 13:29:36,642 WARN 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late 
message for now expired checkpoint attempt 3 from 
1db1e9c995bae21d6c81e828e11f50a6 of job a98b29dcef6ad8eac28f8290034a0590.
2018-08-16 13:29:36,661 WARN 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late 
message for now expired checkpoint attempt 4 from 
0f62f47cdd04acf58cf8aa6c11288950 of job a98b29dcef6ad8eac28f8290034a0590.
2018-08-16 13:29:36,662 WARN 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late 
message for now expired checkpoint attempt 4 from 
e4ce857f2ef5dd4cc75a48b6fdb7b694 of job a98b29dcef6ad8eac28f8290034a0590.
2018-08-16 13:29:36,663 WARN 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late 
message for now expired checkpoint attempt 3 from 
0f62f47cdd04acf58cf8aa6c11288950 of job a98b29dcef6ad8eac28f8290034a0590.
2018-08-16 13:29:36,967 INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom Source 
-> Ad requests (1/2) (1db1e9c995bae21d6c81e828e11f50a6) switched from RUNNING 
to FAILED.
java.lang.Exception: Error while triggering checkpoint 4 for Source: Custom 
Source -> Ad requests (1/2)
at org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1210)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.Exception: Could not perform checkpoint 4 for operator 
Source: Custom Source -> Ad requests (1/2).
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:544)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.triggerCheckpoint(SourceStreamTask.java:111)
at org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1199)
... 5 more
Caused by: java.lang.Exception: 

ODP: How to handle tests that need docker?

2018-08-16 Thread Dominik Wosiński
Imho, 
Such tests should be off by default as they impose some requirements on the 
people trying to build Flink from source. And one would have to install docker 
only for the tests to pass. But of course they should be enabled for CI builds.

Best Regards,
Dominik.

Wysłane z aplikacji Poczta dla Windows 10

Od: Niels Basjes
Wysłano: czwartek, 16 sierpnia 2018 14:05
Do: dev@flink.apache.org
Temat: How to handle tests that need docker?

Hi,

I'm working together with Richard Deurwaarder on the PubSub Source and Sink
https://issues.apache.org/jira/browse/FLINK-9311

We currently have most of the code done and we also have several tests that
verify the behavior by actually reading and writing messages to and from
PubSub.
These tests use a docker image provided by Google that includes the pubsub
emulator.

Starting and stopping this thing is really fast.
Starting up, creating topic, running a few tests and cleaning up takes
about 5 seconds.

Although the complete set of tests run on my machine in about 20-25 seconds
it does mean that anyone doing that MUST have docker installed locally.

So my question to the committers: Do we simply leave these tests "always
on" or do you want them to be "off by default" or something else.
I case you do not want the "always on" then please indicate how we should
implement the "on"/"off" switch for these tests.

Thanks.

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes



How to handle tests that need docker?

2018-08-16 Thread Niels Basjes
Hi,

I'm working together with Richard Deurwaarder on the PubSub Source and Sink
https://issues.apache.org/jira/browse/FLINK-9311

We currently have most of the code done and we also have several tests that
verify the behavior by actually reading and writing messages to and from
PubSub.
These tests use a docker image provided by Google that includes the pubsub
emulator.

Starting and stopping this thing is really fast.
Starting up, creating topic, running a few tests and cleaning up takes
about 5 seconds.

Although the complete set of tests run on my machine in about 20-25 seconds
it does mean that anyone doing that MUST have docker installed locally.

So my question to the committers: Do we simply leave these tests "always
on" or do you want them to be "off by default" or something else.
I case you do not want the "always on" then please indicate how we should
implement the "on"/"off" switch for these tests.

Thanks.

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes


[jira] [Created] (FLINK-10159) TestHarness#initializeState(xyz) calls after TestHarness#open() are being silently ignored

2018-08-16 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-10159:
--

 Summary: TestHarness#initializeState(xyz) calls after 
TestHarness#open() are being silently ignored
 Key: FLINK-10159
 URL: https://issues.apache.org/jira/browse/FLINK-10159
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.6.0
Reporter: Piotr Nowojski


This is an old issue. Incorrect order of initializeState and open result to 
initializeState being ignored. For example in this code:
{code:java}
testHarness = createTestHarness(topic);
testHarness.setup();
testHarness.open();
testHarness.initializeState(snapshot1);
{code}
Which is miss-leading both for Flink developers and for users (since we 
recommend using test harness for unit tests).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[VOTE] Release 1.5.3, release candidate #1

2018-08-16 Thread Chesnay Schepler

Hi everyone,
Please review and vote on the release candidate #1 for the version 
1.5.3, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to 
be deployed to dist.apache.org [2], which are signed with the key with 
fingerprint 11D464BA [3],

* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.5.3-rc1" [5],
* website pull request listing the new release and adding announcement 
blog post [6].


The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.


Thanks,
Chesnay

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12343777

[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.5.3/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1179
[5] 
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=refs/tags/release-1.5.3-rc1

[6] https://github.com/apache/flink-web/pull/119







ODP: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Dominik Wosiński
+1, I agree that frequent releases are good for users.
Dominik


Wysłane z aplikacji Poczta dla Windows 10

Od: Stefan Richter
Wysłano: czwartek, 16 sierpnia 2018 10:57
Do: dev
Temat: Re: [DISCUSS] Release Flink 1.5.3

+1 sounds good.

Stefan

> Am 16.08.2018 um 09:55 schrieb Piotr Nowojski :
> 
> +1 from me
> 
> Piotrek
> 
>> On 16 Aug 2018, at 09:40, Timo Walther  wrote:
>> 
>> Thank you for starting this discussion.
>> 
>> +1 for this
>> 
>> Regards,
>> Timo
>> 
>> Am 16.08.18 um 09:27 schrieb vino yang:
>>> Agree! This sounds very good.
>>> 
>>> Till Rohrmann  于2018年8月16日周四 下午3:14写道:
>>> 
 +1 for starting the release process 1.5.3 immediately. We can always
 create another bug fix release afterwards. I think the more often we
 release the better it is for our users, because they receive incremental
 improvements faster.
 
 Cheers,
 Till
 
 On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler 
 wrote:
 
> I would actually start with the release process today unless anyone
> objects.
> 
> On 16.08.2018 08:46, vino yang wrote:
>> Hi Chesnay,
>> 
>> +1
>>  I want to know when you plan to cut the branch.
>> 
>> Thanks, vino.
>> 
>> Chesnay Schepler  于2018年8月16日周四 下午2:29写道:
>> 
>>> Hello everyone,
>>> 
>>> it has been a little over 2 weeks since 1.5.2 was released, and since
>>> then a number of fixes were added to the release-1.5 that I think we
>>> should push to users.
>>> 
>>> Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
>>> when using batch, FLINK-10066 that reduces the memory print on the JM
>>> for long-running jobs or FLINK-10070 that reverses a regression
>>> introduced in 1.5.2 due to which Flink could not be compiled with
>>> certain maven versions.
>>> 
>>> I would of course volunteer as Release Manager.
>>> 
>>> 
> 
>> 
> 




Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Stefan Richter
+1 sounds good.

Stefan

> Am 16.08.2018 um 09:55 schrieb Piotr Nowojski :
> 
> +1 from me
> 
> Piotrek
> 
>> On 16 Aug 2018, at 09:40, Timo Walther  wrote:
>> 
>> Thank you for starting this discussion.
>> 
>> +1 for this
>> 
>> Regards,
>> Timo
>> 
>> Am 16.08.18 um 09:27 schrieb vino yang:
>>> Agree! This sounds very good.
>>> 
>>> Till Rohrmann  于2018年8月16日周四 下午3:14写道:
>>> 
 +1 for starting the release process 1.5.3 immediately. We can always
 create another bug fix release afterwards. I think the more often we
 release the better it is for our users, because they receive incremental
 improvements faster.
 
 Cheers,
 Till
 
 On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler 
 wrote:
 
> I would actually start with the release process today unless anyone
> objects.
> 
> On 16.08.2018 08:46, vino yang wrote:
>> Hi Chesnay,
>> 
>> +1
>>  I want to know when you plan to cut the branch.
>> 
>> Thanks, vino.
>> 
>> Chesnay Schepler  于2018年8月16日周四 下午2:29写道:
>> 
>>> Hello everyone,
>>> 
>>> it has been a little over 2 weeks since 1.5.2 was released, and since
>>> then a number of fixes were added to the release-1.5 that I think we
>>> should push to users.
>>> 
>>> Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
>>> when using batch, FLINK-10066 that reduces the memory print on the JM
>>> for long-running jobs or FLINK-10070 that reverses a regression
>>> introduced in 1.5.2 due to which Flink could not be compiled with
>>> certain maven versions.
>>> 
>>> I would of course volunteer as Release Manager.
>>> 
>>> 
> 
>> 
> 



[jira] [Created] (FLINK-10158) The DataOutputSerializer may consume excessive memory

2018-08-16 Thread aitozi (JIRA)
aitozi created FLINK-10158:
--

 Summary: The DataOutputSerializer may consume excessive memory
 Key: FLINK-10158
 URL: https://issues.apache.org/jira/browse/FLINK-10158
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.6.0, 1.5.2, 1.4.2
Reporter: aitozi
Assignee: aitozi


I found that the dataOutputSerializer clear the intermediate buffer when the 
buffer exceed the 5M (as a fixed configuration), But when we encountered the 
rebalance or keyByPartition and the downstream has a large parallel it will 
also consume a lot memory, we can do two things :

1. make this config configurable
2. like the https://issues.apache.org/jira/projects/FLINK/issues/FLINK-1326 
mentioned, we can make the serializer one for the output

What's your idea ? [~StephanEwen]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10157) The TTL state not allow `null` state value

2018-08-16 Thread chengjie.wu (JIRA)
chengjie.wu created FLINK-10157:
---

 Summary: The TTL state not allow `null` state value
 Key: FLINK-10157
 URL: https://issues.apache.org/jira/browse/FLINK-10157
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.6.0
 Environment: Flink:1.6.0

Scala:2.11

JDK:1.8
Reporter: chengjie.wu
 Attachments: StateWithOutTtlTest.scala, StateWithTtlTest.scala

In the previous version or when StateTtl is not enabled,MapState allows `null` 
value,that means after
{code:java}
mapState.put("key", null){code}
, then
{code:java}
mapState.contains("key"){code}
will return {color:#FF}*true*{color}, but when StateTtl is enabled,
{code:java}
mapState.contains("key"){code}
will return {color:#FF}*false*{color}(*the key has not expired*).
So I think the field `userValue` in 
`org.apache.flink.runtime.state.ttl.TtlValue` should allow `null` value. User 
state is null may not means the TtlValue should be null.

 
{code:java}
/**
 * This class wraps user value of state with TTL.
 *
 * @param  Type of the user value of state with TTL
 */
class TtlValue implements Serializable {
 private final T userValue;
 private final long lastAccessTimestamp;
TtlValue(T userValue, long lastAccessTimestamp) {
 Preconditions.checkNotNull(userValue);
 this.userValue = userValue;
 this.lastAccessTimestamp = lastAccessTimestamp;
 }
T getUserValue() {
 return userValue;
 }
long getLastAccessTimestamp() {
 return lastAccessTimestamp;
 }
}
{code}
Am I understanding right?

This is my test class.

[^StateWithTtlTest.scala] [^StateWithOutTtlTest.scala]

^Thanks!:)^



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Piotr Nowojski
+1 from me

Piotrek

> On 16 Aug 2018, at 09:40, Timo Walther  wrote:
> 
> Thank you for starting this discussion.
> 
> +1 for this
> 
> Regards,
> Timo
> 
> Am 16.08.18 um 09:27 schrieb vino yang:
>> Agree! This sounds very good.
>> 
>> Till Rohrmann  于2018年8月16日周四 下午3:14写道:
>> 
>>> +1 for starting the release process 1.5.3 immediately. We can always
>>> create another bug fix release afterwards. I think the more often we
>>> release the better it is for our users, because they receive incremental
>>> improvements faster.
>>> 
>>> Cheers,
>>> Till
>>> 
>>> On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler 
>>> wrote:
>>> 
 I would actually start with the release process today unless anyone
 objects.
 
 On 16.08.2018 08:46, vino yang wrote:
> Hi Chesnay,
> 
> +1
>   I want to know when you plan to cut the branch.
> 
> Thanks, vino.
> 
> Chesnay Schepler  于2018年8月16日周四 下午2:29写道:
> 
>> Hello everyone,
>> 
>> it has been a little over 2 weeks since 1.5.2 was released, and since
>> then a number of fixes were added to the release-1.5 that I think we
>> should push to users.
>> 
>> Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
>> when using batch, FLINK-10066 that reduces the memory print on the JM
>> for long-running jobs or FLINK-10070 that reverses a regression
>> introduced in 1.5.2 due to which Flink could not be compiled with
>> certain maven versions.
>> 
>> I would of course volunteer as Release Manager.
>> 
>> 
 
> 



Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Timo Walther

Thank you for starting this discussion.

+1 for this

Regards,
Timo

Am 16.08.18 um 09:27 schrieb vino yang:

Agree! This sounds very good.

Till Rohrmann  于2018年8月16日周四 下午3:14写道:


+1 for starting the release process 1.5.3 immediately. We can always
create another bug fix release afterwards. I think the more often we
release the better it is for our users, because they receive incremental
improvements faster.

Cheers,
Till

On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler 
wrote:


I would actually start with the release process today unless anyone
objects.

On 16.08.2018 08:46, vino yang wrote:

Hi Chesnay,

+1
   I want to know when you plan to cut the branch.

Thanks, vino.

Chesnay Schepler  于2018年8月16日周四 下午2:29写道:


Hello everyone,

it has been a little over 2 weeks since 1.5.2 was released, and since
then a number of fixes were added to the release-1.5 that I think we
should push to users.

Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
when using batch, FLINK-10066 that reduces the memory print on the JM
for long-running jobs or FLINK-10070 that reverses a regression
introduced in 1.5.2 due to which Flink could not be compiled with
certain maven versions.

I would of course volunteer as Release Manager.








Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread vino yang
Agree! This sounds very good.

Till Rohrmann  于2018年8月16日周四 下午3:14写道:

> +1 for starting the release process 1.5.3 immediately. We can always
> create another bug fix release afterwards. I think the more often we
> release the better it is for our users, because they receive incremental
> improvements faster.
>
> Cheers,
> Till
>
> On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler 
> wrote:
>
>> I would actually start with the release process today unless anyone
>> objects.
>>
>> On 16.08.2018 08:46, vino yang wrote:
>> > Hi Chesnay,
>> >
>> > +1
>> >   I want to know when you plan to cut the branch.
>> >
>> > Thanks, vino.
>> >
>> > Chesnay Schepler  于2018年8月16日周四 下午2:29写道:
>> >
>> >> Hello everyone,
>> >>
>> >> it has been a little over 2 weeks since 1.5.2 was released, and since
>> >> then a number of fixes were added to the release-1.5 that I think we
>> >> should push to users.
>> >>
>> >> Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
>> >> when using batch, FLINK-10066 that reduces the memory print on the JM
>> >> for long-running jobs or FLINK-10070 that reverses a regression
>> >> introduced in 1.5.2 due to which Flink could not be compiled with
>> >> certain maven versions.
>> >>
>> >> I would of course volunteer as Release Manager.
>> >>
>> >>
>>
>>


Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Till Rohrmann
+1 for starting the release process 1.5.3 immediately. We can always create
another bug fix release afterwards. I think the more often we release the
better it is for our users, because they receive incremental improvements
faster.

Cheers,
Till

On Thu, Aug 16, 2018 at 8:52 AM Chesnay Schepler  wrote:

> I would actually start with the release process today unless anyone
> objects.
>
> On 16.08.2018 08:46, vino yang wrote:
> > Hi Chesnay,
> >
> > +1
> >   I want to know when you plan to cut the branch.
> >
> > Thanks, vino.
> >
> > Chesnay Schepler  于2018年8月16日周四 下午2:29写道:
> >
> >> Hello everyone,
> >>
> >> it has been a little over 2 weeks since 1.5.2 was released, and since
> >> then a number of fixes were added to the release-1.5 that I think we
> >> should push to users.
> >>
> >> Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
> >> when using batch, FLINK-10066 that reduces the memory print on the JM
> >> for long-running jobs or FLINK-10070 that reverses a regression
> >> introduced in 1.5.2 due to which Flink could not be compiled with
> >> certain maven versions.
> >>
> >> I would of course volunteer as Release Manager.
> >>
> >>
>
>


Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Chesnay Schepler

I would actually start with the release process today unless anyone objects.

On 16.08.2018 08:46, vino yang wrote:

Hi Chesnay,

+1
  I want to know when you plan to cut the branch.

Thanks, vino.

Chesnay Schepler  于2018年8月16日周四 下午2:29写道:


Hello everyone,

it has been a little over 2 weeks since 1.5.2 was released, and since
then a number of fixes were added to the release-1.5 that I think we
should push to users.

Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
when using batch, FLINK-10066 that reduces the memory print on the JM
for long-running jobs or FLINK-10070 that reverses a regression
introduced in 1.5.2 due to which Flink could not be compiled with
certain maven versions.

I would of course volunteer as Release Manager.






Re: [DISCUSS] Release Flink 1.5.3

2018-08-16 Thread vino yang
Hi Chesnay,

+1
 I want to know when you plan to cut the branch.

Thanks, vino.

Chesnay Schepler  于2018年8月16日周四 下午2:29写道:

> Hello everyone,
>
> it has been a little over 2 weeks since 1.5.2 was released, and since
> then a number of fixes were added to the release-1.5 that I think we
> should push to users.
>
> Notable fixes, (among others) are FLINK-9969 which fixes a memory leak
> when using batch, FLINK-10066 that reduces the memory print on the JM
> for long-running jobs or FLINK-10070 that reverses a regression
> introduced in 1.5.2 due to which Flink could not be compiled with
> certain maven versions.
>
> I would of course volunteer as Release Manager.
>
>


[DISCUSS] Release Flink 1.5.3

2018-08-16 Thread Chesnay Schepler

Hello everyone,

it has been a little over 2 weeks since 1.5.2 was released, and since 
then a number of fixes were added to the release-1.5 that I think we 
should push to users.


Notable fixes, (among others) are FLINK-9969 which fixes a memory leak 
when using batch, FLINK-10066 that reduces the memory print on the JM 
for long-running jobs or FLINK-10070 that reverses a regression 
introduced in 1.5.2 due to which Flink could not be compiled with 
certain maven versions.


I would of course volunteer as Release Manager.