[jira] [Created] (FLINK-11860) Remove all the usage of deprecated unit-provided memory options in docs and scripts

2019-03-07 Thread Yun Tang (JIRA)
Yun Tang created FLINK-11860:


 Summary: Remove all the usage of deprecated unit-provided memory 
options in docs and scripts
 Key: FLINK-11860
 URL: https://issues.apache.org/jira/browse/FLINK-11860
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / Scripts, Documentation
Reporter: Yun Tang
 Fix For: 1.9.0


Currently, options with unit provided ,e.g. {{jobmanager.heap.mb}} and 
{{taskmanager.heap.mb}} have already been deprecated. However, these options 
are still showed in documentation and deployment scripts. We should remove 
these to not confuse users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11859) Improve SpanningRecordSerializer performance by serializing record length to serialization buffer directly

2019-03-07 Thread Yingjie Cao (JIRA)
Yingjie Cao created FLINK-11859:
---

 Summary: Improve SpanningRecordSerializer performance by 
serializing record length to serialization buffer directly
 Key: FLINK-11859
 URL: https://issues.apache.org/jira/browse/FLINK-11859
 Project: Flink
  Issue Type: Improvement
Reporter: Yingjie Cao
Assignee: Yingjie Cao


In the current implementation of SpanningRecordSerializer, the length of a 
record is serialized to an intermediate length buffer and then copied to the 
target buffer. Actually, the length filed can be serialized directly to the 
data buffer (serializationBuffer), which can avoid the copy of length buffer. 
Though the total bytes copied remain unchanged, it one copy of a small record 
which incurs high overhead. The flink-benchmarks shows it can improve 
performance and the test results are as follows.

Result with the optimization:
|Benchmark|Mode|Threads|Samples|Score|Score Error (99.9%)|Unit|Param: 
channelsFlushTimeout|Param: stateBackend|
|KeyByBenchmarks.arrayKeyBy|thrpt|1|30|2228.049605|77.631804|ops/ms| | |
|KeyByBenchmarks.tupleKeyBy|thrpt|1|30|3968.361739|193.501755|ops/ms| | |
|MemoryStateBackendBenchmark.stateBackends|thrpt|1|30|3030.016702|29.272713|ops/ms|
 |MEMORY|
|MemoryStateBackendBenchmark.stateBackends|thrpt|1|30|2754.77678|26.215395|ops/ms|
 |FS|
|MemoryStateBackendBenchmark.stateBackends|thrpt|1|30|3001.957606|29.288019|ops/ms|
 |FS_ASYNC|
|RocksStateBackendBenchmark.stateBackends|thrpt|1|30|123.698984|3.339233|ops/ms|
 |ROCKS|
|RocksStateBackendBenchmark.stateBackends|thrpt|1|30|126.252137|1.137735|ops/ms|
 |ROCKS_INC|
|SerializationFrameworkMiniBenchmarks.serializerAvro|thrpt|1|30|323.658098|5.855697|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerKryo|thrpt|1|30|183.34423|3.710787|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerPojo|thrpt|1|30|404.380233|5.131744|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerRow|thrpt|1|30|527.193369|10.176726|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerTuple|thrpt|1|30|550.073024|11.724412|ops/ms|
 | |
|StreamNetworkBroadcastThroughputBenchmarkExecutor.networkBroadcastThroughput|thrpt|1|30|564.690627|13.766809|ops/ms|
 | |
|StreamNetworkThroughputBenchmarkExecutor.networkThroughput|thrpt|1|30|49918.11806|2324.234776|ops/ms|100,100ms|
 |
|StreamNetworkThroughputBenchmarkExecutor.networkThroughput|thrpt|1|30|10443.63491|315.835962|ops/ms|100,100ms,SSL|
 |
|StreamNetworkThroughputBenchmarkExecutor.networkThroughput|thrpt|1|30|21387.47608|2779.832704|ops/ms|1000,1ms|
 |
|StreamNetworkThroughputBenchmarkExecutor.networkThroughput|thrpt|1|30|26585.85453|860.243347|ops/ms|1000,100ms|
 |
|StreamNetworkThroughputBenchmarkExecutor.networkThroughput|thrpt|1|30|8252.563405|947.129028|ops/ms|1000,100ms,SSL|
 |
|SumLongsBenchmark.benchmarkCount|thrpt|1|30|8806.021402|263.995836|ops/ms| | |
|WindowBenchmarks.globalWindow|thrpt|1|30|4573.620126|112.099391|ops/ms| | |
|WindowBenchmarks.sessionWindow|thrpt|1|30|585.246412|7.026569|ops/ms| | |
|WindowBenchmarks.slidingWindow|thrpt|1|30|449.302134|4.123669|ops/ms| | |
|WindowBenchmarks.tumblingWindow|thrpt|1|30|2979.806858|33.818909|ops/ms| | |
|StreamNetworkLatencyBenchmarkExecutor.networkLatency1to1|avgt|1|30|12.842865|0.13796|ms/op|
 | |

Result without the optimization:

 
|Benchmark|Mode|Threads|Samples|Score|Score Error (99.9%)|Unit|Param: 
channelsFlushTimeout|Param: stateBackend|
|KeyByBenchmarks.arrayKeyBy|thrpt|1|30|2060.241715|59.898485|ops/ms| | |
|KeyByBenchmarks.tupleKeyBy|thrpt|1|30|3645.306819|223.821719|ops/ms| | |
|MemoryStateBackendBenchmark.stateBackends|thrpt|1|30|2992.698822|36.978115|ops/ms|
 |MEMORY|
|MemoryStateBackendBenchmark.stateBackends|thrpt|1|30|2756.10949|27.798937|ops/ms|
 |FS|
|MemoryStateBackendBenchmark.stateBackends|thrpt|1|30|2965.969876|44.159793|ops/ms|
 |FS_ASYNC|
|RocksStateBackendBenchmark.stateBackends|thrpt|1|30|125.506942|1.245978|ops/ms|
 |ROCKS|
|RocksStateBackendBenchmark.stateBackends|thrpt|1|30|127.258737|1.190588|ops/ms|
 |ROCKS_INC|
|SerializationFrameworkMiniBenchmarks.serializerAvro|thrpt|1|30|316.497954|8.309241|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerKryo|thrpt|1|30|189.065149|6.302073|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerPojo|thrpt|1|30|391.51305|7.750728|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerRow|thrpt|1|30|513.611151|10.640899|ops/ms|
 | |
|SerializationFrameworkMiniBenchmarks.serializerTuple|thrpt|1|30|534.184947|14.370082|ops/ms|
 | |
|StreamNetworkBroadcastThroughputBenchmarkExecutor.networkBroadcastThroughput|thrpt|1|30|483.388618|19.506723|ops/ms|
 | |
|StreamNetworkThroughputBenchmarkExecutor.networkThroughput|thrpt|1|30|42777.70615|4981.87539|ops/ms|100,100ms|
 |
|StreamNetworkThroughputBenchmarkExecutor.networkThroughput|thrpt|1|30|10201.48525|286.248845|ops/ms|100,100ms,SSL|
 |

apply for contributor permission

2019-03-07 Thread 何春平
Hi,

I want to contribute to Apache Flink.
would you please give me the contributor permission?
my jira id is moxian

thanks


[jira] [Created] (FLINK-11858) Introduce block compressor/decompressor for batch table runtime

2019-03-07 Thread Kurt Young (JIRA)
Kurt Young created FLINK-11858:
--

 Summary: Introduce block compressor/decompressor for batch table 
runtime
 Key: FLINK-11858
 URL: https://issues.apache.org/jira/browse/FLINK-11858
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Operators
Reporter: Kurt Young
Assignee: Kurt Young






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11857) Introduce BinaryExternalSorter to batch table runtime

2019-03-07 Thread Jingsong Lee (JIRA)
Jingsong Lee created FLINK-11857:


 Summary: Introduce BinaryExternalSorter to batch table runtime
 Key: FLINK-11857
 URL: https://issues.apache.org/jira/browse/FLINK-11857
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Operators
 Environment: We need a sorter to take full advantage of the high 
performance of the Binary format.
Reporter: Jingsong Lee
Assignee: Jingsong Lee






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11856) Introduce BinaryHashTable and LongHashTable to batch table runtime

2019-03-07 Thread Kurt Young (JIRA)
Kurt Young created FLINK-11856:
--

 Summary: Introduce BinaryHashTable and LongHashTable to batch 
table runtime
 Key: FLINK-11856
 URL: https://issues.apache.org/jira/browse/FLINK-11856
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Operators
Reporter: Kurt Young
Assignee: Kurt Young






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


I want to be contributor

2019-03-07 Thread leocook
HiGuys,

IwanttocontributetoApacheFlink.
Wouldyoupleasegivemethepermissionasacontributor?
MyJIRAIDisleocook, thanks!

[jira] [Created] (FLINK-11855) Race condition in EmbeddedLeaderService

2019-03-07 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-11855:
-

 Summary: Race condition in EmbeddedLeaderService
 Key: FLINK-11855
 URL: https://issues.apache.org/jira/browse/FLINK-11855
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.7.2, 1.8.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.7.3, 1.8.0


There is a race condition in the {{EmbeddedLeaderService}} which can occur if 
the {{EmbeddedLeaderService}} is shut down before the {{GrantLeadershipCall}} 
has been executed. In this case, the {{contender}} is nulled which leads to a 
NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] Update on Flink 1.8 Release Progress

2019-03-07 Thread Aljoscha Krettek
Hi,

Another update: things are looking mostly good! There is only one release 
blocker remaining [1] as of the time of this writing. Unfortunately, tomorrow 
is a public holiday in Germany so I might not get around to creating a first 
RC, sorry for that!

Regarding, https://issues.apache.org/jira/browse/FLINK-11501 
, I decided not to include 
it on the release-1.8 branch because I had some concerns about adding 
shaded-guava to flink-core as a dependency. I started a thread about this [2].

I’ll create a first RC as soon as the blockers are resolved and I find the time.

Best,
Aljoscha

[1] https://issues.apache.org/jira/issues/?filter=12334772 

[2] 
https://lists.apache.org/thread.html/d74081ac65c19db790f421f28a70db2a8e0f0c804f08fd0344274515@%3Cdev.flink.apache.org%3E
 

 

> On 2. Mar 2019, at 04:47, Thomas Weise  wrote:
> 
> Hi Aljoscha,
> 
> Thanks for driving the release.
> 
> I think it is a good idea for others to take a look at the blog post and
> make sure we capture all goodies worth mentioning.
> 
> I would also like to ask to cherry pick the following into the release
> branch:
> 
> https://issues.apache.org/jira/browse/FLINK-11501 
> 
> 
> Thanks,
> Thomas
> 
> 
> On Fri, Mar 1, 2019 at 7:43 AM Aljoscha Krettek  > wrote:
> 
>> Hi Everyone,
>> 
>> We are now about a week after cutting the release-1.8 branch and things
>> are looking quite good! The community has worked hard on fixing bugs and
>> test instabilities. There are now only two issues that are marked as
>> “blocker” in our Jira: [1]. The first of which is about updating the
>> release notes, for which there is an open PR [2], the second one is a bug
>> in TraversableSerializer but Dawid has already found the solution [3].
>> 
>> There is also already a PR for updating our Downloads page for Flink 1.8
>> [4]. This change is a bit bigger than usual because it also restructures
>> the Download page, feel free to have a look in there.
>> 
>> Lastly, the release blog post is also available as a PR [5] and the
>> community is working on finalising that one.
>> 
>> Please comment here if you think there are any issues that should be
>> considered blockers for Flink 1.8.0. I’ll get the release process rolling
>> as soon as those last blockers are out of the way.
>> 
>> Best,
>> Aljoscha
>> 
>> [1] https://issues.apache.org/jira/issues/?filter=12334772 <
>> https://issues.apache.org/jira/issues/?filter=12334772 
>> >
>> [2] https://github.com/apache/flink/pull/7876 
>>  <
>> https://github.com/apache/flink/pull/7876 
>> >
>> [3] https://issues.apache.org/jira/browse/FLINK-11420 
>>  <
>> https://issues.apache.org/jira/browse/FLINK-11420 
>> >
>> [4] https://github.com/apache/flink-web/pull/180 
>>  <
>> https://github.com/apache/flink-web/pull/180 
>> >
>> [5] https://github.com/apache/flink-web/pull/179 
>> 


Re: Sharing state between subtasks

2019-03-07 Thread Thomas Weise
The state sharing support will be part of the upcoming 1.8 release.

We have also completed most of the synchronization work for the Kinesis
consumer and will contribute those changes to Flink soon.

Most of the code will be reusable for Kafka consumer.

We will need the same support in the Kafka consumer but have not started
work to integrate that yet.

Thomas

On Thu, Mar 7, 2019 at 6:53 AM Gerard Garcia  wrote:

> Any advance related to synchronizing ingestion by event/ingestion-time
> between kafka partitions?
>
> On Thu, Nov 15, 2018 at 1:27 AM Jamie Grier 
> wrote:
>
> > Hey all,
> >
> > I think all we need for this on the state sharing side is pretty
> simple.  I
> > opened a JIRA to track this work and submitted a PR for the state sharing
> > bit.
> >
> > https://issues.apache.org/jira/browse/FLINK-10886
> > https://github.com/apache/flink/pull/7099
> >
> > Please provide feedback :)
> >
> > -Jamie
> >
> >
> > On Thu, Nov 1, 2018 at 3:33 AM Till Rohrmann 
> wrote:
> >
> > > Hi Thomas,
> > >
> > > using Akka directly would further manifest our dependency on Scala in
> > > flink-runtime. This is something we are currently trying to get rid of.
> > For
> > > that purpose we have added the RpcService abstraction which
> encapsulates
> > > all Akka specific logic. We hope that we can soon get rid of the Scala
> > > dependency in flink-runtime by using a special class loader only for
> > > loading the AkkaRpcService implementation.
> > >
> > > I think the easiest way to sync the task information is actually going
> > > through the JobMaster because the subtasks don't know on which other
> TMs
> > > the other subtasks run. Otherwise, we would need to have some TM
> > detection
> > > mechanism between TMs. If you choose this way, then you should be able
> to
> > > use the RpcService by extending the JobMasterGateway by additional
> RPCs.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Oct 31, 2018 at 6:52 PM Thomas Weise  wrote:
> > >
> > > > Hi,
> > > >
> > > > We are planning to work on the Kinesis consumer in the following
> order:
> > > >
> > > > 1. Add per shard watermarking:
> > > > https://issues.apache.org/jira/browse/FLINK-5697 - this will be code
> > we
> > > > already use internally; I will open a PR to add it to the Flink
> Kinesis
> > > > consumer
> > > > 2. Exchange of per subtask watermarks between all subtasks of one or
> > > > multiple sources
> > > > 3. Implement queue approach described in Jamie's document in to
> utilize
> > > 1.)
> > > > and 2.) to align the shard consumers WRT event time
> > > >
> > > > There was some discussion regarding the mechanism to share the
> > watermarks
> > > > between subtasks. If there is something that can be re-used it would
> be
> > > > great. Otherwise I'm going to further investigate the Akka or JGroups
> > > > route. Regarding Akka, since it is used within Flink already, is
> there
> > an
> > > > abstraction that you would recommend to consider to avoid direct
> > > > dependency?
> > > >
> > > > Thanks,
> > > > Thomas
> > > >
> > > >
> > > >
> > > > On Thu, Oct 18, 2018 at 8:34 PM Zhijiang(wangzhijiang999)
> > > >  wrote:
> > > >
> > > > > Not yet. We only have some initial thoughts and have not worked on
> it
> > > > yet.
> > > > > We will update the progress in this discussion if have.
> > > > >
> > > > > Best,
> > > > > Zhijiang
> > > > > --
> > > > > 发件人:Aljoscha Krettek 
> > > > > 发送时间:2018年10月18日(星期四) 17:53
> > > > > 收件人:dev ; Zhijiang(wangzhijiang999) <
> > > > > wangzhijiang...@aliyun.com>
> > > > > 抄 送:Till Rohrmann 
> > > > > 主 题:Re: Sharing state between subtasks
> > > > >
> > > > > Hi Zhijiang,
> > > > >
> > > > > do you already have working code or a design doc for the second
> > > approach?
> > > > >
> > > > > Best,
> > > > > Aljoscha
> > > > >
> > > > > > On 18. Oct 2018, at 08:42, Zhijiang(wangzhijiang999) <
> > > > > wangzhijiang...@aliyun.com.INVALID> wrote:
> > > > > >
> > > > > > Just noticed this discussion from @Till Rohrmann's weekly
> community
> > > > > update and I want to share some thoughts from our experiences.
> > > > > >
> > > > > > We also encountered the source consuption skew issue before, and
> we
> > > are
> > > > > focused on improving this by two possible ways.
> > > > > >
> > > > > > 1. Control the read strategy by the downstream side. In detail,
> > every
> > > > > input channel in downstream task corresponds to the consumption of
> > one
> > > > > upstream source task, and we will tag each input channel with
> > watermark
> > > > to
> > > > > find the lowest channel to read in high priority. In essence, we
> > > actually
> > > > > rely on the mechanism of backpressure. If the channel with highest
> > > > > timestamp is not read by downstream task for a while, it will block
> > the
> > > > > corresponding source task to read when the buffers are exhausted.
> It
> > is
> > > > no
> > > > > need to change the source interface in this way, but 

Re: Sharing state between subtasks

2019-03-07 Thread Gerard Garcia
Any advance related to synchronizing ingestion by event/ingestion-time
between kafka partitions?

On Thu, Nov 15, 2018 at 1:27 AM Jamie Grier  wrote:

> Hey all,
>
> I think all we need for this on the state sharing side is pretty simple.  I
> opened a JIRA to track this work and submitted a PR for the state sharing
> bit.
>
> https://issues.apache.org/jira/browse/FLINK-10886
> https://github.com/apache/flink/pull/7099
>
> Please provide feedback :)
>
> -Jamie
>
>
> On Thu, Nov 1, 2018 at 3:33 AM Till Rohrmann  wrote:
>
> > Hi Thomas,
> >
> > using Akka directly would further manifest our dependency on Scala in
> > flink-runtime. This is something we are currently trying to get rid of.
> For
> > that purpose we have added the RpcService abstraction which encapsulates
> > all Akka specific logic. We hope that we can soon get rid of the Scala
> > dependency in flink-runtime by using a special class loader only for
> > loading the AkkaRpcService implementation.
> >
> > I think the easiest way to sync the task information is actually going
> > through the JobMaster because the subtasks don't know on which other TMs
> > the other subtasks run. Otherwise, we would need to have some TM
> detection
> > mechanism between TMs. If you choose this way, then you should be able to
> > use the RpcService by extending the JobMasterGateway by additional RPCs.
> >
> > Cheers,
> > Till
> >
> > On Wed, Oct 31, 2018 at 6:52 PM Thomas Weise  wrote:
> >
> > > Hi,
> > >
> > > We are planning to work on the Kinesis consumer in the following order:
> > >
> > > 1. Add per shard watermarking:
> > > https://issues.apache.org/jira/browse/FLINK-5697 - this will be code
> we
> > > already use internally; I will open a PR to add it to the Flink Kinesis
> > > consumer
> > > 2. Exchange of per subtask watermarks between all subtasks of one or
> > > multiple sources
> > > 3. Implement queue approach described in Jamie's document in to utilize
> > 1.)
> > > and 2.) to align the shard consumers WRT event time
> > >
> > > There was some discussion regarding the mechanism to share the
> watermarks
> > > between subtasks. If there is something that can be re-used it would be
> > > great. Otherwise I'm going to further investigate the Akka or JGroups
> > > route. Regarding Akka, since it is used within Flink already, is there
> an
> > > abstraction that you would recommend to consider to avoid direct
> > > dependency?
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > >
> > > On Thu, Oct 18, 2018 at 8:34 PM Zhijiang(wangzhijiang999)
> > >  wrote:
> > >
> > > > Not yet. We only have some initial thoughts and have not worked on it
> > > yet.
> > > > We will update the progress in this discussion if have.
> > > >
> > > > Best,
> > > > Zhijiang
> > > > --
> > > > 发件人:Aljoscha Krettek 
> > > > 发送时间:2018年10月18日(星期四) 17:53
> > > > 收件人:dev ; Zhijiang(wangzhijiang999) <
> > > > wangzhijiang...@aliyun.com>
> > > > 抄 送:Till Rohrmann 
> > > > 主 题:Re: Sharing state between subtasks
> > > >
> > > > Hi Zhijiang,
> > > >
> > > > do you already have working code or a design doc for the second
> > approach?
> > > >
> > > > Best,
> > > > Aljoscha
> > > >
> > > > > On 18. Oct 2018, at 08:42, Zhijiang(wangzhijiang999) <
> > > > wangzhijiang...@aliyun.com.INVALID> wrote:
> > > > >
> > > > > Just noticed this discussion from @Till Rohrmann's weekly community
> > > > update and I want to share some thoughts from our experiences.
> > > > >
> > > > > We also encountered the source consuption skew issue before, and we
> > are
> > > > focused on improving this by two possible ways.
> > > > >
> > > > > 1. Control the read strategy by the downstream side. In detail,
> every
> > > > input channel in downstream task corresponds to the consumption of
> one
> > > > upstream source task, and we will tag each input channel with
> watermark
> > > to
> > > > find the lowest channel to read in high priority. In essence, we
> > actually
> > > > rely on the mechanism of backpressure. If the channel with highest
> > > > timestamp is not read by downstream task for a while, it will block
> the
> > > > corresponding source task to read when the buffers are exhausted. It
> is
> > > no
> > > > need to change the source interface in this way, but there are two
> > major
> > > > concerns: first it will affect the barier alignment resulting in
> > > checkpoint
> > > > delayed or expired. Second it can not confirm source consumption
> > > alignment
> > > > very precisely, and it is just a best effort way. So we gave up this
> > way
> > > > finally.
> > > > >
> > > > > 2. Add the new component of SourceCoordinator to coordinate the
> > source
> > > > consumption distributedly. For example we can start this componnet in
> > the
> > > > JobManager like the current role of CheckpointCoordinator. Then every
> > > > source task would commnicate with JobManager via current RPC
> mechanism,
> > > > maybe we can rely on the heartbeat message 

Re: JobManager scale limitation - Slow S3 checkpoint deletes

2019-03-07 Thread Jamie Grier
All of the suggestions is this thread are good ones.  I had also considered
farming the actual cleanup work out to all the TaskMangers as well but
didn't realize how the easy the fix might be for this.  Thanks, Till.

With Till's change https://github.com/apache/flink/pull/7924 the problem
we're actually experiencing should be fixed so I'm not going to pursue this
any further right now.

-Jamie


On Thu, Mar 7, 2019 at 2:29 AM Till Rohrmann  wrote:

> I think part of the problem is that we currently use the executor of the
> common RpcService to run the I/O operations as Stephan suspected [1]. I
> will be fixing this problem for 1.8.0 and 1.7.3.
>
> This should resolve the problem but supporting different means of clean up
> might still be interesting to add.
>
> [1] https://issues.apache.org/jira/browse/FLINK-11851
>
> Cheers,
> Till
>
> On Thu, Mar 7, 2019 at 8:56 AM Yun Tang  wrote:
>
> > Sharing the communication pressure of a single node to multi task
> managers
> > would be a good idea. From my point of view, let task managers to know
> the
> > information that some specific checkpoint had already been aborted could
> > benefit a lot of things:
> >
> >   *   Let task manager to clean up the files, which is the topic of this
> > thread.
> >   *   Let `StreamTask` could cancel aborted running checkpoint in
> > task-side, just as https://issues.apache.org/jira/browse/FLINK-8871 want
> > to achieve.
> >   *   Let local state store could prune local checkpoints as soon as
> > possible without waiting for next `notifyCheckpointComplete` come.
> >   *   Let state backend on task manager side could did something on its
> > side, which would be really helpful for specific state backend
> > disaggregating computation and storage.
> >
> > Best
> > Yun Tang
> > 
> > From: Thomas Weise 
> > Sent: Thursday, March 7, 2019 12:06
> > To: dev@flink.apache.org
> > Subject: Re: JobManager scale limitation - Slow S3 checkpoint deletes
> >
> > Nice!
> >
> > Perhaps for file systems without TTL/expiration support (AFAIK includes
> > HDFS), cleanup could be performed in the task managers?
> >
> >
> > On Wed, Mar 6, 2019 at 6:01 PM Jamie Grier 
> > wrote:
> >
> > > Yup, it looks like the actor threads are spending all of their time
> > > communicating with S3.  I've attached a picture of a typical stack
> trace
> > > for one of the actor threads [1].  At the end of that call stack what
> > > you'll see is the thread blocking on synchronous communication with the
> > S3
> > > service.  This is for one of the flink-akka.actor.default-dispatcher
> > > threads.
> > >
> > > I've also attached a link to a YourKit snapshot if you'd like to
> explore
> > > the profiling data in more detail [2]
> > >
> > > [1]
> > >
> > >
> >
> https://drive.google.com/open?id=0BzMcf4IvnGWNNjEzMmRJbkFiWkZpZDFtQWo4LXByUXpPSFpR
> > > [2] https://drive.google.com/open?id=1iHCKJT-PTQUcDzFuIiacJ1MgAkfxza3W
> > >
> > >
> > >
> > > On Wed, Mar 6, 2019 at 7:41 AM Stephan Ewen  wrote:
> > >
> > > > I think having an option to not actively delete checkpoints (but
> rather
> > > > have the TTL feature of the file system take care of it) sounds like
> a
> > > good
> > > > idea.
> > > >
> > > > I am curious why you get heartbeat misses and akka timeouts during
> > > deletes.
> > > > Are some parts of the deletes happening sychronously in the actor
> > thread?
> > > >
> > > > On Wed, Mar 6, 2019 at 3:40 PM Jamie Grier 
> > > > wrote:
> > > >
> > > > > We've run into an issue that limits the max parallelism of jobs we
> > can
> > > > run
> > > > > and what it seems to boil down to is that the JobManager becomes
> > > > > unresponsive while essentially spending all of it's time discarding
> > > > > checkpoints from S3.  This results in sluggish UI, sporadic
> > > > > AkkaAskTimeouts, heartbeat misses, etc.
> > > > >
> > > > > Since S3 (and I assume HDFS) have policy that can be used to
> discard
> > > old
> > > > > objects without Flink actively deleting them I think it would be a
> > > useful
> > > > > feature to add the option to Flink to not ever discard checkpoints.
> > I
> > > > > believe this will solve the problem.
> > > > >
> > > > > Any objections or other known solutions to this problem?
> > > > >
> > > > > -Jamie
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Create a Flink ecosystem website

2019-03-07 Thread Becket Qin
Absolutely! Thanks for the pointer. I'll submit a PR to update the
ecosystem page and the navigation.

Thanks,

Jiangjie (Becket) Qin

On Thu, Mar 7, 2019 at 8:47 PM Robert Metzger  wrote:

> Okay. I will reach out to spark-packages.org and see if they are willing
> to share.
>
> Do you want to raise a PR to update the ecosystem page (maybe sync with
> the "Software Projects" listed here:
> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink) and
> link it in the navigation?
>
> Best,
> Robert
>
>
> On Thu, Mar 7, 2019 at 10:13 AM Becket Qin  wrote:
>
>> Hi Robert,
>>
>> I think it at least worths checking if spark-packages.org owners are
>> willing to share. Thanks for volunteering to write the requirement
>> descriptions! In any case, that will be very helpful.
>>
>> Since a static page has almost no cost, and we will need it to redirect
>> to the dynamic site anyways, how about we first do that while working on
>> the dynamic website?
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Thu, Mar 7, 2019 at 4:59 AM Ufuk Celebi  wrote:
>>
>>> I like Shaoxuan's idea to keep this a static site first. We could then
>>> iterate on this and make it a dynamic thing. Of course, if we have the
>>> resources in the community to quickly start with a dynamic site, I'm
>>> not apposed.
>>>
>>> – Ufuk
>>>
>>> On Wed, Mar 6, 2019 at 2:31 PM Robert Metzger 
>>> wrote:
>>> >
>>> > Awesome! Thanks a lot for looking into this Becket! The VMs hosted by
>>> Infra
>>> > look suitable.
>>> >
>>> > @Shaoxuan: There is actually already a static page. It used to be
>>> linked,
>>> > but has been removed from the navigation bar for some reason. This is
>>> the
>>> > page: https://flink.apache.org/ecosystem.html
>>> > We could update the page and add it back to the navigation bar for the
>>> > coming weeks. What do you think?
>>> >
>>> > I would actually like to push for a dynamic page right away.
>>> >
>>> > I know it's kind of a bold move, but how do you feel about sending the
>>> > owners of spark-packages.org a short note, if they are interested in
>>> > sharing the source? We could maintain the code together in a public
>>> repo.
>>> > If they are not interested in sharing, or we decide not to ask in the
>>> first
>>> > place, I'm happy to write down a short description of the requirements,
>>> > maybe some mockups. We could then see if we find somebody here in the
>>> > community who's willing to implement it.
>>> > Given the number of people who are eager to contribute, I believe we
>>> will
>>> > be able to find somebody pretty soon.
>>> >
>>> >
>>> > On Wed, Mar 6, 2019 at 3:49 AM Becket Qin 
>>> wrote:
>>> >
>>> > > Forgot to provide the link...
>>> > >
>>> > > [1] https://www.apache.org/dev/services.html#blogs (Apache infra
>>> services)
>>> > > [2] https://www.apache.org/dev/freebsd-jails (FreeBSD Jail provided
>>> by
>>> > > Apache Infra)
>>> > >
>>> > > On Wed, Mar 6, 2019 at 10:46 AM Becket Qin 
>>> wrote:
>>> > >
>>> > >> Hi Robert,
>>> > >>
>>> > >> Thanks for the feedback. These are good points. We should absolutely
>>> > >> shoot for a dynamic website to support more interactions in the
>>> community.
>>> > >> There might be a few things to solve:
>>> > >> 1. The website code itself. An open source solution would be great.
>>> TBH,
>>> > >> I do not have much experience on building a website. It'll be great
>>> if
>>> > >> someone could help comment on the solution here.
>>> > >> 2. The hardware to host the website. Apache Infra provides a few
>>> > >> services[1] that Apache projects can leverage. I did not see
>>> database
>>> > >> service, but maybe we can run a simple MySQL db in FreeBSD jail[2].
>>> > >>
>>> > >> @Bowen & vino, thanks for the positive feedback!
>>> > >>
>>> > >> @Shaoxuan Wang 
>>> > >> Thanks for the suggestion. That sounds reasonable to me. We
>>> probably need
>>> > >> a page in the Flink official site anyways, even just provide links
>>> it to
>>> > >> the ecosystem website. So listing the connectors in that static
>>> page seems
>>> > >> something we could start with while we are working on the dynamic
>>> pages.
>>> > >>
>>> > >> Thanks,
>>> > >>
>>> > >> Jiangjie (Becket) Qin
>>> > >>
>>> > >> On Wed, Mar 6, 2019 at 10:40 AM Shaoxuan Wang 
>>> > >> wrote:
>>> > >>
>>> > >>> Hi Becket and Robert,
>>> > >>>
>>> > >>> I like this idea!  Let us roll this out with Flink connectors at
>>> the
>>> > >>> first beginning. We can start with a static page, and upgrade it
>>> when we
>>> > >>> find a better solution for dynamic one with rich functions.
>>> > >>>
>>> > >>> Regards,
>>> > >>> Shaoxuan
>>> > >>>
>>> > >>>
>>> > >>> On Wed, Mar 6, 2019 at 1:36 AM Robert Metzger >> >
>>> > >>> wrote:
>>> > >>>
>>> >  Hey Becket,
>>> > 
>>> >  This is a great idea!
>>> >  For this to be successful, we need to make sure the page is placed
>>> >  prominently so that the people submitting something will get
>>> attention for
>>> >  their contributions.
>>> >  I 

Re: [DISCUSS] Improve the flinkbot

2019-03-07 Thread Robert Metzger
Each Jira ticket has a "last updated" field, and in a JIRA search, you can
sort results by that field.

So I will regularly check all Jira tickets which have been updated since
the last time my tool checked. For all changed Jira tickets, I'll update
the PR if the component has changed.

The implementation will be a bit differently, to not run into rate limits
with the JIRA or GitHub API.

On Thu, Mar 7, 2019 at 2:40 PM Chesnay Schepler  wrote:

> How do you intend to keep the label up-to-date with whatever
> modifications are made in JIRA?
>
> On 07.03.2019 13:40, Robert Metzger wrote:
> > I will automatically assign the Jira component as a label to the PR, yes.
> > You won't have to manually update the label on the PR, this will be done
> > automatically.
> >
> > So JIRA will stay the ground truth for setting the component correctly.
> >
> > On Thu, Mar 7, 2019 at 10:11 AM Chesnay Schepler 
> wrote:
> >
> >> Component labels seem a bit redundant. Every JIRA with an open PR
> >> already has a "pull-request-available" tag.
> >> So this information already exists.
> >>
> >> I assume you'll base the labels on the component tags at the time the PR
> >> is opened, but this also implies that they may be set incorrectly (or
> >> not at all) by the contributor. In this case we now have to update the
> >> component both in JIRA and on GitHub, and I'm most certainly not looking
> >> forward to that.
> >>
> >> On 06.03.2019 13:51, Robert Metzger wrote:
> >>> This is the picture:
> >>>
> >>
> https://user-images.githubusercontent.com/89049/53882383-7fda9380-4016-11e9-877d-10cdc00bdfbd.png
> >>> Speaking about feature requests, priorities and time-spend: My plan was
> >> to
> >>> now work on introducing a new label category for the components.
> >>> This should get us a lot better overview over the per-component
> >>> status/health of pull requests.
> >>>
> >>>
> >>> On Wed, Mar 6, 2019 at 12:58 PM Chesnay Schepler 
> >> wrote:
>  The image didn't go through.
> 
>  I would keep it as is; imo there are significantly more important
> things
>  that I'd like Robert to spend time on. (literally everything in the
>  Feature requests section)
> 
>  If we want to better distinguish new PRs I would suggest to either a)
>  introduce a dedicated "New" label or b) not attach any label by
> default,
>  and only attach the description label if someone has
>  approved/disapproved it.
> 
>  On 06.03.2019 12:37, Robert Metzger wrote:
> > Hey Kurt,
> > thanks a lot for this idea.
> >
> > My reasoning behind using just one color is the following: I wanted
> to
> > use one color per category of labels.
> > So when we are introducing labels for components, that it'll look
> like
> > this:
> >
> > image.png
> >
> > But we could of course also go with color families per category. So
> > "review" is green colors, "component" is red colors and so on.
> >
> > If nobody objects (or agrees) with me, I'll change the colors soon.
> >
> >
> > On Wed, Mar 6, 2019 at 7:51 AM Kurt Young  > > wrote:
> >
> >   Hi Dev,
> >
> >   I've been using the flinkbot and the label for a couple days,
> it
> >   worked
> >   really well! I have a minor suggestion, can we
> >   use different colors for different labels? We don't need to
> have
> >   different
> >   colors for every label, but only to distinguish whether
> >   someone had review the PR.
> >   For example, "review=description?" is the initial default
> label,
> >   and it may
> >   indicate that no reviewer has been try to review it.
> >
> >   For "review=architecture?", "review=consensus?",
> >   "review=quality?", they
> >   indicate that at least someone has try to review it and
> >   approved something. It sounds like the review is in progress.
> >
> >   For "review=approved ✅", it indicates the review is finished.
> >
> >   So i think 3 colors is enough, it tell committers whether the
> >   review has
> >   not started yes, or in progress, or is finished.
> >
> >   What do you think?
> >
> >   Best,
> >   Kurt
> >
> >
> >   On Mon, Mar 4, 2019 at 6:50 PM Robert Metzger <
> >> rmetz...@apache.org
> >   > wrote:
> >
> >   > GitHub has two methods for authentication with the APIs:
> >   > a) using an account's oauth token
> >   > b) using the GitHub Apps API
> >   >
> >   > Most of the libraries for the GH API use a), so does
> Flinkbot.
> >   The problem
> >   > with a) is that it does not allow for fine-grained access
> >   control, and
> >   > Infra does not want to give Flinkbot "write" access to
> >   "apache/flink".
> 

Re: [DISCUSS] Improve the flinkbot

2019-03-07 Thread Chesnay Schepler
How do you intend to keep the label up-to-date with whatever 
modifications are made in JIRA?


On 07.03.2019 13:40, Robert Metzger wrote:

I will automatically assign the Jira component as a label to the PR, yes.
You won't have to manually update the label on the PR, this will be done
automatically.

So JIRA will stay the ground truth for setting the component correctly.

On Thu, Mar 7, 2019 at 10:11 AM Chesnay Schepler  wrote:


Component labels seem a bit redundant. Every JIRA with an open PR
already has a "pull-request-available" tag.
So this information already exists.

I assume you'll base the labels on the component tags at the time the PR
is opened, but this also implies that they may be set incorrectly (or
not at all) by the contributor. In this case we now have to update the
component both in JIRA and on GitHub, and I'm most certainly not looking
forward to that.

On 06.03.2019 13:51, Robert Metzger wrote:

This is the picture:


https://user-images.githubusercontent.com/89049/53882383-7fda9380-4016-11e9-877d-10cdc00bdfbd.png

Speaking about feature requests, priorities and time-spend: My plan was

to

now work on introducing a new label category for the components.
This should get us a lot better overview over the per-component
status/health of pull requests.


On Wed, Mar 6, 2019 at 12:58 PM Chesnay Schepler 

wrote:

The image didn't go through.

I would keep it as is; imo there are significantly more important things
that I'd like Robert to spend time on. (literally everything in the
Feature requests section)

If we want to better distinguish new PRs I would suggest to either a)
introduce a dedicated "New" label or b) not attach any label by default,
and only attach the description label if someone has
approved/disapproved it.

On 06.03.2019 12:37, Robert Metzger wrote:

Hey Kurt,
thanks a lot for this idea.

My reasoning behind using just one color is the following: I wanted to
use one color per category of labels.
So when we are introducing labels for components, that it'll look like
this:

image.png

But we could of course also go with color families per category. So
"review" is green colors, "component" is red colors and so on.

If nobody objects (or agrees) with me, I'll change the colors soon.


On Wed, Mar 6, 2019 at 7:51 AM Kurt Young mailto:ykt...@gmail.com>> wrote:

  Hi Dev,

  I've been using the flinkbot and the label for a couple days, it
  worked
  really well! I have a minor suggestion, can we
  use different colors for different labels? We don't need to have
  different
  colors for every label, but only to distinguish whether
  someone had review the PR.
  For example, "review=description?" is the initial default label,
  and it may
  indicate that no reviewer has been try to review it.

  For "review=architecture?", "review=consensus?",
  "review=quality?", they
  indicate that at least someone has try to review it and
  approved something. It sounds like the review is in progress.

  For "review=approved ✅", it indicates the review is finished.

  So i think 3 colors is enough, it tell committers whether the
  review has
  not started yes, or in progress, or is finished.

  What do you think?

  Best,
  Kurt


  On Mon, Mar 4, 2019 at 6:50 PM Robert Metzger <

rmetz...@apache.org

  > wrote:

  > GitHub has two methods for authentication with the APIs:
  > a) using an account's oauth token
  > b) using the GitHub Apps API
  >
  > Most of the libraries for the GH API use a), so does Flinkbot.
  The problem
  > with a) is that it does not allow for fine-grained access
  control, and
  > Infra does not want to give Flinkbot "write" access to
  "apache/flink".
  > That's why I need to rewrite parts of the bot to support b),
  which allows
  > to give access only a repo's metadata, but not the code itself.
  >
  >
  >
  >
  > On Sat, Mar 2, 2019 at 12:42 AM Thomas Weise mailto:t...@apache.org>> wrote:
  >
  > > It would be good to encourage participation of non-committers
  in the
  > review
  > > process, so +1 for allowing everyone to operate the bot.
  > >
  > > Github approval will show a green checkmark for committer

approval

  > > (assuming accounts were linked via gitbox) - that should

provide

  > sufficient
  > > orientation?
  > >
  > > I just noticed that flinkbot seems to act as Robert when it
  comes to
  > label
  > > management? I think that is confusing (besides earning Robert
  a lot of
  > > extra github notification mail thanks to participation on
  every PR :)
  > >
  > > Overall flinkbot is very useful, thanks for all the work on
  it! I heard
  > > positive feedback from other contributors, I think they see

their

  > > contributions are better received now.
  > >
 

[jira] [Created] (FLINK-11854) Introduce batch physical nodes

2019-03-07 Thread godfrey he (JIRA)
godfrey he created FLINK-11854:
--

 Summary: Introduce batch physical nodes
 Key: FLINK-11854
 URL: https://issues.apache.org/jira/browse/FLINK-11854
 Project: Flink
  Issue Type: New Feature
  Components: API / Table SQL
Reporter: godfrey he
Assignee: godfrey he






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Create a Flink ecosystem website

2019-03-07 Thread Robert Metzger
Okay. I will reach out to spark-packages.org and see if they are willing to
share.

Do you want to raise a PR to update the ecosystem page (maybe sync with the
"Software Projects" listed here:
https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink) and
link it in the navigation?

Best,
Robert


On Thu, Mar 7, 2019 at 10:13 AM Becket Qin  wrote:

> Hi Robert,
>
> I think it at least worths checking if spark-packages.org owners are
> willing to share. Thanks for volunteering to write the requirement
> descriptions! In any case, that will be very helpful.
>
> Since a static page has almost no cost, and we will need it to redirect to
> the dynamic site anyways, how about we first do that while working on the
> dynamic website?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Thu, Mar 7, 2019 at 4:59 AM Ufuk Celebi  wrote:
>
>> I like Shaoxuan's idea to keep this a static site first. We could then
>> iterate on this and make it a dynamic thing. Of course, if we have the
>> resources in the community to quickly start with a dynamic site, I'm
>> not apposed.
>>
>> – Ufuk
>>
>> On Wed, Mar 6, 2019 at 2:31 PM Robert Metzger 
>> wrote:
>> >
>> > Awesome! Thanks a lot for looking into this Becket! The VMs hosted by
>> Infra
>> > look suitable.
>> >
>> > @Shaoxuan: There is actually already a static page. It used to be
>> linked,
>> > but has been removed from the navigation bar for some reason. This is
>> the
>> > page: https://flink.apache.org/ecosystem.html
>> > We could update the page and add it back to the navigation bar for the
>> > coming weeks. What do you think?
>> >
>> > I would actually like to push for a dynamic page right away.
>> >
>> > I know it's kind of a bold move, but how do you feel about sending the
>> > owners of spark-packages.org a short note, if they are interested in
>> > sharing the source? We could maintain the code together in a public
>> repo.
>> > If they are not interested in sharing, or we decide not to ask in the
>> first
>> > place, I'm happy to write down a short description of the requirements,
>> > maybe some mockups. We could then see if we find somebody here in the
>> > community who's willing to implement it.
>> > Given the number of people who are eager to contribute, I believe we
>> will
>> > be able to find somebody pretty soon.
>> >
>> >
>> > On Wed, Mar 6, 2019 at 3:49 AM Becket Qin  wrote:
>> >
>> > > Forgot to provide the link...
>> > >
>> > > [1] https://www.apache.org/dev/services.html#blogs (Apache infra
>> services)
>> > > [2] https://www.apache.org/dev/freebsd-jails (FreeBSD Jail provided
>> by
>> > > Apache Infra)
>> > >
>> > > On Wed, Mar 6, 2019 at 10:46 AM Becket Qin 
>> wrote:
>> > >
>> > >> Hi Robert,
>> > >>
>> > >> Thanks for the feedback. These are good points. We should absolutely
>> > >> shoot for a dynamic website to support more interactions in the
>> community.
>> > >> There might be a few things to solve:
>> > >> 1. The website code itself. An open source solution would be great.
>> TBH,
>> > >> I do not have much experience on building a website. It'll be great
>> if
>> > >> someone could help comment on the solution here.
>> > >> 2. The hardware to host the website. Apache Infra provides a few
>> > >> services[1] that Apache projects can leverage. I did not see database
>> > >> service, but maybe we can run a simple MySQL db in FreeBSD jail[2].
>> > >>
>> > >> @Bowen & vino, thanks for the positive feedback!
>> > >>
>> > >> @Shaoxuan Wang 
>> > >> Thanks for the suggestion. That sounds reasonable to me. We probably
>> need
>> > >> a page in the Flink official site anyways, even just provide links
>> it to
>> > >> the ecosystem website. So listing the connectors in that static page
>> seems
>> > >> something we could start with while we are working on the dynamic
>> pages.
>> > >>
>> > >> Thanks,
>> > >>
>> > >> Jiangjie (Becket) Qin
>> > >>
>> > >> On Wed, Mar 6, 2019 at 10:40 AM Shaoxuan Wang 
>> > >> wrote:
>> > >>
>> > >>> Hi Becket and Robert,
>> > >>>
>> > >>> I like this idea!  Let us roll this out with Flink connectors at the
>> > >>> first beginning. We can start with a static page, and upgrade it
>> when we
>> > >>> find a better solution for dynamic one with rich functions.
>> > >>>
>> > >>> Regards,
>> > >>> Shaoxuan
>> > >>>
>> > >>>
>> > >>> On Wed, Mar 6, 2019 at 1:36 AM Robert Metzger 
>> > >>> wrote:
>> > >>>
>> >  Hey Becket,
>> > 
>> >  This is a great idea!
>> >  For this to be successful, we need to make sure the page is placed
>> >  prominently so that the people submitting something will get
>> attention for
>> >  their contributions.
>> >  I think a dynamic site would probably be better, if we want
>> features
>> >  such as up and downvoting or comments.
>> >  I would also like this to be hosted on Apache infra, and endorsed
>> by
>> >  the community.
>> > 
>> >  Does anybody here know any existing software that we could use?
>> >  The only think I was able to find 

??????Fwd: DataStream EventTime last data cannot be output??

2019-03-07 Thread ??????
the watermark >  end timestamp of the window should trigger the window. 
-  --
;./??: "?? ??";
: 2019??3??7??(??) 8:35
??: "dev";

: Fwd: DataStream EventTime last data cannot be output??





> 
> 
> ??: ?? ?? 
> : DataStream EventTime last data cannot be output??
> : 2019??3??6?? GMT+8 10:51:14
> ??: u...@flink.apache.org
> 
> DataStream EventTime last data cannot be output ??
> 
> 
> In the verification of EventTime plus watermark processing, I found that the 
> data sent to the socket cannot be output in time or output.
> ). The verification found that only the timestamp of the current send data of 
> getCurrentWatermark() > TimeWindow + maxOutOfOrderness will trigger the end 
> of the last window
> ). But the latest record can not be processed in time, or can not be processed
> ). How can I deal with this problem?
> 
> 
> 
> The following is the Flink program ,Flink 1.7.2
> ---
> 
> 
> 
> package 
> com.opensourceteams.module.bigdata.flink.example.stream.worldcount.nc.eventtime
> 
> import java.util.{Date, Properties}
> 
> import com.alibaba.fastjson.JSON
> import com.opensourceteams.module.bigdata.flink.common.ConfigurationUtil
> import org.apache.flink.api.common.serialization.SimpleStringSchema
> import org.apache.flink.configuration.Configuration
> import org.apache.flink.streaming.api.TimeCharacteristic
> import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks
> import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
> import org.apache.flink.streaming.api.scala.function.ProcessAllWindowFunction
> import org.apache.flink.streaming.api.watermark.Watermark
> import org.apache.flink.streaming.api.windowing.time.Time
> import org.apache.flink.streaming.api.windowing.windows.TimeWindow
> import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer
> import org.apache.flink.util.Collector
> 
> 
> object SockWordCountRun {
> 
> 
> 
>   def main(args: Array[String]): Unit = {
> 
> 
> // get the execution environment
>// val env: StreamExecutionEnvironment = 
> StreamExecutionEnvironment.getExecutionEnvironment
> 
> 
> val configuration : Configuration = 
> ConfigurationUtil.getConfiguration(true)
> 
> val env:StreamExecutionEnvironment = 
> StreamExecutionEnvironment.createLocalEnvironment(1,configuration)
> 
> 
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
> 
> 
> 
> import org.apache.flink.streaming.api.scala._
> val dataStream = env.socketTextStream("localhost", 1234, '\n')
> 
>  // .setParallelism(3)
> 
> 
> dataStream.assignTimestampsAndWatermarks(new 
> AssignerWithPeriodicWatermarks[String] {
> 
> val maxOutOfOrderness =  2 * 1000L // 3.5 seconds
> var currentMaxTimestamp: Long = _
> var currentTimestamp: Long = _
> 
> override def getCurrentWatermark: Watermark =  new 
> Watermark(currentMaxTimestamp - maxOutOfOrderness)
> 
> override def extractTimestamp(element: String, 
> previousElementTimestamp: Long): Long = {
>   val jsonObject = JSON.parseObject(element)
> 
>   val timestamp = jsonObject.getLongValue("extract_data_time")
>   currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp)
>   currentTimestamp = timestamp
> 
> /*  println("===watermark begin===")
>   println()
>   println(new Date(currentMaxTimestamp - 20 * 1000))
>   println(jsonObject)
>   println("===watermark end===")
>   println()*/
>   timestamp
> }
> 
>   })
>   .timeWindowAll(Time.seconds(3))
> 
>   .process(new ProcessAllWindowFunction[String,String,TimeWindow]() {
>   override def process(context: Context, elements: Iterable[String], out: 
> Collector[String]): Unit = {
> 
> 
> println()
> println("window")
> println(new Date())
> for(e <- elements) out.collect(e)
> println("window")
> println(new Date())
> println()
>   }
> })
> 
>   .print()
>   //.setParallelism(3)
> 
> 
> 
> 
> 
> 
> println("==??==")
> println("(firefox):https://flink.apache.org/visualizer 
> ")
> //
> println(env.getStreamGraph.getStreamingPlanAsJSON)
> println("==?? 
> JSON??==\n")
> 
> 
> env.execute("Socket ")
> 
> 
> 
> 
> 
> 
> println("")
> 
>   }
> 
> 
>   // Data type for words with count
>   case class WordWithCount(word: String, count: Long){
> //override def toString: String = Thread.currentThread().getName + word + 
> " : " + count
>   

Re: [DISCUSS] Improve the flinkbot

2019-03-07 Thread Robert Metzger
I will automatically assign the Jira component as a label to the PR, yes.
You won't have to manually update the label on the PR, this will be done
automatically.

So JIRA will stay the ground truth for setting the component correctly.

On Thu, Mar 7, 2019 at 10:11 AM Chesnay Schepler  wrote:

> Component labels seem a bit redundant. Every JIRA with an open PR
> already has a "pull-request-available" tag.
> So this information already exists.
>
> I assume you'll base the labels on the component tags at the time the PR
> is opened, but this also implies that they may be set incorrectly (or
> not at all) by the contributor. In this case we now have to update the
> component both in JIRA and on GitHub, and I'm most certainly not looking
> forward to that.
>
> On 06.03.2019 13:51, Robert Metzger wrote:
> > This is the picture:
> >
> https://user-images.githubusercontent.com/89049/53882383-7fda9380-4016-11e9-877d-10cdc00bdfbd.png
> >
> > Speaking about feature requests, priorities and time-spend: My plan was
> to
> > now work on introducing a new label category for the components.
> > This should get us a lot better overview over the per-component
> > status/health of pull requests.
> >
> >
> > On Wed, Mar 6, 2019 at 12:58 PM Chesnay Schepler 
> wrote:
> >
> >> The image didn't go through.
> >>
> >> I would keep it as is; imo there are significantly more important things
> >> that I'd like Robert to spend time on. (literally everything in the
> >> Feature requests section)
> >>
> >> If we want to better distinguish new PRs I would suggest to either a)
> >> introduce a dedicated "New" label or b) not attach any label by default,
> >> and only attach the description label if someone has
> >> approved/disapproved it.
> >>
> >> On 06.03.2019 12:37, Robert Metzger wrote:
> >>> Hey Kurt,
> >>> thanks a lot for this idea.
> >>>
> >>> My reasoning behind using just one color is the following: I wanted to
> >>> use one color per category of labels.
> >>> So when we are introducing labels for components, that it'll look like
> >>> this:
> >>>
> >>> image.png
> >>>
> >>> But we could of course also go with color families per category. So
> >>> "review" is green colors, "component" is red colors and so on.
> >>>
> >>> If nobody objects (or agrees) with me, I'll change the colors soon.
> >>>
> >>>
> >>> On Wed, Mar 6, 2019 at 7:51 AM Kurt Young  >>> > wrote:
> >>>
> >>>  Hi Dev,
> >>>
> >>>  I've been using the flinkbot and the label for a couple days, it
> >>>  worked
> >>>  really well! I have a minor suggestion, can we
> >>>  use different colors for different labels? We don't need to have
> >>>  different
> >>>  colors for every label, but only to distinguish whether
> >>>  someone had review the PR.
> >>>  For example, "review=description?" is the initial default label,
> >>>  and it may
> >>>  indicate that no reviewer has been try to review it.
> >>>
> >>>  For "review=architecture?", "review=consensus?",
> >>>  "review=quality?", they
> >>>  indicate that at least someone has try to review it and
> >>>  approved something. It sounds like the review is in progress.
> >>>
> >>>  For "review=approved ✅", it indicates the review is finished.
> >>>
> >>>  So i think 3 colors is enough, it tell committers whether the
> >>>  review has
> >>>  not started yes, or in progress, or is finished.
> >>>
> >>>  What do you think?
> >>>
> >>>  Best,
> >>>  Kurt
> >>>
> >>>
> >>>  On Mon, Mar 4, 2019 at 6:50 PM Robert Metzger <
> rmetz...@apache.org
> >>>  > wrote:
> >>>
> >>>  > GitHub has two methods for authentication with the APIs:
> >>>  > a) using an account's oauth token
> >>>  > b) using the GitHub Apps API
> >>>  >
> >>>  > Most of the libraries for the GH API use a), so does Flinkbot.
> >>>  The problem
> >>>  > with a) is that it does not allow for fine-grained access
> >>>  control, and
> >>>  > Infra does not want to give Flinkbot "write" access to
> >>>  "apache/flink".
> >>>  > That's why I need to rewrite parts of the bot to support b),
> >>>  which allows
> >>>  > to give access only a repo's metadata, but not the code itself.
> >>>  >
> >>>  >
> >>>  >
> >>>  >
> >>>  > On Sat, Mar 2, 2019 at 12:42 AM Thomas Weise  >>>  > wrote:
> >>>  >
> >>>  > > It would be good to encourage participation of non-committers
> >>>  in the
> >>>  > review
> >>>  > > process, so +1 for allowing everyone to operate the bot.
> >>>  > >
> >>>  > > Github approval will show a green checkmark for committer
> >> approval
> >>>  > > (assuming accounts were linked via gitbox) - that should
> provide
> >>>  > sufficient
> >>>  > > orientation?
> >>>  > >
> >>>  > > I just noticed that flinkbot seems to act as Robert when it
> >>>  comes to
> >>>  

[jira] [Created] (FLINK-11853) Allow /jars/:jarid/plan as a POST request too

2019-03-07 Thread Stephen Connolly (JIRA)
Stephen Connolly created FLINK-11853:


 Summary: Allow  /jars/:jarid/plan as a POST request too
 Key: FLINK-11853
 URL: https://issues.apache.org/jira/browse/FLINK-11853
 Project: Flink
  Issue Type: Bug
  Components: Runtime / REST
Reporter: Stephen Connolly


Lots of HTTP clients (including the Java native client as well as OkHttp) 
enforce that a GET request cannot include a request body and change the request 
to a POST request if an attempt is made to send a request body.

While one answer would be to just request that people use a different HTTP 
client library, in some cases the potential for classloader conflicts may make 
this difficult. The easier solution would be to just provide a second end-point 
that responds to POST requests as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Error on importing flink source code into Intellij

2019-03-07 Thread Felipe Gutierrez
Thank you for your suggestions,
I did everything that you said however did not worked. After all, I deleted
the /home/myuser/.m2 directory and generated the source again. I also
closed my Eclipse IDE because the machine was running out of memory. I am
not sure about the second action. But when I restart Intellij again I could
build the project.
Thanks, Felipe
*--*
*-- Felipe Gutierrez*

*-- skype: felipe.o.gutierrez*
*--* *https://felipeogutierrez.blogspot.com
*


On Wed, Mar 6, 2019 at 1:43 PM Kurt Young  wrote:

> And sometimes just reimport maven will work.
>
> Right click pom.xml located in Flink's root dir -> Maven -> Reimport
>
> Best,
> Kurt
>
>
> On Wed, Mar 6, 2019 at 8:02 PM Chesnay Schepler 
> wrote:
>
>> Usually when I run into this i use "File -> Invalidate Caches /
>> Restart... -> Invalidate and restart"
>>
>> On 05.03.2019 16:15, Felipe Gutierrez wrote:
>> > Hello,
>> >
>> > I imported the flink source code at my Intellij IDE following the steps
>> > described here
>> >
>> https://ci.apache.org/projects/flink/flink-docs-stable/flinkDev/ide_setup.html#importing-flink
>> > and after I included the CheckStyle described here. When I use "mvn
>> clean
>> > package -DskipTests" everything compiles smoothly. For some reason I am
>> > getting these errors on Intellij:
>> >
>> >
>> /home/felipe/idea-workspace/flink/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterWithClientResource.java
>> > Error:(23, 42) java: package org.apache.flink.runtime.testutils does not
>> > exist
>> > Error:(24, 42) java: package org.apache.flink.runtime.testutils does not
>> > exist
>> > Error:(33, 34) java: modifier private not allowed here
>> > Error:(35, 33) java: modifier private not allowed here
>> >
>> /home/felipe/idea-workspace/flink/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResource.java
>> > Error:(29, 8) java: cyclic inheritance involving
>> > org.apache.flink.test.util.MiniClusterResource
>> > Error:(28, 1) java: annotation type not applicable to this kind of
>> > declaration
>> >
>> /home/felipe/idea-workspace/flink/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/MiniClusterResourceConfiguration.java
>> > Error:(33, 89) java: package org.apache.flink.runtime.testutils does not
>> > exist
>> >
>> /home/felipe/idea-workspace/flink/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/AbstractTestBase.java
>> > Error:(21, 42) java: package org.apache.flink.runtime.testutils does not
>> > exist
>> > Warning:(63, 21) java:
>> > org.apache.flink.test.util.MiniClusterResourceConfiguration in
>> > org.apache.flink.test.util has been deprecated
>> >
>> /home/felipe/idea-workspace/flink/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/test/util/TestProcessBuilder.java
>> > Error:(23, 42) java: package org.apache.flink.runtime.testutils does not
>> > exist
>> > Error:(24, 58) java: package
>> > org.apache.flink.runtime.testutils.CommonTestUtils does not exist
>> > Error:(32, 49) java: package org.apache.flink.runtime.testutils does not
>> > exist
>> > Error:(32, 1) java: static import only from classes and interfaces
>> > Error:(33, 49) java: package org.apache.flink.runtime.testutils does not
>> > exist
>> > Error:(33, 1) java: static import only from classes and interfaces
>> > Error:(40, 57) java: cannot find symbol
>> >symbol:   method getJavaCommandPath()
>> >location: class org.apache.flink.test.util.TestProcessBuilder
>> > Error:(52, 17) java: cannot find symbol
>> >symbol:   variable CommonTestUtils
>> >location: class org.apache.flink.test.util.TestProcessBuilder
>> > Error:(57, 29) java: cannot find symbol
>> >symbol:   method getCurrentClasspath()
>> >location: class org.apache.flink.test.util.TestProcessBuilder
>> > Error:(74, 21) java: cannot find symbol
>> >symbol:   class PipeForwarder
>> >location: class org.apache.flink.test.util.TestProcessBuilder
>> >
>> > How can I fix it on Intellij?
>> >
>> > Thanks, Felipe
>> >
>> > *--*
>> > *-- Felipe Gutierrez*
>> >
>> > *-- skype: felipe.o.gutierrez*
>> > *--* *https://felipeogutierrez.blogspot.com
>> > *
>> >
>>
>>


[jira] [Created] (FLINK-11852) Improve Processing function example

2019-03-07 Thread Flavio Pompermaier (JIRA)
Flavio Pompermaier created FLINK-11852:
--

 Summary: Improve Processing function example
 Key: FLINK-11852
 URL: https://issues.apache.org/jira/browse/FLINK-11852
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.7.2
Reporter: Flavio Pompermaier


In the processing function documentation 
([https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/process_function.html)]
 there's an "abusive" usage of the timers since a new timer is registered for 
every new tuple coming in. This could cause problems in terms of allocated 
objects and could burden the overall application.

It could worth to mention this problem and remove useless timers, e.g.:

 
{code:java}
CountWithTimestamp current = state.value();
if (current == null) {
     current = new CountWithTimestamp();
     current.key = value.f0;
 } else {
    ctx.timerService().deleteEventTimeTimer(current.lastModified + timeout);
 }{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: JobManager scale limitation - Slow S3 checkpoint deletes

2019-03-07 Thread Till Rohrmann
I think part of the problem is that we currently use the executor of the
common RpcService to run the I/O operations as Stephan suspected [1]. I
will be fixing this problem for 1.8.0 and 1.7.3.

This should resolve the problem but supporting different means of clean up
might still be interesting to add.

[1] https://issues.apache.org/jira/browse/FLINK-11851

Cheers,
Till

On Thu, Mar 7, 2019 at 8:56 AM Yun Tang  wrote:

> Sharing the communication pressure of a single node to multi task managers
> would be a good idea. From my point of view, let task managers to know the
> information that some specific checkpoint had already been aborted could
> benefit a lot of things:
>
>   *   Let task manager to clean up the files, which is the topic of this
> thread.
>   *   Let `StreamTask` could cancel aborted running checkpoint in
> task-side, just as https://issues.apache.org/jira/browse/FLINK-8871 want
> to achieve.
>   *   Let local state store could prune local checkpoints as soon as
> possible without waiting for next `notifyCheckpointComplete` come.
>   *   Let state backend on task manager side could did something on its
> side, which would be really helpful for specific state backend
> disaggregating computation and storage.
>
> Best
> Yun Tang
> 
> From: Thomas Weise 
> Sent: Thursday, March 7, 2019 12:06
> To: dev@flink.apache.org
> Subject: Re: JobManager scale limitation - Slow S3 checkpoint deletes
>
> Nice!
>
> Perhaps for file systems without TTL/expiration support (AFAIK includes
> HDFS), cleanup could be performed in the task managers?
>
>
> On Wed, Mar 6, 2019 at 6:01 PM Jamie Grier 
> wrote:
>
> > Yup, it looks like the actor threads are spending all of their time
> > communicating with S3.  I've attached a picture of a typical stack trace
> > for one of the actor threads [1].  At the end of that call stack what
> > you'll see is the thread blocking on synchronous communication with the
> S3
> > service.  This is for one of the flink-akka.actor.default-dispatcher
> > threads.
> >
> > I've also attached a link to a YourKit snapshot if you'd like to explore
> > the profiling data in more detail [2]
> >
> > [1]
> >
> >
> https://drive.google.com/open?id=0BzMcf4IvnGWNNjEzMmRJbkFiWkZpZDFtQWo4LXByUXpPSFpR
> > [2] https://drive.google.com/open?id=1iHCKJT-PTQUcDzFuIiacJ1MgAkfxza3W
> >
> >
> >
> > On Wed, Mar 6, 2019 at 7:41 AM Stephan Ewen  wrote:
> >
> > > I think having an option to not actively delete checkpoints (but rather
> > > have the TTL feature of the file system take care of it) sounds like a
> > good
> > > idea.
> > >
> > > I am curious why you get heartbeat misses and akka timeouts during
> > deletes.
> > > Are some parts of the deletes happening sychronously in the actor
> thread?
> > >
> > > On Wed, Mar 6, 2019 at 3:40 PM Jamie Grier 
> > > wrote:
> > >
> > > > We've run into an issue that limits the max parallelism of jobs we
> can
> > > run
> > > > and what it seems to boil down to is that the JobManager becomes
> > > > unresponsive while essentially spending all of it's time discarding
> > > > checkpoints from S3.  This results in sluggish UI, sporadic
> > > > AkkaAskTimeouts, heartbeat misses, etc.
> > > >
> > > > Since S3 (and I assume HDFS) have policy that can be used to discard
> > old
> > > > objects without Flink actively deleting them I think it would be a
> > useful
> > > > feature to add the option to Flink to not ever discard checkpoints.
> I
> > > > believe this will solve the problem.
> > > >
> > > > Any objections or other known solutions to this problem?
> > > >
> > > > -Jamie
> > > >
> > >
> >
>


[jira] [Created] (FLINK-11851) ClusterEntrypoint provides wrong executor to HaServices

2019-03-07 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-11851:
-

 Summary: ClusterEntrypoint provides wrong executor to HaServices
 Key: FLINK-11851
 URL: https://issues.apache.org/jira/browse/FLINK-11851
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.7.2, 1.8.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.7.3, 1.8.0


The {{ClusterEntrypoint}} provides the executor of the common {{RpcService}} to 
the {{HighAvailabilityServices}} which uses the executor to run io operations. 
In I/O heavy cases, this can block all {{RpcService}} threads and make the 
{{RpcEndpoints}} running in the respective {{RpcService}} unresponsive.

I suggest to introduce a dedicated I/O executor which is used for io heavy 
operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11850) ZooKeeperHaServicesTest#testSimpleCloseAndCleanupAllData fails on Travis

2019-03-07 Thread Congxian Qiu(klion26) (JIRA)
Congxian Qiu(klion26) created FLINK-11850:
-

 Summary: ZooKeeperHaServicesTest#testSimpleCloseAndCleanupAllData 
fails on Travis
 Key: FLINK-11850
 URL: https://issues.apache.org/jira/browse/FLINK-11850
 Project: Flink
  Issue Type: Test
  Components: Tests
Reporter: Congxian Qiu(klion26)


org.apache.flink.runtime.highavailability.zookeeper.ZooKeeperHaServicesTest
08:20:01.694 [ERROR] 
testSimpleCloseAndCleanupAllData(org.apache.flink.runtime.highavailability.zookeeper.ZooKeeperHaServicesTest)
 Time elapsed: 0.076 s <<< ERROR!
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode 
for 
/foo/bar/flink/default/leaderlatch/resource_manager_lock/_c_477d0124-92f3-4069-98aa-a71b8243250c-latch-00
 at 
org.apache.flink.runtime.highavailability.zookeeper.ZooKeeperHaServicesTest.runCleanupTest(ZooKeeperHaServicesTest.java:203)
 at 
org.apache.flink.runtime.highavailability.zookeeper.ZooKeeperHaServicesTest.testSimpleCloseAndCleanupAllData(ZooKeeperHaServicesTest.java:128)
 
Travis links: https://travis-ci.org/apache/flink/jobs/502960186



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] FLIP-33: Terminate/Suspend Job with Savepoint

2019-03-07 Thread Kostas Kloudas
Hi,

Thanks for the comments.
I agree with the Ufuk's and Elias' proposal.

- "cancel" remains the good old "cancel"
- "terminate" becomes "stop --drain-with-savepoint"
- "suspend" becomes "stop --with-savepoint"
- "cancel-with-savepoint" is subsumed by "stop --with-savepoint"

As you see from the previous, I would also add "terminate" and "suspend"
to result in keeping a savepoint by default.

As for Ufuk's remarks:

1) You are correct that to have a proper way to not allow elements to be
fed in the pipeline
after the checkpoint barrier, we need support from the sources. This is
more the responsibility
of FLIP-27
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface

2) I would lean more towards replacing the old "stop" command with the new
one. But, as you said,
I have no view of how many users (if any) rely on the old "stop" command
for their usecases.

Cheers,
Kostas



On Wed, Mar 6, 2019 at 9:52 PM Ufuk Celebi  wrote:

> I really like this effort. I think the original plan for
> "cancel-with-savepoint" was always to just be a workaround until we
> arrived at a better solution as proposed here.
>
> Regarding the FLIP, I agree with Elias comments. I think the number of
> termination modes the FLIP introduces can be overwhelming and I would
> personally rather follow Elias' proposal. In context of the proposal,
> this would result in the following:
> - "terminate" becomes "stop --drain"
> - "suspend" becomes "stop --with-savepoint"
> - "cancel-with-savepoint" is superseded by "stop --with-savepoint"
>
> I have two remaining questions:
>
> 1) @Kostas: Elias suggests for stop that "a job should process no
> messages after the checkpoints barrier". This is something that needs
> support from the sources. Is this in the scope of your proposal (I
> think not)? If not, is there a future plan for this?
>
> 2) Would we need to introduce a new command/name for "stop" as we
> already have a "stop" command? Assuming that there are no users that
> actually use the existing "stop" command as no major sources are
> stoppable (I think), I would personally suggest to upgrade the
> existing "stop" command to the proposed one. If on the other hand, if
> we know of users that rely on the current "stop" command, we'd need to
> find another name for it.
>
> Best,
>
> Ufuk
>
> On Wed, Mar 6, 2019 at 12:27 AM Elias Levy 
> wrote:
> >
> > Apologies for the late reply.
> >
> > I think this is badly needed, but I fear we are adding complexity by
> > introducing yet two more stop commands.  We'll have: cancel, stop,
> > terminate. and suspend.  We basically want to do two things: terminate a
> > job with prejudice or stop a job safely.
> >
> > For the former "cancel" is the appropriate term, and should have no need
> > for a cancel with checkpoint option.  If the job was configured to use
> > externalized checkpoints and it ran long enough, a checkpoint will be
> > available for it.
> >
> > For the later "stop" is the appropriate term, and it means that a job
> > should process no messages after the checkpoints barrier and that it
> should
> > ensure that exactly-once sinks complete their two-phase commits
> > successfully.  If a savepoint was requested, one should be created.
> >
> > So in my mind there are two commands, cancel and stop, with appropriate
> > semantics.  Emitting MAX_WATERMARK before the checkpoint barrier during
> > stop is merely an optional behavior, like creation of a savepoint.  But
> if
> > a specific command for it is desired, then "drain" seems appropriate.
> >
> > On Tue, Feb 12, 2019 at 9:50 AM Stephan Ewen  wrote:
> >
> > > Hi Elias!
> > >
> > > I remember you brought this missing feature up in the past. Do you
> think
> > > the proposed enhancement would work for your use case?
> > >
> > > Best,
> > > Stephan
> > >
> > > -- Forwarded message -
> > > From: Kostas Kloudas 
> > > Date: Tue, Feb 12, 2019 at 5:28 PM
> > > Subject: [DISCUSS] FLIP-33: Terminate/Suspend Job with Savepoint
> > > To: 
> > >
> > >
> > > Hi everyone,
> > >
> > >  A commonly used functionality offered by Flink is the
> > > "cancel-with-savepoint" operation. When applied to the current
> exactly-once
> > > sinks, the current implementation of the feature can be problematic,
> as it
> > > does not guarantee that side-effects will be committed by Flink to the
> 3rd
> > > party storage system.
> > >
> > >  This discussion targets fixing this issue and proposes the addition
> of two
> > > termination modes, namely:
> > > 1) SUSPEND, for temporarily stopping the job, e.g. for Flink
> version
> > > upgrading in your cluster
> > > 2) TERMINATE, for terminal shut down which ends the stream and
> sends
> > > MAX_WATERMARK time, and flushes any state associated with (event time)
> > > timers
> > >
> > > A google doc with the FLIP proposal can be found here:
> > >
> > >
> https://docs.google.com/document/d/1EZf6pJMvqh_HeBCaUOnhLUr9JmkhfPgn6Mre_z6tgp8/edit?usp=sharing
> > >
> > > And 

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-07 Thread Becket Qin
Hi Robert,

I think it at least worths checking if spark-packages.org owners are
willing to share. Thanks for volunteering to write the requirement
descriptions! In any case, that will be very helpful.

Since a static page has almost no cost, and we will need it to redirect to
the dynamic site anyways, how about we first do that while working on the
dynamic website?

Thanks,

Jiangjie (Becket) Qin

On Thu, Mar 7, 2019 at 4:59 AM Ufuk Celebi  wrote:

> I like Shaoxuan's idea to keep this a static site first. We could then
> iterate on this and make it a dynamic thing. Of course, if we have the
> resources in the community to quickly start with a dynamic site, I'm
> not apposed.
>
> – Ufuk
>
> On Wed, Mar 6, 2019 at 2:31 PM Robert Metzger  wrote:
> >
> > Awesome! Thanks a lot for looking into this Becket! The VMs hosted by
> Infra
> > look suitable.
> >
> > @Shaoxuan: There is actually already a static page. It used to be linked,
> > but has been removed from the navigation bar for some reason. This is the
> > page: https://flink.apache.org/ecosystem.html
> > We could update the page and add it back to the navigation bar for the
> > coming weeks. What do you think?
> >
> > I would actually like to push for a dynamic page right away.
> >
> > I know it's kind of a bold move, but how do you feel about sending the
> > owners of spark-packages.org a short note, if they are interested in
> > sharing the source? We could maintain the code together in a public repo.
> > If they are not interested in sharing, or we decide not to ask in the
> first
> > place, I'm happy to write down a short description of the requirements,
> > maybe some mockups. We could then see if we find somebody here in the
> > community who's willing to implement it.
> > Given the number of people who are eager to contribute, I believe we will
> > be able to find somebody pretty soon.
> >
> >
> > On Wed, Mar 6, 2019 at 3:49 AM Becket Qin  wrote:
> >
> > > Forgot to provide the link...
> > >
> > > [1] https://www.apache.org/dev/services.html#blogs (Apache infra
> services)
> > > [2] https://www.apache.org/dev/freebsd-jails (FreeBSD Jail provided by
> > > Apache Infra)
> > >
> > > On Wed, Mar 6, 2019 at 10:46 AM Becket Qin 
> wrote:
> > >
> > >> Hi Robert,
> > >>
> > >> Thanks for the feedback. These are good points. We should absolutely
> > >> shoot for a dynamic website to support more interactions in the
> community.
> > >> There might be a few things to solve:
> > >> 1. The website code itself. An open source solution would be great.
> TBH,
> > >> I do not have much experience on building a website. It'll be great if
> > >> someone could help comment on the solution here.
> > >> 2. The hardware to host the website. Apache Infra provides a few
> > >> services[1] that Apache projects can leverage. I did not see database
> > >> service, but maybe we can run a simple MySQL db in FreeBSD jail[2].
> > >>
> > >> @Bowen & vino, thanks for the positive feedback!
> > >>
> > >> @Shaoxuan Wang 
> > >> Thanks for the suggestion. That sounds reasonable to me. We probably
> need
> > >> a page in the Flink official site anyways, even just provide links it
> to
> > >> the ecosystem website. So listing the connectors in that static page
> seems
> > >> something we could start with while we are working on the dynamic
> pages.
> > >>
> > >> Thanks,
> > >>
> > >> Jiangjie (Becket) Qin
> > >>
> > >> On Wed, Mar 6, 2019 at 10:40 AM Shaoxuan Wang 
> > >> wrote:
> > >>
> > >>> Hi Becket and Robert,
> > >>>
> > >>> I like this idea!  Let us roll this out with Flink connectors at the
> > >>> first beginning. We can start with a static page, and upgrade it
> when we
> > >>> find a better solution for dynamic one with rich functions.
> > >>>
> > >>> Regards,
> > >>> Shaoxuan
> > >>>
> > >>>
> > >>> On Wed, Mar 6, 2019 at 1:36 AM Robert Metzger 
> > >>> wrote:
> > >>>
> >  Hey Becket,
> > 
> >  This is a great idea!
> >  For this to be successful, we need to make sure the page is placed
> >  prominently so that the people submitting something will get
> attention for
> >  their contributions.
> >  I think a dynamic site would probably be better, if we want features
> >  such as up and downvoting or comments.
> >  I would also like this to be hosted on Apache infra, and endorsed by
> >  the community.
> > 
> >  Does anybody here know any existing software that we could use?
> >  The only think I was able to find is AUR:
> https://aur.archlinux.org/
> >  (which is a community packages site for Arch Linux. The source code
> of this
> >  portal is open source, but the layout and structure is not an ideal
> fit for
> >  our requirements)
> > 
> >  Best,
> >  Robert
> > 
> > 
> > 
> >  On Tue, Mar 5, 2019 at 12:03 PM Becket Qin 
> >  wrote:
> > 
> > > Hi folks,
> > >
> > > I would like to start a discussion thread about creating a Flink
> > > ecosystem website. 

[jira] [Created] (FLINK-11849) The flink document about checkpoint and savepoint may be wrong

2019-03-07 Thread xulongfetion (JIRA)
xulongfetion created FLINK-11849:


 Summary: The flink document about checkpoint and savepoint may be 
wrong
 Key: FLINK-11849
 URL: https://issues.apache.org/jira/browse/FLINK-11849
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.6.3
Reporter: xulongfetion


The document content here "ops/state/checkpoints.html#difference-to-savepoints" 
says "do not support Flink specific features like rescaling". But when i try to 
change the parallelism of a stated operator and restore from an external 
checkpoint, it recovers without any exception. So is this sentence wrong or i 
misunderstand it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Improve the flinkbot

2019-03-07 Thread Chesnay Schepler
Component labels seem a bit redundant. Every JIRA with an open PR 
already has a "pull-request-available" tag.

So this information already exists.

I assume you'll base the labels on the component tags at the time the PR 
is opened, but this also implies that they may be set incorrectly (or 
not at all) by the contributor. In this case we now have to update the 
component both in JIRA and on GitHub, and I'm most certainly not looking 
forward to that.


On 06.03.2019 13:51, Robert Metzger wrote:

This is the picture:
https://user-images.githubusercontent.com/89049/53882383-7fda9380-4016-11e9-877d-10cdc00bdfbd.png

Speaking about feature requests, priorities and time-spend: My plan was to
now work on introducing a new label category for the components.
This should get us a lot better overview over the per-component
status/health of pull requests.


On Wed, Mar 6, 2019 at 12:58 PM Chesnay Schepler  wrote:


The image didn't go through.

I would keep it as is; imo there are significantly more important things
that I'd like Robert to spend time on. (literally everything in the
Feature requests section)

If we want to better distinguish new PRs I would suggest to either a)
introduce a dedicated "New" label or b) not attach any label by default,
and only attach the description label if someone has
approved/disapproved it.

On 06.03.2019 12:37, Robert Metzger wrote:

Hey Kurt,
thanks a lot for this idea.

My reasoning behind using just one color is the following: I wanted to
use one color per category of labels.
So when we are introducing labels for components, that it'll look like
this:

image.png

But we could of course also go with color families per category. So
"review" is green colors, "component" is red colors and so on.

If nobody objects (or agrees) with me, I'll change the colors soon.


On Wed, Mar 6, 2019 at 7:51 AM Kurt Young mailto:ykt...@gmail.com>> wrote:

 Hi Dev,

 I've been using the flinkbot and the label for a couple days, it
 worked
 really well! I have a minor suggestion, can we
 use different colors for different labels? We don't need to have
 different
 colors for every label, but only to distinguish whether
 someone had review the PR.
 For example, "review=description?" is the initial default label,
 and it may
 indicate that no reviewer has been try to review it.

 For "review=architecture?", "review=consensus?",
 "review=quality?", they
 indicate that at least someone has try to review it and
 approved something. It sounds like the review is in progress.

 For "review=approved ✅", it indicates the review is finished.

 So i think 3 colors is enough, it tell committers whether the
 review has
 not started yes, or in progress, or is finished.

 What do you think?

 Best,
 Kurt


 On Mon, Mar 4, 2019 at 6:50 PM Robert Metzger mailto:rmetz...@apache.org>> wrote:

 > GitHub has two methods for authentication with the APIs:
 > a) using an account's oauth token
 > b) using the GitHub Apps API
 >
 > Most of the libraries for the GH API use a), so does Flinkbot.
 The problem
 > with a) is that it does not allow for fine-grained access
 control, and
 > Infra does not want to give Flinkbot "write" access to
 "apache/flink".
 > That's why I need to rewrite parts of the bot to support b),
 which allows
 > to give access only a repo's metadata, but not the code itself.
 >
 >
 >
 >
 > On Sat, Mar 2, 2019 at 12:42 AM Thomas Weise mailto:t...@apache.org>> wrote:
 >
 > > It would be good to encourage participation of non-committers
 in the
 > review
 > > process, so +1 for allowing everyone to operate the bot.
 > >
 > > Github approval will show a green checkmark for committer

approval

 > > (assuming accounts were linked via gitbox) - that should provide
 > sufficient
 > > orientation?
 > >
 > > I just noticed that flinkbot seems to act as Robert when it
 comes to
 > label
 > > management? I think that is confusing (besides earning Robert
 a lot of
 > > extra github notification mail thanks to participation on
 every PR :)
 > >
 > > Overall flinkbot is very useful, thanks for all the work on
 it! I heard
 > > positive feedback from other contributors, I think they see their
 > > contributions are better received now.
 > >
 > > Thomas
 > >
 > >
 > >
 > > On Fri, Mar 1, 2019 at 8:38 AM Robert Metzger
 mailto:rmetz...@apache.org>>
 > wrote:
 > >
 > > > I will update labels only based on committer's approvals (for
 > > everything),
 > > > I think that's cleaner.
 > > >
 > > > We can always revisit this.
 > > >
 > > > On Wed, Feb 27, 2019 at 4:31 PM Chesnay Schepler
 mailto:ches...@apache.org>>
 > > > wrote:
 > > >
 > > > > Fore code-quality/description I agree, but