date:20190819

Re: Weekly Community Update 2019/33, Personal Chinese Version

2019-08-19 Thread Zili Chen

Hi Paul & Jark,

Thanks for your feedbacks!

I also think of putting the content in the email but hesitate on
where it should be sent to(user-zh only IMO), what kind of thread
it should be sorted to([ANNOUNCE] or just normal thread), and how
to format to fit the email form.

It is reasonable to have this thread sync on both user-zh list side
and web page side so that we both follow the Apache way and make it
convenient to reach for (potential) Chinese users.

I'm glad to put the content in the email but decided to collect some
feedback of the idea first :-) If no other suggestions I am going to
start a separated normal thread, i.e., without [ANNOUNCE] header, to
user-zh list later today or tomorrow.

Best,
tison.

Jark Wu  于2019年8月20日周二 上午11:28写道：

> Hi Zili,
>
> +1 for the Chinese Weekly Community Update.
> I think this will categorical attract more Chinese users.
> Btw, could you also put the content of Chinese Weekly Updates in the
> email? I think this will be more align with the Apache Way.
> So that we can help to response users who have interesting/questions on
> some items.
>
> Thanks,
> Jark
>
>
> On Mon, 19 Aug 2019 at 13:27, Paul Lam  wrote:
>
>> Hi Tison,
>>
>> Big +1 for the Chinese Weekly Community Update. The content is
>> well-organized, and I believe it would be very helpful for Chinese users to
>> get an overview of what’s going on in the community.
>>
>> Best,
>> Paul Lam
>>
>> > 在 2019年8月19日，12:27，Zili Chen  写道：
>> >
>> > Hi community,
>> >
>> > Inspired by weekly community updates thread, regards the growth
>> > of Chinese community and kind of different concerns for community
>> > members I'd like to start a personally maintained Chinese version of
>> > Weekly Community Update.
>> >
>> > Right now I posted these weeks' updates on this page[1] where Chinese
>> > users as well as potential ones could easily reach.
>> >
>> > Any feedbacks are welcome and I am looking for a collaborator who is
>> > familiar with TableAPI/SQL topics to enrich the content.
>> >
>> > Best,
>> > tison.
>> >
>> > [1] https://zhuanlan.zhihu.com/p/78753149 <
>> https://zhuanlan.zhihu.com/p/78753149>
>>
>

Re: Configuring logback for my flink job

2019-08-19 Thread Yang Wang

Hi Vishwas,

If you mean to have your application logs to its configured appenders in
client, i think you could use your own FLINK_CONF_DIR environment.
Otherwise, we could not update the log4j/logback configuration.[1]


[1]
https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/bin/flink#L52

Vishwas Siravara  于2019年8月19日周一 下午11:02写道：

> Hi,
> I have a logback for my flink application which is packaged with the
> application fat jar. However when I submit my job from flink command line
> tool, I see that logback is set to
> -Dlogback.configurationFile=file:/data/flink-1.7.2/conf/logback.xml from
> the client log.
>
> As a result my application log ends up going to the client log. How can I
> change this behavior . I know a dirty fix is to add my logback config to
> the logback in /data/flink-1.7.2/conf/logback.xml .
>
> Any other suggestions on how I can have my application log to its
> configured appenders.
>
> Thanks,
> Vishwas
>

Re: Weekly Community Update 2019/33, Personal Chinese Version

2019-08-19 Thread Jark Wu

Hi Zili,

+1 for the Chinese Weekly Community Update.
I think this will categorical attract more Chinese users.
Btw, could you also put the content of Chinese Weekly Updates in the email?
I think this will be more align with the Apache Way.
So that we can help to response users who have interesting/questions on
some items.

Thanks,
Jark


On Mon, 19 Aug 2019 at 13:27, Paul Lam  wrote:

> Hi Tison,
>
> Big +1 for the Chinese Weekly Community Update. The content is
> well-organized, and I believe it would be very helpful for Chinese users to
> get an overview of what’s going on in the community.
>
> Best,
> Paul Lam
>
> > 在 2019年8月19日，12:27，Zili Chen  写道：
> >
> > Hi community,
> >
> > Inspired by weekly community updates thread, regards the growth
> > of Chinese community and kind of different concerns for community
> > members I'd like to start a personally maintained Chinese version of
> > Weekly Community Update.
> >
> > Right now I posted these weeks' updates on this page[1] where Chinese
> > users as well as potential ones could easily reach.
> >
> > Any feedbacks are welcome and I am looking for a collaborator who is
> > familiar with TableAPI/SQL topics to enrich the content.
> >
> > Best,
> > tison.
> >
> > [1] https://zhuanlan.zhihu.com/p/78753149 <
> https://zhuanlan.zhihu.com/p/78753149>
>

Re: Weekly Community Update 2019/33, Personal Chinese Version

2019-08-19 Thread Jark Wu

Hi Zili,

+1 for the Chinese Weekly Community Update.
I think this will categorical attract more Chinese users.
Btw, could you also put the content of Chinese Weekly Updates in the email?
I think this will be more align with the Apache Way.
So that we can help to response users who have interesting/questions on
some items.

Thanks,
Jark


On Mon, 19 Aug 2019 at 13:27, Paul Lam  wrote:

> Hi Tison,
>
> Big +1 for the Chinese Weekly Community Update. The content is
> well-organized, and I believe it would be very helpful for Chinese users to
> get an overview of what’s going on in the community.
>
> Best,
> Paul Lam
>
> > 在 2019年8月19日，12:27，Zili Chen  写道：
> >
> > Hi community,
> >
> > Inspired by weekly community updates thread, regards the growth
> > of Chinese community and kind of different concerns for community
> > members I'd like to start a personally maintained Chinese version of
> > Weekly Community Update.
> >
> > Right now I posted these weeks' updates on this page[1] where Chinese
> > users as well as potential ones could easily reach.
> >
> > Any feedbacks are welcome and I am looking for a collaborator who is
> > familiar with TableAPI/SQL topics to enrich the content.
> >
> > Best,
> > tison.
> >
> > [1] https://zhuanlan.zhihu.com/p/78753149 <
> https://zhuanlan.zhihu.com/p/78753149>
>

Re: Recovery from job manager crash using check points

2019-08-19 Thread Zili Chen

Hi Min,

I guess you use standalone high-availability and when TM fails,
JM can recovered the job from an in-memory checkpoint store.

However, when JM fails, since you don't persist state on ha backend
such as ZooKeeper, even JM relaunched by YARN RM superseded by a
stand by, the new one knows nothing about the previous jobs.

In short, you need to set up ZooKeepers as you yourself mentioned.

Best,
tison.

Biao Liu  于2019年8月19日周一 下午11:49写道：

> Hi Min,
>
> > Do I need to set up zookeepers to keep the states when a job manager
> crashes?
>
> I guess you need to set up the HA [1] properly. Besides that, I would
> suggest you should also check the state backend.
>
> 1.
> https://ci.apache.org/projects/flink/flink-docs-master/ops/jobmanager_high_availability.html
> 2.
> https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Mon, 19 Aug 2019 at 23:28,  wrote:
>
>> Hi,
>>
>>
>>
>> I can use check points to recover Flink states when a task manger crashes.
>>
>>
>>
>> I can not use check points to recover Flink states when a job manger
>> crashes.
>>
>>
>>
>> Do I need to set up zookeepers to keep the states when a job manager
>> crashes?
>>
>>
>>
>> Regards
>>
>>
>>
>> Min
>>
>>
>>
>

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-19 Thread Oytun Tez

Thanks, Gordon, will do tomorrow.

---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com


On Mon, Aug 19, 2019 at 7:21 PM Tzu-Li (Gordon) Tai 
wrote:

> Hi!
>
> Voting on RC3 for Apache Flink 1.9.0 has started:
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-1-9-0-release-candidate-3-td31988.html
>
> Please check this out if you want to verify your applications against this
> new Flink release.
>
> Cheers,
> Gordon
>
> -- Forwarded message -
> From: Tzu-Li (Gordon) Tai 
> Date: Tue, Aug 20, 2019 at 1:16 AM
> Subject: [VOTE] Apache Flink 1.9.0, release candidate #3
> To: dev 
>
>
> Hi all,
>
> Release candidate #3 for Apache Flink 1.9.0 is now ready for your review.
>
> Please review and vote on release candidate #3 for version 1.9.0, as
> follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint 1C1E2394D3194E1944613488F320986D35C33D6A [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag “release-1.9.0-rc3” [5].
> * pull requests for the release note documentation [6] and announcement
> blog post [7].
>
> As proposed in the RC2 vote thread [8], for RC3 we are only cherry-picking
> minimal specific changes on top of RC2 to be able to reasonably carry over
> previous testing efforts and effectively require a shorter voting time.
>
> The only extra commits in this RC, compared to RC2, are the following:
> - c2d9aeac [FLINK-13231] [pubsub] Replace Max outstanding acknowledgement
> ids limit with a FlinkConnectorRateLimiter
> - d8941711 [FLINK-13699][table-api] Fix TableFactory doesn’t work with DDL
> when containing TIMESTAMP/DATE/TIME types
> - 04e95278 [FLINK-13752] Only references necessary variables when
> bookkeeping result partitions on TM
>
> Due to the minimal set of changes, the vote for RC3 will be *open for
> only 48 hours*.
> Please cast your votes before *Aug. 21st (Wed.) 2019, 17:00 PM CET*.
>
> It is adopted by majority approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Gordon
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.0-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1236
> [5]
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=refs/tags/release-1.9.0-rc3
> [6] https://github.com/apache/flink/pull/9438
> [7] https://github.com/apache/flink-web/pull/244
> [8]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-Release-1-9-0-release-candidate-2-tp31542p31933.html
>

Fwd: [VOTE] Apache Flink 1.9.0, release candidate #3

2019-08-19 Thread Tzu-Li (Gordon) Tai

Hi!

Voting on RC3 for Apache Flink 1.9.0 has started:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-1-9-0-release-candidate-3-td31988.html

Please check this out if you want to verify your applications against this
new Flink release.

Cheers,
Gordon

-- Forwarded message -
From: Tzu-Li (Gordon) Tai 
Date: Tue, Aug 20, 2019 at 1:16 AM
Subject: [VOTE] Apache Flink 1.9.0, release candidate #3
To: dev 


Hi all,

Release candidate #3 for Apache Flink 1.9.0 is now ready for your review.

Please review and vote on release candidate #3 for version 1.9.0, as
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be
deployed to dist.apache.org [2], which are signed with the key with
fingerprint 1C1E2394D3194E1944613488F320986D35C33D6A [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag “release-1.9.0-rc3” [5].
* pull requests for the release note documentation [6] and announcement
blog post [7].

As proposed in the RC2 vote thread [8], for RC3 we are only cherry-picking
minimal specific changes on top of RC2 to be able to reasonably carry over
previous testing efforts and effectively require a shorter voting time.

The only extra commits in this RC, compared to RC2, are the following:
- c2d9aeac [FLINK-13231] [pubsub] Replace Max outstanding acknowledgement
ids limit with a FlinkConnectorRateLimiter
- d8941711 [FLINK-13699][table-api] Fix TableFactory doesn’t work with DDL
when containing TIMESTAMP/DATE/TIME types
- 04e95278 [FLINK-13752] Only references necessary variables when
bookkeeping result partitions on TM

Due to the minimal set of changes, the vote for RC3 will be *open for only
48 hours*.
Please cast your votes before *Aug. 21st (Wed.) 2019, 17:00 PM CET*.

It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Gordon

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12344601
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.9.0-rc3/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1236
[5]
https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=refs/tags/release-1.9.0-rc3
[6] https://github.com/apache/flink/pull/9438
[7] https://github.com/apache/flink-web/pull/244
[8]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Apache-Flink-Release-1-9-0-release-candidate-2-tp31542p31933.html

Re: Issue with FilterableTableSource and the logical optimizer rules

2019-08-19 Thread Rong Rong

Hi Itamar,

The problem you described sounds similar to this ticket[1].
Can you try to see if following the solution listed resolves your issue?

--
Rong

[1] https://issues.apache.org/jira/browse/FLINK-12399

On Mon, Aug 19, 2019 at 8:56 AM Itamar Ravid  wrote:

> Hi, I’m facing a strange issue with Flink 1.8.1. I’ve implemented a
> StreamTableSource that implements FilterableTableSource and
> ProjectableTableSource. However, I’m seeing that during the logical plan
> optimization (TableEnvironment.scala:288), the applyPredicates method is
> called but the resulting plan does NOT contain the source with the filter
> pushed.
>
> It appears that the problem is in the VolcanoPlanner.findBestExp method;
> when it reaches “root.buildCheapestPlan”, the resulting RelNode does not
> contain the filtered source.
>
> Additionally, I added a breakpoint in
> FlinkLogicalTableSourceScan#computeSelfCost, and the tableSource never has
> the predicates pushed. I verified that in the
> PushFilterIntoTableSourceScanRule, the resulting source always has the
> predicates pushed.
>
> Amusingly, this issue causes queries like “SELECT a FROM src WHERE a =
> 123” to be rewritten to “SELECT 123 FROM src” :-)
>
> Does anyone have any advice on debugging/working around this without
> disabling predicate pushdown on the source?
>

Re: Update Checkpoint and/or Savepoint Timeout of Running Job without restart?

2019-08-19 Thread Aaron Levin

Thanks for the answer, Congxian!

On Sun, Aug 18, 2019 at 10:43 PM Congxian Qiu 
wrote:

> Hi
>
> Currently, we can't change a running job's checkpoint timeout, but there
> is an issue[1] which wants to set a separate timeout for savepoint.
>
> [1] https://issues.apache.org/jira/browse/FLINK-9465
> Best,
> Congxian
>
>
> Aaron Levin  于2019年8月17日周六 上午12:37写道：
>
>> Hello,
>>
>> Question: Is it possible to update the checkpoint and/or savepoint
>> timeout of a running job without restarting it? If not, is this something
>> that would be a welcomed contribution (not sure how easy this would be)?
>>
>> Context: sometimes we have jobs who are making progress but get into a
>> state where checkpoints are timing out, though we believe they would be
>> successful if we could increase the checkpoint timeout. Unfortunately we
>> currently need to restart the job to change this, and we would like to
>> avoid this if possible. Ideally we could make this change temporarily,
>> allow a checkpoint or savepoint to succeed, and then change the settings
>> back.
>>
>> Best,
>>
>> Aaron Levin
>>
>

Re: Stale watermark due to unconsumed Kafka partitions

2019-08-19 Thread Stephan Ewen

You can use the Timestamp Assigner / Watermark Generator in two different
ways: Per Kafka Partition or per parallel source.

I would usually recommend per Kafka Partition, because if the read position
in the partitions drifts apart (for example some partitions are read at the
tail, some are read a few minutes behind) then your watermarks get messes
up easily, if you do not track them per partition.
There is one instance of the assigner per partition, because its state
might be different for each partition (like "highest seen timestamp so far"
or "millis since last activity").

Why this behaves differently with a JDBC sink versus simply printing is
strange. Does the JDBC sink alter the parallelism or block some parts of
the pipeline?

On Sat, Aug 17, 2019 at 2:42 AM Eduardo Winpenny Tejedor <
eduardo.winpe...@gmail.com> wrote:

> Hi all,
>
> It was a bit tricky to figure out what was going wrong here, hopefully
> someone can add the missing piece to the puzzle.
>
> I have a Kafka source with a custom AssignerWithPeriodicWatermarks
> timestamp assigner. It's a copy of the AscendingTimestampExtractor with a
> log statement printing each timestamp and watermark produced (along with
> the hash of the assigner instance so I know exactly how far each substream
> has progressed). Attached to that there is a JDBCSinkFunction. I have set
> the whole plan's parallelism to 1 and the max also to 1.
>
> My first surprise was to see there are 16 instances of my assigner
> created, despite there being only one thread using all 16.
>
> My second surprise was to see there were only 4 assigner instances that
> were extracting timestamps.
>
> This meant the whole job's watermark wasn't advancing (and while that's
> not important in this simplified example it is in my real life use case).
>
> If I replace my JDBC sink for a print sink though all 16 assigners get
> fully used (i.e. they all receive messages from which they have to extract
> a timestamp).
>
> What is happening here? I don't want to ignore the unattended Kafka
> partitions or mark them as idle - because I know from using the print sink
> that they do have messages in them. I'm also surprised that there are 16
> instances of the assigner (one per Kafka partition) even though the
> parallelism of the job is one - is that a conscious decision and if so
> what's the reason?
>
> Finally I'd also like to know why only 4 assigners are effectively been
> used, I suspect it's a JDBC default I can override somehow.
>
> Thanks for getting to the bottom of this!
>

RabbitMQ - ConsumerCancelledException

2019-08-19 Thread Oytun Tez

Hi there,

We've started to witness ConsumerCancelledException errors from our
RabbitMQ source. We've digged in everywhere, yet couldn't come up with a
good explanation.

This is the exception:

com.rabbitmq.client.ConsumerCancelledException
at 
com.rabbitmq.client.QueueingConsumer.handle(QueueingConsumer.java:208)
at 
com.rabbitmq.client.QueueingConsumer.nextDelivery(QueueingConsumer.java:223)
at 
org.apache.flink.streaming.connectors.rabbitmq.RMQSource.run(RMQSource.java:193)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.rabbitmq.client.ConsumerCancelledException
at 
com.rabbitmq.client.QueueingConsumer.handleCancel(QueueingConsumer.java:122)
at 
com.rabbitmq.client.impl.ConsumerDispatcher$3.run(ConsumerDispatcher.java:115)
at 
com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:100)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more


We've tried limiting prefetch count 100 and 500, didn't change. We can
try 1 by 1, but that doesn't really sound efficient.


Is anyone familiar with possible causes?





---
Oytun Tez

*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com

Re: Using S3 as a sink (StreamingFileSink)

2019-08-19 Thread Swapnil Kumar

 We are on 1.8 as of now will give "stop with savepoint"  a try once we
upgrade.
I am trying to cancel the job with savepoint and restore it back again.

I think there is an issue with how our s3 lifecycle is configured. Thank
you for your help.

On Sun, Aug 18, 2019 at 8:10 AM Stephan Ewen  wrote:

> My first guess would also be the same as Rafi's: The lifetime of the MPU
> part files is so too low for that use case.
>
> Maybe this can help:
>
>   - If you want to  stop a job with a savepoint and plan to restore later
> from it (possible much later, so that the MPU Part lifetime might be
> exceeded), then I would recommend to use Flink 1.9's new "stop with
> savepoint" feature. That should finalize in-flight uploads and make sure no
> lingering part files exist.
>
>   - If you take a savepoint out of a running job to start a new job, you
> probably need to configure the sink differently anyways, to not interfere
> with the running job. In that case, I would suggest to change the name of
> the sink (the operator uid) such that the new job's sink doesn't try to
> resume (and interfere with) the running job's sink.
>
> Best,
> Stephan
>
>
>
> On Sat, Aug 17, 2019 at 11:23 PM Rafi Aroch  wrote:
>
>> Hi,
>>
>> S3 would delete files only if you have 'lifecycle rules' [1] defined on
>> the bucket. Could that be the case? If so, make sure to disable / extend
>> the object expiration period.
>>
>> [1]
>> https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html
>> 
>>
>> Thanks,
>> Rafi
>>
>>
>> On Sat, Aug 17, 2019 at 1:48 AM Oytun Tez  wrote:
>>
>>> Hi Swapnil,
>>>
>>> I am not familiar with the StreamingFileSink, however, this sounds like
>>> a checkpointing issue to me FileSink should keep its sink state, and remove
>>> from the state the files that it *really successfully* sinks (perhaps
>>> you may want to add a validation here with S3 to check file integrity).
>>> This leaves us in the state with the failed files, partial files etc.
>>>
>>>
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>> 
>>>
>>>
>>> On Fri, Aug 16, 2019 at 6:02 PM Swapnil Kumar 
>>> wrote:
>>>
 Hello, We are using Flink to process input events and aggregate and
 write o/p of our streaming job to S3 using StreamingFileSink but whenever
 we try to restore the job from a savepoint, the restoration fails with
 missing part files error. As per my understanding, s3 deletes those
 part(intermittent) files and can no longer be found on s3. Is there a
 workaround for this, so that we can use s3 as a sink?

 --
 Thanks,
 Swapnil Kumar

>>>

-- 
Thanks,
Swapnil Kumar

Re: Using S3 as a sink (StreamingFileSink)

2019-08-19 Thread Swapnil Kumar

Thank you Taher, We are not on EMR but great to know that s3 and streaming
sink should be working fine based on your explanation.

On Sun, Aug 18, 2019 at 8:23 AM taher koitawala  wrote:

> Hi Swapnil,
>We faced this problem once, I think changing checkpoint dir to hdfs
> and keeping sink dir to s3 with EMRFS s3 consistency enabled solves this
> problem. If you are not using emr then I don't know how else it can be
> solved. But in a nutshell because EMRFS s3 consistency uses Dynamo DB in
> the back end to check for all files being written to s3. It kind of makes
> s3 consistent and Streaming file sink works just fine.
>
>
>
> On Sat, Aug 17, 2019, 3:32 AM Swapnil Kumar  wrote:
>
>> Hello, We are using Flink to process input events and aggregate and write
>> o/p of our streaming job to S3 using StreamingFileSink but whenever we try
>> to restore the job from a savepoint, the restoration fails with missing
>> part files error. As per my understanding, s3 deletes those
>> part(intermittent) files and can no longer be found on s3. Is there a
>> workaround for this, so that we can use s3 as a sink?
>>
>> --
>> Thanks,
>> Swapnil Kumar
>>
>

-- 
Thanks,
Swapnil Kumar

Re: Using S3 as a sink (StreamingFileSink)

2019-08-19 Thread Swapnil Kumar

Hello Rafi,

Thank you for getting back. We have lifecycle rule setup for the sink and
not the s3 bucket for savepoints. This was my initial hunch too but we
tried restarting the job immediately after canceling them and it failed.

Best,
Swapnil Kumar

On Sat, Aug 17, 2019 at 2:23 PM Rafi Aroch  wrote:

> Hi,
>
> S3 would delete files only if you have 'lifecycle rules' [1] defined on
> the bucket. Could that be the case? If so, make sure to disable / extend
> the object expiration period.
>
> [1]
> https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html
> 
>
> Thanks,
> Rafi
>
>
> On Sat, Aug 17, 2019 at 1:48 AM Oytun Tez  wrote:
>
>> Hi Swapnil,
>>
>> I am not familiar with the StreamingFileSink, however, this sounds like a
>> checkpointing issue to me FileSink should keep its sink state, and remove
>> from the state the files that it *really successfully* sinks (perhaps
>> you may want to add a validation here with S3 to check file integrity).
>> This leaves us in the state with the failed files, partial files etc.
>>
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>> 
>>
>>
>> On Fri, Aug 16, 2019 at 6:02 PM Swapnil Kumar 
>> wrote:
>>
>>> Hello, We are using Flink to process input events and aggregate and
>>> write o/p of our streaming job to S3 using StreamingFileSink but whenever
>>> we try to restore the job from a savepoint, the restoration fails with
>>> missing part files error. As per my understanding, s3 deletes those
>>> part(intermittent) files and can no longer be found on s3. Is there a
>>> workaround for this, so that we can use s3 as a sink?
>>>
>>> --
>>> Thanks,
>>> Swapnil Kumar
>>>
>>

-- 
Thanks,
Swapnil Kumar

Issue with FilterableTableSource and the logical optimizer rules

2019-08-19 Thread Itamar Ravid

Hi, I’m facing a strange issue with Flink 1.8.1. I’ve implemented a 
StreamTableSource that implements FilterableTableSource and 
ProjectableTableSource. However, I’m seeing that during the logical plan 
optimization (TableEnvironment.scala:288), the applyPredicates method is called 
but the resulting plan does NOT contain the source with the filter pushed.

It appears that the problem is in the VolcanoPlanner.findBestExp method; when 
it reaches “root.buildCheapestPlan”, the resulting RelNode does not contain the 
filtered source.

Additionally, I added a breakpoint in 
FlinkLogicalTableSourceScan#computeSelfCost, and the tableSource never has the 
predicates pushed. I verified that in the PushFilterIntoTableSourceScanRule, 
the resulting source always has the predicates pushed.

Amusingly, this issue causes queries like “SELECT a FROM src WHERE a = 123” to 
be rewritten to “SELECT 123 FROM src” :-)

Does anyone have any advice on debugging/working around this without disabling 
predicate pushdown on the source?

Re: Recovery from job manager crash using check points

2019-08-19 Thread Biao Liu

Hi Min,

> Do I need to set up zookeepers to keep the states when a job manager
crashes?

I guess you need to set up the HA [1] properly. Besides that, I would
suggest you should also check the state backend.

1.
https://ci.apache.org/projects/flink/flink-docs-master/ops/jobmanager_high_availability.html
2.
https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html

Thanks,
Biao /'bɪ.aʊ/



On Mon, 19 Aug 2019 at 23:28,  wrote:

> Hi,
>
>
>
> I can use check points to recover Flink states when a task manger crashes.
>
>
>
> I can not use check points to recover Flink states when a job manger
> crashes.
>
>
>
> Do I need to set up zookeepers to keep the states when a job manager
> crashes?
>
>
>
> Regards
>
>
>
> Min
>
>
>

Re: Recovery from job manager crash using check points

2019-08-19 Thread miki haiat

Wich kind of deployment system are you using,
Standalone ,yarn ... Other ?

On Mon, Aug 19, 2019, 18:28  wrote:

> Hi,
>
>
>
> I can use check points to recover Flink states when a task manger crashes.
>
>
>
> I can not use check points to recover Flink states when a job manger
> crashes.
>
>
>
> Do I need to set up zookeepers to keep the states when a job manager
> crashes?
>
>
>
> Regards
>
>
>
> Min
>
>
>

Re: How to implement Multi-tenancy in Flink

2019-08-19 Thread Ahmad Hassan

Flink's sliding window didn't work well for our use case at SAP as the
checkpointing freezes with 288 sliding windows per tenant. Implementing
sliding window through tumbling window / process function reduces the
checkpointing time to few seconds. We will see how that scales with 1000 or
more tenants to get the better idea about scalability.

Best Regards,

On Mon, 19 Aug 2019 at 16:16, Fabian Hueske  wrote:

> Great!
>
> Thanks for the feedback.
>
> Cheers, Fabian
>
> Am Mo., 19. Aug. 2019 um 17:12 Uhr schrieb Ahmad Hassan <
> ahmad.has...@gmail.com>:
>
>>
>> Thank you Fabian. This works really well.
>>
>> Best Regards,
>>
>> On Fri, 16 Aug 2019 at 09:22, Fabian Hueske  wrote:
>>
>>> Hi Ahmad,
>>>
>>> The ProcessFunction should not rely on new records to come (i..e, do the
>>> processsing in the onElement() method) but rather register a timer every 5
>>> minutes and perform the processing when the timer fires in onTimer().
>>> Essentially, you'd only collect data the data in `processElement()` and
>>> process in `onTimer()`.
>>> You need to make sure that you have timers registered, as long as
>>> there's data in the ring buffer.
>>>
>>> Best, Fabian
>>>
>>> Am Do., 15. Aug. 2019 um 19:20 Uhr schrieb Ahmad Hassan <
>>> ahmad.has...@gmail.com>:
>>>
 Hi Fabian,

 In this case, how do we emit tumbling window when there are no events?
 Otherwise it’s not possible to emulate a sliding window in process function
 and move the buffer ring every 5 mins when there are no events.

 Yes I can create a periodic source function but how can it be
 associated with all the keyed windows.

 Thanks.

 Best,

 On 2 Aug 2019, at 12:49, Fabian Hueske  wrote:

 Ok, I won't go into the implementation detail.

 The idea is to track all products that were observed in the last five
 minutes (i.e., unique product ids) in a five minute tumbling window.
 Every five minutes, the observed products are send to a process
 function that collects the data of the last 24 hours and updates the
 current result by adding the data of the latest 5 minutes and removing the
 data of the 5 minutes that fell out of the 24 hour window.

 I don't know your exact business logic, but this is the rough scheme
 that I would follow.

 Cheers, Fabian

 Am Fr., 2. Aug. 2019 um 12:25 Uhr schrieb Ahmad Hassan <
 ahmad.has...@gmail.com>:

> Hi Fabian,
>
> Thanks for this detail. However, our pipeline is keeping track of list
> of products seen in 24 hour with 5 min slide (288 windows).
>
> inStream
>
> .filter(Objects::*nonNull*)
>
> .keyBy(*TENANT*)
>
> .window(SlidingProcessingTimeWindows.*of*(Time.*minutes*(24), Time.
> *minutes*(5)))
>
> .trigger(TimeTrigger.*create*())
>
> .evictor(CountEvictor.*of*(1))
>
> .process(*new* MetricProcessWindowFunction());
>
>
> Trigger just fires for onElement and MetricProcessWindowFunction just
> store stats for each product within MapState
>
> and emit only if it reaches expiry. Evictor just empty the window as
> all products state is within MapState. Flink 1.7.0 checkpointing just 
> hangs
> and expires while processing our pipeline.
>
>
> However, with your proposed solution, how would we be able to achieve
> this sliding window mechanism of emitting 24 hour window every 5 minute
> using processfunction ?
>
>
> Best,
>
>
> On Fri, 2 Aug 2019 at 09:48, Fabian Hueske  wrote:
>
>> Hi Ahmad,
>>
>> First of all, you need to preaggregate the data in a 5 minute
>> tumbling window. For example, if your aggregation function is count or 
>> sum,
>> this is simple.
>> You have a 5 min tumbling window that just emits a count or sum every
>> 5 minutes.
>>
>> The ProcessFunction then has a MapState
>> (called buffer). IntermediateAgg is the result type of the tumbling 
>> window
>> and the MapState is used like an array with the Integer key being the
>> position pointer to the value. You will only use the pointers 0 to 287 to
>> store the 288 intermediate aggregation values and use the MapState as a
>> ring buffer. For that you need a ValueState (called pointer) 
>> that
>> is a pointer to the position that is overwritten next. Finally, you have 
>> a
>> ValueState (called result) that stores the result of the last
>> window.
>>
>> When the ProcessFunction receives a new intermediate result, it will
>> perform the following steps:
>>
>> 1) get the oldest intermediate result: buffer.get(pointer)
>> 2) override the oldest intermediate result by the newly received
>> intermediate result: buffer.put(pointer, new-intermediate-result)
>> 3) increment the pointer by 1 and reset it to 0 if it became 288
>> 4) subtract the oldest

Recovery from job manager crash using check points

2019-08-19 Thread min.tan

Hi,

I can use check points to recover Flink states when a task manger crashes.

I can not use check points to recover Flink states when a job manger crashes.

Do I need to set up zookeepers to keep the states when a job manager crashes?

Regards

Min


E-mails can involve SUBSTANTIAL RISKS, e.g. lack of confidentiality, potential 
manipulation of contents and/or sender's address, incorrect recipient 
(misdirection), viruses etc. Based on previous e-mail correspondence with you 
and/or an agreement reached with you, UBS considers itself authorized to 
contact you via e-mail. UBS assumes no responsibility for any loss or damage 
resulting from the use of e-mails. 
The recipient is aware of and accepts the inherent risks of using e-mails, in 
particular the risk that the banking relationship and confidential information 
relating thereto are disclosed to third parties.
UBS reserves the right to retain and monitor all messages. Messages are 
protected and accessed only in legally justified cases.
For information on how UBS uses and discloses personal data, how long we retain 
it, how we keep it secure and your data protection rights, please see our 
Privacy Notice http://www.ubs.com/privacy-statement

Re: How to implement Multi-tenancy in Flink

2019-08-19 Thread Fabian Hueske

Great!

Thanks for the feedback.

Cheers, Fabian

Am Mo., 19. Aug. 2019 um 17:12 Uhr schrieb Ahmad Hassan <
ahmad.has...@gmail.com>:

>
> Thank you Fabian. This works really well.
>
> Best Regards,
>
> On Fri, 16 Aug 2019 at 09:22, Fabian Hueske  wrote:
>
>> Hi Ahmad,
>>
>> The ProcessFunction should not rely on new records to come (i..e, do the
>> processsing in the onElement() method) but rather register a timer every 5
>> minutes and perform the processing when the timer fires in onTimer().
>> Essentially, you'd only collect data the data in `processElement()` and
>> process in `onTimer()`.
>> You need to make sure that you have timers registered, as long as there's
>> data in the ring buffer.
>>
>> Best, Fabian
>>
>> Am Do., 15. Aug. 2019 um 19:20 Uhr schrieb Ahmad Hassan <
>> ahmad.has...@gmail.com>:
>>
>>> Hi Fabian,
>>>
>>> In this case, how do we emit tumbling window when there are no events?
>>> Otherwise it’s not possible to emulate a sliding window in process function
>>> and move the buffer ring every 5 mins when there are no events.
>>>
>>> Yes I can create a periodic source function but how can it be associated
>>> with all the keyed windows.
>>>
>>> Thanks.
>>>
>>> Best,
>>>
>>> On 2 Aug 2019, at 12:49, Fabian Hueske  wrote:
>>>
>>> Ok, I won't go into the implementation detail.
>>>
>>> The idea is to track all products that were observed in the last five
>>> minutes (i.e., unique product ids) in a five minute tumbling window.
>>> Every five minutes, the observed products are send to a process function
>>> that collects the data of the last 24 hours and updates the current result
>>> by adding the data of the latest 5 minutes and removing the data of the 5
>>> minutes that fell out of the 24 hour window.
>>>
>>> I don't know your exact business logic, but this is the rough scheme
>>> that I would follow.
>>>
>>> Cheers, Fabian
>>>
>>> Am Fr., 2. Aug. 2019 um 12:25 Uhr schrieb Ahmad Hassan <
>>> ahmad.has...@gmail.com>:
>>>
 Hi Fabian,

 Thanks for this detail. However, our pipeline is keeping track of list
 of products seen in 24 hour with 5 min slide (288 windows).

 inStream

 .filter(Objects::*nonNull*)

 .keyBy(*TENANT*)

 .window(SlidingProcessingTimeWindows.*of*(Time.*minutes*(24), Time.
 *minutes*(5)))

 .trigger(TimeTrigger.*create*())

 .evictor(CountEvictor.*of*(1))

 .process(*new* MetricProcessWindowFunction());


 Trigger just fires for onElement and MetricProcessWindowFunction just
 store stats for each product within MapState

 and emit only if it reaches expiry. Evictor just empty the window as
 all products state is within MapState. Flink 1.7.0 checkpointing just hangs
 and expires while processing our pipeline.


 However, with your proposed solution, how would we be able to achieve
 this sliding window mechanism of emitting 24 hour window every 5 minute
 using processfunction ?


 Best,


 On Fri, 2 Aug 2019 at 09:48, Fabian Hueske  wrote:

> Hi Ahmad,
>
> First of all, you need to preaggregate the data in a 5 minute tumbling
> window. For example, if your aggregation function is count or sum, this is
> simple.
> You have a 5 min tumbling window that just emits a count or sum every
> 5 minutes.
>
> The ProcessFunction then has a MapState
> (called buffer). IntermediateAgg is the result type of the tumbling window
> and the MapState is used like an array with the Integer key being the
> position pointer to the value. You will only use the pointers 0 to 287 to
> store the 288 intermediate aggregation values and use the MapState as a
> ring buffer. For that you need a ValueState (called pointer) that
> is a pointer to the position that is overwritten next. Finally, you have a
> ValueState (called result) that stores the result of the last
> window.
>
> When the ProcessFunction receives a new intermediate result, it will
> perform the following steps:
>
> 1) get the oldest intermediate result: buffer.get(pointer)
> 2) override the oldest intermediate result by the newly received
> intermediate result: buffer.put(pointer, new-intermediate-result)
> 3) increment the pointer by 1 and reset it to 0 if it became 288
> 4) subtract the oldest intermediate result from the result
> 5) add the newly received intermediate result to the result. Update
> the result state and emit the result
>
> Note, this only works for certain aggregation functions. Depending on
> the function, you cannot pre-aggregate which is a hard requirement for 
> this
> approach.
>
> Best, Fabian
>
> Am Do., 1. Aug. 2019 um 20:00 Uhr schrieb Ahmad Hassan <
> ahmad.has...@gmail.com>:
>
>>
>> Hi Fabian,
>>
>> > On 4 Jul 2018, at 11:39, Fabian Hueske  wrote:
>> >
>> > - Pre-aggregate

Re: How to implement Multi-tenancy in Flink

2019-08-19 Thread Ahmad Hassan

Thank you Fabian. This works really well.

Best Regards,

On Fri, 16 Aug 2019 at 09:22, Fabian Hueske  wrote:

> Hi Ahmad,
>
> The ProcessFunction should not rely on new records to come (i..e, do the
> processsing in the onElement() method) but rather register a timer every 5
> minutes and perform the processing when the timer fires in onTimer().
> Essentially, you'd only collect data the data in `processElement()` and
> process in `onTimer()`.
> You need to make sure that you have timers registered, as long as there's
> data in the ring buffer.
>
> Best, Fabian
>
> Am Do., 15. Aug. 2019 um 19:20 Uhr schrieb Ahmad Hassan <
> ahmad.has...@gmail.com>:
>
>> Hi Fabian,
>>
>> In this case, how do we emit tumbling window when there are no events?
>> Otherwise it’s not possible to emulate a sliding window in process function
>> and move the buffer ring every 5 mins when there are no events.
>>
>> Yes I can create a periodic source function but how can it be associated
>> with all the keyed windows.
>>
>> Thanks.
>>
>> Best,
>>
>> On 2 Aug 2019, at 12:49, Fabian Hueske  wrote:
>>
>> Ok, I won't go into the implementation detail.
>>
>> The idea is to track all products that were observed in the last five
>> minutes (i.e., unique product ids) in a five minute tumbling window.
>> Every five minutes, the observed products are send to a process function
>> that collects the data of the last 24 hours and updates the current result
>> by adding the data of the latest 5 minutes and removing the data of the 5
>> minutes that fell out of the 24 hour window.
>>
>> I don't know your exact business logic, but this is the rough scheme that
>> I would follow.
>>
>> Cheers, Fabian
>>
>> Am Fr., 2. Aug. 2019 um 12:25 Uhr schrieb Ahmad Hassan <
>> ahmad.has...@gmail.com>:
>>
>>> Hi Fabian,
>>>
>>> Thanks for this detail. However, our pipeline is keeping track of list
>>> of products seen in 24 hour with 5 min slide (288 windows).
>>>
>>> inStream
>>>
>>> .filter(Objects::*nonNull*)
>>>
>>> .keyBy(*TENANT*)
>>>
>>> .window(SlidingProcessingTimeWindows.*of*(Time.*minutes*(24), Time.
>>> *minutes*(5)))
>>>
>>> .trigger(TimeTrigger.*create*())
>>>
>>> .evictor(CountEvictor.*of*(1))
>>>
>>> .process(*new* MetricProcessWindowFunction());
>>>
>>>
>>> Trigger just fires for onElement and MetricProcessWindowFunction just
>>> store stats for each product within MapState
>>>
>>> and emit only if it reaches expiry. Evictor just empty the window as all
>>> products state is within MapState. Flink 1.7.0 checkpointing just hangs and
>>> expires while processing our pipeline.
>>>
>>>
>>> However, with your proposed solution, how would we be able to achieve
>>> this sliding window mechanism of emitting 24 hour window every 5 minute
>>> using processfunction ?
>>>
>>>
>>> Best,
>>>
>>>
>>> On Fri, 2 Aug 2019 at 09:48, Fabian Hueske  wrote:
>>>
 Hi Ahmad,

 First of all, you need to preaggregate the data in a 5 minute tumbling
 window. For example, if your aggregation function is count or sum, this is
 simple.
 You have a 5 min tumbling window that just emits a count or sum every 5
 minutes.

 The ProcessFunction then has a MapState
 (called buffer). IntermediateAgg is the result type of the tumbling window
 and the MapState is used like an array with the Integer key being the
 position pointer to the value. You will only use the pointers 0 to 287 to
 store the 288 intermediate aggregation values and use the MapState as a
 ring buffer. For that you need a ValueState (called pointer) that
 is a pointer to the position that is overwritten next. Finally, you have a
 ValueState (called result) that stores the result of the last
 window.

 When the ProcessFunction receives a new intermediate result, it will
 perform the following steps:

 1) get the oldest intermediate result: buffer.get(pointer)
 2) override the oldest intermediate result by the newly received
 intermediate result: buffer.put(pointer, new-intermediate-result)
 3) increment the pointer by 1 and reset it to 0 if it became 288
 4) subtract the oldest intermediate result from the result
 5) add the newly received intermediate result to the result. Update the
 result state and emit the result

 Note, this only works for certain aggregation functions. Depending on
 the function, you cannot pre-aggregate which is a hard requirement for this
 approach.

 Best, Fabian

 Am Do., 1. Aug. 2019 um 20:00 Uhr schrieb Ahmad Hassan <
 ahmad.has...@gmail.com>:

>
> Hi Fabian,
>
> > On 4 Jul 2018, at 11:39, Fabian Hueske  wrote:
> >
> > - Pre-aggregate records in a 5 minute Tumbling window. However,
> pre-aggregation does not work for FoldFunctions.
> > - Implement the window as a custom ProcessFunction that maintains a
> state of 288 events and aggregates and retracts the pre-aggregated 
> records.
> >
> >

Configuring logback for my flink job

2019-08-19 Thread Vishwas Siravara

Hi,
I have a logback for my flink application which is packaged with the
application fat jar. However when I submit my job from flink command line
tool, I see that logback is set to
-Dlogback.configurationFile=file:/data/flink-1.7.2/conf/logback.xml from
the client log.

As a result my application log ends up going to the client log. How can I
change this behavior . I know a dirty fix is to add my logback config to
the logback in /data/flink-1.7.2/conf/logback.xml .

Any other suggestions on how I can have my application log to its
configured appenders.

Thanks,
Vishwas

Re: processing avro data source using DataSet API and output to parquet

2019-08-19 Thread Zhenghua Gao

the DataStream API should fully subsume the DataSet API (through bounded
streams) in the long run [1]
And you can consider use Table/SQL API in your project.

[1]
https://flink.apache.org/roadmap.html#analytics-applications-and-the-roles-of-datastream-dataset-and-table-api

*Best Regards,*
*Zhenghua Gao*


On Fri, Aug 16, 2019 at 11:52 PM Lian Jiang  wrote:

> Thanks. Which api (dataset or datastream) is recommended for file handling
> (no window operation required)?
>
> We have similar scenario for real-time processing. May it make sense to
> use datastream api for both batch and real-time for uniformity?
>
> Sent from my iPhone
>
> On Aug 16, 2019, at 00:38, Zhenghua Gao  wrote:
>
> Flink allows hadoop (mapreduce) OutputFormats in Flink jobs[1]. You can
> have a try with Parquet OutputFormat[2].
>
> And if you can turn to DataStream API，
> StreamingFileSink + ParquetBulkWriter meets your requirement[3][4].
>
> [1]
> https://github.com/apache/flink/blob/master/flink-connectors/flink-hadoop-compatibility/src/test/java/org/apache/flink/test/hadoopcompatibility/mapreduce/example/WordCount.java
> [2]
> https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java
> [3]
> https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/filesystem/StreamingFileSink.java
> [4]
> https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/java/org/apache/flink/formats/parquet/ParquetBulkWriter.java
>
>
> *Best Regards,*
> *Zhenghua Gao*
>
>
> On Fri, Aug 16, 2019 at 1:04 PM Lian Jiang  wrote:
>
>> Hi,
>>
>> I am using Flink 1.8.1 DataSet for a batch processing. The data source is
>> avro files and I want to output the result into parquet.
>> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/batch/
>> only has no related information. What's the recommended way for doing this?
>> Do I need to write adapters? Appreciate your help!
>>
>>
>>

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-19 Thread Ufuk Celebi

I'm late to the party... Welcome and congrats! :-)

– Ufuk


On Mon, Aug 19, 2019 at 9:26 AM Andrey Zagrebin 
wrote:

> Hi Everybody!
>
> Thanks a lot for the warn welcome!
> I am really happy about joining Flink committer team and hope to help the
> project to grow more.
>
> Cheers,
> Andrey
>
> On Fri, Aug 16, 2019 at 11:10 AM Terry Wang  wrote:
>
> > Congratulations Andrey!
> >
> > Best,
> > Terry Wang
> >
> >
> >
> > 在 2019年8月15日，下午9:27，Hequn Cheng  写道：
> >
> > Congratulations Andrey!
> >
> > On Thu, Aug 15, 2019 at 3:30 PM Fabian Hueske  wrote:
> >
> >> Congrats Andrey!
> >>
> >> Am Do., 15. Aug. 2019 um 07:58 Uhr schrieb Gary Yao  >:
> >>
> >> > Congratulations Andrey, well deserved!
> >> >
> >> > Best,
> >> > Gary
> >> >
> >> > On Thu, Aug 15, 2019 at 7:50 AM Bowen Li  wrote:
> >> >
> >> > > Congratulations Andrey!
> >> > >
> >> > > On Wed, Aug 14, 2019 at 10:18 PM Rong Rong 
> >> wrote:
> >> > >
> >> > >> Congratulations Andrey!
> >> > >>
> >> > >> On Wed, Aug 14, 2019 at 10:14 PM chaojianok 
> >> wrote:
> >> > >>
> >> > >> > Congratulations Andrey!
> >> > >> > At 2019-08-14 21:26:37, "Till Rohrmann" 
> >> wrote:
> >> > >> > >Hi everyone,
> >> > >> > >
> >> > >> > >I'm very happy to announce that Andrey Zagrebin accepted the
> >> offer of
> >> > >> the
> >> > >> > >Flink PMC to become a committer of the Flink project.
> >> > >> > >
> >> > >> > >Andrey has been an active community member for more than 15
> >> months.
> >> > He
> >> > >> has
> >> > >> > >helped shaping numerous features such as State TTL, FRocksDB
> >> release,
> >> > >> > >Shuffle service abstraction, FLIP-1, result partition management
> >> and
> >> > >> > >various fixes/improvements. He's also frequently helping out on
> >> the
> >> > >> > >user@f.a.o mailing lists.
> >> > >> > >
> >> > >> > >Congratulations Andrey!
> >> > >> > >
> >> > >> > >Best, Till
> >> > >> > >(on behalf of the Flink PMC)
> >> > >> >
> >> > >>
> >> > >
> >> >
> >>
> >
> >
>

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-19 Thread Andrey Zagrebin

Hi Everybody!

Thanks a lot for the warn welcome!
I am really happy about joining Flink committer team and hope to help the
project to grow more.

Cheers,
Andrey

On Fri, Aug 16, 2019 at 11:10 AM Terry Wang  wrote:

> Congratulations Andrey!
>
> Best,
> Terry Wang
>
>
>
> 在 2019年8月15日，下午9:27，Hequn Cheng  写道：
>
> Congratulations Andrey!
>
> On Thu, Aug 15, 2019 at 3:30 PM Fabian Hueske  wrote:
>
>> Congrats Andrey!
>>
>> Am Do., 15. Aug. 2019 um 07:58 Uhr schrieb Gary Yao :
>>
>> > Congratulations Andrey, well deserved!
>> >
>> > Best,
>> > Gary
>> >
>> > On Thu, Aug 15, 2019 at 7:50 AM Bowen Li  wrote:
>> >
>> > > Congratulations Andrey!
>> > >
>> > > On Wed, Aug 14, 2019 at 10:18 PM Rong Rong 
>> wrote:
>> > >
>> > >> Congratulations Andrey!
>> > >>
>> > >> On Wed, Aug 14, 2019 at 10:14 PM chaojianok 
>> wrote:
>> > >>
>> > >> > Congratulations Andrey!
>> > >> > At 2019-08-14 21:26:37, "Till Rohrmann" 
>> wrote:
>> > >> > >Hi everyone,
>> > >> > >
>> > >> > >I'm very happy to announce that Andrey Zagrebin accepted the
>> offer of
>> > >> the
>> > >> > >Flink PMC to become a committer of the Flink project.
>> > >> > >
>> > >> > >Andrey has been an active community member for more than 15
>> months.
>> > He
>> > >> has
>> > >> > >helped shaping numerous features such as State TTL, FRocksDB
>> release,
>> > >> > >Shuffle service abstraction, FLIP-1, result partition management
>> and
>> > >> > >various fixes/improvements. He's also frequently helping out on
>> the
>> > >> > >user@f.a.o mailing lists.
>> > >> > >
>> > >> > >Congratulations Andrey!
>> > >> > >
>> > >> > >Best, Till
>> > >> > >(on behalf of the Flink PMC)
>> > >> >
>> > >>
>> > >
>> >
>>
>
>

Re: Weekly Community Update 2019/33, Personal Chinese Version

Re: Configuring logback for my flink job

Re: Weekly Community Update 2019/33, Personal Chinese Version

Re: Weekly Community Update 2019/33, Personal Chinese Version

Re: Recovery from job manager crash using check points

Re: [VOTE] Apache Flink 1.9.0, release candidate #3

Fwd: [VOTE] Apache Flink 1.9.0, release candidate #3

Re: Issue with FilterableTableSource and the logical optimizer rules

Re: Update Checkpoint and/or Savepoint Timeout of Running Job without restart?

Re: Stale watermark due to unconsumed Kafka partitions

RabbitMQ - ConsumerCancelledException

Re: Using S3 as a sink (StreamingFileSink)

Re: Using S3 as a sink (StreamingFileSink)

Re: Using S3 as a sink (StreamingFileSink)

Issue with FilterableTableSource and the logical optimizer rules

Re: Recovery from job manager crash using check points

Re: Recovery from job manager crash using check points

Re: How to implement Multi-tenancy in Flink

Recovery from job manager crash using check points

Re: How to implement Multi-tenancy in Flink

Re: How to implement Multi-tenancy in Flink

Configuring logback for my flink job

Re: processing avro data source using DataSet API and output to parquet

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

25 matches

Site Navigation

Mail list logo

Footer information