Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-20 Thread
Hi Fabian,

I see a bot "project-bot" active on pull requests. It is some progress of
this thread?

Best,
tison.


Thomas Weise  于2018年9月19日周三 下午10:02写道:

> Follow-up regarding the PR template that pops up when opening a PR:
>
> I think what we have now is a fairly big blob of text that jumps up a bit
> unexpectedly for a first time contributor and is also cumbersome to deal
> with in the small PR description window. Perhaps we can improve it a bit:
>
> * Instead of putting all that text into the description, add it to
> website/wiki and just have a pointer in the PR, asking the contributor to
> review the guidelines before opening a PR.
> * If the questions further down can be made relevant to the context of the
> contribution, that would probably help both the contributor and the
> reviewer. For example, the questions would be different for a documentation
> change, connector change or work deep in core. Not sure if that can be
> automated, but if moved to a separate page, it could be structured better.
>
> Thanks,
> Thomas
>
>
>
>
>
>
> On Tue, Sep 18, 2018 at 8:13 AM 陈梓立  wrote:
>
> > Put some good cases here might be helpful.
> >
> > See how a contribution of runtime module be proposed, discussed,
> > implemented and merged from  https://github.com/apache/flink/pull/5931
> to
> > https://github.com/apache/flink/pull/6132.
> >
> > 1. #5931 fix a bug, but remains points could be improved. Here sihuazhou
> > and shuai-xu share their considerations and require review(of the
> proposal)
> > by Stephan, Till and Gary, our committers.
> > 2. After discussion, all people involved reach a consensus. So the rest
> > work is to implement it.
> > 3. sihuazhou gives out an implementation #6132, Till reviews it and find
> it
> > is somewhat out of the "architectural" aspect, so suggests
> > implementation-level changes.
> > 4. Addressing those implementation-level comments, the PR gets merged.
> >
> > I think this is quite a good example how we think our review process
> should
> > go.
> >
> > Best,
> > tison.
> >
> >
> > 陈梓立  于2018年9月18日周二 下午10:53写道:
> >
> > > Maybe a little rearrange to the process would help.
> > >
> > > (1). Does the contributor describe itself well?
> > >   (1.1) By whom this contribution should be given attention. This often
> > > shows by its title, "[FLINK-XXX] [module]", the module part infer.
> > >   (1.2) What the purpose of this contribution is. Done by the PR
> > template.
> > > Even on JIRA an issue should cover these points.
> > >
> > > (2). Is there consensus on the contribution?
> > > This follows (1), because we need to clear what the purpose of the
> > > contribution first. At this stage reviewers could cc to module
> maintainer
> > > as a supplement to (1.1). Also reviewers might ask the contributor to
> > > clarify his purpose to sharp(1.2)
> > >
> > > (3). Is the implement architectural and fit code style?
> > > This follows (2). And only after a consensus we talk about concrete
> > > implement, which prevent spend time and put effort in vain.
> > >
> > > In addition, ideally a "+1" comment or approval means the purpose of
> > > contribution is supported by the reviewer and implement(if there is)
> > > quality is fine, so the reviewer vote for a consensus.
> > >
> > > Best,
> > > tison.
> > >
> > >
> > > Stephan Ewen  于2018年9月18日周二 下午6:44写道:
> > >
> > >> On the template discussion, some thoughts
> > >>
> > >> *PR Template*
> > >>
> > >> I think the PR template went well. We can rethink the "checklist" at
> the
> > >> bottom, but all other parts turned out helpful in my opinion.
> > >>
> > >> With the amount of contributions, it helps to ask the contributor to
> > take
> > >> a
> > >> little more work in order for the reviewer to be more efficient.
> > >> I would suggest to keep that mindset: Whenever we find a way that the
> > >> contributor can prepare stuff in such a way that reviews become
> > >> more efficient, we should do that. In my experience, most contributors
> > are
> > >> willing to put in some extra minutes if it helps that their
> > >> PR gets merged faster.
> > >>
> > >> *Review Template*
> > >>
> > >> I think it would be helpful to have this checklist. It does not matter
> > 

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-19 Thread
Hi,

I would prefer the integrated idea and could help to concrete it.

Posting a review template once a pull request created. This is regarded as
the formal process reviews to follow. Posting a checklist in form and make
it editable for maintainer.

[ ] Reach consensus
[ ] Implementation architectural
[ ] Code quality admired
( can with more check item and description if needed )

When the process go ahead, maintainer, i.e., committers SHOULD update the
checklist, and SUGGESTED to add a label to mark the stage. Ideally when
stage changed, maintainer/committer add a corresponding comment to explain
it. Of course when things are obvious, some works could be omit.

Best,
tison.


Fabian Hueske  于2018年9月19日周三 下午8:14写道:

> Hi,
>
> I'd like to suggest that we keep this thread focused on discussing
> Stephan's proposal, i.e., introducing a structured PR review process.
> Tison and Piotr raised some good points related to PR reviews that are
> definitely worth discussing, but I think we should do that on different
> threads and move forward with the original discussion.
>
> Stephan's proposal has received only positive feedback until now.
> So, I think we should go ahead and adopt the process.
>
> IMO, there are two things to be done here:
>
> * Add the proposed process to the website (trivial)
> * Implement the process in practice which is still an open question to me.
>
> Two methods have been proposed to implement the process and I see following
> dis/advantages:
>
> 1. Automatically posting the review checklist as the first comment to a PR
> and track the progress by ticking off boxes in the checklist.
>
>   + First time contributors / reviewers learn about the process from the
> comment, which reduces the chance of detailed reviews without consensus on
> the motivation and approach.
>   - Comments are not visible when looking at the PR list and cannot be used
> for filtering.
>   - Needs some kind of service that automatically comments on PRs. Service
> does not need special permissions as every GH user can comment.
>
> 2. Tracking the review status of a PR with labels.
>
>   + Labels are visible on the PR overview and can be used to filter PRs.
>   - The review process is not spelled out. Contributors and reviewers have
> to learn about the process somewhere else (link could be added to PR
> template).
>   - If we want to tick-off labels, we needs some kind of service to
> automatically assign labels. The service needs committer permissions or be
> setup by ASF Infra.
>   - We need to check if assignment/removal of labels is mirrored to a
> mailing list which is important in the ASF to track decision. This
> shouldn't be hard to figure out, but if labels are not tracked, they cannot
> be the sole solution.
>
> We can of course also do both.
> Have the review checklist posted and tracked as a comment and ask reviewers
> to add the right label when ticking a box off.
>
> What do you think?
>
> Best,
> Fabian
>
> 2018-09-19 11:24 GMT+02:00 陈梓立 :
>
> > Hi Piotr,
> >
> > I strongly agree with your idea. Since we work on a huge project, diff in
> > the PR becomes hard to review if there is too much noise.
> >
> > Just to clarify the process. Do you mean that a pull request should go
> into
> > the way below?
> >
> > Separated commits during the implement, mainly distinguish feature/bug
> fix
> > with clean up/rework.
> >
> > [FLINK-XXX] [module] future/bug fix...
> > [FLINK-XXX] [module] more on future/bug fix...
> > [hotfix] clean up/rework...
> >
> > and so on.
> >
> > And finally, when get merged, put all stuff into one commit and comments
> > close #PRID
> > so coming ones could see the detail by jump to #PRID.
> >
> > One thing to trade off is that if we merge by one commit, we cannot
> revert
> > part of it automated; if we merge by PR commits(one by one), the commit
> log
> > might mess.
> >
> > Best,
> > tison.
> >
> >
> > Piotr Nowojski  于2018年9月19日周三 下午5:10写道:
> >
> > > Hi,
> > >
> > > I would like to rise one more issue. Often contributions are very heavy
> > > and difficult to review. They have one big commit that changes multiple
> > > things which is difficult to review. Instead I would be in favour of
> > > implementing a rule, that a single commit should never do more then one
> > > thing. For example never mixing refactoring/renames/clean ups/code
> > > deduplications with functional changes. If implementing something
> > requires
> > > two independent changes in two separate components, those also should
> be
> > > two i

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-19 Thread
Hi Piotr,

I strongly agree with your idea. Since we work on a huge project, diff in
the PR becomes hard to review if there is too much noise.

Just to clarify the process. Do you mean that a pull request should go into
the way below?

Separated commits during the implement, mainly distinguish feature/bug fix
with clean up/rework.

[FLINK-XXX] [module] future/bug fix...
[FLINK-XXX] [module] more on future/bug fix...
[hotfix] clean up/rework...

and so on.

And finally, when get merged, put all stuff into one commit and comments
close #PRID
so coming ones could see the detail by jump to #PRID.

One thing to trade off is that if we merge by one commit, we cannot revert
part of it automated; if we merge by PR commits(one by one), the commit log
might mess.

Best,
tison.


Piotr Nowojski  于2018年9月19日周三 下午5:10写道:

> Hi,
>
> I would like to rise one more issue. Often contributions are very heavy
> and difficult to review. They have one big commit that changes multiple
> things which is difficult to review. Instead I would be in favour of
> implementing a rule, that a single commit should never do more then one
> thing. For example never mixing refactoring/renames/clean ups/code
> deduplications with functional changes. If implementing something requires
> two independent changes in two separate components, those also should be
> two independent commits. In other words, if there are two changed lines in
> a single commit, they should interact somehow together and strictly depend
> on one another. This has couple of important advantages:
>
> 1. Obviously faster reviews. Especially if a reviewer do not have to find
> 2 lines bug fix among 200 lines of renames/refactoring.
> 2. Provides good “cut out” points for reviewer. For example he can easily
> interrupt reviewing in the middle and continue later or even merge PR in
> stages.
> 3. Better reference “why something was done this way not the other” for
> the future. This is the same argument as first point, however with benefit
> not during reviewing, but when after merging someone is trying to
> understand the code.
> 4. Commit message becomes much better place to write down reasons why
> something was done and what are the effects (not that this should replace
> comments/documentation, only to complement it).
> 5. In case of need to revert/drop some part of the contribution, we are
> not loosing all of it. If we have to revert some small feature, it would be
> easier to keep refactoring and clean ups.
>
>
> Some examples of PRs that were more or less following this rule:
> https://github.com/apache/flink/pull/6692/commits
> https://github.com/apache/flink/pull/5423/commits <
> https://github.com/apache/flink/pull/5423/commits> (a bit extreme)
>
> If someone is not convinced I encourage to open those PRs and browse
> through couple of first commits (which are refactoring/clean up commits)
> one by one (GitHub has next/prev commit button). Then imagine if they were
> squashed with some functional/performance improvement changes.
>
> Piotrek
>
> > On 18 Sep 2018, at 17:12, 陈梓立  wrote:
> >
> > Put some good cases here might be helpful.
> >
> > See how a contribution of runtime module be proposed, discussed,
> > implemented and merged from  https://github.com/apache/flink/pull/5931
> to
> > https://github.com/apache/flink/pull/6132.
> >
> > 1. #5931 fix a bug, but remains points could be improved. Here sihuazhou
> > and shuai-xu share their considerations and require review(of the
> proposal)
> > by Stephan, Till and Gary, our committers.
> > 2. After discussion, all people involved reach a consensus. So the rest
> > work is to implement it.
> > 3. sihuazhou gives out an implementation #6132, Till reviews it and find
> it
> > is somewhat out of the "architectural" aspect, so suggests
> > implementation-level changes.
> > 4. Addressing those implementation-level comments, the PR gets merged.
> >
> > I think this is quite a good example how we think our review process
> should
> > go.
> >
> > Best,
> > tison.
> >
> >
> > 陈梓立  于2018年9月18日周二 下午10:53写道:
> >
> >> Maybe a little rearrange to the process would help.
> >>
> >> (1). Does the contributor describe itself well?
> >>  (1.1) By whom this contribution should be given attention. This often
> >> shows by its title, "[FLINK-XXX] [module]", the module part infer.
> >>  (1.2) What the purpose of this contribution is. Done by the PR
> template.
> >> Even on JIRA an issue should cover these points.
> >>
> >> (2). Is there consensus on the contribution?
> >> This follows (1), because we need to clear wh

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-18 Thread
Put some good cases here might be helpful.

See how a contribution of runtime module be proposed, discussed,
implemented and merged from  https://github.com/apache/flink/pull/5931 to
https://github.com/apache/flink/pull/6132.

1. #5931 fix a bug, but remains points could be improved. Here sihuazhou
and shuai-xu share their considerations and require review(of the proposal)
by Stephan, Till and Gary, our committers.
2. After discussion, all people involved reach a consensus. So the rest
work is to implement it.
3. sihuazhou gives out an implementation #6132, Till reviews it and find it
is somewhat out of the "architectural" aspect, so suggests
implementation-level changes.
4. Addressing those implementation-level comments, the PR gets merged.

I think this is quite a good example how we think our review process should
go.

Best,
tison.


陈梓立  于2018年9月18日周二 下午10:53写道:

> Maybe a little rearrange to the process would help.
>
> (1). Does the contributor describe itself well?
>   (1.1) By whom this contribution should be given attention. This often
> shows by its title, "[FLINK-XXX] [module]", the module part infer.
>   (1.2) What the purpose of this contribution is. Done by the PR template.
> Even on JIRA an issue should cover these points.
>
> (2). Is there consensus on the contribution?
> This follows (1), because we need to clear what the purpose of the
> contribution first. At this stage reviewers could cc to module maintainer
> as a supplement to (1.1). Also reviewers might ask the contributor to
> clarify his purpose to sharp(1.2)
>
> (3). Is the implement architectural and fit code style?
> This follows (2). And only after a consensus we talk about concrete
> implement, which prevent spend time and put effort in vain.
>
> In addition, ideally a "+1" comment or approval means the purpose of
> contribution is supported by the reviewer and implement(if there is)
> quality is fine, so the reviewer vote for a consensus.
>
> Best,
> tison.
>
>
> Stephan Ewen  于2018年9月18日周二 下午6:44写道:
>
>> On the template discussion, some thoughts
>>
>> *PR Template*
>>
>> I think the PR template went well. We can rethink the "checklist" at the
>> bottom, but all other parts turned out helpful in my opinion.
>>
>> With the amount of contributions, it helps to ask the contributor to take
>> a
>> little more work in order for the reviewer to be more efficient.
>> I would suggest to keep that mindset: Whenever we find a way that the
>> contributor can prepare stuff in such a way that reviews become
>> more efficient, we should do that. In my experience, most contributors are
>> willing to put in some extra minutes if it helps that their
>> PR gets merged faster.
>>
>> *Review Template*
>>
>> I think it would be helpful to have this checklist. It does not matter in
>> which form, be that as a text template, be that as labels.
>>
>> The most important thing is to make explicit which questions have been
>> answered in the review.
>> Currently there is a lot of "+1" on pull requests which means "code
>> quality
>> is fine", but all other questions are unanswered.
>> The contributors then rightfully wonder why this does not get merged.
>>
>>
>>
>> On Tue, Sep 18, 2018 at 7:26 AM, 陈梓立  wrote:
>>
>> > Hi all interested,
>> >
>> > Within the document there is a heated discussion about how the PR
>> > template/review template should be.
>> >
>> > Here share my opinion:
>> >
>> > 1. For the review template, actually we don't need comment a review
>> > template at all. GitHub has a tag system and only committer could add
>> tags,
>> > which we can make use of it. That is, tagging this PR is
>> > waiting-for-proposal-approved, waiting-for-code-review,
>> > waiting-for-benchmark or block-by-author and so on. Asfbot could pick
>> > GitHub tag state to the corresponding JIRA and we always regard JIRA as
>> the
>> > main discussion borad.
>> >
>> > 2. For the PR template, the greeting message is redundant. Just
>> emphasize a
>> > JIRA associated is important and how to format the title is enough.
>> > Besides, the "Does this pull request potentially affect one of the
>> > following parts" part and "Documentation" should be coved from "What is
>> the
>> > purpose of the change" and "Brief change log". These two parts, users
>> > always answer no and would be aware if they really make changes on it.
>> As
>> > example, even pull request requires do

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-18 Thread
Maybe a little rearrange to the process would help.

(1). Does the contributor describe itself well?
  (1.1) By whom this contribution should be given attention. This often
shows by its title, "[FLINK-XXX] [module]", the module part infer.
  (1.2) What the purpose of this contribution is. Done by the PR template.
Even on JIRA an issue should cover these points.

(2). Is there consensus on the contribution?
This follows (1), because we need to clear what the purpose of the
contribution first. At this stage reviewers could cc to module maintainer
as a supplement to (1.1). Also reviewers might ask the contributor to
clarify his purpose to sharp(1.2)

(3). Is the implement architectural and fit code style?
This follows (2). And only after a consensus we talk about concrete
implement, which prevent spend time and put effort in vain.

In addition, ideally a "+1" comment or approval means the purpose of
contribution is supported by the reviewer and implement(if there is)
quality is fine, so the reviewer vote for a consensus.

Best,
tison.


Stephan Ewen  于2018年9月18日周二 下午6:44写道:

> On the template discussion, some thoughts
>
> *PR Template*
>
> I think the PR template went well. We can rethink the "checklist" at the
> bottom, but all other parts turned out helpful in my opinion.
>
> With the amount of contributions, it helps to ask the contributor to take a
> little more work in order for the reviewer to be more efficient.
> I would suggest to keep that mindset: Whenever we find a way that the
> contributor can prepare stuff in such a way that reviews become
> more efficient, we should do that. In my experience, most contributors are
> willing to put in some extra minutes if it helps that their
> PR gets merged faster.
>
> *Review Template*
>
> I think it would be helpful to have this checklist. It does not matter in
> which form, be that as a text template, be that as labels.
>
> The most important thing is to make explicit which questions have been
> answered in the review.
> Currently there is a lot of "+1" on pull requests which means "code quality
> is fine", but all other questions are unanswered.
> The contributors then rightfully wonder why this does not get merged.
>
>
>
> On Tue, Sep 18, 2018 at 7:26 AM, 陈梓立  wrote:
>
> > Hi all interested,
> >
> > Within the document there is a heated discussion about how the PR
> > template/review template should be.
> >
> > Here share my opinion:
> >
> > 1. For the review template, actually we don't need comment a review
> > template at all. GitHub has a tag system and only committer could add
> tags,
> > which we can make use of it. That is, tagging this PR is
> > waiting-for-proposal-approved, waiting-for-code-review,
> > waiting-for-benchmark or block-by-author and so on. Asfbot could pick
> > GitHub tag state to the corresponding JIRA and we always regard JIRA as
> the
> > main discussion borad.
> >
> > 2. For the PR template, the greeting message is redundant. Just
> emphasize a
> > JIRA associated is important and how to format the title is enough.
> > Besides, the "Does this pull request potentially affect one of the
> > following parts" part and "Documentation" should be coved from "What is
> the
> > purpose of the change" and "Brief change log". These two parts, users
> > always answer no and would be aware if they really make changes on it. As
> > example, even pull request requires document, its owner might no add it
> at
> > first. The PR template is a guide but not which one have to learn.
> >
> > To sum up, (1) take advantage of GitHub's tag system to tag review
> progress
> > (2) make the template more concise to avoid burden mature contributors
> and
> > force new comer to learn too much.
> >
> > Best,
> > tison.
> >
> >
> > Rong Rong  于2018年9月18日周二 上午7:05写道:
> >
> > > Thanks for putting the review contribution doc together, Stephan! This
> > will
> > > definitely help the community to make the review process better.
> > >
> > > From my experience this will benefit on both contributors and reviewers
> > > side! Thus +1 for putting into practice as well.
> > >
> > > --
> > > Rong
> > >
> > > On Mon, Sep 17, 2018 at 10:18 AM Stephan Ewen 
> wrote:
> > >
> > > > Hi!
> > > >
> > > > Thanks you for the encouraging feedback so far.
> > > >
> > > > The overall goal is definitely to make the contribution process
> better
> > > and
> > > > get fewer pull 

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-17 Thread
Hi all interested,

Within the document there is a heated discussion about how the PR
template/review template should be.

Here share my opinion:

1. For the review template, actually we don't need comment a review
template at all. GitHub has a tag system and only committer could add tags,
which we can make use of it. That is, tagging this PR is
waiting-for-proposal-approved, waiting-for-code-review,
waiting-for-benchmark or block-by-author and so on. Asfbot could pick
GitHub tag state to the corresponding JIRA and we always regard JIRA as the
main discussion borad.

2. For the PR template, the greeting message is redundant. Just emphasize a
JIRA associated is important and how to format the title is enough.
Besides, the "Does this pull request potentially affect one of the
following parts" part and "Documentation" should be coved from "What is the
purpose of the change" and "Brief change log". These two parts, users
always answer no and would be aware if they really make changes on it. As
example, even pull request requires document, its owner might no add it at
first. The PR template is a guide but not which one have to learn.

To sum up, (1) take advantage of GitHub's tag system to tag review progress
(2) make the template more concise to avoid burden mature contributors and
force new comer to learn too much.

Best,
tison.


Rong Rong  于2018年9月18日周二 上午7:05写道:

> Thanks for putting the review contribution doc together, Stephan! This will
> definitely help the community to make the review process better.
>
> From my experience this will benefit on both contributors and reviewers
> side! Thus +1 for putting into practice as well.
>
> --
> Rong
>
> On Mon, Sep 17, 2018 at 10:18 AM Stephan Ewen  wrote:
>
> > Hi!
> >
> > Thanks you for the encouraging feedback so far.
> >
> > The overall goal is definitely to make the contribution process better
> and
> > get fewer pull requests that are disregarded.
> >
> > There are various reasons for the disregarded pull requests, one being
> that
> > fewer committers really participate in reviews beyond
> > the component they are currently very involved with. This is a separate
> > issue and I am thinking on how to encourage more
> > activity there.
> >
> > The other reason I was lack of structure and lack of decision making,
> which
> > is what I am first trying to fix here.
> > A follow-up to this will definitely be to improve the contribution guide
> as
> > well.
> >
> > Best,
> > Stephan
> >
> >
> > On Mon, Sep 17, 2018 at 12:05 PM, Zhijiang(wangzhijiang999) <
> > wangzhijiang...@aliyun.com.invalid> wrote:
> >
> > > From my personal experience as a contributor for three years, I feel
> > > better experience in contirbuting or reviewing than before, although we
> > > still have some points for further progress.
> > >
> > > I reviewed the proposal doc, and it gives very constructive and
> > meaningful
> > > guides which could help both contributor and reviewer. I agree with the
> > > bove suggestions and wish they can be praticed well!
> > >
> > > Best,
> > > Zhijiang
> > > --
> > > 发件人:Till Rohrmann 
> > > 发送时间:2018年9月17日(星期一) 16:27
> > > 收件人:dev 
> > > 主 题:Re: [PROPOSAL] [community] A more structured approach to reviews
> and
> > > contributions
> > >
> > > Thanks for writing this up Stephan. I like the steps and hope that it
> > will
> > > help the community to make the review process better. Thus, +1 for
> > putting
> > > your proposal to practice.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Sep 17, 2018 at 10:00 AM Stephan Ewen 
> wrote:
> > >
> > > > Hi Flink community members!
> > > >
> > > > As many of you will have noticed, the Flink project activity has gone
> > up
> > > > again quite a bit.
> > > > There are many more contributions, which is an absolutely great thing
> > to
> > > > have :-)
> > > >
> > > > However, we see a continuously growing backlog of pull requests and
> > JIRA
> > > > issues.
> > > > To make sure the community will be able to handle the increased
> > volume, I
> > > > think we need to revisit some
> > > > approaches and processes. I believe there are a few opportunities to
> > > > structure things a bit better, which
> > > > should help to scale the development.
> > > >
> > > > The first thing I would like to bring up are *Pull Request Reviews*.
> > Even
> > > > though more community members being
> > > > active in reviews (which is a really great thing!) the Pull Request
> > > backlog
> > > > is increasing quite a bit.
> > > >
> > > > Why are pull requests still not merged faster? Looking at the
> reviews,
> > > one
> > > > thing I noticed is that most reviews deal
> > > > immediately with detailed code issues, and leave out most of the core
> > > > questions that need to be answered
> > > > before a Pull Request can be merged, like "is this a desired
> feature?"
> > or
> > > > "does this align well with other developments?".
> > > > I think that we even make things

Re: [PROPOSAL] [community] A more structured approach to reviews and contributions

2018-09-17 Thread
Hi Stephan,

Thanks for raising this discussion and the previous work! I
strongly support effort to improve the process of contributions and reviews.

As you mentioned above, Flink community responses  to contributions a bit
inefficiently, both for latency and quality. In my opinion the most
important things are to make sure reviews of contributions supervised and
push on the stage of a contribution step by step. The latter is similar as
you mentioned in the document.

Here are my observations and suggestions on these theme.

1. The document says that some contributions need need attention from some
specific committers. I definitely agree with that. In fact, under our
current process, every pull request is ideally named in format "[FLINK-XXX]
[module] Title". The module part indicates that in someways. However, there
is a time I make a contribution and know nothing about to whom I could cc.
Take a look at the Scala repo ( https://github.com/scala/scala/ ). It lists
by which committers a module maintain, I think it is really help for new
contributors.

2. I find that there are quite a number of pull requests of Flink are
inactive because no one comments or take concern of. Despite the reason it
is quite a bad experience. We surely reject some contributions but should
never disregard them. Here comes the idea that we always initially assign a
committer to a new contribution ( ideally, for we might have no so many
committer yet ). They are initially in charge of the contribution and take
the responsibility to nude it. Surely one can reassign to another who is
more properly, since the main issue here is ensure contribution supervised.

3. The document introduce five stages of a review, and take 2 into
consideration. I'd like to add 2 more stage, one of which is
"waiting-for-review" represents no one comments and "stall" represents pull
request cannot go ahead for some reason such as contributor doesn't
response, note that a "stall" pull requests is not rejected.
Having these previous, as Fabian Hueske comments in the document, we can
implement a bot to do such work. GitHub supports tagging pull requests
which meet our requirement of marking stage of a pr. Tags we would to
support include five stages in document, two addition stages as above.
Since a pr closed by rejected or merged and must with a conclusion in the
comments, we don't need that tag.
Rust-lang team implements a bot (
https://github.com/rust-lang-nursery/highfive ) which does these things
amazingly. See how https://github.com/rust-lang/rust/pulls works.

To sum up, I suggestion that we (1) introduce our committer and which
modules they maintain/familiar with at somewhere easy to find. (2) make
sure that every contribution is supervised. (3) Tag pull requests so that
the process more smoothly.

Best,
tison.


Till Rohrmann  于2018年9月17日周一 下午4:27写道:

> Thanks for writing this up Stephan. I like the steps and hope that it will
> help the community to make the review process better. Thus, +1 for putting
> your proposal to practice.
>
> Cheers,
> Till
>
> On Mon, Sep 17, 2018 at 10:00 AM Stephan Ewen  wrote:
>
> > Hi Flink community members!
> >
> > As many of you will have noticed, the Flink project activity has gone up
> > again quite a bit.
> > There are many more contributions, which is an absolutely great thing to
> > have :-)
> >
> > However, we see a continuously growing backlog of pull requests and JIRA
> > issues.
> > To make sure the community will be able to handle the increased volume, I
> > think we need to revisit some
> > approaches and processes. I believe there are a few opportunities to
> > structure things a bit better, which
> > should help to scale the development.
> >
> > The first thing I would like to bring up are *Pull Request Reviews*. Even
> > though more community members being
> > active in reviews (which is a really great thing!) the Pull Request
> backlog
> > is increasing quite a bit.
> >
> > Why are pull requests still not merged faster? Looking at the reviews,
> one
> > thing I noticed is that most reviews deal
> > immediately with detailed code issues, and leave out most of the core
> > questions that need to be answered
> > before a Pull Request can be merged, like "is this a desired feature?" or
> > "does this align well with other developments?".
> > I think that we even make things slightly worse that way: From my
> personal
> > experience, I have often thought "oh, this
> > PR has a review already" and rather looked at another PR, only to find
> > later that the first review did never decide whether
> > this PR is actually a good fit for Flink.
> >
> > There has never been a proper documentation of how to answer these
> > questions, what to evaluate in reviews,
> > guidelines for how to evaluate pull requests, other than code quality. I
> > suspect that this is why so many reviewers
> > do not address the "is this a good contribution" questions, making pull
> > requests linger until another committers joins
> > th

Re: [ANNOUNCE] New committer Gary Yao

2018-09-07 Thread
Congratulation Gary!

Best,
tison.


Till Rohrmann  于2018年9月7日周五 下午9:15写道:

> Hi everybody,
>
> On behalf of the PMC I am delighted to announce Gary Yao as a new Flink
> committer!
>
> Gary started contributing to the project in June 2017. He helped with the
> Flip-6 implementation, implemented many of the new REST handlers, fixed
> Mesos issues and initiated the Jepsen-based distributed test suite which
> uncovered several serious issues. Moreover, he actively helps community
> members on the mailing list and with PR reviews.
>
> Please join me in congratulating Gary for becoming a Flink committer!
>
> Cheers,
> Till
>


Re: [Proposal] Utilities for reading, transforming and creatingStreaming savepoints

2018-08-17 Thread
Hi,

also +1.
As vivo said, SavePoint is not compatible.
I have heard from a lot of users just said "my previous program does not
work any more!".
If these utilities provide such migration functions, it would be perfect!

Best,
tison.


vino yang  于2018年8月18日周六 上午11:04写道:

> Hi,
>
> +1, from my side. Considering that Savepoint is not compatible with every
> version of Flink.
> It would be very useful if it could do some conversion between different
> versions of Flink Savepoint.
>
> Thanks, vino.
>
> bupt_ljy  于2018年8月18日周六 上午2:55写道:
>
> > Hi,
> > +1, I think it will be a very great tool for Flink, especially the
> > creating new state part. On production, we’re really worried about the
> > availability of the savepoints, because the generating logic is inside
> > Flink and we don’t have a good way to validate it. But with this tool, we
> > can construct a new state for our programs very soon even if the
> savepoints
> > data is broken.
> > It’s great, thanks!
> >
> >
> > Original Message
> > Sender:Jamie grierjgr...@lyft.com
> > Recipient:dev...@flink.apache.org
> > Date:Saturday, Aug 18, 2018 02:32
> > Subject:Re: [Proposal] Utilities for reading, transforming and
> > creatingStreaming savepoints
> >
> >
> > This is great, Gyula! A colleague here at Lyft has also done some work
> > around bootstrapping DataStream programs and we've also talked a bit
> about
> > doing this by running DataSet programs. On Fri, Aug 17, 2018 at 3:28 AM,
> > Gyula Fóra gyula.f...@gmail.com wrote:  Hi All!   I want to share with
> > you a little project we have been working on at King  (with some help
> from
> > some dataArtisans folks). I think this would be a  valuable addition to
> > Flink and solve a bunch of outstanding production  use-cases and
> headaches
> > around state bootstrapping and state analytics.   We have built a quick
> and
> > dirty POC implementation on top of Flink 1.6,  please check the README
> for
> > some nice examples to get a quick idea:   https://github.com/king/bravo
> >  *Short story*  Bravo is a convenient state reader and writer library
> > leveraging the  Flink’s batch processing capabilities. It supports
> > processing and writing  Flink streaming savepoints. At the moment it only
> > supports processing  RocksDB savepoints but this can be extended in the
> > future for other state  backends and checkpoint types.   Our goal is to
> > cover a few basic features:   - Converting keyed states to Flink DataSets
> > for processing and analytics  - Reading/Writing non-keyed operators
> states
> > - Bootstrap keyed states from Flink DataSets and create new valid
> > savepoints  - Transform existing savepoints by replacing/changing some
> > statesSome example use-cases:   - Point-in-time state analytics
> across
> > all operators and keys  - Bootstrap state of a streaming job from
> external
> > resources such as  reading from database/filesystem  - Validate and
> > potentially repair corrupted state of a streaming job  - Change max
> > parallelism of a jobOur main goal is to start working together with
> > other Flink production  users and make this something useful that can be
> > part of Flink. So if you  have use-cases please talk to us :)  I have
> also
> > started a google doc which contains a little bit more info than  the
> readme
> > and could be a starting place for discussions:
> > https://docs.google.com/document/d/103k6wPX20kMu5H3SOOXSg5PZIaYpw
> > dhqBMr-ppkFL5E/edit?usp=sharing   I know there are a bunch of rough edges
> > and bugs (and no tests) but our  motto is: If you are not embarrassed,
> you
> > released too late :)   Please let me know what you think!   Cheers,
> Gyula
>


Re: [DISCUSS] Improve Tutorials section of documentation

2018-08-09 Thread
Hi Fabian,

+1 for improve tutorials stuff. It's a nice idea that distinguish users by
their goals.
One thing I suggest is that, we could list out the new content designed,
which would make the discuss more clear.

FYI, the current struct is: (from where I stand)

```
> HOME

- Concepts
  - Programming Model
  - Distributed Runtime
- *Quickstart*
- *Examples*
  - Overview
  - Monitoring the Wikipedia Edit
  - Batch Example

- *Project Setup*
  - Project Template for Java
  - Project Template for Scala
  - Configuring Dependencies, Connectors, Libaries
  - IDE Setup
  - Scala REPL
  - Running Flink on Windows
  - Building Flink from Source
- Application Development
  - ...
- Depolyment & Operations
  - ...
- Debugging & Monitoring
  - ...

- Internals
  - ...
```

Aljoscha Krettek  于2018年8月9日周四 下午11:29写道:

> +1
>
> I think this moves us in the direction how having a more hands-on
> tutorials section where we don't explain all the details and a reference
> section where we provide details but don't necessarily spell out a full
> step-by-step case.
>
> > On 9. Aug 2018, at 14:44, Fabian Hueske  wrote:
> >
> > Hi everyone,
> >
> > I'd like to discuss a proposal to improve the tutorials / quickstart
> guides
> > of Flink's documentation.
> > I think the current tutorials have a few issues that should be fix in
> order
> > to help our (future) users getting started with Flink.
> >
> > I propose to add a single "Tutorials" section to the documentation where
> > users find step-by-step guides. The tutorials section help users with
> > different goals:
> >
> >  * Get a quick idea of the overall system
> >  * Implement a DataStream/DataSet/Table API/SQL job
> >  * Set up Flink on a local machine (or run a Docker container)
> >
> > For some of these goals, we do not offer tutorials yet. Our existing
> > tutorials are mixed with instructions for how to setup an environment to
> > develop Flink itself ("IDE setup", "Building Flink from Source"), and
> > reference information that is required to implement applications
> > ("Configuring Dependencies, Connectors, Libraries", Project Templates
> > Java/Scala).
> >
> > As a first step, I would like to reorganize this content of the
> > "Quickstart", "Examples", and "Project Setup" sections of the
> documentation
> > depending on the goals of the users (getting started, reference lookup,
> > developing Flink). So, this would be mostly moving content around.
> >
> > In a second step, I would improve existing tutorials (Implementing
> > DataStream applications, Local Setup) and add missing tutorials (Local
> > Docker setup, Implementing DataSet / Table API / SQL applications, etc.).
> >
> > What do you think?
> >
> > Cheers, Fabian
>
>


Re: [HELP] Bisect

2018-08-02 Thread
I think I find it at
https://github.com/apache/flink/commit/4e616a8362744c15a71e0e57ad68fbb52266c837

Thanks,
tison

陈梓立  于2018年8月2日周四 下午3:14写道:

> Hi devs,
>
> At tag release-1.5.1, I find LOG contains noise like below.
>
> ```
> 13:58:40,752 INFO
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Job
> a18a72a9c335b5cc964f43075418dedc reached globally terminal state FINISHED.
> 13:58:40,753 INFO  org.apache.flink.runtime.minicluster.MiniCluster
>   - Shutting down Flink Mini Cluster
> 13:58:40,753 INFO
> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint- Shutting
> down rest endpoint.
> 13:58:40,754 INFO  org.apache.flink.runtime.jobmaster.JobMaster
>   - Stopping the JobMaster for job Flink Java Job at Thu Aug 02
> 13:58:29 CST 2018(a18a72a9c335b5cc964f43075418dedc).
> 13:58:40,754 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor
>   - Stopping TaskExecutor akka://flink/user/taskmanager_0.
> 13:58:40,755 INFO
> org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager  -
> Shutting down TaskExecutorLocalStateStoresManager.
> 13:58:40,755 INFO  org.apache.flink.runtime.jobmaster.JobMaster
>   - Close ResourceManager connection 58178cf2968f728837df335f4a8a63a5:
> JobManager is shutting down..
> 13:58:40,755 INFO
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Stopping
> dispatcher akka://flink/user/dispatcher65cb755a-d441-48b5-8942-52465b77bae2.
> 13:58:40,755 INFO
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Stopping
> all currently running jobs of dispatcher
> akka://flink/user/dispatcher65cb755a-d441-48b5-8942-52465b77bae2.
> 13:58:40,756 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool
>   - Suspending SlotPool.
> 13:58:40,756 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool
>   - Stopping SlotPool.
> 13:58:40,758 INFO
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  -
> Disconnect job manager 
> 93c37c8311c692f1a5b0c7caaa074a3a@akka://flink/user/jobmanager_1
> for job a18a72a9c335b5cc964f43075418dedc from the resource manager.
> 13:58:40,758 INFO
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Closing
> the SlotManager.
> 13:58:40,758 INFO
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  -
> Suspending the SlotManager.
> 13:58:40,759 INFO
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  -
> Unregister TaskManager 8c28415e2960d2f7a49a12cf329e2444 from the
> SlotManager.
> 13:58:40,759 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor
>   - JobManager for job a18a72a9c335b5cc964f43075418dedc with leader id
> 93c37c8311c692f1a5b0c7caaa074a3a lost leadership.
> 13:58:40,759 INFO
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
> request with allocation id a3f91d383ba2e647c07894a0974af2fb for job
> a18a72a9c335b5cc964f43075418dedc failed.
> org.apache.flink.util.FlinkException: The assigned slot
> 2cd8d254-283d-472c-8ae2-6499e4ba04bf_7 was removed.
> at
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:895)
> ...
> 13:58:40,761 INFO
> org.apache.flink.runtime.rest.handler.legacy.backpressure.StackTraceSampleCoordinator
> - Shutting down stack trace sample coordinator.
> 13:58:40,763 INFO
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
> request with allocation id 25f5cd03e1bd5654c54c20143fd05a7e for job
> a18a72a9c335b5cc964f43075418dedc failed.
> org.apache.flink.util.FlinkException: The assigned slot
> 2cd8d254-283d-472c-8ae2-6499e4ba04bf_6 was removed.
> at
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:895)
> ...
> 13:58:40,766 INFO
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Stopped
> dispatcher akka://flink/user/dispatcher65cb755a-d441-48b5-8942-52465b77bae2.
> 13:58:40,766 INFO
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
> request with allocation id 32047761d6066628b196ac13b7b2993e for job
> a18a72a9c335b5cc964f43075418dedc failed.
> org.apache.flink.util.FlinkException: The assigned slot
> 2cd8d254-283d-472c-8ae2-6499e4ba04bf_5 was removed.
> at
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:895)
> ...
>
> (repeat likely)
> 13:58:40,768 INFO
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
> request with allocation id befc46831f1f0ec5ef54cd660d12aa88 for job
> a18a72a9c335b5cc964f43075418dedc failed.
> org.apache.flink.util.FlinkException: The assigned slot
> 2cd8d254-283d-472c-8ae2-6499e4ba04bf_1 was removed.
> at
> 

[HELP] Bisect

2018-08-02 Thread
Hi devs,

At tag release-1.5.1, I find LOG contains noise like below.

```
13:58:40,752 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Job
a18a72a9c335b5cc964f43075418dedc reached globally terminal state FINISHED.
13:58:40,753 INFO  org.apache.flink.runtime.minicluster.MiniCluster
  - Shutting down Flink Mini Cluster
13:58:40,753 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint- Shutting
down rest endpoint.
13:58:40,754 INFO  org.apache.flink.runtime.jobmaster.JobMaster
  - Stopping the JobMaster for job Flink Java Job at Thu Aug 02
13:58:29 CST 2018(a18a72a9c335b5cc964f43075418dedc).
13:58:40,754 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor
  - Stopping TaskExecutor akka://flink/user/taskmanager_0.
13:58:40,755 INFO
org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager  -
Shutting down TaskExecutorLocalStateStoresManager.
13:58:40,755 INFO  org.apache.flink.runtime.jobmaster.JobMaster
  - Close ResourceManager connection 58178cf2968f728837df335f4a8a63a5:
JobManager is shutting down..
13:58:40,755 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Stopping
dispatcher akka://flink/user/dispatcher65cb755a-d441-48b5-8942-52465b77bae2.
13:58:40,755 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Stopping
all currently running jobs of dispatcher
akka://flink/user/dispatcher65cb755a-d441-48b5-8942-52465b77bae2.
13:58:40,756 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool
  - Suspending SlotPool.
13:58:40,756 INFO  org.apache.flink.runtime.jobmaster.slotpool.SlotPool
  - Stopping SlotPool.
13:58:40,758 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  -
Disconnect job manager
93c37c8311c692f1a5b0c7caaa074a3a@akka://flink/user/jobmanager_1
for job a18a72a9c335b5cc964f43075418dedc from the resource manager.
13:58:40,758 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Closing
the SlotManager.
13:58:40,758 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  -
Suspending the SlotManager.
13:58:40,759 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  -
Unregister TaskManager 8c28415e2960d2f7a49a12cf329e2444 from the
SlotManager.
13:58:40,759 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor
  - JobManager for job a18a72a9c335b5cc964f43075418dedc with leader id
93c37c8311c692f1a5b0c7caaa074a3a lost leadership.
13:58:40,759 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
request with allocation id a3f91d383ba2e647c07894a0974af2fb for job
a18a72a9c335b5cc964f43075418dedc failed.
org.apache.flink.util.FlinkException: The assigned slot
2cd8d254-283d-472c-8ae2-6499e4ba04bf_7 was removed.
at
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:895)
...
13:58:40,761 INFO
org.apache.flink.runtime.rest.handler.legacy.backpressure.StackTraceSampleCoordinator
- Shutting down stack trace sample coordinator.
13:58:40,763 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
request with allocation id 25f5cd03e1bd5654c54c20143fd05a7e for job
a18a72a9c335b5cc964f43075418dedc failed.
org.apache.flink.util.FlinkException: The assigned slot
2cd8d254-283d-472c-8ae2-6499e4ba04bf_6 was removed.
at
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:895)
...
13:58:40,766 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher  - Stopped
dispatcher akka://flink/user/dispatcher65cb755a-d441-48b5-8942-52465b77bae2.
13:58:40,766 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
request with allocation id 32047761d6066628b196ac13b7b2993e for job
a18a72a9c335b5cc964f43075418dedc failed.
org.apache.flink.util.FlinkException: The assigned slot
2cd8d254-283d-472c-8ae2-6499e4ba04bf_5 was removed.
at
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:895)
...

(repeat likely)
13:58:40,768 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Slot
request with allocation id befc46831f1f0ec5ef54cd660d12aa88 for job
a18a72a9c335b5cc964f43075418dedc failed.
org.apache.flink.util.FlinkException: The assigned slot
2cd8d254-283d-472c-8ae2-6499e4ba04bf_1 was removed.
at
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager.removeSlot(SlotManager.java:895)
...
13:58:40,784 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint- Removing
cache directory
/var/folders/ff/92vml6qd147dncsvvw28qlrcgn/T/flink-web-ui
13:58:40,785 INFO  org.apache.flink.runtime.io.disk.iomanager.IOManager
  - I/O manager removed spill file directory
/var/folders/ff/92vml6qd147dncsvvw28qlrcgn/T/flink-io-22be945e-daeb-49e9-ab3b-22521e9d2280
13:58:40,785 INFO  org.apache.flink.runtime.io.network.NetworkEnvironment
  - Shutting down the network environment and its components.
13:

Re: Intro

2018-07-24 Thread
Hi Kevin,

Welcome!
As a supplement to vino, Flink has a user list u...@flink.apache.org, which
you can subscribe by sending a e-mail to user-subscr...@flink.apache.org

Best,
tison.

vino yang  于2018年7月25日周三 上午10:01写道:

> Hi Kevin,
>
> Welcome, I think you should read the offiicial documentation to know more
> about Flink community.[1]
>
> And you can read this documentation to learn how to contribute.[2]
>
> [1]: http://flink.apache.org/community.html
> [2]: http://flink.apache.org/how-to-contribute.html
>
> Thanks, vino.
>
> 2018-07-25 9:55 GMT+08:00 Kevin Waterson :
>
> > Hi Dev,
> >
> > New to Flink, coming from Metron use.
> > Looking forward to using, and getting involved in the documentation of
> > Flink.
> > Who heads up the docs?
> > Is there another list or use the user or community lists?
> >
> > Thanks,
> > Kev
> >
>


Re: What is the way to prove performance improvement?

2018-07-18 Thread
Hi Piotrek,

Thanks for your reply! These improvements are inspired by some batch table
tasks outside flink/master. Let me take a look at that you link to and see
if I could define proper benchmark.

Best,
tison.

Piotr Nowojski  于2018年7月18日周三 下午4:45写道:

> Hi,
>
> There is a some small set of benchmarks defined in
> https://github.com/dataArtisans/flink-benchmarks <
> https://github.com/dataArtisans/flink-benchmarks> , however their scope
> is limited and after briefly looking at your PRs, I wouldn’t expect them to
> cover your cases. However if you could define there some jmh micro
> benchmark to cover your cases that would be nice. It would be a shame if
> someone would accidentally revert/brake your improvements in the future.
>
> Piotrek
>
> > On 18 Jul 2018, at 08:52, 陈梓立  wrote:
> >
> > Hi Till,
> >
> > Thanks for your reply! I will try to add ones later.
> >
> > Best,
> > tison.
> >
> > Till Rohrmann  于2018年7月18日周三 下午2:48写道:
> >
> >> Hi Tison,
> >>
> >> at the moment there is formal way to verify performance improvements.
> What
> >> you can do is to provide your measurements by adding the graphs to the
> PR
> >> thread and specify the setup. Then others could try to verify these
> numbers
> >> by running their own benchmark.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Wed, Jul 18, 2018 at 1:34 AM 陈梓立  wrote:
> >>
> >>> Hi all,
> >>>
> >>> Recently I pull 3 PRs about performance improvements[1][2][3]. Unit
> tests
> >>> will verify their correctness, and in the real scenario, we have
> >> benchmark
> >>> report to confirm that they do help for performance.
> >>>
> >>> I wonder what is the formal way to verify a performance improvement. Is
> >> it
> >>> to give out a benchmark report, or run a standard benchmark, or add
> >>> performance test(I don't know how to do it), or anything else.
> >>>
> >>> Looking forward for your reply.
> >>>
> >>> Best,
> >>> tison.
> >>>
> >>> [1] https://github.com/apache/flink/pull/6339
> >>> [2] https://github.com/apache/flink/pull/6345
> >>> [3] https://github.com/apache/flink/pull/6353
> >>>
> >>
>
>


Re: What is the way to prove performance improvement?

2018-07-17 Thread
Hi Till,

Thanks for your reply! I will try to add ones later.

Best,
tison.

Till Rohrmann  于2018年7月18日周三 下午2:48写道:

> Hi Tison,
>
> at the moment there is formal way to verify performance improvements. What
> you can do is to provide your measurements by adding the graphs to the PR
> thread and specify the setup. Then others could try to verify these numbers
> by running their own benchmark.
>
> Cheers,
> Till
>
> On Wed, Jul 18, 2018 at 1:34 AM 陈梓立  wrote:
>
> > Hi all,
> >
> > Recently I pull 3 PRs about performance improvements[1][2][3]. Unit tests
> > will verify their correctness, and in the real scenario, we have
> benchmark
> > report to confirm that they do help for performance.
> >
> > I wonder what is the formal way to verify a performance improvement. Is
> it
> > to give out a benchmark report, or run a standard benchmark, or add
> > performance test(I don't know how to do it), or anything else.
> >
> > Looking forward for your reply.
> >
> > Best,
> > tison.
> >
> > [1] https://github.com/apache/flink/pull/6339
> > [2] https://github.com/apache/flink/pull/6345
> > [3] https://github.com/apache/flink/pull/6353
> >
>


What is the way to prove performance improvement?

2018-07-17 Thread
Hi all,

Recently I pull 3 PRs about performance improvements[1][2][3]. Unit tests
will verify their correctness, and in the real scenario, we have benchmark
report to confirm that they do help for performance.

I wonder what is the formal way to verify a performance improvement. Is it
to give out a benchmark report, or run a standard benchmark, or add
performance test(I don't know how to do it), or anything else.

Looking forward for your reply.

Best,
tison.

[1] https://github.com/apache/flink/pull/6339
[2] https://github.com/apache/flink/pull/6345
[3] https://github.com/apache/flink/pull/6353


Register

2018-07-11 Thread
Register