Towards a spec for robust streaming SQL, Part 2

2017-07-24 Thread Tyler Akidau
Hello Flink, Calcite, and Beam dev lists!

Linked below is the second document I promised way back in April regarding
a collaborative spec for streaming SQL in Beam/Calcite/Flink (& apologies
for the delay; I thought I was nearly done a while back and then temporal
joins expanded to something much larger than expected).

To repeat what it says in the doc, my hope is that it can serve various
purposes over it's lifetime:

   -
   - A discussion ground for ironing out any remaining features necessary
   for supporting robust streaming semantics in Calcite SQL.

   - A rough, high-level source of truth for tracking efforts underway in
   support of this, currently spanning the Calcite, Flink, and Beam projects.

   - A written specification of the changes that were made, for the sake of
   understanding the delta after the fact.

The first and third points are, IMO, the most important. AFAIK, there are a
few features missing still that need to be defined (e.g., triggers
equivalents via EMIT, robust temporal join support). I'm also proposing a
clear distinction of streams and tables, which I think is important, but
which I believe is not the approach most folks have been taking in this
area. Sorting out these open issues and then having a concise record of the
solutions adopted will be important for providing a solid streaming
experience and teaching folks how to use it.

At any rate, I would much appreciate it if anyone with an interest in this
stuff could please take a look and add comments/suggestions/references to
related work in flight/etc as appropriate. For now please use
comments/suggestions, but if you really want to dive in with edit access,
let me know.

The doc: http://s.apache.org/streaming-sql-spec

-Tyler


Connect more than two streams

2017-07-24 Thread Govindarajan Srinivasaraghavan
Hi,


I have two streams reading from kafka, one for data and other for control.
The data stream is split by type and there are around six types. Each type
has its own processing logic and finally everything has to be merged to get
the collective state per device. I was thinking I could connect multiple
streams, process and maintain state but connect only supports two streams.
Is there some way to achieve my desired functionality?


By the way after split and some processing all of them are keyed streams.


Re: [DISCUSS] A more thorough Pull Request check list and template

2017-07-24 Thread Eron Wright
I think a combination of techniques would be effective:
- identifying focus areas for the next release (e.g. see Robert's thread
about 1.4)
- emphasizing design discussion in advance of a PR
- assigning reviewers and a steward in a structured way
- using a label or assignment field to 'pass the baton'
- closing rejected PRs decisively

I don't mean to co-opt this thread for a broader process question, but
figured that the PR template could provide additional process guidance.

On Mon, Jul 24, 2017 at 11:55 AM, Stephan Ewen  wrote:

> @Eron Review timeliness would be great to improve.
>
> Some observation from the past year:
> There were periods where some components in Flink were making slow progress
> because all committers knowledgeable in those components were busy handling
> pull requests that were opened against those components, but were not in
> good shape, were adding not discussed designs, etc.
>
> I think the only way to ensure timely handling of pull requests is to be
> very strict in the handling. For example any non-trivial change needs prior
> discussion, agreement that this should be fixed now, and an agreed upon
> design doc. Otherwise the PR is not considered and simply rejected. Same
> for presence of docs, proper tests, ...
>
> But, I fear that introducing such strictness will scare off many in the
> community. So I would be very reluctant to do this.
> After all, many pull requests do bring in a good piece of perspective, at
> least, even if the code is not immediately suited for contribution...
>
>
> On Mon, Jul 24, 2017 at 8:18 PM, Eron Wright  wrote:
>
> > This seems like a good step in establishing a better PR process.  I
> believe
> > the process could be improved to ensure timely and targeted review by
> > component experts and committers.
> >
> > On Mon, Jul 17, 2017 at 9:36 AM, Stephan Ewen  wrote:
> >
> > > Hi all!
> > >
> > > I have reflected a bit on the pull requests and on some of the recent
> > > changes to Flink and some of the introduced bugs / regressions that we
> > have
> > > fixed.
> > >
> > > One thing that I think would have helped is to have more explicit
> > > information about what the pull request does and how the contributor
> > would
> > > suggest to verify it. I have seen this when contributing to some other
> > > project and really liked the approach.
> > >
> > > It requires that a contributor takes a minute to reflect on what was
> > > touched, and what would be ways to verify that the changes work
> properly.
> > > Besides being a help to the reviewer, it also makes contributors aware
> of
> > > what is important during the review process.
> > >
> > >
> > > I suggest a new pull request template, as attached below, with a
> preview
> > > here:
> > > https://github.com/StephanEwen/incubator-flink/
> > > blob/pr_template/.github/PULL_REQUEST_TEMPLATE.md
> > >
> > > Don't be scared, it looks long, but a big part is the introductory text
> > > (only relevant for new contributors) and the examples contents for the
> > > description.
> > >
> > > Filling this out for code that is in shape should be a quick thing:
> > Remove
> > > the into and checklist, write a few sentences on what the PR does (one
> > > should do that anyways) and then pick some yes/no in the classification
> > > section.
> > >
> > > Curious to hear what you think!
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > 
> > >
> > > Full suggested pull request template:
> > >
> > >
> > >
> > > *Thank you very much for contributing to Apache Flink - we are happy
> that
> > > you want to help us improve Flink. To help the community review you
> > > contribution in the best possible way, please go through the checklist
> > > below, which will get the contribution into a shape in which it can be
> > best
> > > reviewed.*
> > >
> > > *Please understand that we do not do this to make contributions to
> Flink
> > a
> > > hassle. In order to uphold a high standard of quality for code
> > > contributions, while at the same time managing a large number of
> > > contributions, we need contributors to prepare the contributions well,
> > and
> > > give reviewers enough contextual information for the review. Please
> also
> > > understand that contributions that do not follow this guide will take
> > > longer to review and thus typically be picked up with lower priority by
> > the
> > > community.*
> > >
> > > ## Contribution Checklist
> > >
> > >   - Make sure that the pull request corresponds to a [JIRA issue](
> > > https://issues.apache.org/jira/projects/FLINK/issues). Exceptions are
> > made
> > > for typos in JavaDoc or documentation files, which need no JIRA issue.
> > >
> > >   - Name the pull request in the form "[FLINK-1234] [component] Title
> of
> > > the pull request", where *FLINK-1234* should be replaced by the actual
> > > issue number. Skip *component* if you are unsure about which is the
> best
> > > component.

Re: [DISCUSS] A more thorough Pull Request check list and template

2017-07-24 Thread Stephan Ewen
@Eron Review timeliness would be great to improve.

Some observation from the past year:
There were periods where some components in Flink were making slow progress
because all committers knowledgeable in those components were busy handling
pull requests that were opened against those components, but were not in
good shape, were adding not discussed designs, etc.

I think the only way to ensure timely handling of pull requests is to be
very strict in the handling. For example any non-trivial change needs prior
discussion, agreement that this should be fixed now, and an agreed upon
design doc. Otherwise the PR is not considered and simply rejected. Same
for presence of docs, proper tests, ...

But, I fear that introducing such strictness will scare off many in the
community. So I would be very reluctant to do this.
After all, many pull requests do bring in a good piece of perspective, at
least, even if the code is not immediately suited for contribution...


On Mon, Jul 24, 2017 at 8:18 PM, Eron Wright  wrote:

> This seems like a good step in establishing a better PR process.  I believe
> the process could be improved to ensure timely and targeted review by
> component experts and committers.
>
> On Mon, Jul 17, 2017 at 9:36 AM, Stephan Ewen  wrote:
>
> > Hi all!
> >
> > I have reflected a bit on the pull requests and on some of the recent
> > changes to Flink and some of the introduced bugs / regressions that we
> have
> > fixed.
> >
> > One thing that I think would have helped is to have more explicit
> > information about what the pull request does and how the contributor
> would
> > suggest to verify it. I have seen this when contributing to some other
> > project and really liked the approach.
> >
> > It requires that a contributor takes a minute to reflect on what was
> > touched, and what would be ways to verify that the changes work properly.
> > Besides being a help to the reviewer, it also makes contributors aware of
> > what is important during the review process.
> >
> >
> > I suggest a new pull request template, as attached below, with a preview
> > here:
> > https://github.com/StephanEwen/incubator-flink/
> > blob/pr_template/.github/PULL_REQUEST_TEMPLATE.md
> >
> > Don't be scared, it looks long, but a big part is the introductory text
> > (only relevant for new contributors) and the examples contents for the
> > description.
> >
> > Filling this out for code that is in shape should be a quick thing:
> Remove
> > the into and checklist, write a few sentences on what the PR does (one
> > should do that anyways) and then pick some yes/no in the classification
> > section.
> >
> > Curious to hear what you think!
> >
> > Best,
> > Stephan
> >
> >
> > 
> >
> > Full suggested pull request template:
> >
> >
> >
> > *Thank you very much for contributing to Apache Flink - we are happy that
> > you want to help us improve Flink. To help the community review you
> > contribution in the best possible way, please go through the checklist
> > below, which will get the contribution into a shape in which it can be
> best
> > reviewed.*
> >
> > *Please understand that we do not do this to make contributions to Flink
> a
> > hassle. In order to uphold a high standard of quality for code
> > contributions, while at the same time managing a large number of
> > contributions, we need contributors to prepare the contributions well,
> and
> > give reviewers enough contextual information for the review. Please also
> > understand that contributions that do not follow this guide will take
> > longer to review and thus typically be picked up with lower priority by
> the
> > community.*
> >
> > ## Contribution Checklist
> >
> >   - Make sure that the pull request corresponds to a [JIRA issue](
> > https://issues.apache.org/jira/projects/FLINK/issues). Exceptions are
> made
> > for typos in JavaDoc or documentation files, which need no JIRA issue.
> >
> >   - Name the pull request in the form "[FLINK-1234] [component] Title of
> > the pull request", where *FLINK-1234* should be replaced by the actual
> > issue number. Skip *component* if you are unsure about which is the best
> > component.
> >   Typo fixes that have no associated JIRA issue should be named following
> > this pattern: `[hotfix] [docs] Fix typo in event time introduction` or
> > `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`.
> >
> >   - Fill out the template below to describe the changes contributed by
> the
> > pull request. That will give reviewers the context they need to do the
> > review.
> >
> >   - Make sure that the change passes the automated tests, i.e., `mvn
> clean
> > verify`
> >
> >   - Each pull request should address only one issue, not mix up code from
> > multiple issues.
> >
> >   - Each commit in the pull request has a meaningful commit message
> > (including the JIRA id)
> >
> >   - Once all items of the checklist are addressed, remove the 

Re: [DISCUSS] A more thorough Pull Request check list and template

2017-07-24 Thread Eron Wright
This seems like a good step in establishing a better PR process.  I believe
the process could be improved to ensure timely and targeted review by
component experts and committers.

On Mon, Jul 17, 2017 at 9:36 AM, Stephan Ewen  wrote:

> Hi all!
>
> I have reflected a bit on the pull requests and on some of the recent
> changes to Flink and some of the introduced bugs / regressions that we have
> fixed.
>
> One thing that I think would have helped is to have more explicit
> information about what the pull request does and how the contributor would
> suggest to verify it. I have seen this when contributing to some other
> project and really liked the approach.
>
> It requires that a contributor takes a minute to reflect on what was
> touched, and what would be ways to verify that the changes work properly.
> Besides being a help to the reviewer, it also makes contributors aware of
> what is important during the review process.
>
>
> I suggest a new pull request template, as attached below, with a preview
> here:
> https://github.com/StephanEwen/incubator-flink/
> blob/pr_template/.github/PULL_REQUEST_TEMPLATE.md
>
> Don't be scared, it looks long, but a big part is the introductory text
> (only relevant for new contributors) and the examples contents for the
> description.
>
> Filling this out for code that is in shape should be a quick thing: Remove
> the into and checklist, write a few sentences on what the PR does (one
> should do that anyways) and then pick some yes/no in the classification
> section.
>
> Curious to hear what you think!
>
> Best,
> Stephan
>
>
> 
>
> Full suggested pull request template:
>
>
>
> *Thank you very much for contributing to Apache Flink - we are happy that
> you want to help us improve Flink. To help the community review you
> contribution in the best possible way, please go through the checklist
> below, which will get the contribution into a shape in which it can be best
> reviewed.*
>
> *Please understand that we do not do this to make contributions to Flink a
> hassle. In order to uphold a high standard of quality for code
> contributions, while at the same time managing a large number of
> contributions, we need contributors to prepare the contributions well, and
> give reviewers enough contextual information for the review. Please also
> understand that contributions that do not follow this guide will take
> longer to review and thus typically be picked up with lower priority by the
> community.*
>
> ## Contribution Checklist
>
>   - Make sure that the pull request corresponds to a [JIRA issue](
> https://issues.apache.org/jira/projects/FLINK/issues). Exceptions are made
> for typos in JavaDoc or documentation files, which need no JIRA issue.
>
>   - Name the pull request in the form "[FLINK-1234] [component] Title of
> the pull request", where *FLINK-1234* should be replaced by the actual
> issue number. Skip *component* if you are unsure about which is the best
> component.
>   Typo fixes that have no associated JIRA issue should be named following
> this pattern: `[hotfix] [docs] Fix typo in event time introduction` or
> `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`.
>
>   - Fill out the template below to describe the changes contributed by the
> pull request. That will give reviewers the context they need to do the
> review.
>
>   - Make sure that the change passes the automated tests, i.e., `mvn clean
> verify`
>
>   - Each pull request should address only one issue, not mix up code from
> multiple issues.
>
>   - Each commit in the pull request has a meaningful commit message
> (including the JIRA id)
>
>   - Once all items of the checklist are addressed, remove the above text
> and this checklist, leaving only the filled out template below.
>
>
> **(The sections below can be removed for hotfixes of typos)**
>
> ## What is the purpose of the change
>
> *(For example: This pull request makes task deployment go through the blob
> server, rather than through RPC. That way we avoid re-transferring them on
> each deployment (during recovery).)*
>
>
> ## Brief change log
>
> *(for example:)*
>   - *The TaskInfo is stored in the blob store on job creation time as a
> persistent artifact*
>   - *Deployments RPC transmits only the blob storage reference*
>   - *TaskManagers retrieve the TaskInfo from the blob cache*
>
>
> ## Verifying this change
>
> *(Please pick either of the following options)*
>
> This change is a trivial rework / code cleanup without any test coverage.
>
> *(or)*
>
> This change is already covered by existing tests, such as *(please describe
> tests)*.
>
> *(or)*
>
> This change added tests and can be verified as follows:
>
> *(example:)*
>   - *Added integration tests for end-to-end deployment with large payloads
> (100MB)*
>   - *Extended integration test for recovery after master (JobManager)
> failure*
>   - *Added test that validates that TaskInfo is transferred only once
> 

Re: [DISCUSS] A more thorough Pull Request check list and template

2017-07-24 Thread Ufuk Celebi
What's the conclusion of last weeks discussion here?

Fabian and Chesnay raised concerns about the introductory text. Are
you still concerned?

On Wed, Jul 19, 2017 at 10:04 AM, Stephan Ewen  wrote:
> @Chesnay:
>
> Put text into template => contributor will have to read it
> Put link to text into template => most contributors will ignore the link
>
> Yes, that is pretty much what my observation from the past is.
>
>
>
> On Tue, Jul 18, 2017 at 11:03 PM, Chesnay Schepler 
> wrote:
>
>> I'm sorry but i can't follow your logic.
>>
>> Put text into template => contributor will definitely read it
>> Put link to text into template => contributor will completely ignore the
>> link
>>
>> The advantage of the link is we don't duplicate the contribution guide in
>> the docs and in the template.
>> Furthermore, you don't even need to remove something from the template,
>> since it's just a single line.
>>
>>
>> On 18.07.2017 19:25, Stephan Ewen wrote:
>>
>>> Concerning moving text to the contributors guide:
>>>
>>> I can only say it again: I believe the contribution guide is almost dead
>>> text. Very few people read it.
>>> Before the current template was introduced, new contributors rarely gave
>>> the pull request a name with Jira number. That is a good indicator about
>>> how many read this guide.
>>> Putting the test in the template is a way that every one reads it.
>>>
>>>
>>> I am also wondering what the concern is.
>>> A new contributor should clearly read through a bit of text, to learn what
>>> we look for in contributions.
>>> A recurring contributor will not have to read it again, simply remove the
>>> text from the pull request message and go on.
>>>
>>> Where is the disadvantage?
>>>
>>>
>>> On Tue, Jul 18, 2017 at 5:35 PM, Nico Kruber 
>>> wrote:
>>>
>>> I like the new template but also agree with the text being too long and
 would
 move the intro to the contributors guide with a link in the PR template.

 Regarding the questions to fill out - I'd like the headings to be short
 and
 have the affected components last so that documentation is not lost
 (although
 being more important than this checklist), e.g.:

 * Purpose of the change
 * Brief change log
 * Verifying the change
 * Documentation
 * Affected components

 The verification options in the original template look a bit too large
 but
 it
 stresses what tests should be added, especially for bigger changes. Can't
 think of a way to make it shorter though.


 Nico

 On Tuesday, 18 July 2017 11:20:41 CEST Chesnay Schepler wrote:

> I fully agree with Fabian.
>
> Multiple-choice questions provide little value to the reviewer, since
> the
> validity has to be verified in any case. While text answers have to be
> validated as well,
> they give some hint to the reviewer as to how it can be verified and
> which steps the
> contributor did to do so.
>
> I also agree that it is too long; IMO this is really intimidating to new
> contributors to be greeted with this.
>
> Ideally we only link to the contributors guide and ask 3 questions:
>
>* What is the problem?
>* How was it fixed?
>* How can the fix be verified?
>
> On 18.07.2017 10:47, Fabian Hueske wrote:
>
>> I like the sections about purpose, change log, and verification of the
>> changes.
>>
>> However, I think the proposed template is too much text. This is
>>
> probably

> the reason why the first attempt to establish a PR template failed.
>> I would move most of the introduction and explanations incl. examples
>>
> to

> the "Contribution Guidelines" and only pass a link.
>> IMO, the template should be rather shorter than the current one and
>>
> only

> have the link, the sections to fill out, and checkboxes.
>>
>> I'm also not sure how much the detailed questions will help.
>> For example even if the question about changed dependencies is answered
>> with "no", the reviewer still has to check that.
>>
>> I think the questions of the current template work differently.
>> A question "Does the PR include tests?" suggests to the contributor
>>
> that

> those should be included. Same for documentation.
>>
>> Cheers,
>> Fabian
>>
>> 2017-07-18 10:05 GMT+02:00 Tzu-Li (Gordon) Tai :
>>
>>> +1, I like this a lot.
>>> With the previous template, it doesn’t really resonate with what we
>>> should
>>> care about, and therefore most of the time I think contributors just
>>> delete
>>> that template and write down something on their own.
>>>
>>> I would also like to add: “Savepoint / checkpoint binary formats” to
>>>
>> the

> potential 

[VOTE] Release Apache Flink-shaded 1.0 (RC2)

2017-07-24 Thread Chesnay Schepler

Dear Flink community,

Please vote on releasing the following candidate as Apache Flink-shaded 
version 1.0.


The commit to be voted in:
https://gitbox.apache.org/repos/asf/flink-shaded/commit/290526022bb99a276220afd73d0c5ffb0eb2cc59

Branch:
release-1.0-rc2

The release artifacts to be voted on can be found at: 
http://home.apache.org/~chesnay/flink-shaded-1.0-rc2/ 



The release artifacts are signed with the key with fingerprint 
19F2195E1B4816D765A2C324C2EED7B111D464BA:

http://www.apache.org/dist/flink/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapacheflink-1131

-


The vote ends on Thursday (5pm CEST), July 27th, 2017.

[ ] +1 Release this package as Apache Flink-shaded 1.0
[ ] -1 Do not release this package, because ...

-


The flink-shaded project contains a number of shaded dependencies for 
Apache Flink.


This release includes asm-all:5.0.4, guava:18.0, netty-all:4.0.27-FINAL 
and netty-router:1.10 . Note that netty-all and netty-router are bundled 
as a single dependency.


The purpose of these dependencies is to provide a single instance of a 
shaded dependency in the Apache Flink distribution, instead of each 
individual module shading the dependency.


For more information, see
https://issues.apache.org/jira/browse/FLINK-6529.


[jira] [Created] (FLINK-7258) IllegalArgumentException in Netty bootstrap with large memory state segment size

2017-07-24 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-7258:
--

 Summary: IllegalArgumentException in Netty bootstrap with large 
memory state segment size
 Key: FLINK-7258
 URL: https://issues.apache.org/jira/browse/FLINK-7258
 Project: Flink
  Issue Type: Bug
  Components: Network
Affects Versions: 1.3.1
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi


In NettyBootstrap we configure the low and high watermarks in the following 
order:
{code}
bootstrap.childOption(ChannelOption.WRITE_BUFFER_LOW_WATER_MARK, 
config.getMemorySegmentSize() + 1);
bootstrap.childOption(ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK, 2 * 
config.getMemorySegmentSize());
{code}

When the memory segment size is higher than the default high water mark, this 
throws an `IllegalArgumentException` when a client tries to connect. Hence, 
this unfortunately only fails during runtime when a intermediate result is 
requested.

A simple fix is to first configure the high water mark and only then configure 
the low watermark.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7256) End-to-end tests should only be run after successful compilation

2017-07-24 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7256:
---

 Summary: End-to-end tests should only be run after successful 
compilation
 Key: FLINK-7256
 URL: https://issues.apache.org/jira/browse/FLINK-7256
 Project: Flink
  Issue Type: Improvement
  Components: Tests, Travis
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.4.0


If the compilation fails (for example due to checkstyle) the end-to-end tests 
are currently still run, even though flink-dist most likely wasn't even built.

Similar to FLINK-7176.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7254) java8 module pom disables checkstyle

2017-07-24 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7254:
---

 Summary: java8 module pom disables checkstyle
 Key: FLINK-7254
 URL: https://issues.apache.org/jira/browse/FLINK-7254
 Project: Flink
  Issue Type: Bug
  Components: Checkstyle
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.4.0


The java8 pom file contains this:
{code}

true

{code}

Thus the checkstyle is not actually enforced.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [flink-shaded] Some suggestions for improvements

2017-07-24 Thread Chesnay Schepler
I've opened a PR to adjust the artifact names: 
https://github.com/apache/flink-shaded/pull/17


Under the new scheme the asm artifact would be named 
"flink-shaded-asm-5.0.4-1.0"


By default maven concatenates the artifactId and version with a dash, 
hence no underscore.


On 23.07.2017 21:37, Stephan Ewen wrote:

For the SNAPSHOT part, I do not feel too strong about that either, just a
tendency to keep it in sync with how core Flink works.

For the "tools" directory, we can keep it as it is. It seems to complicated
and really is not a big deal...

On Sun, Jul 23, 2017 at 8:31 PM, Chesnay Schepler 
wrote:


I agree that the version scheme in the artifact isn't ideal.

We can keep the tools out of the release, but not in a nice way.
We either

1. remove it in a separate commit before each release
2. just omit it during the release process.

1) has the odd downside that the release branch cannot release itself, as
the script is now missing
2) has the odd downside that no branch in the repository would actually
match the release

As for the SNAPSHOT suffix, given that we currently don't do any snapshot
deployments
for flink-shaded, nor there being a reason for that in the first place
since the only a consumer
for the dependencies (aka Apache Flink) would never rely on SNAPSHOT
versions (I guess),
I don't see a the need for it. But i don't feel strongly about this, and
don't mind either way.


On 23.07.2017 15:42, Stephan Ewen wrote:


A few comments what we can improve in future releases:

- I agree with Robert's comment to change the versioning to the same
model as Flink where the master branch is on a SNAPSHOT version always and
the releases are branches/tags with stable versions.

- The version names of the artifacts read a bit
strange: flink-shaded-asm-5-1.0-5.0.4
- I would suggest to rename them to something like
flink-shaded-asm_5.0.4-1.0
- The version of the artifact with an underscore, so separate artifact
version from release version. Think of it as similar to the Scala version
specific release artifacts.

- I would suggest to also remove the "tools" directly from the source
release, if that is not too much work.






Re: [DISCUSS] Release 1.3.2 planning

2017-07-24 Thread Aljoscha Krettek
@Greg: I merged that.

It seems like all blockers are now either resolved or determined not to be 
blocking for this release. I will try and cut the first RC now.

Best,
Aljoscha

> On 21. Jul 2017, at 19:07, Greg Hogan  wrote:
> 
> FLINK-7211 is a trivial change for excluding the gelly examples javadoc from 
> the release assembly and would be good to have fixed for 1.3.2.
> 
> 
>> On Jul 13, 2017, at 3:34 AM, Tzu-Li (Gordon) Tai  wrote:
>> 
>> I agree that FLINK-6951 should also be a blocker for 1.3.2. I’ll update its 
>> priority.
>> 
>> On 13 July 2017 at 4:06:06 PM, Bowen Li (bowen...@offerupnow.com) wrote:
>> 
>> Hi Aljoscha,  
>> I'd like to see https://issues.apache.org/jira/browse/FLINK-6951 fixed  
>> in 1.3.2, if it makes sense.  
>> 
>> Thanks,  
>> Bowen  
>> 
>> On Wed, Jul 12, 2017 at 3:06 AM, Aljoscha Krettek   
>> wrote:  
>> 
>>> Short update, we resolved some blockers and discovered some new ones.  
>>> There’s this nifty Jira page if you want to keep track:  
>>> https://issues.apache.org/jira/projects/FLINK/versions/12340984 <  
>>> https://issues.apache.org/jira/projects/FLINK/versions/12340984>  
>>> 
>>> Once again, could everyone please update the Jira issues that they think  
>>> should be release blocking. I would like to start building release  
>>> candidates at the end of this week, if possible.  
>>> 
>>> And yes, I’m volunteering to be the release manager on this release. ;-)  
>>> 
>>> Best,  
>>> Aljoscha  
>>> 
 On 7. Jul 2017, at 16:03, Aljoscha Krettek  wrote:  
 
 I think we might have another blocker: https://issues.apache.org/  
>>> jira/browse/FLINK-7133   
 
> On 7. Jul 2017, at 09:18, Haohui Mai  wrote:  
> 
> I think we are pretty close now -- Jira shows that we're down to two  
> blockers: FLINK-7069 and FLINK-6965.  
> 
> FLINK-7069 is being merged and we have a PR for FLINK-6965.  
> 
> ~Haohui  
> 
> On Thu, Jul 6, 2017 at 1:44 AM Aljoscha Krettek   
>>> wrote:  
> 
>> I’m seeing these remaining blockers:  
>> https://issues.apache.org/jira/browse/FLINK-7069?filter=  
>>> 12334772=project%20%3D%20FLINK%20AND%20priority%20%3D%20Blocker%20AND%  
>>> 20resolution%20%3D%20Unresolved  
>> <  
>> https://issues.apache.org/jira/browse/FLINK-7069?filter=  
>>> 12334772=project%20=%20FLINK%20AND%20priority%20=%  
>>> 20Blocker%20AND%20resolution%20=%20Unresolved  
>>> 
>> 
>> Could everyone please correctly mark as “blocking” those issues that  
>>> they  
>> consider blocking for 1.3.2 so that we get an accurate overview of  
>>> where we  
>> are.  
>> 
>> @Chesnay, could you maybe check if this one should in fact be  
>>> considered a  
>> blocker: https://issues.apache.org/jira/browse/FLINK-7034? <  
>> https://issues.apache.org/jira/browse/FLINK-7034?>  
>> 
>> Best,  
>> Aljoscha  
>>> On 6. Jul 2017, at 07:19, Tzu-Li (Gordon) Tai   
>> wrote:  
>>> 
>>> FLINK-7041 has been merged.  
>>> I’d also like to raise another blocker for 1.3.2:  
>> https://issues.apache.org/jira/browse/FLINK-6996.  
>>> 
>>> Cheers,  
>>> Gordon  
>>> On 30 June 2017 at 12:46:07 AM, Aljoscha Krettek (aljos...@apache.org  
>>> )  
>> wrote:  
>>> 
>>> Gordon and I found this (in my opinion) blocking issue:  
>> https://issues.apache.org/jira/browse/FLINK-7041 <  
>> https://issues.apache.org/jira/browse/FLINK-7041>  
>>> 
>>> I’m trying to quickly provide a fix.  
>>> 
 On 26. Jun 2017, at 15:30, Timo Walther  wrote:  
 
 I just opened a PR which should be included in the next bug fix  
>>> release  
>> for the Table API:  
 https://issues.apache.org/jira/browse/FLINK-7005  
 
 Timo  
 
 Am 23.06.17 um 14:09 schrieb Robert Metzger:  
> Thanks Haohui.  
> 
> The first main task for the release management is to come up with a  
> timeline :)  
> Lets just wait and see which issues get reported. There are  
>>> currently  
>> no  
> blockers set for 1.3.1 in JIRA.  
> 
> On Thu, Jun 22, 2017 at 6:47 PM, Haohui Mai   
>> wrote:  
> 
>> Hi,  
>> 
>> Release management is though, I'm happy to help. Are there any  
>> timelines  
>> you have in mind?  
>> 
>> Haohui  
>> On Fri, Jun 23, 2017 at 12:01 AM Robert Metzger <  
>>> rmetz...@apache.org>  
>> wrote:  
>> 
>>> Hi all,  
>>> 
>>> with the 1.3.1 release on the way, we can start thinking about the  
>> 1.3.2  
>>> release.  
>>> 
>>> We have 

[CANCEL][VOTE] Release Apache Flink-shaded 1.0 (RC1)

2017-07-24 Thread Chesnay Schepler

I'm canceling the RC to fix the artifact versioning.

Greg raises a good point, and the required changes are minimal.

I will create another RC later today.

On 24.07.2017 10:22, Stephan Ewen wrote:

There is no real pressure, true.

I was simply checking the mandatory constraints for the release. and leave
it to Chesnay to make the call whether to release or adjust the naming
scheme of the artifacts.

The other two points raised in the "improvements mail" are not really
important in any way.



On Mon, Jul 24, 2017 at 1:58 AM, Greg Hogan  wrote:


Is there a pressing need to get the release out quickly? This being the
first release, would it be better to change the versioning now to prevent
future confusion? Even if Flink is the only intended consumer we’ll still
be publishing the jars.



On Jul 23, 2017, at 9:41 AM, Stephan Ewen  wrote:

The release is technically correct, so
+1 for the release

  - LICENSE and NOTICE are good
  - Shaded artifacts add their licenses to the artifact where needed
  - no binaries in the release


I will send another mail with suggestions for improving things for future
releases


On Fri, Jul 21, 2017 at 11:39 AM, Robert Metzger 
wrote:


Thanks a lot for preparing the release artifacts.
While checking the source repo / release commit, I realized that you are
not following the versioning scheme as flink:
the current master has a "x.y-SNAPSHOT" version, and release candidates
(and releases) get a x.y.z version. I wonder if it makes sense to use

the

same model in the flink-shaded.git repo. I think this is the default
assumption in maven, and some modules behave differently based on the
version: for example "mvn deploy" sends "-SNAPSHOT" artifacts to a

snapshot

server, and release artifacts to a staging repository.

I don't think we need to cancel the release because of this, I just

wanted

to raise this point to see what others are thinking.


I've checked the following
- The netty shaded jar contains the MIT license from netty router:
https://repository.apache.org/content/repositories/
orgapacheflink-1130/org/apache/flink/flink-shaded-
netty-4/1.0-4.0.27.Final/flink-shaded-netty-4-1.0-4.0.27.Final.jar
- In the staging repo, I didn't see any dependencies exposed.
- I checked some of the md5 sums in the staging and they were correct /

I

used a mvn plugin to check the signatures in the staging repo and they

were

okay
- clean install in the source repo worked (this includes a license

header

check)
- LICENSE and NOTICE file are there

==> +1 to release.

On Fri, Jul 21, 2017 at 9:45 AM, Chesnay Schepler 
wrote:


Here's a list of things we need to check:

* correct License/Notice files
* licenses of shaded dependencies are included in the jar
* the versions of shaded dependencies match those used in Flink 1.4
* compilation with maven works
* the assembled jars only contain the shaded dependency and no
   non-shaded classes
* no transitive dependencies should be exposed


On 19.07.2017 15:59, Chesnay Schepler wrote:


Dear Flink community,

Please vote on releasing the following candidate as Apache

Flink-shaded

version 1.0.

The commit to be voted in:
https://gitbox.apache.org/repos/asf/flink-shaded/commit/fd30
33ba9ead310478963bf43e09cd50d1e36d71

Branch:
release-1.0-rc1

The release artifacts to be voted on can be found at:
http://home.apache.org/~chesnay/flink-shaded-1.0-rc1/ <
http://home.apache.org/%7Echesnay/flink-shaded-1.0-rc1/>

The release artifacts are signed with the key with fingerprint
19F2195E1B4816D765A2C324C2EED7B111D464BA:
http://www.apache.org/dist/flink/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/

orgapacheflink-1130

-


The vote ends on Monday (5pm CEST), July 24th, 2017.

[ ] +1 Release this package as Apache Flink-shaded 1.0
[ ] -1 Do not release this package, because ...

-


The flink-shaded project contains a number of shaded dependencies for
Apache Flink.

This release includes asm-all:5.0.4, guava:18.0,

netty-all:4.0.27-FINAL

and netty-router:1.10 . Note that netty-all and netty-router are

bundled as

a single dependency.

The purpose of these dependencies is to provide a single instance of a
shaded dependency in the Apache Flink distribution, instead of each
individual module shading the dependency.

For more information, see
https://issues.apache.org/jira/browse/FLINK-6529.





[jira] [Created] (FLINK-7253) Remove all 'assume Java 8' code in tests

2017-07-24 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7253:
---

 Summary: Remove all 'assume Java 8' code in tests
 Key: FLINK-7253
 URL: https://issues.apache.org/jira/browse/FLINK-7253
 Project: Flink
  Issue Type: Sub-task
Reporter: Stephan Ewen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7251) Merge the flink-java8 project into flink-core

2017-07-24 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7251:
---

 Summary: Merge the flink-java8 project into flink-core
 Key: FLINK-7251
 URL: https://issues.apache.org/jira/browse/FLINK-7251
 Project: Flink
  Issue Type: Sub-task
Reporter: Stephan Ewen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7252) Remove Flink Futures or back them by CompletableFutures

2017-07-24 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7252:
---

 Summary: Remove Flink Futures or back them by CompletableFutures
 Key: FLINK-7252
 URL: https://issues.apache.org/jira/browse/FLINK-7252
 Project: Flink
  Issue Type: Sub-task
Reporter: Stephan Ewen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7249) Bump Java version in build plugind

2017-07-24 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7249:
---

 Summary: Bump Java version in build plugind
 Key: FLINK-7249
 URL: https://issues.apache.org/jira/browse/FLINK-7249
 Project: Flink
  Issue Type: Sub-task
Reporter: Stephan Ewen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7250) Drop the jdk8 build profile

2017-07-24 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7250:
---

 Summary: Drop the jdk8 build profile
 Key: FLINK-7250
 URL: https://issues.apache.org/jira/browse/FLINK-7250
 Project: Flink
  Issue Type: Sub-task
Reporter: Stephan Ewen






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7248) Invalid checkRestoredNullCheckpointWhenFetcherNotReady test in FlinkKafkaConsumerBaseTest

2017-07-24 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-7248:
--

 Summary: Invalid checkRestoredNullCheckpointWhenFetcherNotReady 
test in FlinkKafkaConsumerBaseTest
 Key: FLINK-7248
 URL: https://issues.apache.org/jira/browse/FLINK-7248
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector, Tests
Affects Versions: 1.4.0
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: 1.4.0


The {{checkRestoredNullCheckpointWhenFetcherNotReady}} is a no longer valid 
remnant from some of the major refactors in the Flink Kafka Consumer, and can 
be safely removed. The test was passing only because our 
{{AbstractPartitionDiscoverer}} implementation used in the test base was an 
empty mock.

The actual expected behaviour for checkpoints when the fetcher is not read is 
actually now verified in `checkRestoredCheckpointWhenFetcherNotReady`.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7247) Replace travis java 7 profiles

2017-07-24 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7247:
---

 Summary: Replace travis java 7 profiles
 Key: FLINK-7247
 URL: https://issues.apache.org/jira/browse/FLINK-7247
 Project: Flink
  Issue Type: Sub-task
  Components: Travis
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Blocker
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release Apache Flink-shaded 1.0 (RC1)

2017-07-24 Thread Stephan Ewen
There is no real pressure, true.

I was simply checking the mandatory constraints for the release. and leave
it to Chesnay to make the call whether to release or adjust the naming
scheme of the artifacts.

The other two points raised in the "improvements mail" are not really
important in any way.



On Mon, Jul 24, 2017 at 1:58 AM, Greg Hogan  wrote:

> Is there a pressing need to get the release out quickly? This being the
> first release, would it be better to change the versioning now to prevent
> future confusion? Even if Flink is the only intended consumer we’ll still
> be publishing the jars.
>
>
> > On Jul 23, 2017, at 9:41 AM, Stephan Ewen  wrote:
> >
> > The release is technically correct, so
> > +1 for the release
> >
> >  - LICENSE and NOTICE are good
> >  - Shaded artifacts add their licenses to the artifact where needed
> >  - no binaries in the release
> >
> >
> > I will send another mail with suggestions for improving things for future
> > releases
> >
> >
> > On Fri, Jul 21, 2017 at 11:39 AM, Robert Metzger 
> > wrote:
> >
> >> Thanks a lot for preparing the release artifacts.
> >> While checking the source repo / release commit, I realized that you are
> >> not following the versioning scheme as flink:
> >> the current master has a "x.y-SNAPSHOT" version, and release candidates
> >> (and releases) get a x.y.z version. I wonder if it makes sense to use
> the
> >> same model in the flink-shaded.git repo. I think this is the default
> >> assumption in maven, and some modules behave differently based on the
> >> version: for example "mvn deploy" sends "-SNAPSHOT" artifacts to a
> snapshot
> >> server, and release artifacts to a staging repository.
> >>
> >> I don't think we need to cancel the release because of this, I just
> wanted
> >> to raise this point to see what others are thinking.
> >>
> >>
> >> I've checked the following
> >> - The netty shaded jar contains the MIT license from netty router:
> >> https://repository.apache.org/content/repositories/
> >> orgapacheflink-1130/org/apache/flink/flink-shaded-
> >> netty-4/1.0-4.0.27.Final/flink-shaded-netty-4-1.0-4.0.27.Final.jar
> >> - In the staging repo, I didn't see any dependencies exposed.
> >> - I checked some of the md5 sums in the staging and they were correct /
> I
> >> used a mvn plugin to check the signatures in the staging repo and they
> were
> >> okay
> >> - clean install in the source repo worked (this includes a license
> header
> >> check)
> >> - LICENSE and NOTICE file are there
> >>
> >> ==> +1 to release.
> >>
> >> On Fri, Jul 21, 2017 at 9:45 AM, Chesnay Schepler 
> >> wrote:
> >>
> >>> Here's a list of things we need to check:
> >>>
> >>> * correct License/Notice files
> >>> * licenses of shaded dependencies are included in the jar
> >>> * the versions of shaded dependencies match those used in Flink 1.4
> >>> * compilation with maven works
> >>> * the assembled jars only contain the shaded dependency and no
> >>>   non-shaded classes
> >>> * no transitive dependencies should be exposed
> >>>
> >>>
> >>> On 19.07.2017 15:59, Chesnay Schepler wrote:
> >>>
>  Dear Flink community,
> 
>  Please vote on releasing the following candidate as Apache
> Flink-shaded
>  version 1.0.
> 
>  The commit to be voted in:
>  https://gitbox.apache.org/repos/asf/flink-shaded/commit/fd30
>  33ba9ead310478963bf43e09cd50d1e36d71
> 
>  Branch:
>  release-1.0-rc1
> 
>  The release artifacts to be voted on can be found at:
>  http://home.apache.org/~chesnay/flink-shaded-1.0-rc1/ <
>  http://home.apache.org/%7Echesnay/flink-shaded-1.0-rc1/>
> 
>  The release artifacts are signed with the key with fingerprint
>  19F2195E1B4816D765A2C324C2EED7B111D464BA:
>  http://www.apache.org/dist/flink/KEYS
> 
>  The staging repository for this release can be found at:
>  https://repository.apache.org/content/repositories/
> orgapacheflink-1130
> 
>  -
> 
> 
>  The vote ends on Monday (5pm CEST), July 24th, 2017.
> 
>  [ ] +1 Release this package as Apache Flink-shaded 1.0
>  [ ] -1 Do not release this package, because ...
> 
>  -
> 
> 
>  The flink-shaded project contains a number of shaded dependencies for
>  Apache Flink.
> 
>  This release includes asm-all:5.0.4, guava:18.0,
> netty-all:4.0.27-FINAL
>  and netty-router:1.10 . Note that netty-all and netty-router are
> >> bundled as
>  a single dependency.
> 
>  The purpose of these dependencies is to provide a single instance of a
>  shaded dependency in the Apache Flink distribution, instead of each
>  individual module shading the dependency.
> 
>  For more information, see
>