Re: [RESULT] [VOTE] Release Apache Flink 1.2.1 (RC2)

2017-04-26 Thread Henry Saputra
Thanks for being awesome RE, as always, Robert!

- Henry

On Tue, Apr 25, 2017 at 2:09 AM, Robert Metzger  wrote:

> I've uploaded the artifacts to the apache mirrors and released the maven
> stuff to central.
>
> While the artifacts are syncing, please review the release announcement:
> https://github.com/apache/flink-web/pull/54
>
> On Tue, Apr 25, 2017 at 10:29 AM, Robert Metzger 
> wrote:
>
> > Okay, I'll then put out the release.
> >
> >
> > The vote to release this package as Flink 1.2.1 has passed with:
> >
> > +1 votes:
> > - Andrew (non-binding)
> > - Gyula (binding)
> > - Till (binding)
> > - Greg (binding)
> > - Henry (binding)
> >
> > No 0 or -1 votes.
> >
> >
> > On Mon, Apr 24, 2017 at 2:11 PM, Aljoscha Krettek 
> > wrote:
> >
> >> Agreed, as I said above:
> >>
> >>  I have the fix ready but we can do that in Flink 1.2.2. Very
> quickly,
> >>  though.
> >>
> >> Best,
> >> Aljoscha
> >>
> >> > On 24. Apr 2017, at 13:19, Ufuk Celebi  wrote:
> >> >
> >> > I agree with Till and would NOT cancel this release. It has been
> >> > delayed already quite a bit already and the feature freeze for 1.3.0
> >> > is coming up (i.e. most contributors will be busy and not spend a lot
> >> > of time for 1.2.1).
> >> >
> >> > – Ufuk
> >> >
> >> >
> >> > On Mon, Apr 24, 2017 at 9:31 AM, Till Rohrmann 
> >> wrote:
> >> >> If this bug was already present in 1.2.0, then I guess not many users
> >> have
> >> >> used this feature. Otherwise we would have seen complaints on the
> >> mailing
> >> >> list.
> >> >>
> >> >> From the JIRA issue description, it looks as if we have to fix it for
> >> 1.3.0
> >> >> anyway. What about fixing it this week and then backporting it to the
> >> 1.2.1
> >> >> branch?
> >> >>
> >> >> Cheers,
> >> >> Till
> >> >>
> >> >> On Mon, Apr 24, 2017 at 8:12 AM, Aljoscha Krettek <
> aljos...@apache.org
> >> >
> >> >> wrote:
> >> >>
> >> >>> It means that users cannot restore from 1.2.0 to 1.2.0, 1.2.0 to
> >> 1.2.1, or
> >> >>> 1.2.1 to 1.2.1. However, this only happens when using the
> >> >>> CheckpointedRestoring interface, which you have to do when you want
> to
> >> >>> migrate away form the Checkpointed interface.
> >> >>>
> >> >>> tl;dr It’s not a new bug but one that was present in 1.2.0 already.
> >> >>>
> >>  On 23. Apr 2017, at 21:16, Robert Metzger 
> >> wrote:
> >> 
> >>  @all: I'm sorry for being a bad release manager this time. I'm not
> >> >>> spending
> >>  much time online these days. I hope to increase my dev@ list
> >> activity a
> >>  little bit next week.
> >> 
> >>  @Aljoscha:
> >>  Does this mean that users can not upgrade from 1.2.0 to 1.2.1 ?
> >> 
> >>  Can we make the minor versions easily compatible?
> >>  If so, I would prefer to cancel this release as well and do another
> >> one.
> >> 
> >> 
> >>  On Fri, Apr 21, 2017 at 12:04 PM, Aljoscha Krettek <
> >> aljos...@apache.org>
> >>  wrote:
> >> 
> >> > There is this (somewhat pesky) issue:
> >> > - https://issues.apache.org/jira/browse/FLINK-6353: Restoring
> using
> >> > CheckpointedRestoring does not work from 1.2 to 1.2
> >> >
> >> > I have the fix ready but we can do that in Flink 1.2.2. Very
> >> quickly,
> >> > though.
> >> >
> >> >> On 20. Apr 2017, at 17:20, Henry Saputra <
> henry.sapu...@gmail.com>
> >> > wrote:
> >> >>
> >> >> LICENSE file exists
> >> >> NOTICE file looks good
> >> >> Signature files look good
> >> >> Hash files look good
> >> >> No 3rd party exes in source artifact
> >> >> Source compiled and pass tests
> >> >> Local run work
> >> >> Run simple job on YARN
> >> >>
> >> >> +1
> >> >>
> >> >> - Henry
> >> >>
> >> >> On Wed, Apr 12, 2017 at 4:06 PM, Robert Metzger <
> >> rmetz...@apache.org>
> >> > wrote:
> >> >>
> >> >>> Dear Flink community,
> >> >>>
> >> >>> Please vote on releasing the following candidate as Apache Flink
> >> >>> version
> >> >>> 1.2
> >> >>> .1.
> >> >>>
> >> >>> The commit to be voted on:
> >> >>> 76eba4e0  >> >>> 76eba4e0>
> >> >>> (*http://git-wip-us.apache.org/repos/asf/flink/commit/76eba4e0
> >> >>>  >*)
> >> >>>
> >> >>> Branch:
> >> >>> release-1.2.1-rc2
> >> >>>
> >> >>> The release artifacts to be voted on can be found at:
> >> >>> http://people.apache.org/~rmetzger/flink-1.2.1-rc2/
> >> >>>
> >> >>>
> >> >>> The release artifacts are signed with the key with fingerprint
> >> >>> D9839159:
> >> >>> http://www.apache.org/dist/flink/KEYS
> >> >>>
> >> >>> The staging repository for this release can be found at:
> >> >>> 

Re: [DISCUSS] Code style / checkstyle

2017-04-26 Thread Henry Saputra
Cool! So, it begins =)

- Henry

On Wed, Apr 26, 2017 at 7:30 AM, Aljoscha Krettek 
wrote:

> I merged the stricter checkstyle for flink-streaming-java today. (Sans
> checking for right curly braces)
>
> > On 18. Apr 2017, at 22:17, Chesnay Schepler  wrote:
> >
> > +1 to use earth rotation as the new standard time unit. Maybe more
> importantly, I'm absolutely in favor of merging it.
> >
> > On 18.04.2017 20:39, Aljoscha Krettek wrote:
> >> I rebased the PR [1] on current master. Is there any strong objection
> against merging this (minus the two last commits which introduce
> curly-brace-style checking). If not, I would like to merge this after two
> earth rotations, i.e. after all the time zones have had some time to react.
> >>
> >> The complete set of checks has been listed by Chesnay (via Greg) before
> but the gist of it is that we only add common-sense checks that most people
> should be able to agree upon so that we avoid edit wars (especially when it
> comes to whitespace, import order and Javadoc paragraph styling).
> >>
> >> [1] https://github.com/apache/flink/pull/3567
> >>> On 5. Apr 2017, at 23:54, Chesnay Schepler  wrote:
> >>>
> >>> I agree to just allow both. While I definitely prefer 1) i can see why
> someone might prefer 2).
> >>>
> >>> Wouldn't want to delay this anymore; can't find to add this to
> flink-metrics and flink-python...
> >>>
> >>> On 03.04.2017 18:33, Aljoscha Krettek wrote:
>  I think enough people did already look at the checkstyle rules
> proposed in the PR.
> 
>  On most of the rules reaching consensus is easy so I propose to
> enable all rules except those regarding placement of curly braces and
> control flow formatting. The two styles in the Flink code base are:
> 
>  1)
>  if () {
>  } else {
>  }
> 
>  try {
>  } catch () {
>  }
> 
>  and
> 
>  2)
> 
>  if () {
>  }
>  else {
>  }
> 
>  try {
>  }
>  catch () {
>  }
> 
>  I think it’s hard to reach consensus on these so I suggest to keep
> allowing both styles.
> 
>  Any comments very welcome! :-)
> 
>  Best,
>  Aljoscha
> > On 19. Mar 2017, at 17:09, Aljoscha Krettek 
> wrote:
> >
> > I played around with this over the week end and it turns out that
> the required changes in flink-streaming-java are not that big. I created a
> PR with a proposed checkstyle.xml and the required code changes:
> https://github.com/apache/flink/pull/3567  flink/pull/3567>. There’s a longer description of the style in the PR.
> The commits add continuously more invasive changes so we can start with the
> more lightweight changes if we want to.
> >
> > If we want to go forward with this I would also encourage other
> people to use this for different modules and see how it turns out.
> >
> > Best,
> > Aljoscha
> >
> >> On 18 Mar 2017, at 08:00, Aljoscha Krettek  > wrote:
> >>
> >> I added an issue for adding a custom checkstyle.xml for
> flink-streaming-java so that we can gradually add checks:
> https://issues.apache.org/jira/browse/FLINK-6107 <
> https://issues.apache.org/jira/browse/FLINK-6107>. I outlined the
> procedure in the Jira. We can use this as a pilot project and see how it
> goes and then gradually also apply to other modules.
> >>
> >> What do you think?
> >>
> >>> On 6 Mar 2017, at 12:42, Stephan Ewen > wrote:
> >>>
> >>> A singular "all reformat in one instant" will cause immense damage
> to the
> >>> project, in my opinion.
> >>>
> >>> - There are so many pull requests that we are having a hard time
> keeping
> >>> up, and merging will a lot more time intensive.
> >>> - I personally have many forked branches with WIP features that
> will
> >>> probably never go in if the branches become unmergeable. I expect
> that to
> >>> be true for many other committers and contributors.
> >>> - Some companies have Flink forks and are rebasing patches onto
> master
> >>> regularly. They will be completely screwed by a full reformat.
> >>>
> >>> If we do something, the only thing that really is possible is:
> >>>
> >>> (1) Define a style. Ideally not too far away from Flink's style.
> >>> (2) Apply it to new projects/modules
> >>> (3) Coordinate carefully to pull it into existing modules, module
> by
> >>> module. Leaving time to adopt pull requests bit by bit, and
> allowing forks
> >>> to go through minor merges, rather than total conflict.
> >>>
> >>>
> >>>
> >>> On Wed, Mar 1, 2017 at 5:57 PM, Henry Saputra <
> henry.sapu...@gmail.com >
> >>> wrote:
> >>>
>  It is actually Databricks Scala guide to help 

[jira] [Created] (FLINK-6392) Change the alias of Window from optional to essential.

2017-04-26 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6392:
--

 Summary: Change the alias of Window from optional to essential.
 Key: FLINK-6392
 URL: https://issues.apache.org/jira/browse/FLINK-6392
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Affects Versions: 1.3.0
Reporter: sunjincheng
Assignee: sunjincheng
 Fix For: 1.3.0


Currently, The window clause use case looks like:
{code}
tab //Table('a,'b,'c)
   .window( Slide over 10.milli every 5.milli  as 'w) 
   .groupBy('w,'a,'b) // WindowGroupedTable
   .select('a, 'b, 'c.sum, 'w.start, 'w.end)
{code}
As we see the alias of window is essential. But the current implementation of 
the TableAPI does not have the constraint for the alias,So we must refactoring 
the API definition using TYPE SYSTEM lead to constraint for the alias.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6391) fix build for scala 2.11 (gelly-examples)

2017-04-26 Thread Eron Wright (JIRA)
Eron Wright  created FLINK-6391:
---

 Summary: fix build for scala 2.11 (gelly-examples)
 Key: FLINK-6391
 URL: https://issues.apache.org/jira/browse/FLINK-6391
 Project: Flink
  Issue Type: Bug
  Components: Build System
Reporter: Eron Wright 
Assignee: Eron Wright 


After switching the build to Scala 2.11 (using 
`tools/change-scala-version.sh`), the build fails in flink-dist module.

{code}
...
[INFO] flink-dist . FAILURE [ 19.337 s]
[INFO] flink-fs-tests . SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 31:16 min
[INFO] Finished at: 2017-04-26T15:17:43-07:00
[INFO] Final Memory: 380M/1172M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-assembly-plugin:2.6:single (bin) on project 
flink-dist_2.11: Failed to create assembly: Error adding file to archive: 
/Users/wrighe3/Projects/flink/flink-dist/../flink-libraries/flink-gelly-examples/target/flink-gelly-examples_2.10-1.3-SNAPSHOT.jar
 -> [Help 1]
{code}

The root cause appears to be that the change-scala-version tool should update 
flink-dist/.../assemblies/bin.xml to use the correct version of 
flink-gelly-examples.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6390) Add Trigger Hooks to the Checkpoint Coordinator

2017-04-26 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-6390:
---

 Summary: Add Trigger Hooks to the Checkpoint Coordinator
 Key: FLINK-6390
 URL: https://issues.apache.org/jira/browse/FLINK-6390
 Project: Flink
  Issue Type: New Feature
  Components: State Backends, Checkpointing
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.3.0


Some source systems require to be notified prior to starting a checkpoint, in 
order to do preparatory work for the checkpoint.

I propose to add an interface to allow sources to register hooks that are 
called by the checkpoint coordinator when triggering / restoring a checkpoint.
These hooks may produce state that is stores with the checkpoint metadata.

Envisioned interface for the hooks
{code}
/**
 * The interface for hooks that can be called by the checkpoint coordinator 
when triggering or
 * restoring a checkpoint. Such a hook is useful for example when preparing 
external systems for
 * taking or restoring checkpoints.
 * 
 * The {@link #triggerCheckpoint(long, long, Executor)} method (called when 
triggering a checkpoint)
 * can return a result (via a future) that will be stored as part of the 
checkpoint metadata.
 * When restoring a checkpoint, that stored result will be given to the {@link 
#restoreCheckpoint(long, Object)}
 * method. The hook's {@link #getIdentifier() identifier} is used to map data 
to hook in the presence
 * of multiple hooks, and when resuming a savepoint that was potentially 
created by a different job.
 * The identifier has a similar role as for example the operator UID in the 
streaming API.
 * 
 * The MasterTriggerRestoreHook is defined when creating the streaming 
dataflow graph. It is attached
 * to the job graph, which gets sent to the cluster for execution. To avoid 
having to make the hook
 * itself serializable, these hooks are attached to the job graph via a {@link 
MasterTriggerRestoreHook.Factory}.
 * 
 * @param  The type of the data produced by the hook and stored as part of 
the checkpoint metadata.
 *If the hook never stores any data, this can be typed to {@code 
Void}.
 */
public interface MasterTriggerRestoreHook {

/**
 * Gets the identifier of this hook. The identifier is used to identify 
a specific hook in the
 * presence of multiple hooks and to give it the correct checkpointed 
data upon checkpoint restoration.
 * 
 * The identifier should be unique between different hooks of a job, 
but deterministic/constant
 * so that upon resuming a savepoint, the hook will get the correct 
data.
 * For example, if the hook calls into another storage system and 
persists namespace/schema specific
 * information, then the name of the storage system, together with the 
namespace/schema name could
 * be an appropriate identifier.
 * 
 * When multiple hooks of the same name are created and attached to 
a job graph, only the first
 * one is actually used. This can be exploited to deduplicate hooks 
that would do the same thing.
 * 
 * @return The identifier of the hook. 
 */
String getIdentifier();

/**
 * This method is called by the checkpoint coordinator prior when 
triggering a checkpoint, prior
 * to sending the "trigger checkpoint" messages to the source tasks.
 * 
 * If the hook implementation wants to store data as part of the 
checkpoint, it may return
 * that data via a future, otherwise it should return null. The data is 
stored as part of
 * the checkpoint metadata under the hooks identifier (see {@link 
#getIdentifier()}).
 * 
 * If the action by this hook needs to be executed synchronously, 
then this method should
 * directly execute the action synchronously and block until it is 
complete. The returned future
 * (if any) would typically be a completed future.
 * 
 * If the action should be executed asynchronously and only needs to 
complete before the
 * checkpoint is considered completed, then the method may use the 
given executor to execute the
 * actual action and would signal its completion by completing the 
future. For hooks that do not
 * need to store data, the future would be completed with null.
 * 
 * @param checkpointId The ID (logical timestamp, monotonously 
increasing) of the checkpoint
 * @param timestamp The wall clock timestamp when the checkpoint was 
triggered, for
 *  info/logging purposes. 
 * @param executor The executor for asynchronous actions
 * 
 * @return Optionally, a future that signals when the hook has 
completed and that contains
 * data to be stored with the checkpoint.
 * 
 * @throws Exception 

[jira] [Created] (FLINK-6388) Add support for DISTINCT into Code Generated Aggregations

2017-04-26 Thread Stefano Bortoli (JIRA)
Stefano Bortoli created FLINK-6388:
--

 Summary: Add support for DISTINCT into Code Generated Aggregations
 Key: FLINK-6388
 URL: https://issues.apache.org/jira/browse/FLINK-6388
 Project: Flink
  Issue Type: Sub-task
  Components: DataStream API
Affects Versions: 1.3.0
Reporter: Stefano Bortoli
Assignee: Stefano Bortoli
 Fix For: 1.3.0


We should support DISTINCT in Code Generated aggrgation functions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Question on window ProcessFunction

2017-04-26 Thread 魏偉哲
Hi Aljoscha,

I see. Thanks for your reply.

Best,
Tony Wei

Aljoscha Krettek 於 2017年4月26日 週三,下午10:29寫道:

> Hi,
> Both implementations work so no one bothered to change the older
> implementations yet. I don’t think it’s a problem but if you want you can
> adapt reduce/fold to the newer implementation.
>
> Best,
> Aljoscha
> > On 26. Apr 2017, at 14:51, 魏偉哲  wrote:
> >
> > Hi Aljoscha,
> >
> > I know the aggregate code is newer. I am confused because the
> > implementations are not consistent.
> > Does it mean that the reduce/fold implementation would need to be
> > refactored for the purpose of having less layers ?
> > Or is it better to remain the current implementations for some reasons?
> >
> > Many thanks,
> > Tony Wei
> >
> > 2017-04-26 20:24 GMT+08:00 Aljoscha Krettek :
> >
> >> Hi Tony,
> >> The reason for this is that the aggregate code is newer. The new code
> has
> >> less layers, compared to the reduce/fold implementation where it is
> >> InternalFunction(ReduceApplyFunction(Reduce)) instead of
> >> InteralAggregateFunction(Aggregate).
> >>
> >> Best,
> >> Aljoscha
> >>> On 26. Apr 2017, at 06:39, 魏偉哲  wrote:
> >>>
> >>> Hi all,
> >>>
> >>> Recently, I was tracing the source code in streaming api and I was
> >> confused
> >>> about some implementations.
> >>>
> >>> When using reduce function with evictor, the *WindowStream* will wrap
> the
> >>> *ReduceFunction* and *ProcessWindowFunction* into
> >>> *ReduceApplyProcessWindonwFunction* and put it in
> >>> *InternalIterableProcessWindowFunction*. So does fold function.
> >>>
> >>> However, when using aggregate, the *InternalIterableProcessWindowF
> >> unction*
> >>> was changed to *InternalAggregateProcessWindowFunction* which was
> >> applied
> >>> aggregation in the process() method.
> >>>
> >>> My question is why not implement an *AggregateApplyProcessWindowFun
> >> ction*
> >>> and use *InternalIterableProcessWindowFunction* instead just like
> >> reduce,
> >>> fold function did. Is there any concern?
> >>>
> >>> Many thanks,
> >>> Tony Wei
> >>
> >>
>
>


Re: Towards a spec for robust streaming SQL, Part 1

2017-04-26 Thread Tyler Akidau
No worries, thanks for the heads up. Good luck wrapping all that stuff up.

-Tyler

On Tue, Apr 25, 2017 at 12:07 AM Fabian Hueske  wrote:

> Hi Tyler,
>
> thanks for pushing this effort and including the Flink list.
> I haven't managed to read the doc yet, but just wanted to thank you for the
> write-up and let you know that I'm very interested in this discussion.
>
> We are very close to the feature freeze of Flink 1.3 and I'm quite busy
> getting as many contributions merged before the release is forked off.
> When that happened, I'll have more time to read and comment.
>
> Thanks,
> Fabian
>
>
> 2017-04-22 0:16 GMT+02:00 Tyler Akidau :
>
> > Good point, when you start talking about anything less than a full join,
> > triggers get involved to describe how one actually achieves the desired
> > semantics, and they may end up being tied to just one of the inputs
> (e.g.,
> > you may only care about the watermark for one side of the join). Am
> > expecting us to address these sorts of details more precisely in doc #2.
> >
> > -Tyler
> >
> > On Fri, Apr 21, 2017 at 2:26 PM Kenneth Knowles 
> > wrote:
> >
> > > There's something to be said about having different triggering
> depending
> > on
> > > which side of a join data comes from, perhaps?
> > >
> > > (delightful doc, as usual)
> > >
> > > Kenn
> > >
> > > On Fri, Apr 21, 2017 at 1:33 PM, Tyler Akidau
>  > >
> > > wrote:
> > >
> > > > Thanks for reading, Luke. The simple answer is that CoGBK is
> basically
> > > > flatten + GBK. Flatten is a non-grouping operation that merges the
> > input
> > > > streams into a single output stream. GBK then groups the data within
> > that
> > > > single union stream as you might otherwise expect, yielding a single
> > > table.
> > > > So I think it doesn't really impact things much. Grouping,
> aggregation,
> > > > window merging etc all just act upon the merged stream and generate
> > what
> > > is
> > > > effectively a merged table.
> > > >
> > > > -Tyler
> > > >
> > > > On Fri, Apr 21, 2017 at 12:36 PM Lukasz Cwik
>  > >
> > > > wrote:
> > > >
> > > > > The doc is a good read.
> > > > >
> > > > > I think you do a great job of explaining table -> stream, stream ->
> > > > stream,
> > > > > and stream -> table when there is only one stream.
> > > > > But when there are multiple streams reading/writing to a table, how
> > > does
> > > > > that impact what occurs?
> > > > > For example, with CoGBK you have multiple streams writing to a
> table,
> > > how
> > > > > does that impact window merging?
> > > > >
> > > > > On Thu, Apr 20, 2017 at 5:57 PM, Tyler Akidau
> > >  > > > >
> > > > > wrote:
> > > > >
> > > > > > Hello Beam, Calcite, and Flink dev lists!
> > > > > >
> > > > > > Apologies for the big cross post, but I thought this might be
> > > something
> > > > > all
> > > > > > three communities would find relevant.
> > > > > >
> > > > > > Beam is finally making progress on a SQL DSL utilizing Calcite,
> > > thanks
> > > > to
> > > > > > Mingmin Xu. As you can imagine, we need to come to some
> conclusion
> > > > about
> > > > > > how to elegantly support the full suite of streaming
> functionality
> > in
> > > > the
> > > > > > Beam model in via Calcite SQL. You folks in the Flink community
> > have
> > > > been
> > > > > > pushing on this (e.g., adding windowing constructs, amongst
> others,
> > > > thank
> > > > > > you! :-), but from my understanding we still don't have a full
> spec
> > > for
> > > > > how
> > > > > > to support robust streaming in SQL (including but not limited to,
> > > > e.g., a
> > > > > > triggers analogue such as EMIT).
> > > > > >
> > > > > > I've been spending a lot of time thinking about this and have
> some
> > > > > opinions
> > > > > > about how I think it should look that I've already written down,
> > so I
> > > > > > volunteered to try to drive forward agreement on a general
> > streaming
> > > > SQL
> > > > > > spec between our three communities (well, technically I
> volunteered
> > > to
> > > > do
> > > > > > that w/ Beam and Calcite, but I figured you Flink folks might
> want
> > to
> > > > > join
> > > > > > in since you're going that direction already anyway and will have
> > > > useful
> > > > > > insights :-).
> > > > > >
> > > > > > My plan was to do this by sharing two docs:
> > > > > >
> > > > > >1. The Beam Model : Streams & Tables - This one is for
> context,
> > > and
> > > > > >really only mentions SQL in passing. But it describes the
> > > > relationship
> > > > > >between the Beam Model and the "streams & tables" way of
> > thinking,
> > > > > which
> > > > > >turns out to be useful in understanding what robust streaming
> in
> > > SQL
> > > > > > might
> > > > > >look like. Many of you probably already know some or all of
> > what's
> > > > in
> > > > > > here,
> > > > > >but I felt it 

Re: [DISCUSS] Code style / checkstyle

2017-04-26 Thread Aljoscha Krettek
I merged the stricter checkstyle for flink-streaming-java today. (Sans checking 
for right curly braces)

> On 18. Apr 2017, at 22:17, Chesnay Schepler  wrote:
> 
> +1 to use earth rotation as the new standard time unit. Maybe more 
> importantly, I'm absolutely in favor of merging it.
> 
> On 18.04.2017 20:39, Aljoscha Krettek wrote:
>> I rebased the PR [1] on current master. Is there any strong objection 
>> against merging this (minus the two last commits which introduce 
>> curly-brace-style checking). If not, I would like to merge this after two 
>> earth rotations, i.e. after all the time zones have had some time to react.
>> 
>> The complete set of checks has been listed by Chesnay (via Greg) before but 
>> the gist of it is that we only add common-sense checks that most people 
>> should be able to agree upon so that we avoid edit wars (especially when it 
>> comes to whitespace, import order and Javadoc paragraph styling).
>> 
>> [1] https://github.com/apache/flink/pull/3567
>>> On 5. Apr 2017, at 23:54, Chesnay Schepler  wrote:
>>> 
>>> I agree to just allow both. While I definitely prefer 1) i can see why 
>>> someone might prefer 2).
>>> 
>>> Wouldn't want to delay this anymore; can't find to add this to 
>>> flink-metrics and flink-python...
>>> 
>>> On 03.04.2017 18:33, Aljoscha Krettek wrote:
 I think enough people did already look at the checkstyle rules proposed in 
 the PR.
 
 On most of the rules reaching consensus is easy so I propose to enable all 
 rules except those regarding placement of curly braces and control flow 
 formatting. The two styles in the Flink code base are:
 
 1)
 if () {
 } else {
 }
 
 try {
 } catch () {
 }
 
 and
 
 2)
 
 if () {
 }
 else {
 }
 
 try {
 }
 catch () {
 }
 
 I think it’s hard to reach consensus on these so I suggest to keep 
 allowing both styles.
 
 Any comments very welcome! :-)
 
 Best,
 Aljoscha
> On 19. Mar 2017, at 17:09, Aljoscha Krettek  wrote:
> 
> I played around with this over the week end and it turns out that the 
> required changes in flink-streaming-java are not that big. I created a PR 
> with a proposed checkstyle.xml and the required code changes: 
> https://github.com/apache/flink/pull/3567 
> . There’s a longer description 
> of the style in the PR. The commits add continuously more invasive 
> changes so we can start with the more lightweight changes if we want to.
> 
> If we want to go forward with this I would also encourage other people to 
> use this for different modules and see how it turns out.
> 
> Best,
> Aljoscha
> 
>> On 18 Mar 2017, at 08:00, Aljoscha Krettek > > wrote:
>> 
>> I added an issue for adding a custom checkstyle.xml for 
>> flink-streaming-java so that we can gradually add checks: 
>> https://issues.apache.org/jira/browse/FLINK-6107 
>> . I outlined the 
>> procedure in the Jira. We can use this as a pilot project and see how it 
>> goes and then gradually also apply to other modules.
>> 
>> What do you think?
>> 
>>> On 6 Mar 2017, at 12:42, Stephan Ewen >> > wrote:
>>> 
>>> A singular "all reformat in one instant" will cause immense damage to 
>>> the
>>> project, in my opinion.
>>> 
>>> - There are so many pull requests that we are having a hard time keeping
>>> up, and merging will a lot more time intensive.
>>> - I personally have many forked branches with WIP features that will
>>> probably never go in if the branches become unmergeable. I expect that 
>>> to
>>> be true for many other committers and contributors.
>>> - Some companies have Flink forks and are rebasing patches onto master
>>> regularly. They will be completely screwed by a full reformat.
>>> 
>>> If we do something, the only thing that really is possible is:
>>> 
>>> (1) Define a style. Ideally not too far away from Flink's style.
>>> (2) Apply it to new projects/modules
>>> (3) Coordinate carefully to pull it into existing modules, module by
>>> module. Leaving time to adopt pull requests bit by bit, and allowing 
>>> forks
>>> to go through minor merges, rather than total conflict.
>>> 
>>> 
>>> 
>>> On Wed, Mar 1, 2017 at 5:57 PM, Henry Saputra >> >
>>> wrote:
>>> 
 It is actually Databricks Scala guide to help contributions to Apache 
 Spark
 so not really official Spark Scala guide.
 The style guide feels bit more 

Re: Question on window ProcessFunction

2017-04-26 Thread Aljoscha Krettek
Hi,
Both implementations work so no one bothered to change the older 
implementations yet. I don’t think it’s a problem but if you want you can adapt 
reduce/fold to the newer implementation.

Best,
Aljoscha
> On 26. Apr 2017, at 14:51, 魏偉哲  wrote:
> 
> Hi Aljoscha,
> 
> I know the aggregate code is newer. I am confused because the
> implementations are not consistent.
> Does it mean that the reduce/fold implementation would need to be
> refactored for the purpose of having less layers ?
> Or is it better to remain the current implementations for some reasons?
> 
> Many thanks,
> Tony Wei
> 
> 2017-04-26 20:24 GMT+08:00 Aljoscha Krettek :
> 
>> Hi Tony,
>> The reason for this is that the aggregate code is newer. The new code has
>> less layers, compared to the reduce/fold implementation where it is
>> InternalFunction(ReduceApplyFunction(Reduce)) instead of
>> InteralAggregateFunction(Aggregate).
>> 
>> Best,
>> Aljoscha
>>> On 26. Apr 2017, at 06:39, 魏偉哲  wrote:
>>> 
>>> Hi all,
>>> 
>>> Recently, I was tracing the source code in streaming api and I was
>> confused
>>> about some implementations.
>>> 
>>> When using reduce function with evictor, the *WindowStream* will wrap the
>>> *ReduceFunction* and *ProcessWindowFunction* into
>>> *ReduceApplyProcessWindonwFunction* and put it in
>>> *InternalIterableProcessWindowFunction*. So does fold function.
>>> 
>>> However, when using aggregate, the *InternalIterableProcessWindowF
>> unction*
>>> was changed to *InternalAggregateProcessWindowFunction* which was
>> applied
>>> aggregation in the process() method.
>>> 
>>> My question is why not implement an *AggregateApplyProcessWindowFun
>> ction*
>>> and use *InternalIterableProcessWindowFunction* instead just like
>> reduce,
>>> fold function did. Is there any concern?
>>> 
>>> Many thanks,
>>> Tony Wei
>> 
>> 



Re: Question on window ProcessFunction

2017-04-26 Thread 魏偉哲
Hi Aljoscha,

I know the aggregate code is newer. I am confused because the
implementations are not consistent.
Does it mean that the reduce/fold implementation would need to be
refactored for the purpose of having less layers ?
Or is it better to remain the current implementations for some reasons?

Many thanks,
Tony Wei

2017-04-26 20:24 GMT+08:00 Aljoscha Krettek :

> Hi Tony,
> The reason for this is that the aggregate code is newer. The new code has
> less layers, compared to the reduce/fold implementation where it is
> InternalFunction(ReduceApplyFunction(Reduce)) instead of
> InteralAggregateFunction(Aggregate).
>
> Best,
> Aljoscha
> > On 26. Apr 2017, at 06:39, 魏偉哲  wrote:
> >
> > Hi all,
> >
> > Recently, I was tracing the source code in streaming api and I was
> confused
> > about some implementations.
> >
> > When using reduce function with evictor, the *WindowStream* will wrap the
> > *ReduceFunction* and *ProcessWindowFunction* into
> > *ReduceApplyProcessWindonwFunction* and put it in
> > *InternalIterableProcessWindowFunction*. So does fold function.
> >
> > However, when using aggregate, the *InternalIterableProcessWindowF
> unction*
> > was changed to *InternalAggregateProcessWindowFunction* which was
> applied
> > aggregation in the process() method.
> >
> > My question is why not implement an *AggregateApplyProcessWindowFun
> ction*
> > and use *InternalIterableProcessWindowFunction* instead just like
> reduce,
> > fold function did. Is there any concern?
> >
> > Many thanks,
> > Tony Wei
>
>


Re: Question on window ProcessFunction

2017-04-26 Thread Aljoscha Krettek
Hi Tony,
The reason for this is that the aggregate code is newer. The new code has less 
layers, compared to the reduce/fold implementation where it is 
InternalFunction(ReduceApplyFunction(Reduce)) instead of 
InteralAggregateFunction(Aggregate).

Best,
Aljoscha  
> On 26. Apr 2017, at 06:39, 魏偉哲  wrote:
> 
> Hi all,
> 
> Recently, I was tracing the source code in streaming api and I was confused
> about some implementations.
> 
> When using reduce function with evictor, the *WindowStream* will wrap the
> *ReduceFunction* and *ProcessWindowFunction* into
> *ReduceApplyProcessWindonwFunction* and put it in
> *InternalIterableProcessWindowFunction*. So does fold function.
> 
> However, when using aggregate, the *InternalIterableProcessWindowFunction*
> was changed to *InternalAggregateProcessWindowFunction* which was applied
> aggregation in the process() method.
> 
> My question is why not implement an *AggregateApplyProcessWindowFunction*
> and use *InternalIterableProcessWindowFunction* instead just like reduce,
> fold function did. Is there any concern?
> 
> Many thanks,
> Tony Wei



[jira] [Created] (FLINK-6387) Flink UI support access log

2017-04-26 Thread shijinkui (JIRA)
shijinkui created FLINK-6387:


 Summary: Flink UI support access log
 Key: FLINK-6387
 URL: https://issues.apache.org/jira/browse/FLINK-6387
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Reporter: shijinkui
Assignee: shijinkui


Record the use request to the access log. Append use access to the log file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)