date:20170320

On this guava, it looks fine when i rebuild the flink using maven 3.2.5.
Doesn't get a chance to test the httpclient, but i did try to use a
decompiler to check the flink-dist_2.11-1.2.0.jar, there is no package
org.apache.http exposed. So i would think it's fine.

So the 6125 is possibly the same issue like mine: incorrect maven version
used to build flink distribution...



On Mon, Mar 20, 2017 at 10:54 PM, Ufuk Celebi  wrote:

> I don't think so:
>
> https://issues.apache.org/jira/browse/FLINK-6125
>
> From a related issue brought to the mailing list by another user.
>
>
> On Mon, Mar 20, 2017 at 2:36 PM, Aljoscha Krettek 
> wrote:
> > So it is resolved now? Thanks for letting us know!
> >> On 20 Mar 2017, at 12:55, Liangfei Su  wrote:
> >>
> >> Well...Comments added.
> >>
> >> This looks due to incorrect local build of flink.
> >>
> >> Documentation at
> >> https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/setup/building.html
> >> shows
> >> clearly that
> >>
> >> """
> >> NOTE: Maven 3.3.x can build Flink, but will not properly shade away
> certain
> >> dependencies. Maven 3.0.3 creates the libraries properly. To build unit
> >> tests with Java 8, use Java 8u51 or above to prevent failures in unit
> tests
> >> that use the PowerMock runner.
> >>
> >> """
> >> My local flink jar was build by maven3.3.3, which the shade of guava is
> not
> >> well set. Cause the conflict of elastic search(guava 18) and the
> packaged
> >> ones(should be the ones from hadoop related)
> >>
> >> On Mon, Mar 20, 2017 at 7:04 PM, Liangfei Su 
> wrote:
> >>
> >>> Another finding that it looks most of these kinds of conflict comes
> from
> >>> the dependency to hadoop/hbase (by using mvn dependency to check).
> >>>
> >>> It looks to me the correct way is to shade more packages from the
> hadoop
> >>> dependencies...  Thoughts here?
> >>>
> >>> Thanks
> >>> Ralph
> >>>
> >>>
> >>>
> >>> On Mon, Mar 20, 2017 at 6:59 PM, Liangfei Su 
> wrote:
> >>>
>  Hi,
> 
>  I try to write some user function to write elastic search, my project
>  comes with elastic search 2.3.5 (which is the same to flink connectors
>  elastic search). But i'm seeing be exception that recorded in the
> below
>  JIRA, by enabling the java -version:class, it show it's the guava
>  dependency issue:
>  Elastic search 2.3.5 transport client is using guava 18.0, and called
> one
>  method starting from guava 18.0, MoreExecutors.directExecutor
> 
>  While the class loading log show the active runtime MoreExecutors
> class
>  loaded from flink distribution, which cause the
> java.lang.NoSuchMethodErro
>  r.
> 
>  Based on above finding, it looks es 2.3.5 is not able to be used with
>  flink1.2.0 (and then the connectors-elasticsearch is broken)? Can
> someone
>  help clarify?
> 
>  Also, it looks some of the fink-core actually use the shade way to
> rename
>  the class like from com.google.guava to org.apache.flink.***.com.
> google.guava
>  which is actually a fix of this kind of issue. Etc.
>  https://issues.apache.org/jira/browse/FLINK-4587/https:
>  //issues.apache.org/jira/browse/FLINK-3373.
> 
> 
>  My flink cluster is v1.2.0, running in docker.
> 
> 
>  Thanks,
>  Ralph
> 
>  -- Forwarded message --
>  From: Su Ralph (JIRA) 
>  Date: Mon, Mar 20, 2017 at 6:41 PM
>  Subject: [jira] [Updated] (FLINK-6126) Yet another conflict : guava
>  To: suliang...@gmail.com
> 
> 
> 
>  [ https://issues.apache.org/jira/browse/FLINK-6126?page=com.at
>  lassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
> 
>  Su Ralph updated FLINK-6126:
>  
> Description:
>  When write a user function try to write to elastic search (depend on
>  elasticsearch 2.3.5)
> 
>  Stack like:
>  java.lang.NoSuchMethodError: com.google.common.util.concurr
>  ent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
> at org.elasticsearch.threadpool.ThreadPool.(
> ThreadPool.
>  java:190)
> at org.elasticsearch.client.transport.TransportClient$
> Builder.b
>  uild(TransportClient.java:131)
> at io.sherlock.capabilities.es.AbstractEsSink.open(
> AbstractEsSi
>  nk.java:98)
> 
>  When enable env.java.opts.taskmanager to -version:class, we can see
> the
>  class load log like:
>  [Loaded com.google.common.util.concurrent.MoreExecutors from
>  file:/opt/flink/lib/flink-dist_2.11-1.2.0.jar]
> 
>  The user code is using guva of 18.0.
> 
>   was:
>  For some reason I need to use org.apache.httpcomponents:
> httpasyncclient:4.1.2
>  in flink.
>  The source file is:
>  {code}
>  import

[jira] [Created] (FLINK-6137) Add custom checkstyle for flink-cep

2017-03-20 Thread Dawid Wysakowicz (JIRA)

Dawid Wysakowicz created FLINK-6137:
---

 Summary: Add custom checkstyle for flink-cep
 Key: FLINK-6137
 URL: https://issues.apache.org/jira/browse/FLINK-6137
 Project: Flink
  Issue Type: Improvement
  Components: CEP
Reporter: Dawid Wysakowicz
Priority: Minor


Add a custom checkstyle.xml for `flink-cep` library as in [FLINK-6107]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: Machine Learning on Flink - Next steps

2017-03-20 Thread Theodore Vasiloudis

Hello all,

I've updated the original Gdoc [1] to include a table with coordinators
and people interested in contributing to the specific projects. With
this latest additions we have many people willing to contribute to
the online learning library, and 2 people who have shown interested
to at least one of the other projects. Feel free to reassign yourself
if you feel like it, these are all indicative of intention anyway, not
commitments (except for the coordinators).

I don't think I'll have the time to set up the online learning doc this
week,
if anyone would like to jump ahead and do that feel free.
Gabor has already started it for the "fast-batch" project, and Stavros has
started with the model serving project as well :)

@Ventura: I would love to see the design principles and abstractions you
have
created for that project, let us know if there is anything you can share
now.

Regards,
Theodore


[1]
https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc49h3Ud06MIRhahtJ6dw/edit?usp=sharing

On Mon, Mar 20, 2017 at 3:56 PM, Gábor Hermann 
wrote:

> Hi all,
>
> @Theodore:
> +1 for the CTR use-case. Thanks for the suggestion!
>
> @Katherin:
> +1 for reflecting the choices made here and contributor commitment in Gdoc.
>
> @Tao, @Ventura:
> It's great to here you have been working on ML on Flink :)
> I hope we can all aggregate our efforts somehow. It would be best if you
> could contribute some of your work.
>
>
> I've started putting together a Gdoc specifically for *Offline/incremental
> learning on Streaming API*:
> https://docs.google.com/document/d/18BqoFTQ0dPkbyO-PWBMMpW5N
> l0pjobSubnWpW0_r8yA/
> Right now you can comment/give suggestions there. I'd like to start a
> separate mailing list discussion as soon as there are enough contributors
> volunteering for this direction. For now, I'm trying to reflect the
> relevant part of the discussion here and the initial Gdoc [1].
>
>
> [1] https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc
> 49h3Ud06MIRhahtJ6dw/
>
> Cheers,
> Gabor
>
>
> On 2017-03-20 14:27, Ventura Del Monte wrote:
>
> Hello everyone,
>>
>> Here at DFKI, we are currently working on project that involves developing
>> open-source Online Machine Learning algorithms on top of Flink.
>> So far, we have simple moments, sampling (e.g.: simple reservoir sampling)
>> and sketches (e.g., Frequent Directions) built on top of scikit-like
>> abstractions and Flink's DataStream/KeyedStream.
>> Moreover, we have few industrial use cases and we are gonna validate our
>> implementation using real industrial data.
>> We plan to implement more advanced algorithms in the future as well as to
>> share our results with you and contribute, in case you are interested.
>>
>> Best,
>> Ventura
>>
>>
>>
>>
>> This message, for the D. Lgs n. 196/2003 (Privacy Code), may contain
>> confidential and/or privileged information. If you are not the addressee
>> or
>> authorized to receive this for the addressee, you must not use, copy,
>> disclose or take any action based on this message or any information
>> herein. If you have received this message in error, please advise the
>> sender immediately by reply e-mail and delete this message. Thank you for
>> your cooperation.
>>
>> On Mon, Mar 20, 2017 at 12:26 PM, Tao Meng  wrote:
>>
>> Hi All,
>>>
>>> Sorry for joining this discussion late.
>>> My graduation thesis is about online learning system. I would build it on
>>> flink in the next three months.
>>>
>>> I'd like to contribute on:
>>>   - Online learning
>>>
>>>
>>>
>>>
>>> On Mon, Mar 20, 2017 at 6:51 PM Katherin Eri 
>>> wrote:
>>>
>>> Hello, Theodore
>>>
>>> Could you please move vectors of development and their prioritized
>>> positions from *## Executive summary* to the google doc?
>>>
>>>
>>>
>>> Could you please also create some table in google doc, that is
>>> representing
>>> the selected directions and persons, who would like to drive or
>>> participate
>>> in the particular topic, in order to make this process transparent for
>>> community and sum up current state of commitment of contributors?
>>>
>>> There we could simply inscribe us to some topic.
>>>
>>>
>>>
>>> And 1+ for CTR prediction case.
>>>
>>> вс, 19 мар. 2017 г. в 16:49, Theodore Vasiloudis <
>>> theodoros.vasilou...@gmail.com>:
>>>
>>> Hello Stavros,

 The way I thought we'd do it is that each shepherd would be responsible

>>> for
>>>
 organizing the project: that includes setting up a Google doc, sending
 an
 email to the dev list to inform the wider community, and if possible,
 personally contacting the people who expressed interest in the project.

 Would you be willing to lead that effort for the model serving project?

 Regards,
 Theodore

 --
 Sent from a mobile device. May contain autocorrect errors.

 On Mar 19, 2017 3:49 AM, "Stavros Kontopoulos" <
 st.kontopou...@gmail.com

[jira] [Created] (FLINK-6136) Separate EmbeddedNonHaServices and NonHaServices

2017-03-20 Thread Till Rohrmann (JIRA)

Till Rohrmann created FLINK-6136:


 Summary: Separate EmbeddedNonHaServices and NonHaServices
 Key: FLINK-6136
 URL: https://issues.apache.org/jira/browse/FLINK-6136
 Project: Flink
  Issue Type: Sub-task
  Components: Distributed Coordination
Affects Versions: 1.3.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.3.0


Currently, the {{NonHaServices}} and the {{EmbeddedNonHaServices}} share the 
same code base for the {{JobManager}} leader election and retrieval. This works 
if the respective components run in the same process. However, in case of a 
distributed execution, we have to instantiate 
{{StandaloneLeaderRetrieval/ElectionServices}} to make the leader retrieval 
possible.

I propose to instantiate the {{StandaloneLeaderRetrieval/ElectionServices}} in 
the {{NonHaServices}} implementation and to use them in case of a distributed 
non-ha execution setting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-6135) Allowing adding inputs to StreamOperator

2017-03-20 Thread Aljoscha Krettek (JIRA)

Aljoscha Krettek created FLINK-6135:
---

 Summary: Allowing adding inputs to StreamOperator
 Key: FLINK-6135
 URL: https://issues.apache.org/jira/browse/FLINK-6135
 Project: Flink
  Issue Type: Sub-task
  Components: DataStream API
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek


As mentioned in 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API
 we need to be able to add additional inputs to a {{StreamOperator}.

There is a (somewhat old) design document here: 
https://docs.google.com/document/d/1ZFzL_0xGuUEnBsFyEiHwWcmCcjhd9ArWsmhrgnt05RI/edit#heading=h.pqg5z6g0mjm7
 and some proof-of-concept code here: 
https://github.com/aljoscha/flink/tree/operator-ng-side-input-wrapper



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-6134) Set UUID(0L, 0L) as default leader session id

2017-03-20 Thread Till Rohrmann (JIRA)

Till Rohrmann created FLINK-6134:


 Summary: Set UUID(0L, 0L) as default leader session id
 Key: FLINK-6134
 URL: https://issues.apache.org/jira/browse/FLINK-6134
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination
Affects Versions: 1.3.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Minor
 Fix For: 1.3.0


The leader election/retrieval services use the {{null}} value as the default 
leader id in the standalone case but also as no active leader in the ZooKeeper 
case. This is ambiguous and therefore I propose to change the default leader id 
to {{UUID(0L, 0L)}}. Consequently, a {{null}} leader id value can then indicate 
that there is no active leader available.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-6133) fix build status in README.md

2017-03-20 Thread Bowen Li (JIRA)

Bowen Li created FLINK-6133:
---

 Summary: fix build status in README.md
 Key: FLINK-6133
 URL: https://issues.apache.org/jira/browse/FLINK-6133
 Project: Flink
  Issue Type: Bug
  Components: Build System
Reporter: Bowen Li
Assignee: Bowen Li
Priority: Minor
 Fix For: 1.3.0, 1.2.1


I think I made a mistake in https://issues.apache.org/jira/browse/FLINK-6122.

I tried to copy what's in the homepage of 
https://github.com/apache/incubator-airflow , but it's broken now and looks 
like they have some different config in Travis. Need to fix this problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-6132) Remove redundant code in CliFrontend.java

2017-03-20 Thread canbinzheng (JIRA)

canbinzheng created FLINK-6132:
--

 Summary: Remove redundant code in CliFrontend.java
 Key: FLINK-6132
 URL: https://issues.apache.org/jira/browse/FLINK-6132
 Project: Flink
  Issue Type: Improvement
  Components: Client
Reporter: canbinzheng
Priority: Minor


Look at the switch case block in method parseParameters of CliFrontend.java. 

// do action
switch (action) {
case ACTION_RUN:
return CliFrontend.this.run(params);
case ACTION_LIST:
return list(params);
case ACTION_INFO:
return info(params);
case ACTION_CANCEL:
return cancel(params);
case ACTION_STOP:
return stop(params);
case ACTION_SAVEPOINT:
return savepoint(params);
It's better to change the first case return clause 
from：
  return CliFrontend.this.run(params);
to：
  return run(params);




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: [Disuss]Permission of checkpoint directory

2017-03-20 Thread Greg Hogan

Prior discussion at https://github.com/apache/flink/pull/3335 



> On Mar 19, 2017, at 11:34 PM, Wangtao (WangTao)  wrote:
> 
> Hi All,
> 
> Checkpoint directory will store user data and it is better to keep it with 
> minimum access right, like 700, in case of information leak.
> 
> Current situation:
> 
> User can specify a checkpoint directory(we called "root chp dir" for 
> convenience) through configuration system, either single mode or session mode.
> 
> In single mode, the job running will create a sub directory under root chp 
> dir using its job id and write checkpoint files into it.
> 
> In session mode, every job running in that session will create their own sub 
> directory under root chp dir using their own job id and write checkpoint 
> files into it.
> 
> Now in session mode, we don't isolate users who submit jobs to this session, 
> so it is to say no matter who submit jobs to this session, we think jobs ran 
> by same user(who start this session) as long as it passed authentication(not 
> supported now, will do in future).
> 
> Information Leak Threat:
> 
> Assume user1 submits a job(no matter single or session mode) and writes 
> checkpoint files into file systems, using default permission, let's say 755, 
> in most cases.
> 
> Then user2 can directly read checkpoint files written by user1 and can get 
> user1's data.
> 
> Solution:
> 
> The simplest way to fix this is to set permission of sub directories for each 
> job to minimum(like 700, we can also make it configurable) in creation.
> 
> What do you think guys?
> 
> 
> Wang Tao
> wangtao...@huawei.com
> Huawei Technologies Co., Ltd.

[jira] [Created] (FLINK-6131) Add side inputs for DataStream API

2017-03-20 Thread Aljoscha Krettek (JIRA)

Aljoscha Krettek created FLINK-6131:
---

 Summary: Add side inputs for DataStream API
 Key: FLINK-6131
 URL: https://issues.apache.org/jira/browse/FLINK-6131
 Project: Flink
  Issue Type: Improvement
  Components: DataStream API
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek


This is an umbrella issue for tracking the implementation of FLIP-17: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-03-20 Thread Ted Yu (JIRA)

Ted Yu created FLINK-6130:
-

 Summary: Consider calling resourceManager#getTerminationFuture() 
with lock held
 Key: FLINK-6130
 URL: https://issues.apache.org/jira/browse/FLINK-6130
 Project: Flink
  Issue Type: Bug
Reporter: Ted Yu
Priority: Minor


In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
{code}
  synchronized (lock) {
LOG.info("Starting High Availability Services");
...
  }

  // wait for resource manager to finish
  resourceManager.getTerminationFuture().get();
{code}
resourceManager#getTerminationFuture() is called without holding lock.

We should store the value returned from resourceManager#getTerminationFuture() 
inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: Machine Learning on Flink - Next steps

2017-03-20 Thread Gábor Hermann


Hi all,

@Theodore:
+1 for the CTR use-case. Thanks for the suggestion!

@Katherin:
+1 for reflecting the choices made here and contributor commitment in Gdoc.

@Tao, @Ventura:
It's great to here you have been working on ML on Flink :)
I hope we can all aggregate our efforts somehow. It would be best if you 
could contribute some of your work.



I've started putting together a Gdoc specifically for 
*Offline/incremental learning on Streaming API*:

https://docs.google.com/document/d/18BqoFTQ0dPkbyO-PWBMMpW5Nl0pjobSubnWpW0_r8yA/
Right now you can comment/give suggestions there. I'd like to start a 
separate mailing list discussion as soon as there are enough 
contributors volunteering for this direction. For now, I'm trying to 
reflect the relevant part of the discussion here and the initial Gdoc [1].



[1] 
https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc49h3Ud06MIRhahtJ6dw/


Cheers,
Gabor

On 2017-03-20 14:27, Ventura Del Monte wrote:


Hello everyone,

Here at DFKI, we are currently working on project that involves developing
open-source Online Machine Learning algorithms on top of Flink.
So far, we have simple moments, sampling (e.g.: simple reservoir sampling)
and sketches (e.g., Frequent Directions) built on top of scikit-like
abstractions and Flink's DataStream/KeyedStream.
Moreover, we have few industrial use cases and we are gonna validate our
implementation using real industrial data.
We plan to implement more advanced algorithms in the future as well as to
share our results with you and contribute, in case you are interested.

Best,
Ventura




This message, for the D. Lgs n. 196/2003 (Privacy Code), may contain
confidential and/or privileged information. If you are not the addressee or
authorized to receive this for the addressee, you must not use, copy,
disclose or take any action based on this message or any information
herein. If you have received this message in error, please advise the
sender immediately by reply e-mail and delete this message. Thank you for
your cooperation.

On Mon, Mar 20, 2017 at 12:26 PM, Tao Meng  wrote:


Hi All,

Sorry for joining this discussion late.
My graduation thesis is about online learning system. I would build it on
flink in the next three months.

I'd like to contribute on:
  - Online learning




On Mon, Mar 20, 2017 at 6:51 PM Katherin Eri 
wrote:

Hello, Theodore

Could you please move vectors of development and their prioritized
positions from *## Executive summary* to the google doc?



Could you please also create some table in google doc, that is representing
the selected directions and persons, who would like to drive or participate
in the particular topic, in order to make this process transparent for
community and sum up current state of commitment of contributors?

There we could simply inscribe us to some topic.



And 1+ for CTR prediction case.

вс, 19 мар. 2017 г. в 16:49, Theodore Vasiloudis <
theodoros.vasilou...@gmail.com>:


Hello Stavros,

The way I thought we'd do it is that each shepherd would be responsible

for

organizing the project: that includes setting up a Google doc, sending an
email to the dev list to inform the wider community, and if possible,
personally contacting the people who expressed interest in the project.

Would you be willing to lead that effort for the model serving project?

Regards,
Theodore

--
Sent from a mobile device. May contain autocorrect errors.

On Mar 19, 2017 3:49 AM, "Stavros Kontopoulos"

Re: [jira] [Updated] (FLINK-6126) Yet another conflict : guava

I don't think so:

https://issues.apache.org/jira/browse/FLINK-6125

>From a related issue brought to the mailing list by another user.

On Mon, Mar 20, 2017 at 2:36 PM, Aljoscha Krettek  wrote:
> So it is resolved now? Thanks for letting us know!
>> On 20 Mar 2017, at 12:55, Liangfei Su  wrote:
>>
>> Well...Comments added.
>>
>> This looks due to incorrect local build of flink.
>>
>> Documentation at
>> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/building.html
>> shows
>> clearly that
>>
>> """
>> NOTE: Maven 3.3.x can build Flink, but will not properly shade away certain
>> dependencies. Maven 3.0.3 creates the libraries properly. To build unit
>> tests with Java 8, use Java 8u51 or above to prevent failures in unit tests
>> that use the PowerMock runner.
>>
>> """
>> My local flink jar was build by maven3.3.3, which the shade of guava is not
>> well set. Cause the conflict of elastic search(guava 18) and the packaged
>> ones(should be the ones from hadoop related)
>>
>> On Mon, Mar 20, 2017 at 7:04 PM, Liangfei Su  wrote:
>>
>>> Another finding that it looks most of these kinds of conflict comes from
>>> the dependency to hadoop/hbase (by using mvn dependency to check).
>>>
>>> It looks to me the correct way is to shade more packages from the hadoop
>>> dependencies...  Thoughts here?
>>>
>>> Thanks
>>> Ralph
>>>
>>>
>>>
>>> On Mon, Mar 20, 2017 at 6:59 PM, Liangfei Su  wrote:
>>>
 Hi,

 I try to write some user function to write elastic search, my project
 comes with elastic search 2.3.5 (which is the same to flink connectors
 elastic search). But i'm seeing be exception that recorded in the below
 JIRA, by enabling the java -version:class, it show it's the guava
 dependency issue:
 Elastic search 2.3.5 transport client is using guava 18.0, and called one
 method starting from guava 18.0, MoreExecutors.directExecutor

 While the class loading log show the active runtime MoreExecutors class
 loaded from flink distribution, which cause the java.lang.NoSuchMethodErro
 r.

 Based on above finding, it looks es 2.3.5 is not able to be used with
 flink1.2.0 (and then the connectors-elasticsearch is broken)? Can someone
 help clarify?

 Also, it looks some of the fink-core actually use the shade way to rename
 the class like from com.google.guava to 
 org.apache.flink.***.com.google.guava
 which is actually a fix of this kind of issue. Etc.
 https://issues.apache.org/jira/browse/FLINK-4587/https:
 //issues.apache.org/jira/browse/FLINK-3373.

 My flink cluster is v1.2.0, running in docker.

 Thanks,
 Ralph

 -- Forwarded message --
 From: Su Ralph (JIRA) 
 Date: Mon, Mar 20, 2017 at 6:41 PM
 Subject: [jira] [Updated] (FLINK-6126) Yet another conflict : guava
 To: suliang...@gmail.com

 [ https://issues.apache.org/jira/browse/FLINK-6126?page=com.at
 lassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

 Su Ralph updated FLINK-6126:

Description:
 When write a user function try to write to elastic search (depend on
 elasticsearch 2.3.5)

 Stack like:
 java.lang.NoSuchMethodError: com.google.common.util.concurr
 ent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
at org.elasticsearch.threadpool.ThreadPool.(ThreadPool.
 java:190)
at org.elasticsearch.client.transport.TransportClient$Builder.b
 uild(TransportClient.java:131)
at io.sherlock.capabilities.es.AbstractEsSink.open(AbstractEsSi
 nk.java:98)

 When enable env.java.opts.taskmanager to -version:class, we can see the
 class load log like:
 [Loaded com.google.common.util.concurrent.MoreExecutors from
 file:/opt/flink/lib/flink-dist_2.11-1.2.0.jar]

 The user code is using guva of 18.0.

  was:
 For some reason I need to use 
 org.apache.httpcomponents:httpasyncclient:4.1.2
 in flink.
 The source file is:
 {code}
 import org.apache.flink.streaming.api.scala._
 import org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory

 /**
  * Created by renkai on 16/9/7.
  */
 object Main {
  def main(args: Array[String]): Unit = {
val instance = ManagedNHttpClientConnectionFactory.INSTANCE
println("instance = " + instance)

val env = StreamExecutionEnvironment.getExecutionEnvironment
val stream = env.fromCollection(1 to 100)
val result = stream.map { x =>
  x * 2
}
result.print()
env.execute("xixi")
  }
 }

 {code}

 and
 {code}
 name := "flink-explore"

 version := "1.0"

 scalaVersion := "2.11.8"

Re: Machine Learning on Flink - Next steps

2017-03-20 Thread Stavros Kontopoulos

Hello Theodore,

I could lead the effort for the model serving thing, I think I have the
cycles for it.
I would also contact Enron and see if he wants to help in that direction.

Best,
Stavros

On Sun, Mar 19, 2017 at 3:43 PM, Theodore Vasiloudis <
theodoros.vasilou...@gmail.com> wrote:

> Hello Stavros,
>
> The way I thought we'd do it is that each shepherd would be responsible for
> organizing the project: that includes setting up a Google doc, sending an
> email to the dev list to inform the wider community, and if possible,
> personally contacting the people who expressed interest in the project.
>
> Would you be willing to lead that effort for the model serving project?
>
> Regards,
> Theodore
>
> --
> Sent from a mobile device. May contain autocorrect errors.
>
> On Mar 19, 2017 3:49 AM, "Stavros Kontopoulos" 
> wrote:
>
> > Hi all...
> >
> > I agree about the tensorflow integration it seems to be important from
> what
> > I hear.
> > Should we sign up somewhere for the working groups (gdcos)?
> > I would like to start helping with the model serving feature.
> >
> > Best Regards,
> > Stavros
> >
> > On Fri, Mar 17, 2017 at 10:34 PM, Gábor Hermann 
> > wrote:
> >
> > > Hi Chen,
> > >
> > > Thanks for the input! :)
> > >
> > > There is already a project [1] for using TensorFlow models in Flink,
> and
> > > Theodore has suggested
> > > to contact the author, Eron Wright for the model serving direction.
> > >
> > >
> > > [1] http://sf.flink-forward.org/kb_sessions/introducing-flink-
> > tensorflow/
> > >
> > > Cheers,
> > > Gabor
> > >
> > >
> > > On 2017-03-17 19:41, Chen Qin wrote:
> > >
> > >> [1]http://sf.flink-forward.org/kb_sessions/introducing-flink-te
> > >> nsorflow/
> > >>
> > >
> > >
> >
>

Re: Future of Queryable State Feature

On Mon, Mar 20, 2017 at 3:25 PM, Aljoscha Krettek  wrote:
> I think adding this could be as simple as adding a 
> makeWindowContentsQueryable(String) call to WindowedStream. In there, we 
> create the StateDescriptor for keeping the window contents so we can call 
> setQueryable() on the StateDescriptor and set the name under which it should 
> be accessible. If you want you can create an issue and change this. I can 
> help you if you have any questions.

+1

I would only allow this for incrementally aggregating windows in the
first version (reduce, fold, etc.). Does this make sense for your use
case Joe? Otherwise, the state might grow quite large and for session
windows identifying the session is quite tricky. We only need to watch
out for the fast path implementation (using
AbstractAlignedProcessingTimeWindowOperator) that cannot be queried.

If you would like to work on this, I would also be happy to work with
you and create the initial JIRA writing down things that we should
watch out for.

– Ufuk

Re: Future of Queryable State Feature

I think adding this could be as simple as adding a 
makeWindowContentsQueryable(String) call to WindowedStream. In there, we create 
the StateDescriptor for keeping the window contents so we can call 
setQueryable() on the StateDescriptor and set the name under which it should be 
accessible. If you want you can create an issue and change this. I can help you 
if you have any questions.

For custom Triggers, I think you can also call setQueryable() on a 
StateDescriptor.

The tricky thing when it comes to windows is that state is scoped to a window, 
so the querying logic has to take that into account.

I also cc’ed Ufuk and Nico, who probably know best what’s going on with 
queryable state.
> On 16 Mar 2017, at 15:33, Joe Olson  wrote:
> 
> I have a question regarding the future direction of the queryable state 
> feature.
> 
> 
> We are current using this feature in production implemented in a RichFlatMap. 
> It is doing what we need it to do at the scale we need it done in, with the 
> resources we have assigned to it. Win.
> 
> 
> However, we'd also like to use this feature in conjunction with Flink's 
> windowing. The "Rich" execution environment is not exposed in any of the 
> windows / triggers / apply hierarchy, so we cannot expose any of the state 
> managed within the windows outside of Flink. Many of our use cases require us 
> to have access to values as they are being accumulated, as well as the 
> aggregated result.
> 
> 
> We can get by with the RichFlatMap for now. I'd like some clarification as to 
> whether or not the queryable state feature is going to be extended to the 
> windowing components for the next milestone release. This will determine our 
> Flink development milestones for the next few months. From consulting the 
> open items in JIRA, it does not look like it is on the docket.
> 
> 
> I'd be more than willing to help out implementing this feature, but I don't 
> think I have the experience to submit this change on my own.

RE: [DISCUSS] Could we Improve tests time and stability?

2017-03-20 Thread Dmytro Shkvyra

Sorry Greg,
I have given you wrong reference to tests of different GCs.
This is correct - https://travis-ci.org/kenmy/flink/builds

-Original Message-
From: Greg Hogan [mailto:c...@greghogan.com] 
Sent: Friday, March 17, 2017 8:06 PM
To: dev@flink.apache.org
Subject: Re: [DISCUSS] Could we Improve tests time and stability?

Dmytro,

This is a good idea and a nice speedup, though I notice that nearly half of the 
speedup (1104s of 2461s) is from job 7 which appears to have hung and timed out 
in the initial run.

Could you test the two changes in isolation (increased maximum memory and 
garbage collector)? If the speedup is due to one or the other then we can make 
only that change.

Greg


> On Mar 17, 2017, at 9:16 AM, Dmytro Shkvyra  wrote:
> 
> Hi Robert,
> 
> I have tried my proposal on my travis and accelerated my build in 9.4% 
> (see results below)
> 
> # Test|Present JVM options|"-Xms256m -Xmx1536m -XX:+UseSerialGC"
> - 
> 
> 1 35 min  53 sec  |35 min 35 sec
> 2 38 min  49 sec  |34 min  18 sec
> 3 35 min  34 sec  |29 min  38 sec
> 4 34 min  38 sec  |31 min  14 sec
> 5 35 min  41 sec  |35 min  11 sec
> 6 36 min  41 sec  |33 min  52 sec
> 7 49 min  59 sec  |31 min  35 sec
> 8 37 min  0 sec   |36 min  20 sec
> 9 32 min  28 sec  |31 min  48 sec
> 1038 min  25 sec  |33 min  28 sec
> 1136 min  19 sec  |38 min  24 sec
> 1225 min  0 sec   |24 min  3 sec
> --
> Total 26187 sec   |23726  sec
> --
> Acceleration  |9.40%  
> 
> I think almost 10% is good enough.
> 
> -Original Message-
> From: Robert Metzger [mailto:rmetz...@apache.org]
> Sent: Thursday, March 16, 2017 6:26 PM
> To: dev@flink.apache.org
> Subject: Re: [DISCUSS] Could we Improve tests time and stability?
> 
> Hi Dmytro,
> 
> I'm happy to hear that you are trying to help us improving the test time 
> situation :) We have another discussion here on the dev@ list to split the 
> project into two git repositories to resolve the problem.
> 
> I agree that your proposed changes could improve the build times, but I'm not 
> sure if they are enough to resolve them forever. Some tests just waste time 
> by waiting on stuff to happen :) If you want, you can enable travis for your 
> own Flink fork on GitHub, add your proposed changes to the travis / maven 
> files and see how much they improve the build time.
> 
> 
> On Thu, Mar 16, 2017 at 5:06 PM, Dmytro Shkvyra 
> 
> wrote:
> 
>> Hi everyone,
>> May be we should remove -XX:-UseGCOverheadLimit option from 
>> maven-surefire-plugin args and increase -Xmx to 1536m for forks?
>> We have about 4 GB RAM and 2 cores at test VMs. I think we can make 
>> test faster than now. When I tried testing flink-runtime some tests 
>> work too slow due to GC overhead.
>> May be you also faced to problem when Travis build was fallen by timeout?
>> Also we can use GC algorithms explicitly for forks execution.
>> BTW, we run tests with java 7 and 8 and these versions use by default 
>> different GC algorithms (GC1 for 8 and Parallel GC for 7). IMHO when 
>> we have strict limitations of RAM and time of build we should avoid 
>> any ambiguity.
>> In case when some tests can generate very big datasets very fast, 
>> paralel GC can do not have time to clean up. I do not know how G1 
>> work in this case exactly, but may be would better use old good 
>> -XX:+UseSerialGC. We have only 1 core per fork so we anyway cant use 
>> all advantages of G1 and ParralelGC. If we use SerialGC (use stop the
>> world) we can be sure that GC collect almost all garbage before test 
>> continue.
>> What do you think about my idea?
>> May be someone has another ideas how to improve tests time and stability?
>> 
>> 
>> Dmytro Shkvyra
>> Senior Software Engineer
>> 
>> Office: +380 44 390 5457 x 65346
>> Cell: +380 50 357 6828   Email:
>> dmytro_shkv...@epam.com
>> Kyiv, Ukraine (GMT+3)   epam.com
>> 
>> CONFIDENTIALITY CAUTION AND DISCLAIMER This message is intended only 
>> for the use of the individual(s) or
>> entity(ies) to which it is addressed and contains information that is 
>> legally privileged and confidential. If you are not the intended 
>> recipient, or the person responsible for delivering the message to 
>> the intended recipient, you are hereby notified that any 
>> dissemination, distribution or copying of this communication is strictly 
>> prohibited.
>> All unintended recipients are obliged to delete this message and 
>> destroy any printed copies.
>> 
>>

Re: [DISCUSS] Project build time and possible restructuring

The Beam Jenkins jobs are configured inside the Beam src repo itself. For 
example: 
https://github.com/apache/beam/blob/master/.jenkins/job_beam_PostCommit_Java_RunnableOnService_Flink.groovy

For initial setup of the seed job you need admin rights on Jenkins, as 
described here: https://cwiki.apache.org/confluence/display/INFRA/Jenkins.

The somewhat annoying thing is setting up our own “flink” build slaves and 
maintaining them. There are some general purpose build slaves but 
high-throughput projects usually have their own build slaves to ensure speedy 
processing of Jenkins jobs: 
https://cwiki.apache.org/confluence/display/INFRA/Jenkins+node+labels

> On 20 Mar 2017, at 14:40, Timo Walther  wrote:
> 
> Another solution would be to make the Travis builds more efficient. For 
> example, we could write a script that determines the modified Maven module 
> and only run the test for this module (and maybe transitive dependencies). 
> PRs for libraries such as Gelly, Table, CEP or connectors would not trigger a 
> compilation of the entire stack anymore. Of course this would not solve all 
> problems but many of it.
> 
> What do you think about this?
> 
> 
> 
> Am 20/03/17 um 14:02 schrieb Robert Metzger:
>> Aljoscha, do you know how to configure jenkins?
>> Is Apache INFRA doing that, or are the beam people doing that themselves?
>> 
>> One downside of Jenkins is that we probably need some machines that execute
>> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
>> currently have 10 such containers available on travis concurrently. I think
>> we would need at least the same amount on Jenkins.
>> 
>> 
>> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther  wrote:
>> 
>>> I agress with Aljoscha that we might consider moving from Jenkins to
>>> Travis. Is there any disadvantage in using Jenkins?
>>> 
>>> I think we should structure the project according to release management
>>> (e.g. more frequent releases of libraries) or other criteria (e.g. core and
>>> non-core) instead of build time. What would happen if the built of another
>>> submodule would become too long, would we split/restructure again and
>>> again? If Jenkins solves all our problems we should use it.
>>> 
>>> Regards,
>>> Timo
>>> 
>>> 
>>> 
>>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>>> 
 I prefer Jenkins to Travis by far. Working on Beam, where we have good
 Jenkins integration, has opened my eyes to what is possible with good CI
 integration.

 For example, look at this recent Beam PR: https://github.com/apache/beam
 /pull/2263 . The
 Jenkins-Github integration will tell you exactly which tests failed and if
 you click on the links you can look at the log output/std out of the tests
 in question.

 This is the overview page of one of the Jenkins Jobs that we have in
 Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
 nService_Flink/ . This is an example of a
 stable build: https://builds.apache.org/job/
 beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
 https://builds.apache.org/job/beam_PostCommit_Java_Runnable
 OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
 information about the Maven run. This is an unstable run:
 https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
 nService_Flink/lastUnstableBuild/ . There
 you can see which tests failed and you can easily drill down.

 Best,
 Aljoscha

 On 20 Mar 2017, at 11:46, Robert Metzger  wrote:
> Thank you for looking into the build times.
> 
> I didn't know that the build time situation is so bad. Even with yarn,
> mesos, connectors and libraries removed, we are still running into the
> build timeout :(
> 
> Aljoscha told me that the Beam community is using Jenkins for running
> the tests, and they are planning to completely move away from Travis. I
> wonder whether we should do the same, as having our own Jenkins servers
> would allow us to run tests for more than 50 minutes.
> 
> I agree with Stephan that we should keep the yarn and mesos tests in the
> core for stability / testing quality purposes.
> 
> 
> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen  > wrote:
> @Greg
> 
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors,
> but
> we may be able to convince him.
> 
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Greg Hogan

We can add cluster tests using the distribution jar, and will need to do so to 
remove Flink’s dependency on Hadoop. The YARN and Mesos tests would still run 
nightly and running cluster tests should be much faster. As troublesome as 
TravisCI has been, a major driver for this change has been local build time.

I agree with splitting off one repo at a time, but we’ll first need to 
reorganize the core repo if using git submodules as flink-python and 
flink-table would need to first be moved. So I think planning this out first is 
a healthy idea, with the understanding that the plan will be reevaluated.

Any changes to the project structure need a scheduled period, perhaps a week, 
for existing pull requests to be reviewed and accepted or closed and later 
migrated.


> On Mar 20, 2017, at 6:27 AM, Stephan Ewen  wrote:
> 
> @Greg
> 
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors, but
> we may be able to convince him.
> 
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases that other tests did not, because they are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling would
> be that they are valuable in the core repository.
> 
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience there, we can probably easily see what
> else we can split out.
> 
> Stephan
> 
> 
> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan  wrote:
> 
>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>> With 51 builds queued up for the weekend (some of which may fail or have
>> been force pushed) we are at the limit of the number of contributions we
>> can process. Fixing this requires 1) splitting the project, 2)
>> investigating speedups for long-running tests, and 3) staying cognizant of
>> test performance when accepting new code.
>> 
>> I’d like to add one to Stephan’s list of module group. I like that the
>> modules are generic (“libraries”) so that no one module is alone and
>> independent.
>> 
>> Flink has three “libraries”: cep, ml, and gelly.
>> 
>> “connectors” is a hotspot due to the long-running Kafka tests (and
>> connectors for three Kafka versions).
>> 
>> Both flink-storm and flink-python have a modest number of number of tests
>> and could live with the miscellaneous modules in “contrib”.
>> 
>> The YARN tests are long-running and problematic (I am unable to
>> successfully run these locally). A “cluster” module could host flink-mesos,
>> flink-yarn, and flink-yarn-tests.
>> 
>> That gets us close to running all tests in a single Travis build.
>>  https://travis-ci.org/greghogan/flink/builds/212122590 <
>> https://travis-ci.org/greghogan/flink/builds/212122590>
>> 
>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>> https://github.com/greghogan/flink/commits/core_build>) with a maven
>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>  https://travis-ci.org/greghogan/flink/builds/212137659 <
>> https://travis-ci.org/greghogan/flink/builds/212137659>
>>  https://travis-ci.org/greghogan/flink/builds/212154470 <
>> https://travis-ci.org/greghogan/flink/builds/212154470>
>> 
>> We can run Travis CI builds nightly to guard against breaking changes.
>> 
>> I also wanted to get an idea of how disruptive it would be to developers
>> to divide the project into multiple git repos. I wrote a simple python
>> script and configured it with the module partitions listed above. The usage
>> string from the top of the file lists commits with files from multiple
>> partitions and well as the modified files.
>>  https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>> 
>> Accounting for the merging of the batch and streaming connector modules,
>> and assuming that the project structure has not changed much over the past
>> 15 months, for the following date ranges the listed number of commits would
>> have been split across repositories.
>> 
>> since "2017-01-01"
>> 36 of 571 commits were mixed
>> 
>> since "2016-07-01"
>> 155 of 1607 commits were mixed
>> 
>> since "2016-01-01"
>> 272 of 2561 commits were mixed
>> 
>> Greg
>> 
>> 
>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen  wrote:
>>> 
>>> @Robert - I think once we know that a separate git repo works well, and
>>> that it actually solves problems, I see no reason to not create a
>>> connectors repository later. The infrastructure changes should be
>> identical
>>> for two or more repositories.
>>> 
>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann 
>> wrote:
>>> 
 I think it

Re: Machine Learning on Flink - Next steps

2017-03-20 Thread Ventura Del Monte

Hello everyone,

Here at DFKI, we are currently working on project that involves developing
open-source Online Machine Learning algorithms on top of Flink.
So far, we have simple moments, sampling (e.g.: simple reservoir sampling)
and sketches (e.g., Frequent Directions) built on top of scikit-like
abstractions and Flink's DataStream/KeyedStream.
Moreover, we have few industrial use cases and we are gonna validate our
implementation using real industrial data.
We plan to implement more advanced algorithms in the future as well as to
share our results with you and contribute, in case you are interested.

Best,
Ventura

This message, for the D. Lgs n. 196/2003 (Privacy Code), may contain
confidential and/or privileged information. If you are not the addressee or
authorized to receive this for the addressee, you must not use, copy,
disclose or take any action based on this message or any information
herein. If you have received this message in error, please advise the
sender immediately by reply e-mail and delete this message. Thank you for
your cooperation.

On Mon, Mar 20, 2017 at 12:26 PM, Tao Meng  wrote:

> Hi All,
>
> Sorry for joining this discussion late.
> My graduation thesis is about online learning system. I would build it on
> flink in the next three months.
>
> I'd like to contribute on:
>  - Online learning
>
>
>
>
> On Mon, Mar 20, 2017 at 6:51 PM Katherin Eri 
> wrote:
>
> Hello, Theodore
>
> Could you please move vectors of development and their prioritized
> positions from *## Executive summary* to the google doc?
>
>
>
> Could you please also create some table in google doc, that is representing
> the selected directions and persons, who would like to drive or participate
> in the particular topic, in order to make this process transparent for
> community and sum up current state of commitment of contributors?
>
> There we could simply inscribe us to some topic.
>
>
>
> And 1+ for CTR prediction case.
>
> вс, 19 мар. 2017 г. в 16:49, Theodore Vasiloudis <
> theodoros.vasilou...@gmail.com>:
>
> > Hello Stavros,
> >
> > The way I thought we'd do it is that each shepherd would be responsible
> for
> > organizing the project: that includes setting up a Google doc, sending an
> > email to the dev list to inform the wider community, and if possible,
> > personally contacting the people who expressed interest in the project.
> >
> > Would you be willing to lead that effort for the model serving project?
> >
> > Regards,
> > Theodore
> >
> > --
> > Sent from a mobile device. May contain autocorrect errors.
> >
> > On Mar 19, 2017 3:49 AM, "Stavros Kontopoulos"  >
> > wrote:
> >
> > > Hi all...
> > >
> > > I agree about the tensorflow integration it seems to be important from
> > what
> > > I hear.
> > > Should we sign up somewhere for the working groups (gdcos)?
> > > I would like to start helping with the model serving feature.
> > >
> > > Best Regards,
> > > Stavros
> > >
> > > On Fri, Mar 17, 2017 at 10:34 PM, Gábor Hermann  >
> > > wrote:
> > >
> > > > Hi Chen,
> > > >
> > > > Thanks for the input! :)
> > > >
> > > > There is already a project [1] for using TensorFlow models in Flink,
> > and
> > > > Theodore has suggested
> > > > to contact the author, Eron Wright for the model serving direction.
> > > >
> > > >
> > > > [1] http://sf.flink-forward.org/kb_sessions/introducing-flink-
> > > tensorflow/
> > > >
> > > > Cheers,
> > > > Gabor
> > > >
> > > >
> > > > On 2017-03-17 19:41, Chen Qin wrote:
> > > >
> > > >> [1]http://sf.flink-forward.org/kb_sessions/introducing-flink-te
> > > >> nsorflow/
> > > >>
> > > >
> > > >
> > >
> >
> --
>
> *Yours faithfully, *
>
> *Kate Eri.*
>

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Robert Metzger

It looks like Jetbrains TeamCity supports something in that direction:
https://blog.jetbrains.com/teamcity/2012/03/incremental-building-with-maven-and-teamcity/


On Mon, Mar 20, 2017 at 2:40 PM, Timo Walther  wrote:

> Another solution would be to make the Travis builds more efficient. For
> example, we could write a script that determines the modified Maven module
> and only run the test for this module (and maybe transitive dependencies).
> PRs for libraries such as Gelly, Table, CEP or connectors would not trigger
> a compilation of the entire stack anymore. Of course this would not solve
> all problems but many of it.
>
> What do you think about this?
>
>
>
> Am 20/03/17 um 14:02 schrieb Robert Metzger:
>
> Aljoscha, do you know how to configure jenkins?
>> Is Apache INFRA doing that, or are the beam people doing that themselves?
>>
>> One downside of Jenkins is that we probably need some machines that
>> execute
>> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
>> currently have 10 such containers available on travis concurrently. I
>> think
>> we would need at least the same amount on Jenkins.
>>
>>
>> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther  wrote:
>>
>> I agress with Aljoscha that we might consider moving from Jenkins to
>>> Travis. Is there any disadvantage in using Jenkins?
>>>
>>> I think we should structure the project according to release management
>>> (e.g. more frequent releases of libraries) or other criteria (e.g. core
>>> and
>>> non-core) instead of build time. What would happen if the built of
>>> another
>>> submodule would become too long, would we split/restructure again and
>>> again? If Jenkins solves all our problems we should use it.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>>>
>>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
 Jenkins integration, has opened my eyes to what is possible with good CI
 integration.

 For example, look at this recent Beam PR:
 https://github.com/apache/beam
 /pull/2263 . The
 Jenkins-Github integration will tell you exactly which tests failed and
 if
 you click on the links you can look at the log output/std out of the
 tests
 in question.

 This is the overview page of one of the Jenkins Jobs that we have in
 Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
 nService_Flink/ . This is an example of
 a
 stable build: https://builds.apache.org/job/
 beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
 https://builds.apache.org/job/beam_PostCommit_Java_Runnable
 OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
 information about the Maven run. This is an unstable run:
 https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
 nService_Flink/lastUnstableBuild/ .
 There
 you can see which tests failed and you can easily drill down.

 Best,
 Aljoscha

 On 20 Mar 2017, at 11:46, Robert Metzger  wrote:

> Thank you for looking into the build times.
>
> I didn't know that the build time situation is so bad. Even with yarn,
> mesos, connectors and libraries removed, we are still running into the
> build timeout :(
>
> Aljoscha told me that the Beam community is using Jenkins for running
> the tests, and they are planning to completely move away from Travis. I
> wonder whether we should do the same, as having our own Jenkins servers
> would allow us to run tests for more than 50 minutes.
>
> I agree with Stephan that we should keep the yarn and mesos tests in
> the
> core for stability / testing quality purposes.
>
>
> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen  > wrote:
> @Greg
>
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors,
> but
> we may be able to convince him.
>
> For the cluster tests (yarn / mesos) - in the past there were many
> cases
> where these tests caught cases that other tests did not, because they
> are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling
> would
> be that they are valuable in the core repository.
>
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience

Re: [DISCUSS] Project build time and possible restructuring

Another solution would be to make the Travis builds more efficient. For 
example, we could write a script that determines the modified Maven 
module and only run the test for this module (and maybe transitive 
dependencies). PRs for libraries such as Gelly, Table, CEP or connectors 
would not trigger a compilation of the entire stack anymore. Of course 
this would not solve all problems but many of it.


What do you think about this?



Am 20/03/17 um 14:02 schrieb Robert Metzger:

Aljoscha, do you know how to configure jenkins?
Is Apache INFRA doing that, or are the beam people doing that themselves?

One downside of Jenkins is that we probably need some machines that execute
the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
currently have 10 such containers available on travis concurrently. I think
we would need at least the same amount on Jenkins.


On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther  wrote:


I agress with Aljoscha that we might consider moving from Jenkins to
Travis. Is there any disadvantage in using Jenkins?

I think we should structure the project according to release management
(e.g. more frequent releases of libraries) or other criteria (e.g. core and
non-core) instead of build time. What would happen if the built of another
submodule would become too long, would we split/restructure again and
again? If Jenkins solves all our problems we should use it.

Regards,
Timo



Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:


I prefer Jenkins to Travis by far. Working on Beam, where we have good
Jenkins integration, has opened my eyes to what is possible with good CI
integration.

For example, look at this recent Beam PR: https://github.com/apache/beam
/pull/2263 . The
Jenkins-Github integration will tell you exactly which tests failed and if
you click on the links you can look at the log output/std out of the tests
in question.

This is the overview page of one of the Jenkins Jobs that we have in
Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
nService_Flink/ . This is an example of a
stable build: https://builds.apache.org/job/
beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
https://builds.apache.org/job/beam_PostCommit_Java_Runnable
OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
information about the Maven run. This is an unstable run:
https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
nService_Flink/lastUnstableBuild/ . There
you can see which tests failed and you can easily drill down.

Best,
Aljoscha

On 20 Mar 2017, at 11:46, Robert Metzger  wrote:

Thank you for looking into the build times.

I didn't know that the build time situation is so bad. Even with yarn,
mesos, connectors and libraries removed, we are still running into the
build timeout :(

Aljoscha told me that the Beam community is using Jenkins for running
the tests, and they are planning to completely move away from Travis. I
wonder whether we should do the same, as having our own Jenkins servers
would allow us to run tests for more than 50 minutes.

I agree with Stephan that we should keep the yarn and mesos tests in the
core for stability / testing quality purposes.


On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen > wrote:
@Greg

I am personally in favor of splitting "connectors" and "contrib" out as
well. I know that @rmetzger has some reservations about the connectors,
but
we may be able to convince him.

For the cluster tests (yarn / mesos) - in the past there were many cases
where these tests caught cases that other tests did not, because they are
the only tests that actually use the "flink-dist.jar" and thus discover
many dependency and configuration issues. For that reason, my feeling
would
be that they are valuable in the core repository.

I would actually suggest to do only the library split initially, to see
what the challenges are in setting up the multi-repo build and release
tooling. Once we gathered experience there, we can probably easily see
what
else we can split out.

Stephan


On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan  wrote:

I’d like to use this refactoring opportunity to unspilt the Travis tests.

With 51 builds queued up for the weekend (some of which may fail or have
been force pushed) we are at the limit of the number of contributions we
can process. Fixing this requires 1) splitting the project, 2)
investigating speedups for long-running tests, and 3) staying cognizant
of
test performance when accepting new code.

I’d like to add one to Stephan’s list of module group. I like that the
modules are generic (“libraries”) so that no one module is alone and
independent.

Flink has three

Re: [jira] [Updated] (FLINK-6126) Yet another conflict : guava

So it is resolved now? Thanks for letting us know!
> On 20 Mar 2017, at 12:55, Liangfei Su  wrote:
> 
> Well...Comments added.
> 
> This looks due to incorrect local build of flink.
> 
> Documentation at
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/building.html
> shows
> clearly that
> 
> """
> NOTE: Maven 3.3.x can build Flink, but will not properly shade away certain
> dependencies. Maven 3.0.3 creates the libraries properly. To build unit
> tests with Java 8, use Java 8u51 or above to prevent failures in unit tests
> that use the PowerMock runner.
> 
> """
> My local flink jar was build by maven3.3.3, which the shade of guava is not
> well set. Cause the conflict of elastic search(guava 18) and the packaged
> ones(should be the ones from hadoop related)
> 
> On Mon, Mar 20, 2017 at 7:04 PM, Liangfei Su  wrote:
> 
>> Another finding that it looks most of these kinds of conflict comes from
>> the dependency to hadoop/hbase (by using mvn dependency to check).
>> 
>> It looks to me the correct way is to shade more packages from the hadoop
>> dependencies...  Thoughts here?
>> 
>> Thanks
>> Ralph
>> 
>> 
>> 
>> On Mon, Mar 20, 2017 at 6:59 PM, Liangfei Su  wrote:
>> 
>>> Hi,
>>> 
>>> I try to write some user function to write elastic search, my project
>>> comes with elastic search 2.3.5 (which is the same to flink connectors
>>> elastic search). But i'm seeing be exception that recorded in the below
>>> JIRA, by enabling the java -version:class, it show it's the guava
>>> dependency issue:
>>> Elastic search 2.3.5 transport client is using guava 18.0, and called one
>>> method starting from guava 18.0, MoreExecutors.directExecutor
>>> 
>>> While the class loading log show the active runtime MoreExecutors class
>>> loaded from flink distribution, which cause the java.lang.NoSuchMethodErro
>>> r.
>>> 
>>> Based on above finding, it looks es 2.3.5 is not able to be used with
>>> flink1.2.0 (and then the connectors-elasticsearch is broken)? Can someone
>>> help clarify?
>>> 
>>> Also, it looks some of the fink-core actually use the shade way to rename
>>> the class like from com.google.guava to 
>>> org.apache.flink.***.com.google.guava
>>> which is actually a fix of this kind of issue. Etc.
>>> https://issues.apache.org/jira/browse/FLINK-4587/https:
>>> //issues.apache.org/jira/browse/FLINK-3373.
>>> 
>>> 
>>> My flink cluster is v1.2.0, running in docker.
>>> 
>>> 
>>> Thanks,
>>> Ralph
>>> 
>>> -- Forwarded message --
>>> From: Su Ralph (JIRA) 
>>> Date: Mon, Mar 20, 2017 at 6:41 PM
>>> Subject: [jira] [Updated] (FLINK-6126) Yet another conflict : guava
>>> To: suliang...@gmail.com
>>> 
>>> 
>>> 
>>> [ https://issues.apache.org/jira/browse/FLINK-6126?page=com.at
>>> lassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>> 
>>> Su Ralph updated FLINK-6126:
>>> 
>>>Description:
>>> When write a user function try to write to elastic search (depend on
>>> elasticsearch 2.3.5)
>>> 
>>> Stack like:
>>> java.lang.NoSuchMethodError: com.google.common.util.concurr
>>> ent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
>>>at org.elasticsearch.threadpool.ThreadPool.(ThreadPool.
>>> java:190)
>>>at org.elasticsearch.client.transport.TransportClient$Builder.b
>>> uild(TransportClient.java:131)
>>>at io.sherlock.capabilities.es.AbstractEsSink.open(AbstractEsSi
>>> nk.java:98)
>>> 
>>> When enable env.java.opts.taskmanager to -version:class, we can see the
>>> class load log like:
>>> [Loaded com.google.common.util.concurrent.MoreExecutors from
>>> file:/opt/flink/lib/flink-dist_2.11-1.2.0.jar]
>>> 
>>> The user code is using guva of 18.0.
>>> 
>>>  was:
>>> For some reason I need to use 
>>> org.apache.httpcomponents:httpasyncclient:4.1.2
>>> in flink.
>>> The source file is:
>>> {code}
>>> import org.apache.flink.streaming.api.scala._
>>> import org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory
>>> 
>>> /**
>>>  * Created by renkai on 16/9/7.
>>>  */
>>> object Main {
>>>  def main(args: Array[String]): Unit = {
>>>val instance = ManagedNHttpClientConnectionFactory.INSTANCE
>>>println("instance = " + instance)
>>> 
>>>val env = StreamExecutionEnvironment.getExecutionEnvironment
>>>val stream = env.fromCollection(1 to 100)
>>>val result = stream.map { x =>
>>>  x * 2
>>>}
>>>result.print()
>>>env.execute("xixi")
>>>  }
>>> }
>>> 
>>> {code}
>>> 
>>> and
>>> {code}
>>> name := "flink-explore"
>>> 
>>> version := "1.0"
>>> 
>>> scalaVersion := "2.11.8"
>>> 
>>> crossPaths := false
>>> 
>>> libraryDependencies ++= Seq(
>>>  "org.apache.flink" %% "flink-scala" % "1.2-SNAPSHOT"
>>>exclude("com.google.code.findbugs", "jsr305"),
>>>  "org.apache.flink" %% "flink-connector-kafka-0.8" % "1.2-SNAPSHOT"
>>>exclude("com.google.code.findbugs", "jsr305"),
>>>

[jira] [Created] (FLINK-6129) MetricQueryService does not stop query actor

2017-03-20 Thread Till Rohrmann (JIRA)

Till Rohrmann created FLINK-6129:


 Summary: MetricQueryService does not stop query actor
 Key: FLINK-6129
 URL: https://issues.apache.org/jira/browse/FLINK-6129
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.3.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.3.0


The {{MetricQueryService}} does not properly close the started query actor upon 
shutdown. This can be a resource leak.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Re: [DISCUSS] TravisCI status on GitHub Page

I merged the PR and therefore obviously think it's fine. ;-) Didn't
see Robert's comment in the issue though ("We once had the travis
build status badge in our readme, but decided to remove it, because it
often shows "Build failed" due to travis issues etc.
This gives people the impression that our builds are very unstable").

It's actually not just an impression, but actually true that the
builds are unstable (even if recently it's "mostly" caused by
timeouts). Since we are actively working on improving this situation
with the repository split, I think it does not hurt having it there.
If others disagree, we can revert it.

On Mon, Mar 20, 2017 at 2:12 PM, Greg Hogan  wrote:
> We are now showing the TravisCI build status on Flink’s GitHub page. I think 
> Robert’s comment in Jira may have gone unnoticed when the PR was committed.
>   https://issues.apache.org/jira/browse/FLINK-6122 
> 
>
> If not yet seeing the benefit even if builds were typically passing.
>
> Greg

RE: FW: [DISCUSS] Table API / SQL indicators for event and processing time

2017-03-20 Thread Radu Tudoran

Hi,

I am not sure if it is not about setting the timestamp within the query but you 
can imagine that there examples where you have different timestamps as 
mentioned. Take for example the case when we do a purchase online. You have:
-time of purchase  (when the payment was input/triggered)
-time of executing the transaction at bank (when the payment is processed from 
account)
-time of settlement (when the payment is executed at merchant bank - when money 
are received by the seller)

In such a scenario you can imagine that over the same stream of online payments 
you might want to run different queries, each that might be driven by one of 
these times. Supporting such a scenario would mean that we have one input 
stream that enters flink engine via a table source and then in the query we can 
start running different queries:
e.g. SELECT SUM(amount) ORDER BY rowtime(time_purchase) LIMIT 10   //you want 
the amount over your last 10 orders
e.g. SELECT SUM(amount) ORDER BY rowtime(time_settlement) LIMIT 10 //you want 
the amount over your last 10 income

Best regards,


-Original Message-
From: Timo Walther [mailto:twal...@apache.org] 
Sent: Monday, March 20, 2017 2:05 PM
To: dev@flink.apache.org
Subject: Re: FW: [DISCUSS] Table API / SQL indicators for event and processing 
time

Yes, you are right. In the current design the user cannot assign 
timestamp and watermarks in a table program. Operators (such as windows) 
might adapt the metatimestamp, if this is the case this adaption might 
need to be expressed in the query itself too.

E.g. for a tumbling windows we could limit the select part to 
table.select('rowtime.ceil(DAY) as 'newRowtime) (so that logical rowtime 
matches the physical metatimestamp)

Do you have a good example use case that needs the assignment of rowtime 
within a query?

Am 20/03/17 um 13:39 schrieb Radu Tudoran:
> Hi,
>
> As suggested by Timo - I am forwarding this to the mailing list. Sorry for 
> not having the conversation directly here - I initially thought it might not 
> be of interest...
>
> @Timo - thanks for the clarification. I get the main point now which is that 
> the rowtime is encoded within the  metadata of the record. I think this is 
> key. My view on the matter was maybe a bit updated in the sense that I saw 
> the processing pipeline as an input source (as you exemplify - a table scan) 
> and from there you have a timestamp and water mark assigner before the 
> processing actually starts. So by overriding the timestamp extractor you 
> match the field that carries the eventtime/rowtime with the mechanism from 
> flink. But as far as I understand this would not be the case anymore...am I 
> right? In case the assignment of the rowtime to the metadata of the record is 
> done differently - what would be the way to do it?
>
>
> Dr. Radu Tudoran
> Senior Research Engineer - Big Data Expert
> IT R Division
>
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> European Research Center
> Riesstrasse 25, 80992 München
>
> E-mail: radu.tudo...@huawei.com
> Mobile: +49 15209084330
> Telephone: +49 891588344173
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
> Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
> Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
> This e-mail and its attachments contain confidential information from HUAWEI, 
> which is intended only for the person or entity whose address is listed 
> above. Any use of the information contained herein in any way (including, but 
> not limited to, total or partial disclosure, reproduction, or dissemination) 
> by persons other than the intended recipient(s) is prohibited. If you receive 
> this e-mail in error, please notify the sender by phone or email immediately 
> and delete it!
>
>
> -Original Message-
> From: Timo Walther [mailto:twal...@apache.org]
> Sent: Monday, March 20, 2017 12:29 PM
> To: Radu Tudoran
> Subject: Re: [DISCUSS] Table API / SQL indicators for event and processing 
> time
>
> You are not bothering me, it is very interesting to compare the design
> with real world use cases.
>
> In your use case we would create table like: tEnv.toTable('date, 'time1,
> 'time2, 'data, 'myrowtime.rowtime)
>
> We would not "overwrite" an actual attribute of the record but only add
> logical "myrowtime". In general, just to make it clear again, the
> rowtime must be in the metatimestamp of the record (by using a timestamp
> extractor before). The Table API assumes that records that enter the
> Table API are timestamped correctly. So in your use case, you would
> create your own TableSource extract the timestamp based on your 3 time
> fields and define an attribute that represents the rowtime logically. In
> the current design we want that the Table API relies on Flink's time
> handling, because time

[DISCUSS] TravisCI status on GitHub Page

2017-03-20 Thread Greg Hogan

We are now showing the TravisCI build status on Flink’s GitHub page. I think 
Robert’s comment in Jira may have gone unnoticed when the PR was committed.
  https://issues.apache.org/jira/browse/FLINK-6122 


If not yet seeing the benefit even if builds were typically passing.

Greg

Re: FW: [DISCUSS] Table API / SQL indicators for event and processing time

Yes, you are right. In the current design the user cannot assign 
timestamp and watermarks in a table program. Operators (such as windows) 
might adapt the metatimestamp, if this is the case this adaption might 
need to be expressed in the query itself too.


E.g. for a tumbling windows we could limit the select part to 
table.select('rowtime.ceil(DAY) as 'newRowtime) (so that logical rowtime 
matches the physical metatimestamp)


Do you have a good example use case that needs the assignment of rowtime 
within a query?


Am 20/03/17 um 13:39 schrieb Radu Tudoran:

Hi,

As suggested by Timo - I am forwarding this to the mailing list. Sorry for not 
having the conversation directly here - I initially thought it might not be of 
interest...

@Timo - thanks for the clarification. I get the main point now which is that 
the rowtime is encoded within the  metadata of the record. I think this is key. 
My view on the matter was maybe a bit updated in the sense that I saw the 
processing pipeline as an input source (as you exemplify - a table scan) and 
from there you have a timestamp and water mark assigner before the processing 
actually starts. So by overriding the timestamp extractor you match the field 
that carries the eventtime/rowtime with the mechanism from flink. But as far as 
I understand this would not be the case anymore...am I right? In case the 
assignment of the rowtime to the metadata of the record is done differently - 
what would be the way to do it?


Dr. Radu Tudoran
Senior Research Engineer - Big Data Expert
IT R Division


HUAWEI TECHNOLOGIES Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 München

E-mail: radu.tudo...@huawei.com
Mobile: +49 15209084330
Telephone: +49 891588344173

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
This e-mail and its attachments contain confidential information from HUAWEI, 
which is intended only for the person or entity whose address is listed above. 
Any use of the information contained herein in any way (including, but not 
limited to, total or partial disclosure, reproduction, or dissemination) by 
persons other than the intended recipient(s) is prohibited. If you receive this 
e-mail in error, please notify the sender by phone or email immediately and 
delete it!


-Original Message-
From: Timo Walther [mailto:twal...@apache.org]
Sent: Monday, March 20, 2017 12:29 PM
To: Radu Tudoran
Subject: Re: [DISCUSS] Table API / SQL indicators for event and processing time

You are not bothering me, it is very interesting to compare the design
with real world use cases.

In your use case we would create table like: tEnv.toTable('date, 'time1,
'time2, 'data, 'myrowtime.rowtime)

We would not "overwrite" an actual attribute of the record but only add
logical "myrowtime". In general, just to make it clear again, the
rowtime must be in the metatimestamp of the record (by using a timestamp
extractor before). The Table API assumes that records that enter the
Table API are timestamped correctly. So in your use case, you would
create your own TableSource extract the timestamp based on your 3 time
fields and define an attribute that represents the rowtime logically. In
the current design we want that the Table API relies on Flink's time
handling, because time handling can be very tricky.So we only support
one event-time time field.

But would it be possible to post our discussion on the ML? It might be
interesting for others as well. If yes, can you forward our conversion
to the ML?

Timo



Am 20/03/17 um 12:11 schrieb Radu Tudoran:

Thanks for the replies.

Regarding the ""It might be sometimes that this is not explicit to be guessed" 
That is
why I added the RelTimeConverter. After this conversion step it should
be as explicit as possible (by using the special types). And we can add
special handling of functions (i.e. ceil) that preserve the monotonicity."

..maybe I am missing something so sorry if I just bother you for nothing (it is 
just to make sure we think of all cases before hand). I saw examples of 
applications where you have multiple fields of the same type. For example an 
event can have 3 time fields of TIMESTAMP, 1 of DATE and 2 of TIME (this is 
actually from a real application with some sort fo standard communication 
schema). I was referring to such cases that it is unclear to me how the code 
will identify the exact field to use as rowtime for example. This is what I 
meant about how are we passing indicators to spot the row time field as well as 
what would happen with the code in such a situation as it can identify multiple 
time fields.

Dr. Radu Tudoran
Senior Research Engineer - Big Data Expert
IT R Division


HUAWEI TECHNOLOGIES

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Robert Metzger

Aljoscha, do you know how to configure jenkins?
Is Apache INFRA doing that, or are the beam people doing that themselves?

One downside of Jenkins is that we probably need some machines that execute
the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
currently have 10 such containers available on travis concurrently. I think
we would need at least the same amount on Jenkins.


On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther  wrote:

> I agress with Aljoscha that we might consider moving from Jenkins to
> Travis. Is there any disadvantage in using Jenkins?
>
> I think we should structure the project according to release management
> (e.g. more frequent releases of libraries) or other criteria (e.g. core and
> non-core) instead of build time. What would happen if the built of another
> submodule would become too long, would we split/restructure again and
> again? If Jenkins solves all our problems we should use it.
>
> Regards,
> Timo
>
>
>
> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>
>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
>> Jenkins integration, has opened my eyes to what is possible with good CI
>> integration.
>>
>> For example, look at this recent Beam PR: https://github.com/apache/beam
>> /pull/2263 . The
>> Jenkins-Github integration will tell you exactly which tests failed and if
>> you click on the links you can look at the log output/std out of the tests
>> in question.
>>
>> This is the overview page of one of the Jenkins Jobs that we have in
>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>> nService_Flink/ > /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a
>> stable build: https://builds.apache.org/job/
>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
>> information about the Maven run. This is an unstable run:
>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>> nService_Flink/lastUnstableBuild/ > /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There
>> you can see which tests failed and you can easily drill down.
>>
>> Best,
>> Aljoscha
>>
>> On 20 Mar 2017, at 11:46, Robert Metzger  wrote:
>>>
>>> Thank you for looking into the build times.
>>>
>>> I didn't know that the build time situation is so bad. Even with yarn,
>>> mesos, connectors and libraries removed, we are still running into the
>>> build timeout :(
>>>
>>> Aljoscha told me that the Beam community is using Jenkins for running
>>> the tests, and they are planning to completely move away from Travis. I
>>> wonder whether we should do the same, as having our own Jenkins servers
>>> would allow us to run tests for more than 50 minutes.
>>>
>>> I agree with Stephan that we should keep the yarn and mesos tests in the
>>> core for stability / testing quality purposes.
>>>
>>>
>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen >> > wrote:
>>> @Greg
>>>
>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>> well. I know that @rmetzger has some reservations about the connectors,
>>> but
>>> we may be able to convince him.
>>>
>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>> where these tests caught cases that other tests did not, because they are
>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>> many dependency and configuration issues. For that reason, my feeling
>>> would
>>> be that they are valuable in the core repository.
>>>
>>> I would actually suggest to do only the library split initially, to see
>>> what the challenges are in setting up the multi-repo build and release
>>> tooling. Once we gathered experience there, we can probably easily see
>>> what
>>> else we can split out.
>>>
>>> Stephan
>>>
>>>
>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan > c...@greghogan.com>> wrote:
>>>
>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
 With 51 builds queued up for the weekend (some of which may fail or have
 been force pushed) we are at the limit of the number of contributions we
 can process. Fixing this requires 1) splitting the project, 2)
 investigating speedups for long-running tests, and 3) staying cognizant
 of
 test performance when accepting new code.

 I’d like to add one to Stephan’s list of module group. I like that the
 modules are generic (“libraries”) so that no one module is alone and
 independent.

 Flink has three “libraries”: cep, ml, and gelly.

 “connectors” is a hotspot due to the long-running Kafka tests (and
 connectors for three Kafka

Re: [DISCUSS] Project build time and possible restructuring

I agress with Aljoscha that we might consider moving from Jenkins to
Travis. Is there any disadvantage in using Jenkins?

I think we should structure the project according to release management
(e.g. more frequent releases of libraries) or other criteria (e.g. core
and non-core) instead of build time. What would happen if the built of
another submodule would become too long, would we split/restructure
again and again? If Jenkins solves all our problems we should use it.

Regards,
Timo

Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:

I prefer Jenkins to Travis by far. Working on Beam, where we have good Jenkins
integration, has opened my eyes to what is possible with good CI integration.

For example, look at this recent Beam PR: https://github.com/apache/beam/pull/2263
. The Jenkins-Github integration will
tell you exactly which tests failed and if you click on the links you can look at the
log output/std out of the tests in question.

This is the overview page of one of the Jenkins Jobs that we have in Beam:
https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/
. This is an
example of a stable build:
https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/
.
Notice how it gives you fine grained information about the Maven run. This is an unstable run:
https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/
.
There you can see which tests failed and you can easily drill down.

Best,
Aljoscha

On 20 Mar 2017, at 11:46, Robert Metzger wrote:

Thank you for looking into the build times.

I didn't know that the build time situation is so bad. Even with yarn, mesos,
connectors and libraries removed, we are still running into the build timeout :(

Aljoscha told me that the Beam community is using Jenkins for running the
tests, and they are planning to completely move away from Travis. I wonder
whether we should do the same, as having our own Jenkins servers would allow us
to run tests for more than 50 minutes.

I agree with Stephan that we should keep the yarn and mesos tests in the core
for stability / testing quality purposes.

On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen > wrote:
@Greg

I am personally in favor of splitting "connectors" and "contrib" out as
well. I know that @rmetzger has some reservations about the connectors, but
we may be able to convince him.

For the cluster tests (yarn / mesos) - in the past there were many cases
where these tests caught cases that other tests did not, because they are
the only tests that actually use the "flink-dist.jar" and thus discover
many dependency and configuration issues. For that reason, my feeling would
be that they are valuable in the core repository.

I would actually suggest to do only the library split initially, to see
what the challenges are in setting up the multi-repo build and release
tooling. Once we gathered experience there, we can probably easily see what
else we can split out.

Stephan

On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan > wrote:

I’d like to use this refactoring opportunity to unspilt the Travis tests.
With 51 builds queued up for the weekend (some of which may fail or have
been force pushed) we are at the limit of the number of contributions we
can process. Fixing this requires 1) splitting the project, 2)
investigating speedups for long-running tests, and 3) staying cognizant of
test performance when accepting new code.

I’d like to add one to Stephan’s list of module group. I like that the
modules are generic (“libraries”) so that no one module is alone and
independent.

Flink has three “libraries”: cep, ml, and gelly.

“connectors” is a hotspot due to the long-running Kafka tests (and
connectors for three Kafka versions).

Both flink-storm and flink-python have a modest number of number of tests
and could live with the miscellaneous modules in “contrib”.

The YARN tests are long-running and problematic (I am unable to
successfully run these locally). A “cluster” module could host flink-mesos,
flink-yarn, and flink-yarn-tests.

That gets us close to running all tests in a single Travis build.
https://travis-ci.org/greghogan/flink/builds/212122590
<
https://travis-ci.org/greghogan/flink/builds/212122590
>

I also tested (https://github.com/greghogan/flink/commits/core_build
<

FW: [DISCUSS] Table API / SQL indicators for event and processing time

2017-03-20 Thread Radu Tudoran

Hi,

As suggested by Timo - I am forwarding this to the mailing list. Sorry for not 
having the conversation directly here - I initially thought it might not be of 
interest...

@Timo - thanks for the clarification. I get the main point now which is that 
the rowtime is encoded within the  metadata of the record. I think this is key. 
My view on the matter was maybe a bit updated in the sense that I saw the 
processing pipeline as an input source (as you exemplify - a table scan) and 
from there you have a timestamp and water mark assigner before the processing 
actually starts. So by overriding the timestamp extractor you match the field 
that carries the eventtime/rowtime with the mechanism from flink. But as far as 
I understand this would not be the case anymore...am I right? In case the 
assignment of the rowtime to the metadata of the record is done differently - 
what would be the way to do it?


Dr. Radu Tudoran
Senior Research Engineer - Big Data Expert
IT R Division


HUAWEI TECHNOLOGIES Duesseldorf GmbH
European Research Center
Riesstrasse 25, 80992 München

E-mail: radu.tudo...@huawei.com
Mobile: +49 15209084330
Telephone: +49 891588344173

HUAWEI TECHNOLOGIES Duesseldorf GmbH
Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang CHEN
This e-mail and its attachments contain confidential information from HUAWEI, 
which is intended only for the person or entity whose address is listed above. 
Any use of the information contained herein in any way (including, but not 
limited to, total or partial disclosure, reproduction, or dissemination) by 
persons other than the intended recipient(s) is prohibited. If you receive this 
e-mail in error, please notify the sender by phone or email immediately and 
delete it!


-Original Message-
From: Timo Walther [mailto:twal...@apache.org] 
Sent: Monday, March 20, 2017 12:29 PM
To: Radu Tudoran
Subject: Re: [DISCUSS] Table API / SQL indicators for event and processing time

You are not bothering me, it is very interesting to compare the design 
with real world use cases.

In your use case we would create table like: tEnv.toTable('date, 'time1, 
'time2, 'data, 'myrowtime.rowtime)

We would not "overwrite" an actual attribute of the record but only add 
logical "myrowtime". In general, just to make it clear again, the 
rowtime must be in the metatimestamp of the record (by using a timestamp 
extractor before). The Table API assumes that records that enter the 
Table API are timestamped correctly. So in your use case, you would 
create your own TableSource extract the timestamp based on your 3 time 
fields and define an attribute that represents the rowtime logically. In 
the current design we want that the Table API relies on Flink's time 
handling, because time handling can be very tricky.So we only support 
one event-time time field.

But would it be possible to post our discussion on the ML? It might be 
interesting for others as well. If yes, can you forward our conversion 
to the ML?

Timo



Am 20/03/17 um 12:11 schrieb Radu Tudoran:
> Thanks for the replies.
>
> Regarding the ""It might be sometimes that this is not explicit to be 
> guessed" That is
> why I added the RelTimeConverter. After this conversion step it should
> be as explicit as possible (by using the special types). And we can add
> special handling of functions (i.e. ceil) that preserve the monotonicity."
>
> ..maybe I am missing something so sorry if I just bother you for nothing (it 
> is just to make sure we think of all cases before hand). I saw examples of 
> applications where you have multiple fields of the same type. For example an 
> event can have 3 time fields of TIMESTAMP, 1 of DATE and 2 of TIME (this is 
> actually from a real application with some sort fo standard communication 
> schema). I was referring to such cases that it is unclear to me how the code 
> will identify the exact field to use as rowtime for example. This is what I 
> meant about how are we passing indicators to spot the row time field as well 
> as what would happen with the code in such a situation as it can identify 
> multiple time fields.
>
> Dr. Radu Tudoran
> Senior Research Engineer - Big Data Expert
> IT R Division
>
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> European Research Center
> Riesstrasse 25, 80992 München
>
> E-mail: radu.tudo...@huawei.com
> Mobile: +49 15209084330
> Telephone: +49 891588344173
>
> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> Hansaallee 205, 40549 Düsseldorf, Germany, www.huawei.com
> Registered Office: Düsseldorf, Register Court Düsseldorf, HRB 56063,
> Managing Director: Bo PENG, Wanzhou MENG, Lifang CHEN
> Sitz der Gesellschaft: Düsseldorf, Amtsgericht Düsseldorf, HRB 56063,
> Geschäftsführer: Bo PENG, Wanzhou MENG, Lifang

Re: [jira] [Updated] (FLINK-6126) Yet another conflict : guava

Well...Comments added.

This looks due to incorrect local build of flink.

Documentation at
https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/building.html
shows
clearly that

"""
NOTE: Maven 3.3.x can build Flink, but will not properly shade away certain
dependencies. Maven 3.0.3 creates the libraries properly. To build unit
tests with Java 8, use Java 8u51 or above to prevent failures in unit tests
that use the PowerMock runner.

"""
My local flink jar was build by maven3.3.3, which the shade of guava is not
well set. Cause the conflict of elastic search(guava 18) and the packaged
ones(should be the ones from hadoop related)

On Mon, Mar 20, 2017 at 7:04 PM, Liangfei Su  wrote:

> Another finding that it looks most of these kinds of conflict comes from
> the dependency to hadoop/hbase (by using mvn dependency to check).
>
> It looks to me the correct way is to shade more packages from the hadoop
> dependencies...  Thoughts here?
>
> Thanks
> Ralph
>
>
>
> On Mon, Mar 20, 2017 at 6:59 PM, Liangfei Su  wrote:
>
>> Hi,
>>
>> I try to write some user function to write elastic search, my project
>> comes with elastic search 2.3.5 (which is the same to flink connectors
>> elastic search). But i'm seeing be exception that recorded in the below
>> JIRA, by enabling the java -version:class, it show it's the guava
>> dependency issue:
>> Elastic search 2.3.5 transport client is using guava 18.0, and called one
>> method starting from guava 18.0, MoreExecutors.directExecutor
>>
>> While the class loading log show the active runtime MoreExecutors class
>> loaded from flink distribution, which cause the java.lang.NoSuchMethodErro
>> r.
>>
>> Based on above finding, it looks es 2.3.5 is not able to be used with
>> flink1.2.0 (and then the connectors-elasticsearch is broken)? Can someone
>> help clarify?
>>
>> Also, it looks some of the fink-core actually use the shade way to rename
>> the class like from com.google.guava to org.apache.flink.***.com.google.guava
>> which is actually a fix of this kind of issue. Etc.
>> https://issues.apache.org/jira/browse/FLINK-4587/https:
>> //issues.apache.org/jira/browse/FLINK-3373.
>>
>>
>> My flink cluster is v1.2.0, running in docker.
>>
>>
>> Thanks,
>> Ralph
>>
>> -- Forwarded message --
>> From: Su Ralph (JIRA) 
>> Date: Mon, Mar 20, 2017 at 6:41 PM
>> Subject: [jira] [Updated] (FLINK-6126) Yet another conflict : guava
>> To: suliang...@gmail.com
>>
>>
>>
>>  [ https://issues.apache.org/jira/browse/FLINK-6126?page=com.at
>> lassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>
>> Su Ralph updated FLINK-6126:
>> 
>> Description:
>> When write a user function try to write to elastic search (depend on
>> elasticsearch 2.3.5)
>>
>> Stack like:
>> java.lang.NoSuchMethodError: com.google.common.util.concurr
>> ent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
>> at org.elasticsearch.threadpool.ThreadPool.(ThreadPool.
>> java:190)
>> at org.elasticsearch.client.transport.TransportClient$Builder.b
>> uild(TransportClient.java:131)
>> at io.sherlock.capabilities.es.AbstractEsSink.open(AbstractEsSi
>> nk.java:98)
>>
>> When enable env.java.opts.taskmanager to -version:class, we can see the
>> class load log like:
>> [Loaded com.google.common.util.concurrent.MoreExecutors from
>> file:/opt/flink/lib/flink-dist_2.11-1.2.0.jar]
>>
>> The user code is using guva of 18.0.
>>
>>   was:
>> For some reason I need to use org.apache.httpcomponents:httpasyncclient:4.1.2
>> in flink.
>> The source file is:
>> {code}
>> import org.apache.flink.streaming.api.scala._
>> import org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory
>>
>> /**
>>   * Created by renkai on 16/9/7.
>>   */
>> object Main {
>>   def main(args: Array[String]): Unit = {
>> val instance = ManagedNHttpClientConnectionFactory.INSTANCE
>> println("instance = " + instance)
>>
>> val env = StreamExecutionEnvironment.getExecutionEnvironment
>> val stream = env.fromCollection(1 to 100)
>> val result = stream.map { x =>
>>   x * 2
>> }
>> result.print()
>> env.execute("xixi")
>>   }
>> }
>>
>> {code}
>>
>> and
>> {code}
>> name := "flink-explore"
>>
>> version := "1.0"
>>
>> scalaVersion := "2.11.8"
>>
>> crossPaths := false
>>
>> libraryDependencies ++= Seq(
>>   "org.apache.flink" %% "flink-scala" % "1.2-SNAPSHOT"
>> exclude("com.google.code.findbugs", "jsr305"),
>>   "org.apache.flink" %% "flink-connector-kafka-0.8" % "1.2-SNAPSHOT"
>> exclude("com.google.code.findbugs", "jsr305"),
>>   "org.apache.flink" %% "flink-streaming-scala" % "1.2-SNAPSHOT"
>> exclude("com.google.code.findbugs", "jsr305"),
>>   "org.apache.flink" %% "flink-clients" % "1.2-SNAPSHOT"
>> exclude("com.google.code.findbugs", "jsr305"),
>>   "org.apache.httpcomponents" % "httpasyncclient" % "4.1.2"
>> )
>> {code}
>> I use

Re: Machine Learning on Flink - Next steps

2017-03-20 Thread Tao Meng

Hi All,

Sorry for joining this discussion late.
My graduation thesis is about online learning system. I would build it on
flink in the next three months.

I'd like to contribute on:
 - Online learning




On Mon, Mar 20, 2017 at 6:51 PM Katherin Eri  wrote:

Hello, Theodore

Could you please move vectors of development and their prioritized
positions from *## Executive summary* to the google doc?



Could you please also create some table in google doc, that is representing
the selected directions and persons, who would like to drive or participate
in the particular topic, in order to make this process transparent for
community and sum up current state of commitment of contributors?

There we could simply inscribe us to some topic.



And 1+ for CTR prediction case.

вс, 19 мар. 2017 г. в 16:49, Theodore Vasiloudis <
theodoros.vasilou...@gmail.com>:

> Hello Stavros,
>
> The way I thought we'd do it is that each shepherd would be responsible
for
> organizing the project: that includes setting up a Google doc, sending an
> email to the dev list to inform the wider community, and if possible,
> personally contacting the people who expressed interest in the project.
>
> Would you be willing to lead that effort for the model serving project?
>
> Regards,
> Theodore
>
> --
> Sent from a mobile device. May contain autocorrect errors.
>
> On Mar 19, 2017 3:49 AM, "Stavros Kontopoulos" 
> wrote:
>
> > Hi all...
> >
> > I agree about the tensorflow integration it seems to be important from
> what
> > I hear.
> > Should we sign up somewhere for the working groups (gdcos)?
> > I would like to start helping with the model serving feature.
> >
> > Best Regards,
> > Stavros
> >
> > On Fri, Mar 17, 2017 at 10:34 PM, Gábor Hermann 
> > wrote:
> >
> > > Hi Chen,
> > >
> > > Thanks for the input! :)
> > >
> > > There is already a project [1] for using TensorFlow models in Flink,
> and
> > > Theodore has suggested
> > > to contact the author, Eron Wright for the model serving direction.
> > >
> > >
> > > [1] http://sf.flink-forward.org/kb_sessions/introducing-flink-
> > tensorflow/
> > >
> > > Cheers,
> > > Gabor
> > >
> > >
> > > On 2017-03-17 19:41, Chen Qin wrote:
> > >
> > >> [1]http://sf.flink-forward.org/kb_sessions/introducing-flink-te
> > >> nsorflow/
> > >>
> > >
> > >
> >
>
--

*Yours faithfully, *

*Kate Eri.*

Re: [DISCUSS] Project build time and possible restructuring

I prefer Jenkins to Travis by far. Working on Beam, where we have good Jenkins 
integration, has opened my eyes to what is possible with good CI integration.

For example, look at this recent Beam PR: 
https://github.com/apache/beam/pull/2263 
. The Jenkins-Github integration will 
tell you exactly which tests failed and if you click on the links you can look 
at the log output/std out of the tests in question.

This is the overview page of one of the Jenkins Jobs that we have in Beam: 
https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/ 
. 
This is an example of a stable build: 
https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/
 
.
 Notice how it gives you fine grained information about the Maven run. This is 
an unstable run: 
https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/
 
.
 There you can see which tests failed and you can easily drill down.

Best,
Aljoscha

> On 20 Mar 2017, at 11:46, Robert Metzger  wrote:
> 
> Thank you for looking into the build times.
> 
> I didn't know that the build time situation is so bad. Even with yarn, mesos, 
> connectors and libraries removed, we are still running into the build timeout 
> :(
> 
> Aljoscha told me that the Beam community is using Jenkins for running the 
> tests, and they are planning to completely move away from Travis. I wonder 
> whether we should do the same, as having our own Jenkins servers would allow 
> us to run tests for more than 50 minutes.
> 
> I agree with Stephan that we should keep the yarn and mesos tests in the core 
> for stability / testing quality purposes.
> 
> 
> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen  > wrote:
> @Greg
> 
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors, but
> we may be able to convince him.
> 
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases that other tests did not, because they are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling would
> be that they are valuable in the core repository.
> 
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience there, we can probably easily see what
> else we can split out.
> 
> Stephan
> 
> 
> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan  > wrote:
> 
> > I’d like to use this refactoring opportunity to unspilt the Travis tests.
> > With 51 builds queued up for the weekend (some of which may fail or have
> > been force pushed) we are at the limit of the number of contributions we
> > can process. Fixing this requires 1) splitting the project, 2)
> > investigating speedups for long-running tests, and 3) staying cognizant of
> > test performance when accepting new code.
> >
> > I’d like to add one to Stephan’s list of module group. I like that the
> > modules are generic (“libraries”) so that no one module is alone and
> > independent.
> >
> > Flink has three “libraries”: cep, ml, and gelly.
> >
> > “connectors” is a hotspot due to the long-running Kafka tests (and
> > connectors for three Kafka versions).
> >
> > Both flink-storm and flink-python have a modest number of number of tests
> > and could live with the miscellaneous modules in “contrib”.
> >
> > The YARN tests are long-running and problematic (I am unable to
> > successfully run these locally). A “cluster” module could host flink-mesos,
> > flink-yarn, and flink-yarn-tests.
> >
> > That gets us close to running all tests in a single Travis build.
> >   https://travis-ci.org/greghogan/flink/builds/212122590 
> >  <
> > https://travis-ci.org/greghogan/flink/builds/212122590 
> > >
> >
> > I also tested (https://github.com/greghogan/flink/commits/core_build 
> >  <
> > https://github.com/greghogan/flink/commits/core_build 
> > >) with a maven
> > parallelism of 2 and 4, with the latter a 6.4% drop in build time.
> >   https://travis-ci.org/greghogan/flink/builds/212137659 
> >  <
> > https://travis-ci.org/greghogan/flink/builds/212137659 
>

Re: [jira] [Updated] (FLINK-6126) Yet another conflict : guava

Another finding that it looks most of these kinds of conflict comes from
the dependency to hadoop/hbase (by using mvn dependency to check).

It looks to me the correct way is to shade more packages from the hadoop
dependencies...  Thoughts here?

Thanks
Ralph


On Mon, Mar 20, 2017 at 6:59 PM, Liangfei Su  wrote:

> Hi,
>
> I try to write some user function to write elastic search, my project
> comes with elastic search 2.3.5 (which is the same to flink connectors
> elastic search). But i'm seeing be exception that recorded in the below
> JIRA, by enabling the java -version:class, it show it's the guava
> dependency issue:
> Elastic search 2.3.5 transport client is using guava 18.0, and called one
> method starting from guava 18.0, MoreExecutors.directExecutor
>
> While the class loading log show the active runtime MoreExecutors class
> loaded from flink distribution, which cause the java.lang.NoSuchMethodErro
> r.
>
> Based on above finding, it looks es 2.3.5 is not able to be used with
> flink1.2.0 (and then the connectors-elasticsearch is broken)? Can someone
> help clarify?
>
> Also, it looks some of the fink-core actually use the shade way to rename
> the class like from com.google.guava to org.apache.flink.***.com.google.guava
> which is actually a fix of this kind of issue. Etc.
> https://issues.apache.org/jira/browse/FLINK-4587/https:
> //issues.apache.org/jira/browse/FLINK-3373.
>
>
> My flink cluster is v1.2.0, running in docker.
>
>
> Thanks,
> Ralph
>
> -- Forwarded message --
> From: Su Ralph (JIRA) 
> Date: Mon, Mar 20, 2017 at 6:41 PM
> Subject: [jira] [Updated] (FLINK-6126) Yet another conflict : guava
> To: suliang...@gmail.com
>
>
>
>  [ https://issues.apache.org/jira/browse/FLINK-6126?page=com.at
> lassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Su Ralph updated FLINK-6126:
> 
> Description:
> When write a user function try to write to elastic search (depend on
> elasticsearch 2.3.5)
>
> Stack like:
> java.lang.NoSuchMethodError: com.google.common.util.concurr
> ent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
> at org.elasticsearch.threadpool.ThreadPool.(ThreadPool.
> java:190)
> at org.elasticsearch.client.transport.TransportClient$Builder.b
> uild(TransportClient.java:131)
> at io.sherlock.capabilities.es.AbstractEsSink.open(AbstractEsSi
> nk.java:98)
>
> When enable env.java.opts.taskmanager to -version:class, we can see the
> class load log like:
> [Loaded com.google.common.util.concurrent.MoreExecutors from
> file:/opt/flink/lib/flink-dist_2.11-1.2.0.jar]
>
> The user code is using guva of 18.0.
>
>   was:
> For some reason I need to use org.apache.httpcomponents:httpasyncclient:4.1.2
> in flink.
> The source file is:
> {code}
> import org.apache.flink.streaming.api.scala._
> import org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory
>
> /**
>   * Created by renkai on 16/9/7.
>   */
> object Main {
>   def main(args: Array[String]): Unit = {
> val instance = ManagedNHttpClientConnectionFactory.INSTANCE
> println("instance = " + instance)
>
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> val stream = env.fromCollection(1 to 100)
> val result = stream.map { x =>
>   x * 2
> }
> result.print()
> env.execute("xixi")
>   }
> }
>
> {code}
>
> and
> {code}
> name := "flink-explore"
>
> version := "1.0"
>
> scalaVersion := "2.11.8"
>
> crossPaths := false
>
> libraryDependencies ++= Seq(
>   "org.apache.flink" %% "flink-scala" % "1.2-SNAPSHOT"
> exclude("com.google.code.findbugs", "jsr305"),
>   "org.apache.flink" %% "flink-connector-kafka-0.8" % "1.2-SNAPSHOT"
> exclude("com.google.code.findbugs", "jsr305"),
>   "org.apache.flink" %% "flink-streaming-scala" % "1.2-SNAPSHOT"
> exclude("com.google.code.findbugs", "jsr305"),
>   "org.apache.flink" %% "flink-clients" % "1.2-SNAPSHOT"
> exclude("com.google.code.findbugs", "jsr305"),
>   "org.apache.httpcomponents" % "httpasyncclient" % "4.1.2"
> )
> {code}
> I use `sbt assembly` to get a fat jar.
>
> If I run the command
> {code}
>  java -cp flink-explore-assembly-1.0.jar Main
> {code}
> I get the result
>
> {code}
> instance = org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFa
> ctory@4909b8da
> log4j:WARN No appenders could be found for logger (
> org.apache.flink.api.scala.ClosureCleaner$).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> more info.
> Connected to JobManager at Actor[akka://flink/user/jobman
> ager_1#-1177584915]
> 09/07/2016 12:05:26 Job execution switched to status RUNNING.
> 09/07/2016 12:05:26 Source: Collection Source(1/1) switched to
> SCHEDULED
> 09/07/2016 12:05:26 Source: Collection Source(1/1) switched to
> DEPLOYING
> ...
> 09/07/2016 12:05:26 Map -> Sink: Unnamed(20/24)

[jira] [Created] (FLINK-6128) Optimize JVM options for improve test performance

2017-03-20 Thread Dmytro Shkvyra (JIRA)

Dmytro Shkvyra created FLINK-6128:
-

 Summary: Optimize JVM options for improve test performance
 Key: FLINK-6128
 URL: https://issues.apache.org/jira/browse/FLINK-6128
 Project: Flink
  Issue Type: Improvement
  Components: Tests
 Environment: maven-surefire-plugin
Reporter: Dmytro Shkvyra
Assignee: Dmytro Shkvyra


Tune JVM options for run tests by maven-surefire-plugin at travis



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Fwd: [jira] [Updated] (FLINK-6126) Yet another conflict : guava

Hi,

I try to write some user function to write elastic search, my project comes
with elastic search 2.3.5 (which is the same to flink connectors elastic
search). But i'm seeing be exception that recorded in the below JIRA, by
enabling the java -version:class, it show it's the guava dependency issue:
Elastic search 2.3.5 transport client is using guava 18.0, and called one
method starting from guava 18.0, MoreExecutors.directExecutor

While the class loading log show the active runtime MoreExecutors class
loaded from flink distribution, which cause the java.lang.NoSuchMethodError.

Based on above finding, it looks es 2.3.5 is not able to be used with
flink1.2.0 (and then the connectors-elasticsearch is broken)? Can someone
help clarify?

Also, it looks some of the fink-core actually use the shade way to rename
the class like from com.google.guava to org.apache.flink.***.com.google.guava
which is actually a fix of this kind of issue. Etc. https://issues.apache.
org/jira/browse/FLINK-4587/https://issues.apache.org/jira/browse/FLINK-3373
.


My flink cluster is v1.2.0, running in docker.


Thanks,
Ralph

-- Forwarded message --
From: Su Ralph (JIRA) 
Date: Mon, Mar 20, 2017 at 6:41 PM
Subject: [jira] [Updated] (FLINK-6126) Yet another conflict : guava
To: suliang...@gmail.com



 [ https://issues.apache.org/jira/browse/FLINK-6126?page=com.
atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Su Ralph updated FLINK-6126:

Description:
When write a user function try to write to elastic search (depend on
elasticsearch 2.3.5)

Stack like:
java.lang.NoSuchMethodError: com.google.common.util.concurr
ent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;
at org.elasticsearch.threadpool.ThreadPool.(ThreadPool.
java:190)
at org.elasticsearch.client.transport.TransportClient$Builder.
build(TransportClient.java:131)
at io.sherlock.capabilities.es.AbstractEsSink.open(AbstractEsSi
nk.java:98)

When enable env.java.opts.taskmanager to -version:class, we can see the
class load log like:
[Loaded com.google.common.util.concurrent.MoreExecutors from
file:/opt/flink/lib/flink-dist_2.11-1.2.0.jar]

The user code is using guva of 18.0.

  was:
For some reason I need to use org.apache.httpcomponents:httpasyncclient:4.1.2
in flink.
The source file is:
{code}
import org.apache.flink.streaming.api.scala._
import org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory

/**
  * Created by renkai on 16/9/7.
  */
object Main {
  def main(args: Array[String]): Unit = {
val instance = ManagedNHttpClientConnectionFactory.INSTANCE
println("instance = " + instance)

val env = StreamExecutionEnvironment.getExecutionEnvironment
val stream = env.fromCollection(1 to 100)
val result = stream.map { x =>
  x * 2
}
result.print()
env.execute("xixi")
  }
}

{code}

and
{code}
name := "flink-explore"

version := "1.0"

scalaVersion := "2.11.8"

crossPaths := false

libraryDependencies ++= Seq(
  "org.apache.flink" %% "flink-scala" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.flink" %% "flink-connector-kafka-0.8" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.flink" %% "flink-streaming-scala" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.flink" %% "flink-clients" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.httpcomponents" % "httpasyncclient" % "4.1.2"
)
{code}
I use `sbt assembly` to get a fat jar.

If I run the command
{code}
 java -cp flink-explore-assembly-1.0.jar Main
{code}
I get the result

{code}
instance = org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFa
ctory@4909b8da
log4j:WARN No appenders could be found for logger (
org.apache.flink.api.scala.ClosureCleaner$).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.
Connected to JobManager at Actor[akka://flink/user/jobmanager_1#-1177584915]
09/07/2016 12:05:26 Job execution switched to status RUNNING.
09/07/2016 12:05:26 Source: Collection Source(1/1) switched to SCHEDULED
09/07/2016 12:05:26 Source: Collection Source(1/1) switched to DEPLOYING
...
09/07/2016 12:05:26 Map -> Sink: Unnamed(20/24) switched to RUNNING
09/07/2016 12:05:26 Map -> Sink: Unnamed(19/24) switched to RUNNING
15> 30
20> 184
...
19> 182
1> 194
8> 160
09/07/2016 12:05:26 Source: Collection Source(1/1) switched to FINISHED
...
09/07/2016 12:05:26 Map -> Sink: Unnamed(1/24) switched to FINISHED
09/07/2016 12:05:26 Job execution switched to status FINISHED.
{code}

Nothing special.

But if I run the jar by
{code}
./bin/flink run shop-monitor-flink-assembly-1.0.jar
{code}

I will get an error

{code}
$ ./bin/flink run flink-explore-assembly-1.0.jar
Cluster configuration: Standalone cluster with

Re: Machine Learning on Flink - Next steps

2017-03-20 Thread Katherin Eri

Hello, Theodore

Could you please move vectors of development and their prioritized
positions from *## Executive summary* to the google doc?



Could you please also create some table in google doc, that is representing
the selected directions and persons, who would like to drive or participate
in the particular topic, in order to make this process transparent for
community and sum up current state of commitment of contributors?

There we could simply inscribe us to some topic.



And 1+ for CTR prediction case.

вс, 19 мар. 2017 г. в 16:49, Theodore Vasiloudis <
theodoros.vasilou...@gmail.com>:

> Hello Stavros,
>
> The way I thought we'd do it is that each shepherd would be responsible for
> organizing the project: that includes setting up a Google doc, sending an
> email to the dev list to inform the wider community, and if possible,
> personally contacting the people who expressed interest in the project.
>
> Would you be willing to lead that effort for the model serving project?
>
> Regards,
> Theodore
>
> --
> Sent from a mobile device. May contain autocorrect errors.
>
> On Mar 19, 2017 3:49 AM, "Stavros Kontopoulos" 
> wrote:
>
> > Hi all...
> >
> > I agree about the tensorflow integration it seems to be important from
> what
> > I hear.
> > Should we sign up somewhere for the working groups (gdcos)?
> > I would like to start helping with the model serving feature.
> >
> > Best Regards,
> > Stavros
> >
> > On Fri, Mar 17, 2017 at 10:34 PM, Gábor Hermann 
> > wrote:
> >
> > > Hi Chen,
> > >
> > > Thanks for the input! :)
> > >
> > > There is already a project [1] for using TensorFlow models in Flink,
> and
> > > Theodore has suggested
> > > to contact the author, Eron Wright for the model serving direction.
> > >
> > >
> > > [1] http://sf.flink-forward.org/kb_sessions/introducing-flink-
> > tensorflow/
> > >
> > > Cheers,
> > > Gabor
> > >
> > >
> > > On 2017-03-17 19:41, Chen Qin wrote:
> > >
> > >> [1]http://sf.flink-forward.org/kb_sessions/introducing-flink-te
> > >> nsorflow/
> > >>
> > >
> > >
> >
>
-- 

*Yours faithfully, *

*Kate Eri.*

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Robert Metzger

Thank you for looking into the build times.

I didn't know that the build time situation is so bad. Even with yarn,
mesos, connectors and libraries removed, we are still running into the
build timeout :(

Aljoscha told me that the Beam community is using Jenkins for running the
tests, and they are planning to completely move away from Travis. I wonder
whether we should do the same, as having our own Jenkins servers would
allow us to run tests for more than 50 minutes.

I agree with Stephan that we should keep the yarn and mesos tests in the
core for stability / testing quality purposes.


On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen  wrote:

> @Greg
>
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors, but
> we may be able to convince him.
>
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases that other tests did not, because they are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling would
> be that they are valuable in the core repository.
>
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience there, we can probably easily see what
> else we can split out.
>
> Stephan
>
>
> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan  wrote:
>
> > I’d like to use this refactoring opportunity to unspilt the Travis tests.
> > With 51 builds queued up for the weekend (some of which may fail or have
> > been force pushed) we are at the limit of the number of contributions we
> > can process. Fixing this requires 1) splitting the project, 2)
> > investigating speedups for long-running tests, and 3) staying cognizant
> of
> > test performance when accepting new code.
> >
> > I’d like to add one to Stephan’s list of module group. I like that the
> > modules are generic (“libraries”) so that no one module is alone and
> > independent.
> >
> > Flink has three “libraries”: cep, ml, and gelly.
> >
> > “connectors” is a hotspot due to the long-running Kafka tests (and
> > connectors for three Kafka versions).
> >
> > Both flink-storm and flink-python have a modest number of number of tests
> > and could live with the miscellaneous modules in “contrib”.
> >
> > The YARN tests are long-running and problematic (I am unable to
> > successfully run these locally). A “cluster” module could host
> flink-mesos,
> > flink-yarn, and flink-yarn-tests.
> >
> > That gets us close to running all tests in a single Travis build.
> >   https://travis-ci.org/greghogan/flink/builds/212122590 <
> > https://travis-ci.org/greghogan/flink/builds/212122590>
> >
> > I also tested (https://github.com/greghogan/flink/commits/core_build <
> > https://github.com/greghogan/flink/commits/core_build>) with a maven
> > parallelism of 2 and 4, with the latter a 6.4% drop in build time.
> >   https://travis-ci.org/greghogan/flink/builds/212137659 <
> > https://travis-ci.org/greghogan/flink/builds/212137659>
> >   https://travis-ci.org/greghogan/flink/builds/212154470 <
> > https://travis-ci.org/greghogan/flink/builds/212154470>
> >
> > We can run Travis CI builds nightly to guard against breaking changes.
> >
> > I also wanted to get an idea of how disruptive it would be to developers
> > to divide the project into multiple git repos. I wrote a simple python
> > script and configured it with the module partitions listed above. The
> usage
> > string from the top of the file lists commits with files from multiple
> > partitions and well as the modified files.
> >   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
> > https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
> >
> > Accounting for the merging of the batch and streaming connector modules,
> > and assuming that the project structure has not changed much over the
> past
> > 15 months, for the following date ranges the listed number of commits
> would
> > have been split across repositories.
> >
> > since "2017-01-01"
> > 36 of 571 commits were mixed
> >
> > since "2016-07-01"
> > 155 of 1607 commits were mixed
> >
> > since "2016-01-01"
> > 272 of 2561 commits were mixed
> >
> > Greg
> >
> >
> > > On Mar 15, 2017, at 1:13 PM, Stephan Ewen  wrote:
> > >
> > > @Robert - I think once we know that a separate git repo works well, and
> > > that it actually solves problems, I see no reason to not create a
> > > connectors repository later. The infrastructure changes should be
> > identical
> > > for two or more repositories.
> > >
> > > On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann 
> > wrote:
> > >
> > >> I think it should not be at least the flink-dist but exactly the
> > remaining
> > >> flink-dist module. Otherwise we do

Re: [DISCUSS] deprecated function need more detail

There is a related checkstyle rule:
http://checkstyle.sourceforge.net/apidocs/com/puppycrawl/tools/checkstyle/checks/annotation/MissingDeprecatedCheck.html

Added a JIRA for adding it here:
https://issues.apache.org/jira/browse/FLINK-6127

We actually wrote this down in our hidden Wiki at
https://cwiki.apache.org/confluence/display/FLINK/Best+Practices+and+Lessons+Learned

> - Always add comments when you deprecate something
> * Add a @Deprecated annotation to the JavaDocs explaining why it was 
> deprecated (removed, replaced, etc.)
> * Create issues with target release version for deleting deprecated interfaces
> * 
> https://docs.oracle.com/javase/1.5.0/docs/guide/javadoc/deprecation/deprecation.html


On Mon, Mar 20, 2017 at 11:33 AM, Stephan Ewen  wrote:
> +1
>
> I think we actually had the same discussion already a while back. Let's
> bring it back to everyone's awareness!
>
>
>
> On Wed, Nov 23, 2016 at 12:09 PM, Paris Carbone  wrote:
>
>> +1
>>
>> This should always be the norm, especially for user-facing code.
>>
>> While we are at it, perhaps when someone deprecates functionality the new
>> alternative should also be replaced right away.
>> E.g. Checkpointed is deprecated but all state management tests are
>> actually using this alternative.
>>
>> cheers
>> Paris
>>
>>
>> > On 23 Nov 2016, at 11:21, Kostas Kloudas 
>> wrote:
>> >
>> > +1 and we should apply the same to all deprecated interfaces/abstract
>> classes.
>> >
>> >> On Nov 23, 2016, at 11:13 AM, Aljoscha Krettek 
>> wrote:
>> >>
>> >> +1 That sounds excellent.
>> >>
>> >> On Wed, 23 Nov 2016 at 11:04 Till Rohrmann 
>> wrote:
>> >>
>> >>> +1 for your proposal.
>> >>>
>> >>> Cheers,
>> >>> Till
>> >>>
>> >>> On Wed, Nov 23, 2016 at 9:33 AM, Fabian Hueske 
>> wrote:
>> >>>
>>  I agree on this one.
>>  Whenever we deprecate a method or a feature we should add a comment
>> that
>>  explains the new API or why the feature was removed without
>> replacement.
>> 
>>  Enforcing this information through checkstyle makes sense as well,
>> IMO.
>> 
>>  Cheers, Fabian
>> 
>>  2016-11-23 4:42 GMT+01:00 sjk :
>> 
>> > Hi, all
>> >
>> > Let’s have look at Checkpointed interface below. It declared
>> deprecated
>> > but have no detail for why, when and how replace this function. It’s
>> a
>>  big
>> > trouble for the users.
>> >
>> > @Deprecated
>> > @PublicEvolving
>> > public interface Checkpointed extends
>> > CheckpointedRestoring {
>> >
>> >
>> > I think we should have more detail: when give up, who replace it, why
>> > deprecated.
>> >
>> > For Java code, add detail  deprecated reason in code annotations.
>> > For Scala code, replace Java annotation  @Deprecated(,,) with Scala
>> > annotation @deprecated, such as
>> > @deprecated(message = "the reason", since = "when fully give up”)
>> >
>> > Add this rule to customized checkstyle plugin of maven and SBT.
>> >
>> > Best regard
>> > -Jinkui Shi
>> 
>> >>>
>> >
>>
>>

[jira] [Created] (FLINK-6127) Add MissingDeprecatedCheck to checkstyle

2017-03-20 Thread Ufuk Celebi (JIRA)

Ufuk Celebi created FLINK-6127:
--

 Summary: Add MissingDeprecatedCheck to checkstyle
 Key: FLINK-6127
 URL: https://issues.apache.org/jira/browse/FLINK-6127
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Ufuk Celebi
Priority: Minor


We should add the MissingDeprecatedCheck to our checkstyle rules to help 
avoiding deprecations without JavaDocs mentioning why the deprecation happened.

http://checkstyle.sourceforge.net/apidocs/com/puppycrawl/tools/checkstyle/checks/annotation/MissingDeprecatedCheck.html




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-6126) Yet another conflict

2017-03-20 Thread Su Ralph (JIRA)

Su Ralph created FLINK-6126:
---

 Summary: Yet another conflict
 Key: FLINK-6126
 URL: https://issues.apache.org/jira/browse/FLINK-6126
 Project: Flink
  Issue Type: Bug
  Components: Build System, Local Runtime
Affects Versions: 1.2.0
 Environment: Latest SNAPSHOT
Reporter: Su Ralph


For some reason I need to use org.apache.httpcomponents:httpasyncclient:4.1.2 
in flink.
The source file is:
{code}
import org.apache.flink.streaming.api.scala._
import org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory

/**
  * Created by renkai on 16/9/7.
  */
object Main {
  def main(args: Array[String]): Unit = {
val instance = ManagedNHttpClientConnectionFactory.INSTANCE
println("instance = " + instance)

val env = StreamExecutionEnvironment.getExecutionEnvironment
val stream = env.fromCollection(1 to 100)
val result = stream.map { x =>
  x * 2
}
result.print()
env.execute("xixi")
  }
}

{code}

and 
{code}
name := "flink-explore"

version := "1.0"

scalaVersion := "2.11.8"

crossPaths := false

libraryDependencies ++= Seq(
  "org.apache.flink" %% "flink-scala" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.flink" %% "flink-connector-kafka-0.8" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.flink" %% "flink-streaming-scala" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.flink" %% "flink-clients" % "1.2-SNAPSHOT"
exclude("com.google.code.findbugs", "jsr305"),
  "org.apache.httpcomponents" % "httpasyncclient" % "4.1.2"
)
{code}
I use `sbt assembly` to get a fat jar.

If I run the command
{code}
 java -cp flink-explore-assembly-1.0.jar Main
{code}
I get the result 

{code}
instance = 
org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory@4909b8da
log4j:WARN No appenders could be found for logger 
(org.apache.flink.api.scala.ClosureCleaner$).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Connected to JobManager at Actor[akka://flink/user/jobmanager_1#-1177584915]
09/07/2016 12:05:26 Job execution switched to status RUNNING.
09/07/2016 12:05:26 Source: Collection Source(1/1) switched to SCHEDULED
09/07/2016 12:05:26 Source: Collection Source(1/1) switched to DEPLOYING
...
09/07/2016 12:05:26 Map -> Sink: Unnamed(20/24) switched to RUNNING
09/07/2016 12:05:26 Map -> Sink: Unnamed(19/24) switched to RUNNING
15> 30
20> 184
...
19> 182
1> 194
8> 160
09/07/2016 12:05:26 Source: Collection Source(1/1) switched to FINISHED
...
09/07/2016 12:05:26 Map -> Sink: Unnamed(1/24) switched to FINISHED
09/07/2016 12:05:26 Job execution switched to status FINISHED.
{code}

Nothing special.

But if I run the jar by
{code}
./bin/flink run shop-monitor-flink-assembly-1.0.jar
{code}

I will get an error

{code}
$ ./bin/flink run flink-explore-assembly-1.0.jar
Cluster configuration: Standalone cluster with JobManager at /127.0.0.1:6123
Using address 127.0.0.1:6123 to connect to JobManager.
JobManager web interface address http://127.0.0.1:8081
Starting execution of program


 The program finished with the following exception:

java.lang.NoSuchFieldError: INSTANCE
at 
org.apache.http.impl.nio.codecs.DefaultHttpRequestWriterFactory.(DefaultHttpRequestWriterFactory.java:53)
at 
org.apache.http.impl.nio.codecs.DefaultHttpRequestWriterFactory.(DefaultHttpRequestWriterFactory.java:57)
at 
org.apache.http.impl.nio.codecs.DefaultHttpRequestWriterFactory.(DefaultHttpRequestWriterFactory.java:47)
at 
org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory.(ManagedNHttpClientConnectionFactory.java:75)
at 
org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory.(ManagedNHttpClientConnectionFactory.java:83)
at 
org.apache.http.impl.nio.conn.ManagedNHttpClientConnectionFactory.(ManagedNHttpClientConnectionFactory.java:64)
at Main$.main(Main.scala:9)
at Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:509)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:403)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:322)
at 
org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:774)
at

Re: [DISCUSS] deprecated function need more detail

2017-03-20 Thread Stephan Ewen

+1

I think we actually had the same discussion already a while back. Let's
bring it back to everyone's awareness!



On Wed, Nov 23, 2016 at 12:09 PM, Paris Carbone  wrote:

> +1
>
> This should always be the norm, especially for user-facing code.
>
> While we are at it, perhaps when someone deprecates functionality the new
> alternative should also be replaced right away.
> E.g. Checkpointed is deprecated but all state management tests are
> actually using this alternative.
>
> cheers
> Paris
>
>
> > On 23 Nov 2016, at 11:21, Kostas Kloudas 
> wrote:
> >
> > +1 and we should apply the same to all deprecated interfaces/abstract
> classes.
> >
> >> On Nov 23, 2016, at 11:13 AM, Aljoscha Krettek 
> wrote:
> >>
> >> +1 That sounds excellent.
> >>
> >> On Wed, 23 Nov 2016 at 11:04 Till Rohrmann 
> wrote:
> >>
> >>> +1 for your proposal.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Wed, Nov 23, 2016 at 9:33 AM, Fabian Hueske 
> wrote:
> >>>
>  I agree on this one.
>  Whenever we deprecate a method or a feature we should add a comment
> that
>  explains the new API or why the feature was removed without
> replacement.
> 
>  Enforcing this information through checkstyle makes sense as well,
> IMO.
> 
>  Cheers, Fabian
> 
>  2016-11-23 4:42 GMT+01:00 sjk :
> 
> > Hi, all
> >
> > Let’s have look at Checkpointed interface below. It declared
> deprecated
> > but have no detail for why, when and how replace this function. It’s
> a
>  big
> > trouble for the users.
> >
> > @Deprecated
> > @PublicEvolving
> > public interface Checkpointed extends
> > CheckpointedRestoring {
> >
> >
> > I think we should have more detail: when give up, who replace it, why
> > deprecated.
> >
> > For Java code, add detail  deprecated reason in code annotations.
> > For Scala code, replace Java annotation  @Deprecated(,,) with Scala
> > annotation @deprecated, such as
> > @deprecated(message = "the reason", since = "when fully give up”)
> >
> > Add this rule to customized checkstyle plugin of maven and SBT.
> >
> > Best regard
> > -Jinkui Shi
> 
> >>>
> >
>
>

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Stephan Ewen

@Greg

I am personally in favor of splitting "connectors" and "contrib" out as
well. I know that @rmetzger has some reservations about the connectors, but
we may be able to convince him.

For the cluster tests (yarn / mesos) - in the past there were many cases
where these tests caught cases that other tests did not, because they are
the only tests that actually use the "flink-dist.jar" and thus discover
many dependency and configuration issues. For that reason, my feeling would
be that they are valuable in the core repository.

I would actually suggest to do only the library split initially, to see
what the challenges are in setting up the multi-repo build and release
tooling. Once we gathered experience there, we can probably easily see what
else we can split out.

Stephan


On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan  wrote:

> I’d like to use this refactoring opportunity to unspilt the Travis tests.
> With 51 builds queued up for the weekend (some of which may fail or have
> been force pushed) we are at the limit of the number of contributions we
> can process. Fixing this requires 1) splitting the project, 2)
> investigating speedups for long-running tests, and 3) staying cognizant of
> test performance when accepting new code.
>
> I’d like to add one to Stephan’s list of module group. I like that the
> modules are generic (“libraries”) so that no one module is alone and
> independent.
>
> Flink has three “libraries”: cep, ml, and gelly.
>
> “connectors” is a hotspot due to the long-running Kafka tests (and
> connectors for three Kafka versions).
>
> Both flink-storm and flink-python have a modest number of number of tests
> and could live with the miscellaneous modules in “contrib”.
>
> The YARN tests are long-running and problematic (I am unable to
> successfully run these locally). A “cluster” module could host flink-mesos,
> flink-yarn, and flink-yarn-tests.
>
> That gets us close to running all tests in a single Travis build.
>   https://travis-ci.org/greghogan/flink/builds/212122590 <
> https://travis-ci.org/greghogan/flink/builds/212122590>
>
> I also tested (https://github.com/greghogan/flink/commits/core_build <
> https://github.com/greghogan/flink/commits/core_build>) with a maven
> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>   https://travis-ci.org/greghogan/flink/builds/212137659 <
> https://travis-ci.org/greghogan/flink/builds/212137659>
>   https://travis-ci.org/greghogan/flink/builds/212154470 <
> https://travis-ci.org/greghogan/flink/builds/212154470>
>
> We can run Travis CI builds nightly to guard against breaking changes.
>
> I also wanted to get an idea of how disruptive it would be to developers
> to divide the project into multiple git repos. I wrote a simple python
> script and configured it with the module partitions listed above. The usage
> string from the top of the file lists commits with files from multiple
> partitions and well as the modified files.
>   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
>
> Accounting for the merging of the batch and streaming connector modules,
> and assuming that the project structure has not changed much over the past
> 15 months, for the following date ranges the listed number of commits would
> have been split across repositories.
>
> since "2017-01-01"
> 36 of 571 commits were mixed
>
> since "2016-07-01"
> 155 of 1607 commits were mixed
>
> since "2016-01-01"
> 272 of 2561 commits were mixed
>
> Greg
>
>
> > On Mar 15, 2017, at 1:13 PM, Stephan Ewen  wrote:
> >
> > @Robert - I think once we know that a separate git repo works well, and
> > that it actually solves problems, I see no reason to not create a
> > connectors repository later. The infrastructure changes should be
> identical
> > for two or more repositories.
> >
> > On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann 
> wrote:
> >
> >> I think it should not be at least the flink-dist but exactly the
> remaining
> >> flink-dist module. Otherwise we do redundant work.
> >>
> >> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger 
> >> wrote:
> >>
> >>> "flink-core" means the main repository, not the "flink-core" module.
> >>>
> >>> When doing a release, we need to build the flink main code first,
> because
> >>> the flink-libraries depend on that.
> >>> Once the "flink-libraries" are build, we need to run the main build
> again
> >>> (at least the flink-dist module), so that it is pulling the artifacts
> >> from
> >>> the flink-libraries to put them into the opt/ folder of the final
> >> artifact.
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann 
> >>> wrote:
> >>>
>  I'm ok with point 3.
> 
>  Concerning point 8: Why do we have to build flink-core twice after
> >> having
>  it built as a dependency for flink-libraries? This

RE: [DISCUSS] Could we Improve tests time and stability?

2017-03-20 Thread Dmytro Shkvyra

Greg, 
The tests falling with time out randomly.
If you explore build history of apache/flink travis you can see this
Оption - UseGCOverheadLimit in args of test args can hide problem with OOM, so 
we need remove it and add heap size in any case.
If talk about GC algorithms - Evgeny Kincharov already done tests when he have 
compared GC algorithms -  https://travis-ci.org/apache/flink/builds/212070460 
His conclusion same to my - Using UseSerialGC  for tests is the best solution.

-Original Message-
From: Greg Hogan [mailto:c...@greghogan.com] 
Sent: Friday, March 17, 2017 8:06 PM
To: dev@flink.apache.org
Subject: Re: [DISCUSS] Could we Improve tests time and stability?

Dmytro,

This is a good idea and a nice speedup, though I notice that nearly half of the 
speedup (1104s of 2461s) is from job 7 which appears to have hung and timed out 
in the initial run.

Could you test the two changes in isolation (increased maximum memory and 
garbage collector)? If the speedup is due to one or the other then we can make 
only that change.

Greg


> On Mar 17, 2017, at 9:16 AM, Dmytro Shkvyra  wrote:
> 
> Hi Robert,
> 
> I have tried my proposal on my travis and accelerated my build in 9.4% 
> (see results below)
> 
> # Test|Present JVM options|"-Xms256m -Xmx1536m -XX:+UseSerialGC"
> - 
> 
> 1 35 min  53 sec  |35 min 35 sec
> 2 38 min  49 sec  |34 min  18 sec
> 3 35 min  34 sec  |29 min  38 sec
> 4 34 min  38 sec  |31 min  14 sec
> 5 35 min  41 sec  |35 min  11 sec
> 6 36 min  41 sec  |33 min  52 sec
> 7 49 min  59 sec  |31 min  35 sec
> 8 37 min  0 sec   |36 min  20 sec
> 9 32 min  28 sec  |31 min  48 sec
> 1038 min  25 sec  |33 min  28 sec
> 1136 min  19 sec  |38 min  24 sec
> 1225 min  0 sec   |24 min  3 sec
> --
> Total 26187 sec   |23726  sec
> --
> Acceleration  |9.40%  
> 
> I think almost 10% is good enough.
> 
> -Original Message-
> From: Robert Metzger [mailto:rmetz...@apache.org]
> Sent: Thursday, March 16, 2017 6:26 PM
> To: dev@flink.apache.org
> Subject: Re: [DISCUSS] Could we Improve tests time and stability?
> 
> Hi Dmytro,
> 
> I'm happy to hear that you are trying to help us improving the test time 
> situation :) We have another discussion here on the dev@ list to split the 
> project into two git repositories to resolve the problem.
> 
> I agree that your proposed changes could improve the build times, but I'm not 
> sure if they are enough to resolve them forever. Some tests just waste time 
> by waiting on stuff to happen :) If you want, you can enable travis for your 
> own Flink fork on GitHub, add your proposed changes to the travis / maven 
> files and see how much they improve the build time.
> 
> 
> On Thu, Mar 16, 2017 at 5:06 PM, Dmytro Shkvyra 
> 
> wrote:
> 
>> Hi everyone,
>> May be we should remove -XX:-UseGCOverheadLimit option from 
>> maven-surefire-plugin args and increase -Xmx to 1536m for forks?
>> We have about 4 GB RAM and 2 cores at test VMs. I think we can make 
>> test faster than now. When I tried testing flink-runtime some tests 
>> work too slow due to GC overhead.
>> May be you also faced to problem when Travis build was fallen by timeout?
>> Also we can use GC algorithms explicitly for forks execution.
>> BTW, we run tests with java 7 and 8 and these versions use by default 
>> different GC algorithms (GC1 for 8 and Parallel GC for 7). IMHO when 
>> we have strict limitations of RAM and time of build we should avoid 
>> any ambiguity.
>> In case when some tests can generate very big datasets very fast, 
>> paralel GC can do not have time to clean up. I do not know how G1 
>> work in this case exactly, but may be would better use old good 
>> -XX:+UseSerialGC. We have only 1 core per fork so we anyway cant use 
>> all advantages of G1 and ParralelGC. If we use SerialGC (use stop the
>> world) we can be sure that GC collect almost all garbage before test 
>> continue.
>> What do you think about my idea?
>> May be someone has another ideas how to improve tests time and stability?
>> 
>> 
>> Dmytro Shkvyra
>> Senior Software Engineer
>> 
>> Office: +380 44 390 5457 x 65346
>> Cell: +380 50 357 6828   Email:
>> dmytro_shkv...@epam.com
>> Kyiv, Ukraine (GMT+3)   epam.com
>> 
>> CONFIDENTIALITY CAUTION AND DISCLAIMER This message is intended only 
>> for the use of the individual(s) or
>> entity(ies) to which it is addressed and contains information that is 
>> legally privileged and confidential. If you are not the

Re: [DISCUSS] Table API / SQL indicators for event and processing time