Re:refactor StreamConfig

2017-07-04 Thread Xu Pingyong
Hi Aljoscha:


I sum up my thoughts now.
1. rename StreamConfig to StreamTaskConfig.
2. OperatorConig can be changed to be serialisable. If StreamTaskConfig is 
also serialisable, it cannot be deserialized when it is passed to the 
jobManager, which do not depend on "flink-streaming-java".
3. The call getChainIndex() is used only in OperatorConfig.toString(), it 
can be removed. However, isChainStart() and isChainEnd() is used in 
AbstractStreamOperator.setup(...).

However I am not sure whether to put some properties in StreamTaskConfig or 
OperatorConfig, for example input serializer is used not only in Operator but 
also in OpeatorChain. Linkewise output edges and serialisers are only used in 
OpeatorChain now, but whether the operator can see and use them later?
2)  streamOperator
4)  output edges and serializers.

   What do you think?


Best Regards!
Xu Pingyong
   








At 2017-07-05 11:02:56, "Xu Pingyong"  wrote:
>Hi Aljoscha:
>
>
>Ye, I agree with you that an operator should not see output edges and 
>serialisers. The call getChainIndex() is used only in 
>OperatorConfig.toString(), it can be removed. However, isChainStart() and 
>isChainEnd() is used in AbstractStreamOperator.setup(...).
>
>
>But I think what Stephan meant is only that changing OperatorConfig to be 
>serialisable. If StreamConfig is also serialisable, it need to be serialized 
>into the Configuration, which is underlying before and flows across modules.
>
>
>Do you agree what I understand?
>
>
>Best Regards!
>
>
>Xu Pingyong
>  
>
>
>
>
>
>
>At 2017-07-05 00:01:34, "Aljoscha Krettek"  wrote:
>>Hi,
>>
>>Yes, but I think what Stephan was hinting at was to change both of them to be 
>>serialisable when already working on this.
>>
>>I think input serialiser is fine to have in OperatorConfig, you’re right! I 
>>don’t see getChainIndex() used anywhere in the code, though. And the output 
>>edges and serialisers also look like they should not be visible to the 
>>operator.
>>
>>What do you think?
>>
>>Best,
>>Aljoscha
>>
>>> On 4. Jul 2017, at 17:52, xu  wrote:
>>> 
>>> Hi Aljoscha:
>>>Thanks a lot for your advice.
>>> 
>>> 
>>>I think I have not need to separate steps, because what I do is only 
>>> that introducing OperatorConfig and moving the fields. StreamConfig  still  
>>> relys on an underlying Configuration which flows from client to the 
>>> jobmanager and then to the task.
>>> 
>>> 
>>>The following configs are used in an operator now:
>>>2) input serializer is used in AsyncWaitOperator.class
>>>5) chain.index is used in AbstractStreamOperator.setup(...)
>>> 
>>> 
>>>However, What I put in the OperatorConfig is all configs belong to the 
>>> operator, contains not only the operator uses now, but also the streamTask 
>>> uses to build an operator. By OperatorConfig, an operator can not see 
>>> configs belong to others.
>>> 
>>> 
>>>   Best Regards!
>>>   JiPing
>>


Re: refactor StreamConfig

2017-07-04 Thread Xu Pingyong
Hi Aljoscha:


Ye, I agree with you that an operator should not see output edges and 
serialisers. The call getChainIndex() is used only in 
OperatorConfig.toString(), it can be removed. However, isChainStart() and 
isChainEnd() is used in AbstractStreamOperator.setup(...).


But I think what Stephan meant is only that changing OperatorConfig to be 
serialisable. If StreamConfig is also serialisable, it need to be serialized 
into the Configuration, which is underlying before and flows across modules.


Do you agree what I understand?


Best Regards!


Xu Pingyong
  






At 2017-07-05 00:01:34, "Aljoscha Krettek"  wrote:
>Hi,
>
>Yes, but I think what Stephan was hinting at was to change both of them to be 
>serialisable when already working on this.
>
>I think input serialiser is fine to have in OperatorConfig, you’re right! I 
>don’t see getChainIndex() used anywhere in the code, though. And the output 
>edges and serialisers also look like they should not be visible to the 
>operator.
>
>What do you think?
>
>Best,
>Aljoscha
>
>> On 4. Jul 2017, at 17:52, xu  wrote:
>> 
>> Hi Aljoscha:
>>Thanks a lot for your advice.
>> 
>> 
>>I think I have not need to separate steps, because what I do is only that 
>> introducing OperatorConfig and moving the fields. StreamConfig  still  relys 
>> on an underlying Configuration which flows from client to the jobmanager and 
>> then to the task.
>> 
>> 
>>The following configs are used in an operator now:
>>2) input serializer is used in AsyncWaitOperator.class
>>5) chain.index is used in AbstractStreamOperator.setup(...)
>> 
>> 
>>However, What I put in the OperatorConfig is all configs belong to the 
>> operator, contains not only the operator uses now, but also the streamTask 
>> uses to build an operator. By OperatorConfig, an operator can not see 
>> configs belong to others.
>> 
>> 
>>   Best Regards!
>>   JiPing
>


Re: [DISCUSS] FLIP-22: Eager State Declaration

2017-07-04 Thread SHI Xiaogang
Hi Tzu-Li,

Thanks for the proposal. The changes are great. I have several questions
about some details.

First, do you have any plan to provide a method to remove states? Now
states can only be created (either lazily or eagerly), but cannot be
removed. We cannot remove those states not registered because they may be
accessed later (with those deprecated methods).

Second, what about exposing namespaces to users? Now namespaces are only
used in window streams and all user states are in the void namespace. But
some users may come across similar scenarios to window streams where states
are closely related to arrived records and cannot be known beforehand.
Since namespaces are not exposed, they have to create new states when new
records arrive. MapState is another choice, but will be less efficient in
some cases. If we can expose namespaces to users, these users may benefit
from eagerly declared states. I think the change will not break existing
interfaces.

Looking forwards to your comments.

Regards,
Xiaogang





2017-07-05 3:53 GMT+08:00 Chesnay Schepler :

> Could you add an example to the FLIP for how a user can register a state
> with the methods in the RichFunction interface?
> Currently it only contains an example for the annotation option.
>
> These methods look like they are called by the user, but that doesn't
> really make sense to me as after all the user has to
> implement them.
>
> To me a more intuitive signature would be
>
> |void registerKeyedState(StateDescriptorRegistry registry);|
>
> that is called by the system when a UDF is provided by a user who then
> registers all the state descriptors he has.
>
>
> On 04.07.2017 20:00, Tzu-Li (Gordon) Tai wrote:
>
>> Hi Flink devs!
>>
>> I would like to propose the following FLIP - Eager State Declaration for
>> Flink managed state: https://cwiki.apache.org/confl
>> uence/display/FLINK/FLIP-22%3A+Eager+State+Declaration.
>> The proposal is a result of some offline discussions with Aljoscha
>> Krettek, Stephan Ewen, and Stefan Richter.
>>
>> With how the current managed state declaration interfaces work, users may
>> declare state lazily while jobs are running.
>> This behavior is a direct blocker for several state management features
>> we wish to make a reality in the future.
>> I also see it as an opportunity to make the interfaces for keyed /
>> operator managed state declarations more unified at the API level, as well
>> as improved user experience for general use cases.
>>
>> The most important part of the required changes is the deprecation of
>> existing APIs and introducing new state declaration interfaces.
>> Since this would be a rework of the state interfaces, it would be great
>> to hear thoughts on this and make sure that the proposal is what we want in
>> the long run!
>>
>> Happy to hear feedback on this :)
>>
>> Cheers,
>> Gordon
>>
>
>
>


Re: [DISCUSS] FLIP-22: Eager State Declaration

2017-07-04 Thread wenlong.lwl
Hi, all, we have jobs which create state according to type of the key and a
dynamic configuration:

eg: key_type_1's aggregation function is average, while key_type_2's is sum

we need to create state dynamically because the aggregation function may
change in runtime and different aggregation function may need different
state to persistent state. It is really hard to declare state eagerly.

In the flip, I think the main concern to propose the eager declaration of
state is to make sure when restoring  we can have all states registered.
how about just persisting state descriptor in state handle and
automatically register states in restoring?

On 5 July 2017 at 03:53, Chesnay Schepler  wrote:

> Could you add an example to the FLIP for how a user can register a state
> with the methods in the RichFunction interface?
> Currently it only contains an example for the annotation option.
>
> These methods look like they are called by the user, but that doesn't
> really make sense to me as after all the user has to
> implement them.
>
> To me a more intuitive signature would be
>
> |void registerKeyedState(StateDescriptorRegistry registry);|
>
> that is called by the system when a UDF is provided by a user who then
> registers all the state descriptors he has.
>
>
> On 04.07.2017 20:00, Tzu-Li (Gordon) Tai wrote:
>
>> Hi Flink devs!
>>
>> I would like to propose the following FLIP - Eager State Declaration for
>> Flink managed state: https://cwiki.apache.org/confl
>> uence/display/FLINK/FLIP-22%3A+Eager+State+Declaration.
>> The proposal is a result of some offline discussions with Aljoscha
>> Krettek, Stephan Ewen, and Stefan Richter.
>>
>> With how the current managed state declaration interfaces work, users may
>> declare state lazily while jobs are running.
>> This behavior is a direct blocker for several state management features
>> we wish to make a reality in the future.
>> I also see it as an opportunity to make the interfaces for keyed /
>> operator managed state declarations more unified at the API level, as well
>> as improved user experience for general use cases.
>>
>> The most important part of the required changes is the deprecation of
>> existing APIs and introducing new state declaration interfaces.
>> Since this would be a rework of the state interfaces, it would be great
>> to hear thoughts on this and make sure that the proposal is what we want in
>> the long run!
>>
>> Happy to hear feedback on this :)
>>
>> Cheers,
>> Gordon
>>
>
>
>


[jira] [Created] (FLINK-7104) AccessDeniedException get when FileUtilsTest#testDeleteDirectoryConcurrently call

2017-07-04 Thread mingleizhang (JIRA)
mingleizhang created FLINK-7104:
---

 Summary: AccessDeniedException get when 
FileUtilsTest#testDeleteDirectoryConcurrently call
 Key: FLINK-7104
 URL: https://issues.apache.org/jira/browse/FLINK-7104
 Project: Flink
  Issue Type: Bug
  Components: Core
Reporter: mingleizhang
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] FLIP-22: Eager State Declaration

2017-07-04 Thread Chesnay Schepler
Could you add an example to the FLIP for how a user can register a state 
with the methods in the RichFunction interface?

Currently it only contains an example for the annotation option.

These methods look like they are called by the user, but that doesn't 
really make sense to me as after all the user has to

implement them.

To me a more intuitive signature would be

|void registerKeyedState(StateDescriptorRegistry registry);|

that is called by the system when a UDF is provided by a user who then 
registers all the state descriptors he has.


On 04.07.2017 20:00, Tzu-Li (Gordon) Tai wrote:

Hi Flink devs!

I would like to propose the following FLIP - Eager State Declaration for Flink 
managed state: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-22%3A+Eager+State+Declaration.
The proposal is a result of some offline discussions with Aljoscha Krettek, 
Stephan Ewen, and Stefan Richter.

With how the current managed state declaration interfaces work, users may 
declare state lazily while jobs are running.
This behavior is a direct blocker for several state management features we wish 
to make a reality in the future.
I also see it as an opportunity to make the interfaces for keyed / operator 
managed state declarations more unified at the API level, as well as improved 
user experience for general use cases.

The most important part of the required changes is the deprecation of existing 
APIs and introducing new state declaration interfaces.
Since this would be a rework of the state interfaces, it would be great to hear 
thoughts on this and make sure that the proposal is what we want in the long 
run!

Happy to hear feedback on this :)

Cheers,
Gordon





[DISCUSS] FLIP-22: Eager State Declaration

2017-07-04 Thread Tzu-Li (Gordon) Tai
Hi Flink devs!

I would like to propose the following FLIP - Eager State Declaration for Flink 
managed state: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-22%3A+Eager+State+Declaration.
The proposal is a result of some offline discussions with Aljoscha Krettek, 
Stephan Ewen, and Stefan Richter.

With how the current managed state declaration interfaces work, users may 
declare state lazily while jobs are running.
This behavior is a direct blocker for several state management features we wish 
to make a reality in the future.
I also see it as an opportunity to make the interfaces for keyed / operator 
managed state declarations more unified at the API level, as well as improved 
user experience for general use cases.

The most important part of the required changes is the deprecation of existing 
APIs and introducing new state declaration interfaces.
Since this would be a rework of the state interfaces, it would be great to hear 
thoughts on this and make sure that the proposal is what we want in the long 
run!

Happy to hear feedback on this :)

Cheers,
Gordon

Re: [DISCUSS] FLIP proposal for Model Serving over Flink

2017-07-04 Thread Andrea Spina
Hi all,
yes, we did too. We - from Radicalbit - have submitted a talk focused
on the recently released flink-jpmml library about model serving.
Lately, it became part of the FlinkML project.

Cheers, Andrea

2017-07-04 16:14 GMT+02:00 Boris Lublinsky :
> Yes,
> I submitted a talk with Stavros on model serving
>
>
> Boris Lublinsky
> FDP Architect
> boris.lublin...@lightbend.com
> https://www.lightbend.com/
>
> On Jul 3, 2017, at 1:18 PM, Robert Metzger  wrote:
>
> Big +1 from my side on getting this effort started.
>
> Users have asked for this and I would like to see some progress there.
> Did anybody submit a talk about the ML efforts to Flink Forward Berlin this
> year?
>
> On Fri, Jun 30, 2017 at 6:04 PM, Fabian Hueske  wrote:
>>
>> Yes, I know that Theo is engaged in the ML efforts but wasn't sure how
>> much
>> he is involved in the model serving part (thought he was more into the
>> online learning part).
>> It would be great if Theo could help here!
>>
>> I just wanted to make sure that we find somebody to help bootstrapping.
>>
>> Cheers, Fabian
>>
>>
>> 2017-06-30 17:52 GMT+02:00 Stavros Kontopoulos :
>>
>> > Hi Fabian,
>> >
>> > However, we should keep in mind that we need a committer to bootstrap
>> > the
>> > > new module.
>> >
>> >
>> > Absolutely I thought Theodore Vassiloudis could help, as an initial
>> > committer.
>> > Is this known? He is part of the effort btw.
>> >
>> > Best,
>> > Stavros
>> >
>> > On Fri, Jun 30, 2017 at 6:42 PM, Fabian Hueske 
>> > wrote:
>> >
>> > > Thanks Stavros (and everybody else involved) for starting this effort
>> > > and
>> > > bringing the discussion back to the mailing list.
>> > >
>> > > As I said before, a model serving module/component would be a great
>> > feature
>> > > for Flink.
>> > > I see the biggest advantage for such a module in the integration with
>> > > the
>> > > other APIs and libraries, such as DataStream, CEP, SQL.
>> > >
>> > > A FLIP would be a great way to continue your efforts and work on a
>> > > design
>> > > for the component.
>> > >
>> > > However, we should keep in mind that we need a committer to bootstrap
>> > > the
>> > > new module.
>> > > As people are contributing to the model serving module, the number of
>> > > committers should hopefully grow after some time.
>> > >
>> > > Best, Fabian
>> > >
>> > > 2017-06-30 10:58 GMT+02:00 Stavros Kontopoulos
>> > > > > >:
>> > >
>> > > > Hi all,
>> > > >
>> > > > After coordinating with Theodore Vasiloudis and the guys behind the
>> > Flink
>> > > > Model Serving effort (Eron, Radicalbit people, Boris, Bas (ING)), we
>> > > > propose to start working on the model serving over Flink in a more
>> > > official
>> > > > way.
>> > > > That translates to capturing design details in a FLIP document.
>> > > >
>> > > > Please let's discuss and vote whether you think this FLIP would be
>> > > viable.
>> > > >
>> > > > Model Serving as a Flink component might involve a lot of work and
>> > > > we
>> > > need
>> > > > to commit to support it in future Flink releases.
>> > > >
>> > > > In the mean time a lot of people have joined Flink ml slack channel
>> > > > (
>> > > > https://flinkml.slack.com, https://flinkml-invites.herokuapp.com/)
>> > and I
>> > > > think its time to try get them gradually on board.
>> > > >
>> > > > So far we have several efforts hosted here:
>> > > > https://github.com/FlinkML
>> > > >
>> > > > Related documents for what we are doing:
>> > > >
>> > > > Flink ML roadmap
>> > > >
>> > > > https://docs.google.com/document/d/1afQbvZBTV15qF3vobVWUjxQc49h3U
>> > > > d06MIRhahtJ6dw/edit
>> > > >
>> > > > Flink MS
>> > > >
>> > > > https://docs.google.com/document/d/1CjWL9aLxPrKytKxUF5c3ohs0ickp0
>> > > > fdEXPsPYPEywsE/edit#
>> > > >
>> > > > PS. I will work on the last document the next few days to
>> > > > consolidate
>> > > > effort results to some extend and break work down.
>> > > > Our target is to provide a generic API based on some plugin
>> > architecture
>> > > to
>> > > > serve different popular models/pipelines along with custom ones over
>> > > flink.
>> > > >
>> > > > Best,
>> > > > Stavros
>> > > >
>> > >
>> >
>
>
>


Re: connect data stream with parameter stream

2017-07-04 Thread Aljoscha Krettek
Hi Lei,

I’m afraid there is currently no API for doing this in one operation. I see two 
options right now:

1. Built a custom operator that implements windowing and also has a second 
input for the parameter stream. This would be a subclass of 
TwoInputStreamOperator. As an example, you can look at KeyedCoProcessOperator 
which is the operator implementation for a two-input process function 
(CoProcessFunction). This variant gives you most flexibility but it’s a bit 
involved.

2. Use two separate steps, i.e. first do the windowed operation and then have a 
second operation that combines the window results with the parameter stream. 
Something like this:

DataStream input = …;
DataStream parameterStream = …;
input
  .keyBy(…)
  .window(…)
  .reduce()/process()/apply() // the operation that you want to perform
  .connect(parameterStream)
  .process(new MyCoProcessFunction())

Where MyCoProcessFunction would receive the results of the windowed operation 
on input 1 and the parameter stream on input 2. The function would keep state 
based on the parameter stream (you should checkpoint this (see 
CheckpointedFunction, and especially OperatorStateStore.getUnionListState()) 
and process elements that come in on input 1 based on this state.

Union ListState works like this: each parallel operator instance can put a 
bunch of things in state. When checkpointing, the state of all parallel 
instances is collected and checkpointed. When restoring (after failure, for 
example) all state is sent to each parallel operator instance. In your case 
(I’m assuming that the parameter stream should be broadcast so that all 
parallel operator instances get the same input and therefore have the same 
state) you would only checkpoint the state of parallel operator instance 0. 
When restoring, this would be distributed to all operators and they therefore 
all have the same state again.

Does that help?

Best,
Aljoscha

> On 30. Jun 2017, at 21:22, Lei Chen  wrote:
> 
> Hi,
> 
> In my scenario I have 2 streams. DS1 is main data stream reading logs from
> kafka, and DS2 is a parameter stream which is used to maintain a state
> about all processing parameters (including filters) need to be applied at
> runtime by DS1. The processing parameters can be changed anytime during the
> job is running.
> 
> DS1 is a windowed stream, DS2 is just a non-keyed normal stream. How to
> connect these 2 streams together so DS1 can apply those parameters in its
> window function by reading up-to-date parameter state maintained by DS2?
> 
> 
> thanks
> Lei



Re: Streaming Graph processing

2017-07-04 Thread Paris Carbone
I cannot answer that for sure since graph streams are still a research topic.
It depends on the demand and how fast graph stream representations and 
operations will become adopted.

If there is high demand on Flink we can definitely start a FLIP at some point 
but for now it makes sense to see how side inputs and other related features 
will evolve to make a proper integration.
In the meantime feel free to try gelly-streaming and let us know of your 
impressions so far!

cheers
Paris

On 30 Jun 2017, at 19:03, Ameet BD 
> wrote:

Hi Paris,

Thanks for the reply. Any idea when will be Gelly-Stream become part of
official Flink distribution?
Regards,
Ameet

On Fri, Jun 30, 2017 at 8:20 PM, Paris Carbone 
> wrote:

Hi Ameet,

Flink’s Gelly currently operates on the DataSet model.
However, we have an experimental project with Vasia (Gelly-Stream) that
does exactly that.

You can check it out and let us know directly what you think:
https://github.com/vasia/gelly-streaming

Paris

On 30 Jun 2017, at 13:17, Ameet BD  wrote:

Hi ,

Can anyone please point me to examples on streaming graph processing based
on Gelly.

Regards,
Ameet





[jira] [Created] (FLINK-7103) Implement dispatcher component

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7103:


 Summary: Implement dispatcher component
 Key: FLINK-7103
 URL: https://issues.apache.org/jira/browse/FLINK-7103
 Project: Flink
  Issue Type: Sub-task
  Components: Cluster Management
Reporter: Till Rohrmann
Assignee: Till Rohrmann


Implement the skeletal structure of the {{Dispatcher}} component. The initial 
functionality will support job submissions and listing of jobs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re:Re: refactor StreamConfig

2017-07-04 Thread xu
Hi Aljoscha:
Thanks a lot for your advice.


I think I have not need to separate steps, because what I do is only that 
introducing OperatorConfig and moving the fields. StreamConfig  still  relys on 
an underlying Configuration which flows from client to the jobmanager and then 
to the task.


The following configs are used in an operator now:
2) input serializer is used in AsyncWaitOperator.class
5) chain.index is used in AbstractStreamOperator.setup(...)


However, What I put in the OperatorConfig is all configs belong to the 
operator, contains not only the operator uses now, but also the streamTask uses 
to build an operator. By OperatorConfig, an operator can not see configs belong 
to others.


   Best Regards!
   JiPing

[jira] [Created] (FLINK-7102) improve ClassLoaderITCase

2017-07-04 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-7102:
--

 Summary: improve ClassLoaderITCase
 Key: FLINK-7102
 URL: https://issues.apache.org/jira/browse/FLINK-7102
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.4.0
Reporter: Nico Kruber
Assignee: Nico Kruber


{{ClassLoaderITCase}}...
* unnecessarily runs multiple tests in a single test case
* {{#testDisposeSavepointWithCustomKvState()}} does not cancel its job (thus 
the order of execution of test cases defines the outcome)
* uses {{e.getCause().getCause()}} which may cause {{NullPointerException}}s 
hiding the original error



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: refactor StreamConfig (Appending a picture)

2017-07-04 Thread Ted Yu
The picture didn't go thru.

Please use third party site.

On Tue, Jul 4, 2017 at 7:09 AM, xu  wrote:

> I All:
>  I am sorry about working with StreamConfig(https://github.
> com/apache/flink/pull/4241) which may conflicts with others' work before
> discussing.
>
>  Motivation:
>  A Task contains one or more operators with chainning, however
> configs of operator and task are all put in StreamConfig. For example, when
> an opeator is setup with the StreamConfig, it can see the interface about
> physicalEdges or chained.task.configs, which are confused.  Similarly a
> streamTask should not see the interface about chain.index.
>  So we need to separate OperatorConfig from StreamConfig. A
> streamTask inits with the streamConfig, and then extracts operatorConfigs
> from it, build streamOperators with every operatorConfig.
> OperatorConfig:  for the streamOperator to setup with, it constains
> informations that only belong to the streamOperator. It contains:
>1)  operator information: name, id
>2)  streamOperator
>3)  input serializer.
>4)  output edges and serializers.
>5)  chain.index
>6)  state.key.serializer
>
>  StreamConfig: for the streamTask to use:
>1) in.physical.edges
>2) out.physical.edges
>3) chained OperatorConfigs
>4) execution environment: checkpoint, state.backend and so on...
>
> Proposed Change
>   I propose overall changes:
>1) Builde jobGraph from streamGraph
>2) StreamOperator is setup with a operatorConfig, so the setup
> interface need to change
>
> (1) Build jobGraph from streamGraph
>When building, first we get every operatorConfig from the
> streamNode. And then put operatorConfigs of streamNodes to a streamConfig
> when we chain them to a jobVertex.
>
> (2) StreamOperator setup with OperatorProperties
>An OperatorConfig is provided instead of streamConfig when the
> streamOperator sets up. Thanks to the advice of StephanEwan, OperatorConfig
> is no need to have a Map of "configKey" to values, just is a  serializable
> class with the respective fields, And StreamConfig still relys on an
> underlying Configuration, because the streamConfig flows by its underlying
> configuration.
>
>   There are people who have already thought about this, maybe someone
> has been working on it. I need your advice.
>
>   Thanks a lot for replying and Best Regards.
>
>   JiPing
>
>
>
>
>
>
>
>


Re: refactor StreamConfig

2017-07-04 Thread Aljoscha Krettek
I think the proposed changed are good, I just wanted to make sure that they 
don’t interfere with what other people are doing.

I also proposed these steps on the Github PR:
Also, for actually doing the changes I suggest separate steps, i.e. separate 
commits. With possibly separate PRs to make reviewing easier and to make the 
changes more isolated:

 - Rename StreamConfig to StreamTaskConfig and make it serialisable, instead of 
relying on an underlying Configuration. This means that the StreamTaskConfig 
itself has fields for storing settings.
 - Introduce OperatorConfig and move only those fields that the operator should 
see from StreamTaskConfig to OperatorConfig. Initialize the operator with an 
OperatorConfig.

Regarding what to put in the OperatorConfig and what in the StreamTaskConfig: 
why are these still in the OperatorConfig?
   2)  streamOperator
   3)  input serializer.
   4)  output edges and serializers.
   5)  chain.index

I think only the StreamTask, that is responsible for building the OperatorChain 
needs to have that information.

Best,
Aljoscha


> On 4. Jul 2017, at 15:56, xu  wrote:
> 
> HI All:
>  I am sorry about working with 
> StreamConfig(https://github.com/apache/flink/pull/4241) which may conflicts 
> with others' work before discussing.
> 
>  Motivation:
>  A Task contains one or more operators with chainning, however 
> configs of operator and task are all put in StreamConfig. For example, when 
> an opeator is setup with the StreamConfig, it can see the interface about 
> physicalEdges or chained.task.configs, which are confused.  Similarly a 
> streamTask should not see the interface about chain.index.
>  So we need to separate OperatorConfig from StreamConfig. A 
> streamTask inits with the streamConfig, and then extracts operatorConfigs 
> from it, build streamOperators with every operatorConfig. 
> 
> OperatorConfig:  for the streamOperator to setup with, it constains 
> informations that only belong to the streamOperator. It contains:
>1)  operator information: name, id
>2)  streamOperator
>3)  input serializer.
>4)  output edges and serializers.
>5)  chain.index
>6)  state.key.serializer
> 
>  StreamConfig: for the streamTask to use:
>1) in.physical.edges
>2) out.physical.edges
>3) chained OperatorConfigs
>4) execution environment: checkpoint, state.backend and so on... 
>  
> Proposed Change
>   I propose overall changes:
>1) Builde jobGraph from streamGraph
>2) StreamOperator is setup with a operatorConfig, so the setup 
> interface need to change
> 
> (1) Build jobGraph from streamGraph
>When building, first we get every operatorConfig from the streamNode. 
> And then put operatorConfigs of streamNodes to a streamConfig when we chain 
> them to a jobVertex.
> 
> (2) StreamOperator setup with OperatorProperties
>An OperatorConfig is provided instead of streamConfig when the 
> streamOperator sets up. Thanks to the advice of StephanEwan, OperatorConfig 
> is no need to have a Map of "configKey" to values, just is a  serializable 
> class with the respective fields, And StreamConfig still relys on an 
> underlying Configuration, because the streamConfig flows by its underlying 
> configuration.
>   
>   There are people who have already thought about this, maybe someone has 
> been working on it. I need your advice.
> 
>   Thanks a lot for replying and Best Regards.
> 
>   JiPing
> 
> 
>  



Re:refactor StreamConfig (Appending a picture)

2017-07-04 Thread xu
I All:
 I am sorry about working with 
StreamConfig(https://github.com/apache/flink/pull/4241) which may conflicts 
with others' work before discussing.


 Motivation:
 A Task contains one or more operators with chainning, however configs 
of operator and task are all put in StreamConfig. For example, when an opeator 
is setup with the StreamConfig, it can see the interface about physicalEdges or 
chained.task.configs, which are confused.  Similarly a streamTask should not 
see the interface about chain.index.
 So we need to separate OperatorConfig from StreamConfig. A streamTask 
inits with the streamConfig, and then extracts operatorConfigs from it, build 
streamOperators with every operatorConfig. 
OperatorConfig:  for the streamOperator to setup with, it constains 
informations that only belong to the streamOperator. It contains:
   1)  operator information: name, id
   2)  streamOperator
   3)  input serializer.
   4)  output edges and serializers.
   5)  chain.index
   6)  state.key.serializer


 StreamConfig: for the streamTask to use:
   1) in.physical.edges
   2) out.physical.edges
   3) chained OperatorConfigs
   4) execution environment: checkpoint, state.backend and so on... 
 
Proposed Change
  I propose overall changes:
   1) Builde jobGraph from streamGraph
   2) StreamOperator is setup with a operatorConfig, so the setup interface 
need to change


(1) Build jobGraph from streamGraph
   When building, first we get every operatorConfig from the streamNode. 
And then put operatorConfigs of streamNodes to a streamConfig when we chain 
them to a jobVertex.


(2) StreamOperator setup with OperatorProperties
   An OperatorConfig is provided instead of streamConfig when the 
streamOperator sets up. Thanks to the advice of StephanEwan, OperatorConfig is 
no need to have a Map of "configKey" to values, just is a  serializable class 
with the respective fields, And StreamConfig still relys on an underlying 
Configuration, because the streamConfig flows by its underlying configuration.
  
  There are people who have already thought about this, maybe someone has 
been working on it. I need your advice.


  Thanks a lot for replying and Best Regards.


  JiPing




 

refactor StreamConfig

2017-07-04 Thread xu
HI All:
 I am sorry about working with 
StreamConfig(https://github.com/apache/flink/pull/4241) which may conflicts 
with others' work before discussing.


 Motivation:
 A Task contains one or more operators with chainning, however configs 
of operator and task are all put in StreamConfig. For example, when an opeator 
is setup with the StreamConfig, it can see the interface about physicalEdges or 
chained.task.configs, which are confused.  Similarly a streamTask should not 
see the interface about chain.index.
 So we need to separate OperatorConfig from StreamConfig. A streamTask 
inits with the streamConfig, and then extracts operatorConfigs from it, build 
streamOperators with every operatorConfig. 
OperatorConfig:  for the streamOperator to setup with, it constains 
informations that only belong to the streamOperator. It contains:
   1)  operator information: name, id
   2)  streamOperator
   3)  input serializer.
   4)  output edges and serializers.
   5)  chain.index
   6)  state.key.serializer


 StreamConfig: for the streamTask to use:
   1) in.physical.edges
   2) out.physical.edges
   3) chained OperatorConfigs
   4) execution environment: checkpoint, state.backend and so on... 
 
Proposed Change
  I propose overall changes:
   1) Builde jobGraph from streamGraph
   2) StreamOperator is setup with a operatorConfig, so the setup interface 
need to change


(1) Build jobGraph from streamGraph
   When building, first we get every operatorConfig from the streamNode. 
And then put operatorConfigs of streamNodes to a streamConfig when we chain 
them to a jobVertex.


(2) StreamOperator setup with OperatorProperties
   An OperatorConfig is provided instead of streamConfig when the 
streamOperator sets up. Thanks to the advice of StephanEwan, OperatorConfig is 
no need to have a Map of "configKey" to values, just is a  serializable class 
with the respective fields, And StreamConfig still relys on an underlying 
Configuration, because the streamConfig flows by its underlying configuration.
  
  There are people who have already thought about this, maybe someone has 
been working on it. I need your advice.


  Thanks a lot for replying and Best Regards.


  JiPing

[jira] [Created] (FLINK-7101) Fix Non-windowed group-aggregate error when using `minIdleStateRetentionTime` config and retract agg

2017-07-04 Thread sunjincheng (JIRA)
sunjincheng created FLINK-7101:
--

 Summary: Fix Non-windowed group-aggregate error when using 
`minIdleStateRetentionTime` config and retract agg
 Key: FLINK-7101
 URL: https://issues.apache.org/jira/browse/FLINK-7101
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.3.1, 1.3.0
Reporter: sunjincheng
Assignee: sunjincheng
 Fix For: 1.4.0


When Non-windowed group-aggregate using {{minIdleStateRetentionTime}} config 
and retract AGG, Will emit "NULL" agg value which we do not expect. 
For example: ({{IntSumWithRetractAggFunction}})
1. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), true) 
2. Cleanup state
3. Receive: CRow(Row.of(6L: JLong, 5: JInt, "aaa"), false)  // acc.f1 = -1, 
getValue= null 

So, we must change the logic of {{GroupAggProcessFunction}} as follows:
{code}
if (inputCnt != 0) {
 ...
} else {
 ...
}
{code}
TO
{code}
if (inputCnt > 0) {
 ...
} else {
if( null != prevRow.row){
 ...
 }
}
{code}
What do you think? [~fhueske] [~hequn8128]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7100) TaskManager metrics are registered twice

2017-07-04 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7100:
---

 Summary: TaskManager metrics are registered twice
 Key: FLINK-7100
 URL: https://issues.apache.org/jira/browse/FLINK-7100
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime, Metrics
Affects Versions: 1.3.0, 1.4.0
Reporter: Chesnay Schepler
 Fix For: 1.4.0, 1.3.2


TaskManager metrics are currently registered twice, once when the TaskManager 
is started and once when the TaskManager associates with a JobManager.

Originally the metrics were registered when the TM associates with the JM and 
unregistered upon disassociation.

9e9776f17ed18b12af177e31ab0bc266236f85ef modified the {{TaskManager}} to use 
the {{TaskManagerServices}}, which when loaded _also_ register the metrics.

I suggest to remove the registrations that happen upon (dis-)association.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7099) Replace usages of deprecated JOB_MANAGER_IPC_PORT_KEY

2017-07-04 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7099:
---

 Summary: Replace usages of deprecated JOB_MANAGER_IPC_PORT_KEY
 Key: FLINK-7099
 URL: https://issues.apache.org/jira/browse/FLINK-7099
 Project: Flink
  Issue Type: Improvement
  Components: Configuration, Local Runtime
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
 Fix For: 1.4.0


The deprecated {{ConfigConstants#JOB_MANAGER_IPC_PORT_KEY}} is still used a lot.

We should replace these usages with {{JobManagerOptions#PORT}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7096) Adapt start-up scripts to start Flip-6 cluster

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7096:


 Summary: Adapt start-up scripts to start Flip-6 cluster
 Key: FLINK-7096
 URL: https://issues.apache.org/jira/browse/FLINK-7096
 Project: Flink
  Issue Type: Bug
  Components: Startup Shell Scripts
Reporter: Till Rohrmann
Priority: Minor


We have to adapt the start-up scripts to be able to start the Flip-6 components.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7095) Add proper command line parsing tool to TaskManagerRunner.main

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7095:


 Summary: Add proper command line parsing tool to 
TaskManagerRunner.main
 Key: FLINK-7095
 URL: https://issues.apache.org/jira/browse/FLINK-7095
 Project: Flink
  Issue Type: Sub-task
  Components: Cluster Management
Reporter: Till Rohrmann


We need to add a proper command line parsing tool to the entry point of the 
{{TaskManagerRunner#main}}. At the moment, we are simply using the 
{{ParameterTool}} as a temporary solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Re: Tips to fix IDEA strange problem after updating master code

2017-07-04 Thread 郭健
After deactivating scala-2.10 profile in IntelliJ, this issue is gone.
Thank you all.

On 7/4/17, 17:11, "Piotr Nowojski"  wrote:

Maybe try 

$ mvn clean 

Before reimporting and restarting/invalidating caches in IntelliJ? Did you 
deactivate scala-2.10 profile in the IntelliJ?

Piotrek



> On Jul 4, 2017, at 11:05 AM, 郭健  wrote:
> 
> I have done all these but still got some issue in IDEA, especially in the 
flink-connector project.
> 
> 
> On 7/4/17, 15:57, "Piotr Nowojski"  wrote:
> 
>Besides deactivating “scala-2.10” profile in the Intellij it might be 
necessary to:
>- reimport maven project:
>   1. Right click on root module: “flink-parent”
>   2. Maven
>   3. reimport
>- invalidate caches and restart: File -> Invalidate caches and restart 
-> invalidate /restart
>- rebuild whole project
> 
>I suspect that either activation of scala-2.10 by default comes from 
flink-scala and flick-scala-shell poms or it’s an artifact because you 
created/imported Intellij project when 2.10 was the default. If the first 
option is true, this PR: https://github.com/apache/flink/pull/4240 
 might fix this issue.
> 
> 
>Another quirk that I encauntered is the compile error about  
ILoopCompat class being defined twice in Intellij (works fine from console). 
This comes from flink-scala-shell/pom.xml, which defines two different source 
paths depending on Scala version:
> 
>src/main/scala-${scala.binary.version}
> 
>Such thing is not supported by Intellij and one have to manually 
remove one of the source directory (either 2.11 or 2.10) from the project 
settings.
> 
>Piotrek
> 
>> On Jul 4, 2017, at 9:46 AM, Aljoscha Krettek  wrote:
>> 
>> Thanks for the hint!
>> 
>>> On 4. Jul 2017, at 06:03, Ted Yu  wrote:
>>> 
>>> Looks like the picture didn't go thru.
>>> 
>>> Mind using third party site ?
>>> 
>>> Thanks
>>> 
>>> On Mon, Jul 3, 2017 at 8:56 PM, Jark Wu  wrote:
>>> 
 Hi devs,
 
 Yesterday, I updated the master code which include [FLINK-7030]: Build
 with scala-2.11 by default. After that, I entered a strange problem 
with
 IDEA that many classes can't be found, and the project can't be
 built/compiled (in IDEA), but maven install worked good.
 
 After a series of attempts, I found that IDEA activate the scala-2.10
 profile by default which result in this problem. After deactivate
 scala-2.10 profile via  sidebar Maven Projects -> Profiles -> 
deactivate
 "scala-2.10" profile, and every works good again.
 
 [image: 内嵌图片 1]
 
 I share this tip in the dev list, because a lot of my colleagues have 
the
 same issues, and maybe many other Flink devs have the same problem too.
 
 BTW, I don't know why IDEA activate scala-2.10 by default, not sure 
it's a
 IDEA bug or the wrong profile setting somewhere.
 
 
 Regards,
 Jark Wu
 
>> 
> 
> 
> 





Re: Tips to fix IDEA strange problem after updating master code

2017-07-04 Thread Chesnay Schepler
You could also try the nuclear option of actually re-importing the 
project into intellij (as if you just downloaded it), that worked for me.


On 04.07.2017 11:11, Piotr Nowojski wrote:

Maybe try

$ mvn clean

Before reimporting and restarting/invalidating caches in IntelliJ? Did you 
deactivate scala-2.10 profile in the IntelliJ?

Piotrek




On Jul 4, 2017, at 11:05 AM, 郭健  wrote:

I have done all these but still got some issue in IDEA, especially in the 
flink-connector project.


On 7/4/17, 15:57, "Piotr Nowojski"  wrote:

Besides deactivating “scala-2.10” profile in the Intellij it might be 
necessary to:
- reimport maven project:
1. Right click on root module: “flink-parent”
2. Maven
3. reimport
- invalidate caches and restart: File -> Invalidate caches and restart -> 
invalidate /restart
- rebuild whole project

I suspect that either activation of scala-2.10 by default comes from flink-scala 
and flick-scala-shell poms or it’s an artifact because you created/imported Intellij 
project when 2.10 was the default. If the first option is true, this PR: 
https://github.com/apache/flink/pull/4240 
 might fix this issue.


Another quirk that I encauntered is the compile error about  ILoopCompat 
class being defined twice in Intellij (works fine from console). This comes 
from flink-scala-shell/pom.xml, which defines two different source paths 
depending on Scala version:

src/main/scala-${scala.binary.version}

Such thing is not supported by Intellij and one have to manually remove one 
of the source directory (either 2.11 or 2.10) from the project settings.

Piotrek


On Jul 4, 2017, at 9:46 AM, Aljoscha Krettek  wrote:

Thanks for the hint!


On 4. Jul 2017, at 06:03, Ted Yu  wrote:

Looks like the picture didn't go thru.

Mind using third party site ?

Thanks

On Mon, Jul 3, 2017 at 8:56 PM, Jark Wu  wrote:


Hi devs,

Yesterday, I updated the master code which include [FLINK-7030]: Build
with scala-2.11 by default. After that, I entered a strange problem with
IDEA that many classes can't be found, and the project can't be
built/compiled (in IDEA), but maven install worked good.

After a series of attempts, I found that IDEA activate the scala-2.10
profile by default which result in this problem. After deactivate
scala-2.10 profile via  sidebar Maven Projects -> Profiles -> deactivate
"scala-2.10" profile, and every works good again.

[image: 内嵌图片 1]

I share this tip in the dev list, because a lot of my colleagues have the
same issues, and maybe many other Flink devs have the same problem too.

BTW, I don't know why IDEA activate scala-2.10 by default, not sure it's a
IDEA bug or the wrong profile setting somewhere.


Regards,
Jark Wu










Re: Tips to fix IDEA strange problem after updating master code

2017-07-04 Thread Piotr Nowojski
Maybe try 

$ mvn clean 

Before reimporting and restarting/invalidating caches in IntelliJ? Did you 
deactivate scala-2.10 profile in the IntelliJ?

Piotrek



> On Jul 4, 2017, at 11:05 AM, 郭健  wrote:
> 
> I have done all these but still got some issue in IDEA, especially in the 
> flink-connector project.
> 
> 
> On 7/4/17, 15:57, "Piotr Nowojski"  wrote:
> 
>Besides deactivating “scala-2.10” profile in the Intellij it might be 
> necessary to:
>- reimport maven project:
>   1. Right click on root module: “flink-parent”
>   2. Maven
>   3. reimport
>- invalidate caches and restart: File -> Invalidate caches and restart -> 
> invalidate /restart
>- rebuild whole project
> 
>I suspect that either activation of scala-2.10 by default comes from 
> flink-scala and flick-scala-shell poms or it’s an artifact because you 
> created/imported Intellij project when 2.10 was the default. If the first 
> option is true, this PR: https://github.com/apache/flink/pull/4240 
>  might fix this issue.
> 
> 
>Another quirk that I encauntered is the compile error about  ILoopCompat 
> class being defined twice in Intellij (works fine from console). This comes 
> from flink-scala-shell/pom.xml, which defines two different source paths 
> depending on Scala version:
> 
>src/main/scala-${scala.binary.version}
> 
>Such thing is not supported by Intellij and one have to manually remove 
> one of the source directory (either 2.11 or 2.10) from the project settings.
> 
>Piotrek
> 
>> On Jul 4, 2017, at 9:46 AM, Aljoscha Krettek  wrote:
>> 
>> Thanks for the hint!
>> 
>>> On 4. Jul 2017, at 06:03, Ted Yu  wrote:
>>> 
>>> Looks like the picture didn't go thru.
>>> 
>>> Mind using third party site ?
>>> 
>>> Thanks
>>> 
>>> On Mon, Jul 3, 2017 at 8:56 PM, Jark Wu  wrote:
>>> 
 Hi devs,
 
 Yesterday, I updated the master code which include [FLINK-7030]: Build
 with scala-2.11 by default. After that, I entered a strange problem with
 IDEA that many classes can't be found, and the project can't be
 built/compiled (in IDEA), but maven install worked good.
 
 After a series of attempts, I found that IDEA activate the scala-2.10
 profile by default which result in this problem. After deactivate
 scala-2.10 profile via  sidebar Maven Projects -> Profiles -> deactivate
 "scala-2.10" profile, and every works good again.
 
 [image: 内嵌图片 1]
 
 I share this tip in the dev list, because a lot of my colleagues have the
 same issues, and maybe many other Flink devs have the same problem too.
 
 BTW, I don't know why IDEA activate scala-2.10 by default, not sure it's a
 IDEA bug or the wrong profile setting somewhere.
 
 
 Regards,
 Jark Wu
 
>> 
> 
> 
> 



Re: Re: Tips to fix IDEA strange problem after updating master code

2017-07-04 Thread 郭健
I have done all these but still got some issue in IDEA, especially in the 
flink-connector project.


On 7/4/17, 15:57, "Piotr Nowojski"  wrote:

Besides deactivating “scala-2.10” profile in the Intellij it might be 
necessary to:
- reimport maven project:
1. Right click on root module: “flink-parent”
2. Maven
3. reimport
- invalidate caches and restart: File -> Invalidate caches and restart -> 
invalidate /restart
- rebuild whole project

I suspect that either activation of scala-2.10 by default comes from 
flink-scala and flick-scala-shell poms or it’s an artifact because you 
created/imported Intellij project when 2.10 was the default. If the first 
option is true, this PR: https://github.com/apache/flink/pull/4240 
 might fix this issue.


Another quirk that I encauntered is the compile error about  ILoopCompat 
class being defined twice in Intellij (works fine from console). This comes 
from flink-scala-shell/pom.xml, which defines two different source paths 
depending on Scala version:

src/main/scala-${scala.binary.version}

Such thing is not supported by Intellij and one have to manually remove one 
of the source directory (either 2.11 or 2.10) from the project settings.

Piotrek

> On Jul 4, 2017, at 9:46 AM, Aljoscha Krettek  wrote:
> 
> Thanks for the hint!
> 
>> On 4. Jul 2017, at 06:03, Ted Yu  wrote:
>> 
>> Looks like the picture didn't go thru.
>> 
>> Mind using third party site ?
>> 
>> Thanks
>> 
>> On Mon, Jul 3, 2017 at 8:56 PM, Jark Wu  wrote:
>> 
>>> Hi devs,
>>> 
>>> Yesterday, I updated the master code which include [FLINK-7030]: Build
>>> with scala-2.11 by default. After that, I entered a strange problem with
>>> IDEA that many classes can't be found, and the project can't be
>>> built/compiled (in IDEA), but maven install worked good.
>>> 
>>> After a series of attempts, I found that IDEA activate the scala-2.10
>>> profile by default which result in this problem. After deactivate
>>> scala-2.10 profile via  sidebar Maven Projects -> Profiles -> deactivate
>>> "scala-2.10" profile, and every works good again.
>>> 
>>> [image: 内嵌图片 1]
>>> 
>>> I share this tip in the dev list, because a lot of my colleagues have 
the
>>> same issues, and maybe many other Flink devs have the same problem too.
>>> 
>>> BTW, I don't know why IDEA activate scala-2.10 by default, not sure 
it's a
>>> IDEA bug or the wrong profile setting somewhere.
>>> 
>>> 
>>> Regards,
>>> Jark Wu
>>> 
> 





Re: Problem to deploy of flink program in cluster

2017-07-04 Thread Ted Yu
I only see pom.xml in the attachment. 
Consider using pastebin for the other files. 
 Original message From: simone sguazza 
 Date: 7/4/17  12:34 AM  (GMT-08:00) To: 
dev@flink.apache.org Subject: Problem to deploy of flink program in cluster 


Hi, below you can find my mail and my problem before the my subscribe to 
mailing list.





simone




Da: simone sguazza 

Inviato: lunedì 3 luglio 2017 11:57

A: dev@flink.apache.org

Oggetto: Problem to deploy of flink program in cluster
 



Hi,
 I'm a student and I using flink v1.2.1 to develop a workflow to analyze data 
coming from kafka v0.10.




















At the moment everything work fine, from eclipse I can export the runnable jar 
with dependencies and with "sudo java -jar flinkApp.jar" I can run the flink 
application on ubuntu server.
Until here all ok.
I configured flink to be in High Avaliabity mode, it's start correcly.
When I try to export with maven, it create a fat jar, i launch that with "sudo 
bin/flink run flinkApp.jar". It's work and on the dashboard I can the see the 
job is running. The problem is that is not receive a single data from kafka. 
The cuorious thing is
 that only when I export the jar in order to run in cluster mode I've this 
problem, whereas I export a runnable jar everything work fine.



I put in attachment three files:
- webUI_producer: it shows the flink dashboard with my kafka producer
- problemFlinkCluster: it shows the ubuntu server where my flink application 
run. The top left square is zookeeper (from kafka), the bottom left is the 
flink program running with "sudo java -jar flinkapp.jar" and the right most 
column is the execution of
 flink app with "sudo bin/flink run appFlink.jar". You can see The flink 
application start and than do nothing more.
- the pom.xml



I think the problem is in the pom.xml but I don't know where.



Best regards, simone

































Re: Tips to fix IDEA strange problem after updating master code

2017-07-04 Thread Piotr Nowojski
Besides deactivating “scala-2.10” profile in the Intellij it might be necessary 
to:
- reimport maven project:
1. Right click on root module: “flink-parent”
2. Maven
3. reimport
- invalidate caches and restart: File -> Invalidate caches and restart -> 
invalidate /restart
- rebuild whole project

I suspect that either activation of scala-2.10 by default comes from 
flink-scala and flick-scala-shell poms or it’s an artifact because you 
created/imported Intellij project when 2.10 was the default. If the first 
option is true, this PR: https://github.com/apache/flink/pull/4240 
 might fix this issue.


Another quirk that I encauntered is the compile error about  ILoopCompat class 
being defined twice in Intellij (works fine from console). This comes from 
flink-scala-shell/pom.xml, which defines two different source paths depending 
on Scala version:

src/main/scala-${scala.binary.version}

Such thing is not supported by Intellij and one have to manually remove one of 
the source directory (either 2.11 or 2.10) from the project settings.

Piotrek

> On Jul 4, 2017, at 9:46 AM, Aljoscha Krettek  wrote:
> 
> Thanks for the hint!
> 
>> On 4. Jul 2017, at 06:03, Ted Yu  wrote:
>> 
>> Looks like the picture didn't go thru.
>> 
>> Mind using third party site ?
>> 
>> Thanks
>> 
>> On Mon, Jul 3, 2017 at 8:56 PM, Jark Wu  wrote:
>> 
>>> Hi devs,
>>> 
>>> Yesterday, I updated the master code which include [FLINK-7030]: Build
>>> with scala-2.11 by default. After that, I entered a strange problem with
>>> IDEA that many classes can't be found, and the project can't be
>>> built/compiled (in IDEA), but maven install worked good.
>>> 
>>> After a series of attempts, I found that IDEA activate the scala-2.10
>>> profile by default which result in this problem. After deactivate
>>> scala-2.10 profile via  sidebar Maven Projects -> Profiles -> deactivate
>>> "scala-2.10" profile, and every works good again.
>>> 
>>> [image: 内嵌图片 1]
>>> 
>>> I share this tip in the dev list, because a lot of my colleagues have the
>>> same issues, and maybe many other Flink devs have the same problem too.
>>> 
>>> BTW, I don't know why IDEA activate scala-2.10 by default, not sure it's a
>>> IDEA bug or the wrong profile setting somewhere.
>>> 
>>> 
>>> Regards,
>>> Jark Wu
>>> 
> 



Re: Problem to deploy of flink program in cluster

2017-07-04 Thread Tzu-Li (Gordon) Tai
Hi Simone,

Is the job actually running? Or did it fail with some exception? If it failed, 
there should be some error message in the JM logs.
Could you take a look at that and see if it hints anything?

Cheers,
Gordon


On 4 July 2017 at 3:35:07 PM, simone sguazza (simone-sgua...@hotmail.com) wrote:

Hi, below you can find my mail and my problem before the my subscribe to 
mailing list.


simone

Da: simone sguazza 
Inviato: lunedì 3 luglio 2017 11:57
A: dev@flink.apache.org
Oggetto: Problem to deploy of flink program in cluster
 
Hi, I'm a student and I using flink v1.2.1 to develop a workflow to analyze 
data coming from kafka v0.10.



At the moment everything work fine, from eclipse I can export the runnable jar 
with dependencies and with "sudo java -jar flinkApp.jar" I can run the flink 
application on ubuntu server.

Until here all ok.

I configured flink to be in High Avaliabity mode, it's start correcly.

When I try to export with maven, it create a fat jar, i launch that with "sudo 
bin/flink run flinkApp.jar". It's work and on the dashboard I can the see the 
job is running. The problem is that is not receive a single data from kafka. 
The cuorious thing is that only when I export the jar in order to run in 
cluster mode I've this problem, whereas I export a runnable jar everything work 
fine.



I put in attachment three files:

- webUI_producer: it shows the flink dashboard with my kafka producer

- problemFlinkCluster: it shows the ubuntu server where my flink application 
run. The top left square is zookeeper (from kafka), the bottom left is the 
flink program running with "sudo java -jar flinkapp.jar" and the right most 
column is the execution of flink app with "sudo bin/flink run appFlink.jar". 
You can see The flink application start and than do nothing more.

- the pom.xml



I think the problem is in the pom.xml but I don't know where.



Best regards, simone

[jira] [Created] (FLINK-7094) Update outdated doc release 1.2 to latest stable release 1.3

2017-07-04 Thread mingleizhang (JIRA)
mingleizhang created FLINK-7094:
---

 Summary: Update outdated doc release 1.2 to latest stable release 
1.3
 Key: FLINK-7094
 URL: https://issues.apache.org/jira/browse/FLINK-7094
 Project: Flink
  Issue Type: Bug
Reporter: mingleizhang
Assignee: mingleizhang






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [外部邮件] Tips to fix IDEA strange problem after updating master code

2017-07-04 Thread 郭健
+1
Meet the same problem. I have tried reimporting the maven project but have no 
effect.


From: Jark Wu 
Reply-To: "dev@flink.apache.org" 
Date: Tuesday, July 4, 2017 at 11:56
To: "dev@flink.apache.org" 
Subject: [外部邮件] Tips to fix IDEA strange problem after updating master code

Hi devs,

Yesterday, I updated the master code which include [FLINK-7030]: Build with 
scala-2.11 by default. After that, I entered a strange problem with IDEA that 
many classes can't be found, and the project can't be built/compiled (in IDEA), 
but maven install worked good.

After a series of attempts, I found that IDEA activate the scala-2.10 profile 
by default which result in this problem. After deactivate scala-2.10 profile 
via  sidebar Maven Projects -> Profiles -> deactivate "scala-2.10" profile, and 
every works good again.

[嵌图片 1]

I share this tip in the dev list, because a lot of my colleagues have the same 
issues, and maybe many other Flink devs have the same problem too.

BTW, I don't know why IDEA activate scala-2.10 by default, not sure it's a IDEA 
bug or the wrong profile setting somewhere.


Regards,
Jark Wu


Re: Tips to fix IDEA strange problem after updating master code

2017-07-04 Thread Aljoscha Krettek
Thanks for the hint!

> On 4. Jul 2017, at 06:03, Ted Yu  wrote:
> 
> Looks like the picture didn't go thru.
> 
> Mind using third party site ?
> 
> Thanks
> 
> On Mon, Jul 3, 2017 at 8:56 PM, Jark Wu  wrote:
> 
>> Hi devs,
>> 
>> Yesterday, I updated the master code which include [FLINK-7030]: Build
>> with scala-2.11 by default. After that, I entered a strange problem with
>> IDEA that many classes can't be found, and the project can't be
>> built/compiled (in IDEA), but maven install worked good.
>> 
>> After a series of attempts, I found that IDEA activate the scala-2.10
>> profile by default which result in this problem. After deactivate
>> scala-2.10 profile via  sidebar Maven Projects -> Profiles -> deactivate
>> "scala-2.10" profile, and every works good again.
>> 
>> [image: 内嵌图片 1]
>> 
>> I share this tip in the dev list, because a lot of my colleagues have the
>> same issues, and maybe many other Flink devs have the same problem too.
>> 
>> BTW, I don't know why IDEA activate scala-2.10 by default, not sure it's a
>> IDEA bug or the wrong profile setting somewhere.
>> 
>> 
>> Regards,
>> Jark Wu
>> 



Problem to deploy of flink program in cluster

2017-07-04 Thread simone sguazza
Hi, below you can find my mail and my problem before the my subscribe to 
mailing list.

simone


Da: simone sguazza 
Inviato: lunedì 3 luglio 2017 11:57
A: dev@flink.apache.org
Oggetto: Problem to deploy of flink program in cluster


Hi, I'm a student and I using flink v1.2.1 to develop a workflow to analyze 
data coming from kafka v0.10.


At the moment everything work fine, from eclipse I can export the runnable jar 
with dependencies and with "sudo java -jar flinkApp.jar" I can run the flink 
application on ubuntu server.

Until here all ok.

I configured flink to be in High Avaliabity mode, it's start correcly.

When I try to export with maven, it create a fat jar, i launch that with "sudo 
bin/flink run flinkApp.jar". It's work and on the dashboard I can the see the 
job is running. The problem is that is not receive a single data from kafka. 
The cuorious thing is that only when I export the jar in order to run in 
cluster mode I've this problem, whereas I export a runnable jar everything work 
fine.


I put in attachment three files:

- webUI_producer: it shows the flink dashboard with my kafka producer

- problemFlinkCluster: it shows the ubuntu server where my flink application 
run. The top left square is zookeeper (from kafka), the bottom left is the 
flink program running with "sudo java -jar flinkapp.jar" and the right most 
column is the execution of flink app with "sudo bin/flink run appFlink.jar". 
You can see The flink application start and than do nothing more.

- the pom.xml


I think the problem is in the pom.xml but I don't know where.


Best regards, simone
http://maven.apache.org/POM/4.0.0; xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;>
  4.0.0

  ch.supsi.dti
  LogAnalyzerCluster
  0.0.1-SNAPSHOT
  jar

  LogAnalyzerCluster
  http://maven.apache.org

  
UTF-8
1.8
1.8
1.2.1
ch.supsi.dti.WordCountFlink.App
  

  

  junit
  junit
  3.8.1
  test


	org.slf4j
	slf4j-nop
	1.7.25
	
		
		org.slf4j
		slf4j-log4j12
		
		
	
	
	
	com.esotericsoftware.kryo
	kryo
	2.24.0
	
	
	
	org.apache.flink
	flink-core
	${flink.version}
	
		
		org.slf4j
		slf4j-log4j12
		
		
	
	
	
	org.apache.flink
	flink-streaming-java_2.11
	${flink.version}
	
		
		org.slf4j
		slf4j-log4j12
		
		
	
	
	
	org.apache.flink
	flink-connector-kafka-0.10_2.11
	${flink.version}
	
		
		org.slf4j
		slf4j-log4j12
		

	
	
	
	org.apache.flink
	flink-cep_2.11
	${flink.version}
	
	
	
	net.sf.opencsv
	opencsv
	2.3
	
	
	
	org.apache.flink
	flink-connector-cassandra_2.11
	${flink.version}
	
		
		org.slf4j
		slf4j-log4j12
		
		
	
	

 
	
	org.apache.flink
	flink-clients_2.11
	${flink.version}
	

  
  
  
  	
		
			
			build-jar

			
false
			

			

	org.apache.flink
	flink-java
	${flink.version}
	provided


	org.apache.flink
	flink-streaming-java_2.11
	${flink.version}
	provided


	org.apache.flink
	flink-clients_2.11
	${flink.version}
	provided


			

			

	
	
		org.apache.maven.plugins
		maven-shade-plugin
		2.4.1
		
			
package

	shade


	
		
	
	

			
		
	

			
		
	
 

		
			
			
org.apache.maven.plugins
maven-shade-plugin
2.4.1

	
	
		package
		
			shade
		
		
			

	
	org.apache.flink:flink-annotations
	org.apache.flink:flink-shaded-hadoop2
	org.apache.flink:flink-shaded-curator-recipes
	org.apache.flink:flink-core
	org.apache.flink:flink-java
	org.apache.flink:flink-scala_2.11
	org.apache.flink:flink-runtime_2.11
	org.apache.flink:flink-optimizer_2.11
	org.apache.flink:flink-clients_2.11
	org.apache.flink:flink-avro_2.11
	org.apache.flink:flink-examples-batch_2.11
	org.apache.flink:flink-examples-streaming_2.11
	org.apache.flink:flink-streaming-java_2.11
	org.apache.flink:flink-streaming-scala_2.11
	org.apache.flink:flink-scala-shell_2.11
	org.apache.flink:flink-python
	org.apache.flink:flink-metrics-core
	org.apache.flink:flink-metrics-jmx
	org.apache.flink:flink-statebackend-rocksdb_2.11

	

	log4j:log4j
	org.scala-lang:scala-library
	org.scala-lang:scala-compiler
	org.scala-lang:scala-reflect
	com.data-artisans:flakka-actor_*
	com.data-artisans:flakka-remote_*
	

[jira] [Created] (FLINK-7092) Shutdown ResourceManager components properly

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7092:


 Summary: Shutdown ResourceManager components properly
 Key: FLINK-7092
 URL: https://issues.apache.org/jira/browse/FLINK-7092
 Project: Flink
  Issue Type: Sub-task
  Components: Mesos
Reporter: Till Rohrmann
Priority: Minor


The {{MesosResourceManager}} starts internally a {{TaskMonitor}}, 
{{LaunchCoordinator}}, {{ConnectionMonitor}} and a 
{{ReconciliationCoordinator}}. These components have to be properly shut down 
when the {{MesosResourceManager}} closes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7090) Rework ConnectionMonitor to not depend on Akka

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7090:


 Summary: Rework ConnectionMonitor to not depend on Akka
 Key: FLINK-7090
 URL: https://issues.apache.org/jira/browse/FLINK-7090
 Project: Flink
  Issue Type: Sub-task
  Components: Mesos
Reporter: Till Rohrmann


In order to get rid of Flink's Akka dependency (in the long term), we should 
rework the {{ConnectionMonitor}} to not depend on Akka.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7089) Rework LaunchCoordinator to not depend on Akka

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7089:


 Summary: Rework LaunchCoordinator to not depend on Akka
 Key: FLINK-7089
 URL: https://issues.apache.org/jira/browse/FLINK-7089
 Project: Flink
  Issue Type: Sub-task
  Components: Mesos
Reporter: Till Rohrmann


In order to get rid of Flink's Akka dependency (in the long term), we should 
rework the {{LaunchCoordinator}} to not depend on Akka.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7088) Rework TaskMonitor to not depend on Akka

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7088:


 Summary: Rework TaskMonitor to not depend on Akka
 Key: FLINK-7088
 URL: https://issues.apache.org/jira/browse/FLINK-7088
 Project: Flink
  Issue Type: Sub-task
  Components: Mesos
Reporter: Till Rohrmann


In order to get rid of Flink's Akka dependency (in the long term), we should 
rework the {{TaskMonitor}} to not depend on Akka.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7087) Implement Flip-6 container entry point

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7087:


 Summary: Implement Flip-6 container entry point
 Key: FLINK-7087
 URL: https://issues.apache.org/jira/browse/FLINK-7087
 Project: Flink
  Issue Type: Bug
  Components: Cluster Management
Reporter: Till Rohrmann


In order to support Docker and K8 we have to provide a container entry point. 
In a first version, the container entry point could be similar to the 
standalone session mode just with the difference that we don't submit a job to 
the cluster. The job has to be contained in the container image or being at 
least retrievable by the entry point. In that sense the container entry point 
acts like a per-job standalone mode.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7086) Create entry point for standalone session mode

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7086:


 Summary: Create entry point for standalone session mode
 Key: FLINK-7086
 URL: https://issues.apache.org/jira/browse/FLINK-7086
 Project: Flink
  Issue Type: Sub-task
  Components: Cluster Management
Reporter: Till Rohrmann


Create a cluster entry point for the standalone session mode which can be 
called by the start-up script to launch a Flink session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7084) Flip-6 testing

2017-07-04 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-7084:


 Summary: Flip-6 testing
 Key: FLINK-7084
 URL: https://issues.apache.org/jira/browse/FLINK-7084
 Project: Flink
  Issue Type: Test
  Components: Tests
Reporter: Till Rohrmann


This is an umbrella issue for all Flip-6 related tests which have to be 
implemented.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)