[jira] [Created] (FLINK-2827) Potential resource leak in TwitterSource#loadAuthenticationProperties()

2015-10-06 Thread Ted Yu (JIRA)
Ted Yu created FLINK-2827:
-

 Summary: Potential resource leak in 
TwitterSource#loadAuthenticationProperties()
 Key: FLINK-2827
 URL: https://issues.apache.org/jira/browse/FLINK-2827
 Project: Flink
  Issue Type: Bug
Reporter: Ted Yu


Here is related code:
{code}
Properties properties = new Properties();
try {
InputStream input = new FileInputStream(authPath);
properties.load(input);
input.close();
} catch (Exception e) {
throw new RuntimeException("Cannot open .properties 
file: " + authPath, e);
}
{code}
If there is exception coming out of properties.load() call, input would be left 
open.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2826) transformed is modified in BroadcastVariableMaterialization#decrementReferenceInternal without proper locking

2015-10-06 Thread Ted Yu (JIRA)
Ted Yu created FLINK-2826:
-

 Summary: transformed is modified in 
BroadcastVariableMaterialization#decrementReferenceInternal without proper 
locking
 Key: FLINK-2826
 URL: https://issues.apache.org/jira/browse/FLINK-2826
 Project: Flink
  Issue Type: Bug
Reporter: Ted Yu


Here is related code:
{code}
if (references.isEmpty()) {
disposed = true;
data = null;
transformed = null;
{code}
Elsewhere, transformed is modified with lock on 
BroadcastVariableMaterialization.this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2825) FlinkClient.killTopology fails due to missing leader session ID

2015-10-06 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created FLINK-2825:
--

 Summary: FlinkClient.killTopology fails due to missing leader 
session ID
 Key: FLINK-2825
 URL: https://issues.apache.org/jira/browse/FLINK-2825
 Project: Flink
  Issue Type: Bug
  Components: Storm Compatibility
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax


Calling FlinkClient.killTopology(...) fails with exception:
{noformat}
java.lang.Exception: Received a message 
CancelJob(e56e680ab6781ce3a16bdac7bcdb4dd7) without a leader session ID, even 
though the message requires a leader session ID.
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:41)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:107)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2824) Iteration feedback partitioning does not work as expected

2015-10-06 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-2824:
-

 Summary: Iteration feedback partitioning does not work as expected
 Key: FLINK-2824
 URL: https://issues.apache.org/jira/browse/FLINK-2824
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Gyula Fora
Priority: Blocker


Iteration feedback partitioning is not handled transparently and can cause 
serious issues if the user does not know the specific implementation details of 
streaming iterations (which is not a realistic expectation).

Example:

IterativeStream it = ... (parallelism 1)
DataStream mapped = it.map(...) (parallelism 2)
// this does not work as the feedback has parallelism 2 != 1
// it.closeWith(mapped.partitionByHash(someField))
// so we need rebalance the data
it.closeWith(mapped.map(NoOpMap).setParallelism(1).partitionByHash(someField))

This program will execute but the feedback will not be partitioned by hash to 
the mapper instances:
The partitioning will be set from the noOpMap to the iteration sink which has 
parallelism different from the mapper (1 vs 2) and then the iteration source 
forwards the element to the mapper (always to 0).

So the problem is basically that the iteration source/sink pair gets the 
parallelism of the input stream (p=1) not the head operator (p = 2) which leads 
to incorrect partitioning.

Workaround:
Set input parallelism to the same as the head operator

Suggested solution:
The iteration construction should be reworked to set the parallelism of the 
source/sink to the parallelism of the head operator (and validate that all 
heads have the same parallelism)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2823) YARN client should report a proper exception if Hadoop Env variables are not set

2015-10-06 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2823:
---

 Summary: YARN client should report a proper exception if Hadoop 
Env variables are not set
 Key: FLINK-2823
 URL: https://issues.apache.org/jira/browse/FLINK-2823
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 0.10
Reporter: Stephan Ewen
 Fix For: 0.10


The important variables are {{HADOOP_HOME}} and {{HADOOP_CONF_DIR}}.

If none of the two environment variables are set, the YARN client does not pick 
the correct settings / file system to successfully deploy a job.

The YARN client should throw a good exception. Right now, this requires some 
tricky debugging to find the cause of the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Failing test

2015-10-06 Thread Till Rohrmann
If there is none yet, then we do. Label it with "test-stability". I think
the consensus was also to mark it as critical.

Otherwise, just add the log to the JIRA.

On Tue, Oct 6, 2015 at 2:57 PM, Matthias J. Sax  wrote:

> Hi,
>
> One test just failed on current master:
> https://travis-ci.org/apache/flink/jobs/83871008
>
> Do we need a JIRA?
>
> >   LeaderChangeStateCleanupTest.testReelectionOfSameJobManager:245 »
> Timeout Futu...
>
>
> -Matthias
>
>


Failing test

2015-10-06 Thread Matthias J. Sax
Hi,

One test just failed on current master:
https://travis-ci.org/apache/flink/jobs/83871008

Do we need a JIRA?

>   LeaderChangeStateCleanupTest.testReelectionOfSameJobManager:245 » Timeout 
> Futu...


-Matthias



signature.asc
Description: OpenPGP digital signature


[jira] [Created] (FLINK-2822) Windowing classes incorrectly import scala.Serializable

2015-10-06 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-2822:
-

 Summary: Windowing classes incorrectly import scala.Serializable
 Key: FLINK-2822
 URL: https://issues.apache.org/jira/browse/FLINK-2822
 Project: Flink
  Issue Type: Bug
Reporter: Gyula Fora
Assignee: Gyula Fora
Priority: Blocker


Windowing classes incorrectly pull in the Scala serializable interface.

grep -lr scala.Serializable *
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/windowing/assigners/WindowAssigner.java
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/windowing/evictors/Evictor.java
flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Towards Flink 0.10

2015-10-06 Thread Maximilian Michels
@Stephan: There is an issue for implementing a detached job submission:
https://issues.apache.org/jira/browse/FLINK-2797. That has the advantage
that the execution mode and control flow is not hard-coded in the user job.
I think it makes more sense to integrate methods like
(cancel/getAccumulators/scale) in the Client class and then also make them
accessible via the command-line client. IMHO we could think about a special
executeWithControl() method for these things but I would not break the
existing execute().

On Tue, Oct 6, 2015 at 12:18 PM, Alexey Sapozhnikov 
wrote:

> Hello everyone.
>
> Stephan, will 0.10 include the cache issues for createRemoteEnvironment?
> That not every addressing of environment will retransmit all the jars?
>
> On Tue, Oct 6, 2015 at 1:16 PM, Stephan Ewen  wrote:
>
> > How about making a quick effort for this:
> > https://issues.apache.org/jira/browse/FLINK-2313
> >
> > It introduces an execution control handle (returned by
> > StreamExecutionEnvironment.execute()) for operations like cancel(),
> > getAccumulators(), scaleIn/Out(), ...
> > It would be big time API breaking, so would be good to have it before the
> > release.
> >
> > As a first version we could return a control handle that has only a
> single
> > method "waitForCompletion()" to emulate the current behavior.
> > Or we postpone it and later add a method "executeWithControl()" or so,
> that
> > returns the control handle.
> >
> >
> > On Mon, Oct 5, 2015 at 6:34 PM, Vasiliki Kalavri <
> > vasilikikala...@gmail.com>
> > wrote:
> >
> > > Yes, FLINK-2785 that's right!
> > > Alright, thanks a lot!
> > >
> > > On 5 October 2015 at 18:31, Fabian Hueske  wrote:
> > >
> > > > Hi Vasia,
> > > >
> > > > I guess you are referring to FLINK-2785. Should be fine, as there is
> a
> > PR
> > > > already.
> > > > I'll add it to the list.
> > > >
> > > > Would be nice if you could take care of FLINK-2786 (remove Spargel).
> > > >
> > > > Cheers, Fabian
> > > >
> > > > 2015-10-05 18:25 GMT+02:00 Vasiliki Kalavri <
> vasilikikala...@gmail.com
> > >:
> > > >
> > > > > Thank you Max for putting the list together and to whomever added
> > > > > FLINK-2561 to the list :)
> > > > > I would also add FLINK-2561 (pending PR #1205). It's a sub-task of
> > > > > FLINK-2561, so maybe it's covered as is.
> > > > >
> > > > > If we go for Gelly graduation, I can take care of FLINK-2786
> "Remove
> > > > > Spargel from source code and update documentation in favor of
> Gelly",
> > > but
> > > > > maybe it makes sense to wait for the restructuring, since we're
> > getting
> > > > rid
> > > > > of staging altogether?
> > > > >
> > > > > -V.
> > > > >
> > > > > On 5 October 2015 at 17:51, Fabian Hueske 
> wrote:
> > > > >
> > > > > > Thanks Max.
> > > > > > I extended the list of issues to fix for the release.
> > > > > >
> > > > > > 2015-10-05 17:10 GMT+02:00 Maximilian Michels :
> > > > > >
> > > > > > > Thanks Greg, we have added that to the list of API breaking
> > > changes.
> > > > > > >
> > > > > > > On Mon, Oct 5, 2015 at 4:36 PM, Greg Hogan  >
> > > > wrote:
> > > > > > >
> > > > > > > > Max,
> > > > > > > >
> > > > > > > > Stephan noted that FLINK-2723 is an API breaking change. The
> > > > > > > CopyableValue
> > > > > > > > interface has a new method "T copy()". Commit
> > > > > > > > e727355e42bd0ad7d403aee703aaf33a68a839d2
> > > > > > > >
> > > > > > > > Greg
> > > > > > > >
> > > > > > > > On Mon, Oct 5, 2015 at 10:20 AM, Maximilian Michels <
> > > > m...@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Flinksters,
> > > > > > > > >
> > > > > > > > > After a lot of development effort in the past months, it is
> > > about
> > > > > > time
> > > > > > > > > to move towards the next major release. We decided to move
> > > > towards
> > > > > > > > > 0.10 instead of a milestone release. This release will
> > probably
> > > > be
> > > > > > the
> > > > > > > > > last release before 1.0.
> > > > > > > > >
> > > > > > > > > For 0.10 we most noticeably have the new Streaming API
> which
> > > > comes
> > > > > > > > > with an improved runtime including exactly-once sources and
> > > > sinks.
> > > > > > > > > Additionally, we have a new web interface with support for
> > > > > > > > > live-monitoring. Not to mention the countless fixes and
> > > > > improvements.
> > > > > > > > >
> > > > > > > > > I've been ransacking the JIRA issues to find out what
> issues
> > we
> > > > > have
> > > > > > > > > to fix before we can release. I've put these issues on the
> > 0.10
> > > > > wiki
> > > > > > > > > page:
> > > > > https://cwiki.apache.org/confluence/display/FLINK/0.10+Release
> > > > > > > > >
> > > > > > > > > It would be great to fix all of these issues. However, I
> > think
> > > we
> > > > > > need
> > > > > > > > > to be pragmatic and pick the most pressing ones. Let's do
> > that
> > > on
> > > > > the
> > > > > > > > > wiki page on the "To Be Fixed" section.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > 

Re: Towards Flink 0.10

2015-10-06 Thread Alexey Sapozhnikov
Hello everyone.

Stephan, will 0.10 include the cache issues for createRemoteEnvironment?
That not every addressing of environment will retransmit all the jars?

On Tue, Oct 6, 2015 at 1:16 PM, Stephan Ewen  wrote:

> How about making a quick effort for this:
> https://issues.apache.org/jira/browse/FLINK-2313
>
> It introduces an execution control handle (returned by
> StreamExecutionEnvironment.execute()) for operations like cancel(),
> getAccumulators(), scaleIn/Out(), ...
> It would be big time API breaking, so would be good to have it before the
> release.
>
> As a first version we could return a control handle that has only a single
> method "waitForCompletion()" to emulate the current behavior.
> Or we postpone it and later add a method "executeWithControl()" or so, that
> returns the control handle.
>
>
> On Mon, Oct 5, 2015 at 6:34 PM, Vasiliki Kalavri <
> vasilikikala...@gmail.com>
> wrote:
>
> > Yes, FLINK-2785 that's right!
> > Alright, thanks a lot!
> >
> > On 5 October 2015 at 18:31, Fabian Hueske  wrote:
> >
> > > Hi Vasia,
> > >
> > > I guess you are referring to FLINK-2785. Should be fine, as there is a
> PR
> > > already.
> > > I'll add it to the list.
> > >
> > > Would be nice if you could take care of FLINK-2786 (remove Spargel).
> > >
> > > Cheers, Fabian
> > >
> > > 2015-10-05 18:25 GMT+02:00 Vasiliki Kalavri  >:
> > >
> > > > Thank you Max for putting the list together and to whomever added
> > > > FLINK-2561 to the list :)
> > > > I would also add FLINK-2561 (pending PR #1205). It's a sub-task of
> > > > FLINK-2561, so maybe it's covered as is.
> > > >
> > > > If we go for Gelly graduation, I can take care of FLINK-2786 "Remove
> > > > Spargel from source code and update documentation in favor of Gelly",
> > but
> > > > maybe it makes sense to wait for the restructuring, since we're
> getting
> > > rid
> > > > of staging altogether?
> > > >
> > > > -V.
> > > >
> > > > On 5 October 2015 at 17:51, Fabian Hueske  wrote:
> > > >
> > > > > Thanks Max.
> > > > > I extended the list of issues to fix for the release.
> > > > >
> > > > > 2015-10-05 17:10 GMT+02:00 Maximilian Michels :
> > > > >
> > > > > > Thanks Greg, we have added that to the list of API breaking
> > changes.
> > > > > >
> > > > > > On Mon, Oct 5, 2015 at 4:36 PM, Greg Hogan 
> > > wrote:
> > > > > >
> > > > > > > Max,
> > > > > > >
> > > > > > > Stephan noted that FLINK-2723 is an API breaking change. The
> > > > > > CopyableValue
> > > > > > > interface has a new method "T copy()". Commit
> > > > > > > e727355e42bd0ad7d403aee703aaf33a68a839d2
> > > > > > >
> > > > > > > Greg
> > > > > > >
> > > > > > > On Mon, Oct 5, 2015 at 10:20 AM, Maximilian Michels <
> > > m...@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Flinksters,
> > > > > > > >
> > > > > > > > After a lot of development effort in the past months, it is
> > about
> > > > > time
> > > > > > > > to move towards the next major release. We decided to move
> > > towards
> > > > > > > > 0.10 instead of a milestone release. This release will
> probably
> > > be
> > > > > the
> > > > > > > > last release before 1.0.
> > > > > > > >
> > > > > > > > For 0.10 we most noticeably have the new Streaming API which
> > > comes
> > > > > > > > with an improved runtime including exactly-once sources and
> > > sinks.
> > > > > > > > Additionally, we have a new web interface with support for
> > > > > > > > live-monitoring. Not to mention the countless fixes and
> > > > improvements.
> > > > > > > >
> > > > > > > > I've been ransacking the JIRA issues to find out what issues
> we
> > > > have
> > > > > > > > to fix before we can release. I've put these issues on the
> 0.10
> > > > wiki
> > > > > > > > page:
> > > > https://cwiki.apache.org/confluence/display/FLINK/0.10+Release
> > > > > > > >
> > > > > > > > It would be great to fix all of these issues. However, I
> think
> > we
> > > > > need
> > > > > > > > to be pragmatic and pick the most pressing ones. Let's do
> that
> > on
> > > > the
> > > > > > > > wiki page on the "To Be Fixed" section.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Max
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 

*Regards*

*Alexey Sapozhnikov*
CTO& Co-Founder
Scalabillit Inc
Aba Even 10-C, Herzelia, Israel
M : +972-52-2363823
E : ale...@scalabill.it
W : http://www.scalabill.it
YT - https://youtu.be/9Rj309PTOFA
Map:http://mapta.gs/Scalabillit
Revolutionizing Proof-of-Concept


Re: Towards Flink 0.10

2015-10-06 Thread Stephan Ewen
How about making a quick effort for this:
https://issues.apache.org/jira/browse/FLINK-2313

It introduces an execution control handle (returned by
StreamExecutionEnvironment.execute()) for operations like cancel(),
getAccumulators(), scaleIn/Out(), ...
It would be big time API breaking, so would be good to have it before the
release.

As a first version we could return a control handle that has only a single
method "waitForCompletion()" to emulate the current behavior.
Or we postpone it and later add a method "executeWithControl()" or so, that
returns the control handle.


On Mon, Oct 5, 2015 at 6:34 PM, Vasiliki Kalavri 
wrote:

> Yes, FLINK-2785 that's right!
> Alright, thanks a lot!
>
> On 5 October 2015 at 18:31, Fabian Hueske  wrote:
>
> > Hi Vasia,
> >
> > I guess you are referring to FLINK-2785. Should be fine, as there is a PR
> > already.
> > I'll add it to the list.
> >
> > Would be nice if you could take care of FLINK-2786 (remove Spargel).
> >
> > Cheers, Fabian
> >
> > 2015-10-05 18:25 GMT+02:00 Vasiliki Kalavri :
> >
> > > Thank you Max for putting the list together and to whomever added
> > > FLINK-2561 to the list :)
> > > I would also add FLINK-2561 (pending PR #1205). It's a sub-task of
> > > FLINK-2561, so maybe it's covered as is.
> > >
> > > If we go for Gelly graduation, I can take care of FLINK-2786 "Remove
> > > Spargel from source code and update documentation in favor of Gelly",
> but
> > > maybe it makes sense to wait for the restructuring, since we're getting
> > rid
> > > of staging altogether?
> > >
> > > -V.
> > >
> > > On 5 October 2015 at 17:51, Fabian Hueske  wrote:
> > >
> > > > Thanks Max.
> > > > I extended the list of issues to fix for the release.
> > > >
> > > > 2015-10-05 17:10 GMT+02:00 Maximilian Michels :
> > > >
> > > > > Thanks Greg, we have added that to the list of API breaking
> changes.
> > > > >
> > > > > On Mon, Oct 5, 2015 at 4:36 PM, Greg Hogan 
> > wrote:
> > > > >
> > > > > > Max,
> > > > > >
> > > > > > Stephan noted that FLINK-2723 is an API breaking change. The
> > > > > CopyableValue
> > > > > > interface has a new method "T copy()". Commit
> > > > > > e727355e42bd0ad7d403aee703aaf33a68a839d2
> > > > > >
> > > > > > Greg
> > > > > >
> > > > > > On Mon, Oct 5, 2015 at 10:20 AM, Maximilian Michels <
> > m...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Flinksters,
> > > > > > >
> > > > > > > After a lot of development effort in the past months, it is
> about
> > > > time
> > > > > > > to move towards the next major release. We decided to move
> > towards
> > > > > > > 0.10 instead of a milestone release. This release will probably
> > be
> > > > the
> > > > > > > last release before 1.0.
> > > > > > >
> > > > > > > For 0.10 we most noticeably have the new Streaming API which
> > comes
> > > > > > > with an improved runtime including exactly-once sources and
> > sinks.
> > > > > > > Additionally, we have a new web interface with support for
> > > > > > > live-monitoring. Not to mention the countless fixes and
> > > improvements.
> > > > > > >
> > > > > > > I've been ransacking the JIRA issues to find out what issues we
> > > have
> > > > > > > to fix before we can release. I've put these issues on the 0.10
> > > wiki
> > > > > > > page:
> > > https://cwiki.apache.org/confluence/display/FLINK/0.10+Release
> > > > > > >
> > > > > > > It would be great to fix all of these issues. However, I think
> we
> > > > need
> > > > > > > to be pragmatic and pick the most pressing ones. Let's do that
> on
> > > the
> > > > > > > wiki page on the "To Be Fixed" section.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Max
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Iteration feedback partitioning does not work properly

2015-10-06 Thread Gyula Fóra
Hi,

This is just a workaround, which actually breaks input order from my
source. I think the iteration construction should be reworked to set the
parallelism of the source/sink to the parallelism of the head operator (and
validate that all heads have the same parallelism).

I thought this was the solution that you described with Stephan in some
older discussion before the rewrite.

Cheers,
Gyula

Aljoscha Krettek  ezt írta (időpont: 2015. okt. 6., K,
9:15):

> Hi,
> I think what you would like to to can be achieved by:
>
> IterativeStream it = in.map(IdentityMap).setParallelism(2).iterate()
> DataStream mapped = it.map(...)
>  it.closeWith(mapped.partitionByHash(someField))
>
> The input is rebalanced to the map inside the iteration as in your example
> and the feedback should be partitioned by hash.
>
> Cheers,
> Aljoscha
>
>
> On Tue, 6 Oct 2015 at 00:11 Gyula Fóra  wrote:
>
> > Hey,
> >
> > This question is mainly targeted towards Aljoscha but maybe someone can
> > help me out here:
> >
> > I think the way feedback partitioning is handled does not work, let me
> > illustrate with a simple example:
> >
> > IterativeStream it = ... (parallelism 1)
> > DataStream mapped = it.map(...) (parallelism 2)
> > // this does not work as the feedback has parallelism 2 != 1
> > // it.closeWith(mapped.partitionByHash(someField))
> > // so we need rebalance the data
> >
> >
> it.closeWith(mapped.map(NoOpMap).setParallelism(1).partitionByHash(someField))
> >
> > This program will execute but the feedback will not be partitioned by
> hash
> > to the mapper instances:
> > The partitioning will be set from the noOpMap to the iteration sink which
> > has parallelism different from the mapper (1 vs 2) and then the iteration
> > source forwards the element to the mapper (always to 0).
> >
> > So the problem is basically that the iteration source/sink pair gets the
> > parallelism of the input stream (p=1) not the head operator (p = 2) which
> > leads to incorrect partitioning.
> >
> > Did I miss something here?
> >
> > Cheers,
> > Gyula
> >
> >
>


Re: streaming GroupBy + Fold

2015-10-06 Thread Martin Neumann
The window is actually part of the workaround we currently using (should
have commented it out) where we use a window and a MapFunction instead of a
Fold.
Original I was running fold without a window facing the same problems.

The workaround works for now so there is no urgency on that one. I just
wanted to make sure I was not doing something stupid and it was a bug that
you guys where  aware of.

cheers Martin

On Tue, Oct 6, 2015 at 8:09 AM, Aljoscha Krettek 
wrote:

> Hi,
> If you are using a fold you are using none of the new code paths. I will
> add support for Fold to the new windowing implementation today, though.
>
> Cheers,
> Aljoscha
>
> On Mon, 5 Oct 2015 at 23:49 Márton Balassi 
> wrote:
>
> > Martin, I have looked at your code and you are running a fold in a
> window,
> > that is a very important distinction - the code paths are separate.
> > Those code paths have been recently touched by Aljoscha if I am not
> > mistaken.
> >
> > I have mocked up a simple example and could not reproduce your problem
> > unfortunately. [1] Could you maybe produce a minimalistic example that we
> > can actually execute? :)
> >
> > [1]
> >
> >
> https://github.com/mbalassi/flink/commit/9f1f02d05e2bc2043a8f514d39fbf7753ea7058d
> >
> > On Mon, Oct 5, 2015 at 10:06 PM, Márton Balassi <
> balassi.mar...@gmail.com>
> > wrote:
> >
> > > Thanks, I am checking it out tomorrow morning.
> > >
> > > On Mon, Oct 5, 2015 at 9:59 PM, Martin Neumann 
> wrote:
> > >
> > >> Hej,
> > >>
> > >> Sorry it took so long to respond I needed to check if I was actually
> > >> allowed to share the code since it uses internal datasets.
> > >>
> > >> In the appendix of this email you will find the main class of this job
> > >> without the supporting classes or the actual dataset. If you want to
> > run it
> > >> you need to replace the dataset by something else but that should be
> > >> trivial.
> > >> If you just want to see the problem itself, have a look at the
> appended
> > >> log in conjunction with the code. Each ERROR printout in the log
> > relates to
> > >> an accumulator receiving wrong values.
> > >>
> > >> cheers Martin
> > >>
> > >> On Sat, Oct 3, 2015 at 11:29 AM, Márton Balassi <
> > balassi.mar...@gmail.com
> > >> > wrote:
> > >>
> > >>> Hey,
> > >>>
> > >>> Thanks for reporting the problem, Martin. I have not merged the PR
> > >>> Stephan
> > >>> is referring to yet. [1] There I am cleaning up some of the internals
> > >>> too.
> > >>> Just out of curiosity, could you share the code for the failing test
> > >>> please?
> > >>>
> > >>> [1] https://github.com/apache/flink/pull/1155
> > >>>
> > >>> On Fri, Oct 2, 2015 at 8:26 PM, Martin Neumann 
> > wrote:
> > >>>
> > >>> > One of my colleagues found it today when we where hunting bugs
> today.
> > >>> We
> > >>> > where using the latest 0.10 version pulled from maven this morning.
> > >>> > The program we where testing is new code so I cant tell you if the
> > >>> behavior
> > >>> > has changed or if it was always like this.
> > >>> >
> > >>> > On Fri, Oct 2, 2015 at 7:46 PM, Stephan Ewen 
> > wrote:
> > >>> >
> > >>> > > I think these operations were recently moved to the internal
> state
> > >>> > > interface. Did the behavior change then?
> > >>> > >
> > >>> > > @Marton or Gyula, can you comment? Is it per chance not mapped to
> > the
> > >>> > > partitioned state?
> > >>> > >
> > >>> > > On Fri, Oct 2, 2015 at 6:37 PM, Martin Neumann  >
> > >>> wrote:
> > >>> > >
> > >>> > > > Hej,
> > >>> > > >
> > >>> > > > In one of my Programs I run a Fold on a GroupedDataStream. The
> > aim
> > >>> is
> > >>> > to
> > >>> > > > aggregate the values in each group.
> > >>> > > > It seems the aggregator in the Fold function is shared on
> > operator
> > >>> > level,
> > >>> > > > so all groups that end up on the same operator get mashed
> > together.
> > >>> > > >
> > >>> > > > Is this the wanted behavior? If so, what do I have to do to
> > >>> separate
> > >>> > > them?
> > >>> > > >
> > >>> > > >
> > >>> > > > cheers Martin
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>


Re: [DISCUSS] Introducing a review process for pull requests

2015-10-06 Thread Theodore Vasiloudis
One problem that we are seeing with FlinkML PRs is that there are simply
not enough commiters to "shepherd" all of them.

While I think this process would help generally, I don't think it would
solve this kind of problem.

Regards,
Theodore

On Mon, Oct 5, 2015 at 3:28 PM, Matthias J. Sax  wrote:

> One comment:
> We should ensure that contributors follow discussions on the dev mailing
> list. Otherwise, they might miss important discussions regarding their
> PR (what happened already). Thus, the contributor was waiting for
> feedback on the PR, while the reviewer(s) waited for the PR to be
> updated according to the discussion consensus, resulting in unnecessary
> delays.
>
> -Matthias
>
>
>
> On 10/05/2015 02:18 PM, Fabian Hueske wrote:
> > Hi everybody,
> >
> > Along with our efforts to improve the “How to contribute” guide, I would
> > like to start a discussion about a setting up a review process for pull
> > requests.
> >
> > Right now, I feel that our PR review efforts are often a bit unorganized.
> > This leads to situation such as:
> >
> > - PRs which are lingering around without review or feedback
> > - PRs which got a +1 for merging but which did not get merged
> > - PRs which have been rejected after a long time
> > - PRs which became irrelevant because some component was rewritten
> > - PRs which are lingering around and have been abandoned by their
> > contributors
> >
> > To address these issues I propose to define a pull request review process
> > as follows:
> >
> > 1. [Get a Shepherd] Each pull request is taken care of by a shepherd.
> > Shepherds are committers that voluntarily sign up and *feel responsible*
> > for helping the PR through the process until it is merged (or discarded).
> > The shepherd is also the person to contact for the author of the PR. A
> > committer becomes the shepherd of a PR by dropping a comment on Github
> like
> > “I would like to shepherd this PR”. A PR can be reassigned with lazy
> > consensus.
> >
> > 2. [Accept Decision] The first decision for a PR should be whether it is
> > accepted or not. This depends on a) whether it is a desired feature or
> > improvement for Flink and b) whether the higher-level solution design is
> > appropriate. In many cases such as bug fixes or discussed features or
> > improvements, this should be an easy decision. In case of more a complex
> > feature, the discussion should have been started when the mandatory JIRA
> > was created. If it is still not clear whether the PR should be accepted
> or
> > not, a discussion should be started in JIRA (a JIRA issue needs to be
> > created if none exists). The acceptance decision should be recorded by a
> > “+1 to accept” message in Github. If the PR is not accepted, it should be
> > closed in a timely manner.
> >
> > 3. [Merge Decision] Once a PR has been “accepted”, it should be brought
> > into a mergeable state. This means the community should quickly react on
> > contributor questions or PR updates. Everybody is encouraged to review
> pull
> > requests and give feedback. Ideally, the PR author does not have to wait
> > for a long time to get feedback. The shepherd of a PR should feel
> > responsible to drive this process forward. Once the PR is considered to
> be
> > mergeable, this should be recorded by a “+1 to merge” message in Github.
> If
> > the pull request becomes abandoned at some point in time, it should be
> > either taken over by somebody else or be closed after a reasonable time.
> > Again, this can be done by anybody, but the shepherd should feel
> > responsible to resolve the issue.
> >
> > 4. Once, the PR is in a mergeable state, it should be merged. This can be
> > done by anybody, however the shepherd should make sure that it happens
> in a
> > timely manner.
> >
> > IMPORTANT: Everybody is encouraged to discuss pull requests, give
> feedback,
> > and merge pull requests which are in a good shape. However, the shepherd
> > should feel responsible to drive a PR forward if nothing happens.
> >
> > By defining a review process for pull requests, I hope we can
> >
> > - Improve the satisfaction of and interaction with contributors
> > - Improve and speed-up the review process of pull requests.
> > - Close irrelevant and stale PRs more timely
> > - Reduce the effort for code reviewing
> >
> > The review process can be supported by some tooling:
> >
> > - A QA bot that checks the quality of pull requests such as increase of
> > compiler warnings, code style, API changes, etc.
> > - A PR management tool such as Sparks PR dashboard (see:
> > https://spark-prs.appspot.com/)
> >
> > I would like to start discussions about using such tools later as
> separate
> > discussions.
> >
> > Looking forward to your feedback,
> > Fabian
> >
>
>


Re: Iteration feedback partitioning does not work properly

2015-10-06 Thread Aljoscha Krettek
Hi,
I think what you would like to to can be achieved by:

IterativeStream it = in.map(IdentityMap).setParallelism(2).iterate()
DataStream mapped = it.map(...)
 it.closeWith(mapped.partitionByHash(someField))

The input is rebalanced to the map inside the iteration as in your example
and the feedback should be partitioned by hash.

Cheers,
Aljoscha


On Tue, 6 Oct 2015 at 00:11 Gyula Fóra  wrote:

> Hey,
>
> This question is mainly targeted towards Aljoscha but maybe someone can
> help me out here:
>
> I think the way feedback partitioning is handled does not work, let me
> illustrate with a simple example:
>
> IterativeStream it = ... (parallelism 1)
> DataStream mapped = it.map(...) (parallelism 2)
> // this does not work as the feedback has parallelism 2 != 1
> // it.closeWith(mapped.partitionByHash(someField))
> // so we need rebalance the data
>
> it.closeWith(mapped.map(NoOpMap).setParallelism(1).partitionByHash(someField))
>
> This program will execute but the feedback will not be partitioned by hash
> to the mapper instances:
> The partitioning will be set from the noOpMap to the iteration sink which
> has parallelism different from the mapper (1 vs 2) and then the iteration
> source forwards the element to the mapper (always to 0).
>
> So the problem is basically that the iteration source/sink pair gets the
> parallelism of the input stream (p=1) not the head operator (p = 2) which
> leads to incorrect partitioning.
>
> Did I miss something here?
>
> Cheers,
> Gyula
>
>