date:20160209

RE: User Feedback

2016-02-09 Thread Ken Krugler


> From: Vasiliki Kalavri
> Sent: February 9, 2016 10:54:51am PST
> To: dev@flink.apache.org
> Cc: Martin Junghanns
> Subject: Re: User Feedback
> 
> Hi Martin,
> 
> thank you for the feedback. Let me try to answer some of your concerns.
> 
> 
> On 9 February 2016 at 15:35, Martin Neumann  wrote:
> 
>> During this year's FOSDEM Martin Junghans and I set together and gathered
>> some feedback for the Flink project. It is based on our personal experience
>> as well as the feedback and questions from People we taught the system.
>> This is going to be a longer email therefore I have split things into
>> categories:
>> 
>> 
>> *Website and Documentation:*
>> 
>>   1. *Out-dated Google Search results*: Google searches lead to outdated
>>   web site versions (e.g. “flink transformations” or “flink iterations”
>>   return the 0.7 version of the corresponding pages).
>> 
> 
> I'm not sure we can do much about this. I would suggest searching in the
> documentation instead of relying on Google.

This issue (Google finding out-of-date documentation) impacts many open source 
projects.

And everyone does in fact use Google :)

Wouldn't adding a sitemap help here?

Regards,

-- Ken

[snip]

--
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

Re: User Feedback

2016-02-09 Thread Vasiliki Kalavri

Hi Martin,

thank you for the feedback. Let me try to answer some of your concerns.


On 9 February 2016 at 15:35, Martin Neumann  wrote:

> During this year's FOSDEM Martin Junghans and I set together and gathered
> some feedback for the Flink project. It is based on our personal experience
> as well as the feedback and questions from People we taught the system.
> This is going to be a longer email therefore I have split things into
> categories:
>
>
> *Website and Documentation:*
>
>1. *Out-dated Google Search results*: Google searches lead to outdated
>web site versions (e.g. “flink transformations” or “flink iterations”
>return the 0.7 version of the corresponding pages).
>

I'm not sure we can do much about this. I would suggest searching in the
documentation instead of relying on Google.
There is a search box on the top of all documentation pages.



>2. *Invalid Links on Website: *Links are confusing / broken (e.g. the
>Gelly /ML Links on the start page lead to the top of the feature page
>(which start with streaming) *-> maybe this can be validated
>automatically?*
>
>
That was bug recently reported and fixed (see FLINK-3316). If you find
 more of those, please report by opening a JIRA or Pull Request.



>
> *Batch API:*
>
>1. *.reduceGroup(GroupReduceFunction) and
>.groupCombine(CombineGroupFunction): *In other functions such as
>.flatMap(FlatMapFunction) the function call matches the naming of the
>operator. This structure is quite convenient for new user since they can
>make use of the autocompletion features of the IDE, basically start
> typing
>the function call and you get the correct class. This does not work for
>.reduceGroup() and .groupCombine() since the names are switched around.
> *->
>maybe the function can be renamed*
>

I agree this might be strange for new users, but I think it will be much
more annoying for existing users if we change this. In my view, it's not an
important case to justify breaking the API.



>2. *.print() and env.execute(): *Often .print() is used for debugging
>and developing programs replacing regular data sinks. Such a project
> will
>not run until the env.execute() is removed. It's very easy to forget to
> add
>it back in, once you change the .print() back to a proper sink. The
> project
>now will compile fine but will not produce any output since .execute()
> is
>missing. This is a very difficult bug to find especially since there is
> no
>warning or error when running the job. It’s common that people use more
>than one .print() statement during debugging and development. This can
> lead
>to confusion since each .print() forces the program to execute so the
>execution behavior is different than without the print. This is
> especially
>important, if the program contains non-deterministic data generation
> (like
>generating IDs). In the stream API .print() would not require to
>remove .execute() as a result the behavior of the two interfaces is
>inconsistent.
>

This is indeed an issue that many users find hard to get used to. We have
changed the behavior of print() a couple of times before and I'm not sure
it would be wise to do so again. Actually, once a user understands the
difference between eager and lazy sinks, I think it's quite easy to avoid
mistakes.



>3. *calling new when applying an operator eg: .reduceGroup(new
>GroupReduceFunction()): *Some of the people I taught the API’s to where
>confused by this. They knew it was a distributed system and they were
>wondering where the constructor would be actually called. They expected
> to
>hand a class to the function that would be initialized on each of the
>worker nodes. *-> maybe have a section about this in the documentation*
>

I'm not sure I understand the confusion with this one. The goal of
high-level APIs is to relieve the users from having to think about
distribution. The only thing they need to understand is the
DataSet/DataStream abstractions and how to create transformations on them.


>4. *.project() loses type information / does not support .returns(..):
> *The
>project transformation currently loses type information which affects
>chained call with other transformations. One workaround is the
> definition
>of an intermediate dataset. However, to be consistent with other
> operators,
>project should support .returns() to define a type information if
> needed.
>
>
I'm not sure _why_ this is the case. Maybe someone who knows more can
clarify this one.



>
> *Stream API:*
>
>1. *.keyBy(): *Currently .keyBy() creates a KeyedDataStream but every
>operator that consumes a KeyedDataStream produces a DataStream. This
> means
>it is not possible to create a program that uses a keyBy() followed by a
>sequence of transformation for each key without having to reapply
> keyBy()
>after each of

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Till Rohrmann

Not the DataSet but the JoinDataSet and the CoGroupDataSet do in the form
of an apply function.


On Tue, Feb 9, 2016 at 11:09 AM, Stefano Baghino <
stefano.bagh...@radicalbit.io> wrote:

> Sure, it was just a draft. I agree that filter and mapPartition make sense,
> but coGroup and join don't look like they take a function.
>
> On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann 
> wrote:
>
> > This looks like a good design to me :-) The only thing is that it is not
> > complete. For example, the filter, mapPartition, coGroup and join
> functions
> > are missing.
> >
> > Cheers,
> > Till
> > 
> >
> > On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> > stefano.bagh...@radicalbit.io> wrote:
> >
> > > What do you think of something like this?
> > >
> > >
> > >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> > >
> > > In this way, several extensions can be collected in this package object
> > and
> > > picked altogether or a-là-carte (e.g. import
> > > org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> > >
> > > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann 
> > > wrote:
> > >
> > > > I like the idea to support partial functions with Flink’s Scala API.
> > > > However, I think that breaking the API and making it inconsistent
> with
> > > > respect to the Java API is not the best option. I would rather be in
> > > favour
> > > > of the first proposal where we add a new method xxxWith via implicit
> > > > conversions.
> > > >
> > > > Cheers,
> > > > Till
> > > > 
> > > >
> > > > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > > > stefano.bagh...@radicalbit.io> wrote:
> > > >
> > > > > It took me a little time but I was able to put together some code.
> > > > >
> > > > > In this commit I just added a few methods renamed to prevent
> > > overloading,
> > > > > thus usable with PartialFunction instead of functions:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/aacd59e0ce98cccb66d48a30d07990ac8f345748
> > > > >
> > > > > In this other commit I coded the original proposal, renaming the
> > > methods
> > > > to
> > > > > obtain the same effect as before, but with lower friction for Scala
> > > > > developers (and provided some usage examples):
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/33403878eebba70def42f73a1cb671d13b1521b5
> > > > >
> > > > > On Thu, Jan 28, 2016 at 6:16 PM, Stefano Baghino <
> > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > >
> > > > > > Hi Stephan,
> > > > > >
> > > > > > thank you for the quick reply and for your feedback; I agree with
> > you
> > > > > that
> > > > > > breaking changes have to taken very seriously.
> > > > > >
> > > > > > The rationale behind my proposal is that Scala users are already
> > > > > > accustomed to higher-order functions that manipulate collections
> > and
> > > it
> > > > > > would beneficial for them to have an API that tries to adhere as
> > much
> > > > as
> > > > > > possible to the interface provided by the Scala Collections API.
> > IMHO
> > > > > being
> > > > > > able to manipulate a DataSet or DataStream like a Scala
> collection
> > > > > > idiomatically would appeal to developers and reduce the friction
> > for
> > > > them
> > > > > > to learn Flink.
> > > > > >
> > > > > > If we want to pursue the renaming path, I think these changes
> (and
> > > > > porting
> > > > > > the rest of the codebase, like `flink-ml` and `flink-contrib`, to
> > the
> > > > new
> > > > > > method names) can be done in relatively little time. Since Flink
> is
> > > > > > approaching a major release, I think it's a good time to consider
> > > this
> > > > > > change, if the community deems it relevant.
> > > > > >
> > > > > > While we await for feedback on the proposal, I can start working
> on
> > > > both
> > > > > > paths to see how it would affect the codebase, what do you think?
> > > > > >
> > > > > > On Thu, Jan 28, 2016 at 2:14 PM, Stephan Ewen 
> > > > wrote:
> > > > > >
> > > > > >> Hi!
> > > > > >>
> > > > > >> Would be nice to support that, agreed.
> > > > > >>
> > > > > >> Such a fundamental break in the API worries me a bit, though - I
> > > would
> > > > > opt
> > > > > >> for a non-breaking addition.
> > > > > >> Wrapping the RichFunctions into Scala functions (which are
> > actually
> > > > > >> wrapped
> > > > > >> as rich functions) with implicits seems like a workaround for
> > > > something
> > > > > >> that should be very simple. Would probably also cost a bit of
> > > > > performance.
> > > > > >>
> > > > > >>
> > > > > >> I like the idea of "mapWith(...)" - if that were a simple non
> > > > overloaded
> > > > > >> function accepting a Scala function, it should accept case-style
> > > > > >> functions,
> > > > > >> right?
> > > > > >> Simply adding that would probably solve things, but add a second
> > > > variant
> > > > > >> of
> > > > > >>

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Till Rohrmann

This looks like a good design to me :-) The only thing is that it is not
complete. For example, the filter, mapPartition, coGroup and join functions
are missing.

Cheers,
Till


On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
stefano.bagh...@radicalbit.io> wrote:

> What do you think of something like this?
>
>
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
>
> In this way, several extensions can be collected in this package object and
> picked altogether or a-là-carte (e.g. import
> org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
>
> On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann 
> wrote:
>
> > I like the idea to support partial functions with Flink’s Scala API.
> > However, I think that breaking the API and making it inconsistent with
> > respect to the Java API is not the best option. I would rather be in
> favour
> > of the first proposal where we add a new method xxxWith via implicit
> > conversions.
> >
> > Cheers,
> > Till
> > 
> >
> > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > stefano.bagh...@radicalbit.io> wrote:
> >
> > > It took me a little time but I was able to put together some code.
> > >
> > > In this commit I just added a few methods renamed to prevent
> overloading,
> > > thus usable with PartialFunction instead of functions:
> > >
> > >
> >
> https://github.com/radicalbit/flink/commit/aacd59e0ce98cccb66d48a30d07990ac8f345748
> > >
> > > In this other commit I coded the original proposal, renaming the
> methods
> > to
> > > obtain the same effect as before, but with lower friction for Scala
> > > developers (and provided some usage examples):
> > >
> > >
> >
> https://github.com/radicalbit/flink/commit/33403878eebba70def42f73a1cb671d13b1521b5
> > >
> > > On Thu, Jan 28, 2016 at 6:16 PM, Stefano Baghino <
> > > stefano.bagh...@radicalbit.io> wrote:
> > >
> > > > Hi Stephan,
> > > >
> > > > thank you for the quick reply and for your feedback; I agree with you
> > > that
> > > > breaking changes have to taken very seriously.
> > > >
> > > > The rationale behind my proposal is that Scala users are already
> > > > accustomed to higher-order functions that manipulate collections and
> it
> > > > would beneficial for them to have an API that tries to adhere as much
> > as
> > > > possible to the interface provided by the Scala Collections API. IMHO
> > > being
> > > > able to manipulate a DataSet or DataStream like a Scala collection
> > > > idiomatically would appeal to developers and reduce the friction for
> > them
> > > > to learn Flink.
> > > >
> > > > If we want to pursue the renaming path, I think these changes (and
> > > porting
> > > > the rest of the codebase, like `flink-ml` and `flink-contrib`, to the
> > new
> > > > method names) can be done in relatively little time. Since Flink is
> > > > approaching a major release, I think it's a good time to consider
> this
> > > > change, if the community deems it relevant.
> > > >
> > > > While we await for feedback on the proposal, I can start working on
> > both
> > > > paths to see how it would affect the codebase, what do you think?
> > > >
> > > > On Thu, Jan 28, 2016 at 2:14 PM, Stephan Ewen 
> > wrote:
> > > >
> > > >> Hi!
> > > >>
> > > >> Would be nice to support that, agreed.
> > > >>
> > > >> Such a fundamental break in the API worries me a bit, though - I
> would
> > > opt
> > > >> for a non-breaking addition.
> > > >> Wrapping the RichFunctions into Scala functions (which are actually
> > > >> wrapped
> > > >> as rich functions) with implicits seems like a workaround for
> > something
> > > >> that should be very simple. Would probably also cost a bit of
> > > performance.
> > > >>
> > > >>
> > > >> I like the idea of "mapWith(...)" - if that were a simple non
> > overloaded
> > > >> function accepting a Scala function, it should accept case-style
> > > >> functions,
> > > >> right?
> > > >> Simply adding that would probably solve things, but add a second
> > variant
> > > >> of
> > > >> each function to the DataSet. An implicit conversion from DataSet to
> > > >> DataSetExtended (which implements the mapWith, reduceWith, ...)
> > methods
> > > >> could help there...
> > > >>
> > > >> What do you think?
> > > >>
> > > >> Greetings,
> > > >> Stephan
> > > >>
> > > >>
> > > >> On Thu, Jan 28, 2016 at 2:05 PM, Stefano Baghino <
> > > >> stefano.bagh...@radicalbit.io> wrote:
> > > >>
> > > >> > Hello everybody,
> > > >> >
> > > >> > as I'm getting familiar with Flink I've found a possible
> improvement
> > > to
> > > >> the
> > > >> > Scala APIs: in Scala it's a common pattern to perform tuple
> > extraction
> > > >> > using pattern matching, making functions working on tuples more
> > > >> readable,
> > > >> > like this:
> > > >> >
> > > >> > // referring to the mail count example in the training
> > > >> > // assuming `mails` is a DataSet[(String, String)]
> > > >> > // a pair of date and a string with username and email
>

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Stefano Baghino

Sure, it was just a draft. I agree that filter and mapPartition make sense,
but coGroup and join don't look like they take a function.

On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann  wrote:

> This looks like a good design to me :-) The only thing is that it is not
> complete. For example, the filter, mapPartition, coGroup and join functions
> are missing.
>
> Cheers,
> Till
> 
>
> On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> stefano.bagh...@radicalbit.io> wrote:
>
> > What do you think of something like this?
> >
> >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> >
> > In this way, several extensions can be collected in this package object
> and
> > picked altogether or a-là-carte (e.g. import
> > org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> >
> > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann 
> > wrote:
> >
> > > I like the idea to support partial functions with Flink’s Scala API.
> > > However, I think that breaking the API and making it inconsistent with
> > > respect to the Java API is not the best option. I would rather be in
> > favour
> > > of the first proposal where we add a new method xxxWith via implicit
> > > conversions.
> > >
> > > Cheers,
> > > Till
> > > 
> > >
> > > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > > stefano.bagh...@radicalbit.io> wrote:
> > >
> > > > It took me a little time but I was able to put together some code.
> > > >
> > > > In this commit I just added a few methods renamed to prevent
> > overloading,
> > > > thus usable with PartialFunction instead of functions:
> > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/aacd59e0ce98cccb66d48a30d07990ac8f345748
> > > >
> > > > In this other commit I coded the original proposal, renaming the
> > methods
> > > to
> > > > obtain the same effect as before, but with lower friction for Scala
> > > > developers (and provided some usage examples):
> > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/33403878eebba70def42f73a1cb671d13b1521b5
> > > >
> > > > On Thu, Jan 28, 2016 at 6:16 PM, Stefano Baghino <
> > > > stefano.bagh...@radicalbit.io> wrote:
> > > >
> > > > > Hi Stephan,
> > > > >
> > > > > thank you for the quick reply and for your feedback; I agree with
> you
> > > > that
> > > > > breaking changes have to taken very seriously.
> > > > >
> > > > > The rationale behind my proposal is that Scala users are already
> > > > > accustomed to higher-order functions that manipulate collections
> and
> > it
> > > > > would beneficial for them to have an API that tries to adhere as
> much
> > > as
> > > > > possible to the interface provided by the Scala Collections API.
> IMHO
> > > > being
> > > > > able to manipulate a DataSet or DataStream like a Scala collection
> > > > > idiomatically would appeal to developers and reduce the friction
> for
> > > them
> > > > > to learn Flink.
> > > > >
> > > > > If we want to pursue the renaming path, I think these changes (and
> > > > porting
> > > > > the rest of the codebase, like `flink-ml` and `flink-contrib`, to
> the
> > > new
> > > > > method names) can be done in relatively little time. Since Flink is
> > > > > approaching a major release, I think it's a good time to consider
> > this
> > > > > change, if the community deems it relevant.
> > > > >
> > > > > While we await for feedback on the proposal, I can start working on
> > > both
> > > > > paths to see how it would affect the codebase, what do you think?
> > > > >
> > > > > On Thu, Jan 28, 2016 at 2:14 PM, Stephan Ewen 
> > > wrote:
> > > > >
> > > > >> Hi!
> > > > >>
> > > > >> Would be nice to support that, agreed.
> > > > >>
> > > > >> Such a fundamental break in the API worries me a bit, though - I
> > would
> > > > opt
> > > > >> for a non-breaking addition.
> > > > >> Wrapping the RichFunctions into Scala functions (which are
> actually
> > > > >> wrapped
> > > > >> as rich functions) with implicits seems like a workaround for
> > > something
> > > > >> that should be very simple. Would probably also cost a bit of
> > > > performance.
> > > > >>
> > > > >>
> > > > >> I like the idea of "mapWith(...)" - if that were a simple non
> > > overloaded
> > > > >> function accepting a Scala function, it should accept case-style
> > > > >> functions,
> > > > >> right?
> > > > >> Simply adding that would probably solve things, but add a second
> > > variant
> > > > >> of
> > > > >> each function to the DataSet. An implicit conversion from DataSet
> to
> > > > >> DataSetExtended (which implements the mapWith, reduceWith, ...)
> > > methods
> > > > >> could help there...
> > > > >>
> > > > >> What do you think?
> > > > >>
> > > > >> Greetings,
> > > > >> Stephan
> > > > >>
> > > > >>
> > > > >> On Thu, Jan 28, 2016 at 2:05 PM, Stefano Baghino <
> > > > >> stefano.bagh...@radicalbit.io> wrote:
> > > > >>
> > > > >> > Hello everybody,
> > > > >> >
> > > >

[jira] [Created] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docsq

2016-02-09 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-3376:
---

 Summary: Add an illustration of Event Time and Watermarks to the 
docsq
 Key: FLINK-3376
 URL: https://issues.apache.org/jira/browse/FLINK-3376
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.10.1
Reporter: Stephan Ewen
Priority: Critical
 Fix For: 1.0.0


Users seem to get confused about how event time and watermarks work.
We need to add documentation with two sections:


1. Event time and watermark progress in general
  - Watermarks are generated at the sources
  - How Watermarks progress through the streaming data flow

2. Ways that users can generate watermarks
  - EventTimeSourceFunctions
  - AscendingTimestampExtractor
  - TimestampExtractor general case




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3375) Allow Watermark Generation in the Kafka Source

2016-02-09 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-3375:
---

 Summary: Allow Watermark Generation in the Kafka Source
 Key: FLINK-3375
 URL: https://issues.apache.org/jira/browse/FLINK-3375
 Project: Flink
  Issue Type: Improvement
  Components: Kafka Connector
Affects Versions: 1.0.0
Reporter: Stephan Ewen
 Fix For: 1.0.0


It is a common case that event timestamps are ascending inside one Kafka 
Partition. Ascending timestamps are easy for users, because they are handles by 
ascending timestamp extraction.

If the Kafka source has multiple partitions per source task, then the records 
become out of order before timestamps can be extracted and watermarks can be 
generated.

If we make the FlinkKafkaConsumer an event time source function, it can 
generate watermarks itself. It would internally implement the same logic as the 
regular operators that merge streams, keeping track of event time progress per 
partition and generating watermarks based on the current guaranteed event time 
progress.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

2016-02-09 Thread Ufuk Celebi (JIRA)

Ufuk Celebi created FLINK-3374:
--

 Summary: CEPITCase testSimplePatternEventTime fails
 Key: FLINK-3374
 URL: https://issues.apache.org/jira/browse/FLINK-3374
 Project: Flink
  Issue Type: Test
  Components: Tests
Reporter: Ufuk Celebi
Priority: Minor


{code}
testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 1.68 
sec  <<< ERROR!
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:659)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:605)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.io.FileNotFoundException: 
/tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:162)
at 
org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
at 
org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
at 
org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
at 
org.apache.flink.streaming.api.functions.sink.FileSinkFunction.open(FileSinkFunction.java:66)
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:308)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:208)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:745)

testSimplePatternEventTime(org.apache.flink.cep.CEPITCase)  Time elapsed: 1.68 
sec  <<< FAILURE!
java.lang.AssertionError: Different number of lines in expected and obtained 
result. expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:306)
at 
org.apache.flink.test.util.TestBaseUtils.compareResultsByLinesInMemory(TestBaseUtils.java:292)
at org.apache.flink.cep.CEPITCase.after(CEPITCase.java:56)
{code}

https://s3.amazonaws.com/flink-logs-us/travis-artifacts/uce/flink/894/894.2.tar.gz

{code}
04:53:46,840 INFO  org.apache.flink.runtime.client.JobClientActor   
 - 02/09/2016 04:53:46  Map -> Sink: Unnamed(2/4) switched to FAILED 
java.io.FileNotFoundException: 
/tmp/junit8551910958461988945/junit1329117229388301975.tmp/2 (No such file or 
directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:162)
at 
org.apache.flink.core.fs.local.LocalDataOutputStream.(LocalDataOutputStream.java:56)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:256)
at 
org.apache.flink.core.fs.local.LocalFileSystem.create(LocalFileSystem.java:263)
at 
org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:248)
at 
org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:76)
at

[jira] [Created] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems

2016-02-09 Thread Jakob Sultan Ericsson (JIRA)

Jakob Sultan Ericsson created FLINK-3373:


 Summary: Using a newer library of Apache HttpClient than 4.2.6 
will get class loading problems
 Key: FLINK-3373
 URL: https://issues.apache.org/jira/browse/FLINK-3373
 Project: Flink
  Issue Type: Bug
 Environment: Latest Flink snapshot 1.0
Reporter: Jakob Sultan Ericsson


When I trying to use Apache HTTP client 4.5.1 in my flink job it will crash 
with NoClassDefFound.
This has to do that it load some classes from provided httpclient 4.2.5/6 in 
core flink.

{noformat}
17:05:56,193 INFO  org.apache.flink.runtime.taskmanager.Task
 - DuplicateFilter -> InstallKeyLookup (11/16) switched to FAILED with 
exception.
java.lang.NoSuchFieldError: INSTANCE
at 
org.apache.http.conn.ssl.SSLConnectionSocketFactory.(SSLConnectionSocketFactory.java:144)
at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.getDefaultRegistry(PoolingHttpClientConnectionManager.java:109)
at 
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.(PoolingHttpClientConnectionManager.java:116)
...
at 
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:89)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:305)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:227)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
at java.lang.Thread.run(Thread.java:745)
{noformat}

SSLConnectionSocketFactory and finds an earlier version of the 
AllowAllHostnameVerifier that does have the INSTANCE variable (instance 
variable was probably added in 4.3).

{noformat}
jar tvf lib/flink-dist-1.0-SNAPSHOT.jar |grep AllowAllHostnameVerifier  
   791 Thu Dec 17 09:55:46 CET 2015 
org/apache/http/conn/ssl/AllowAllHostnameVerifier.class
{noformat}

Solutions would be:
- Fix the classloader so that my custom job does not conflict with internal 
flink-core classes... pretty hard
- Remove the dependency somehow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3378) Consolidate TestingCluster and FokableFlinkMiniCluster

2016-02-09 Thread Maximilian Michels (JIRA)

Maximilian Michels created FLINK-3378:
-

 Summary: Consolidate TestingCluster and FokableFlinkMiniCluster
 Key: FLINK-3378
 URL: https://issues.apache.org/jira/browse/FLINK-3378
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.0.0
Reporter: Maximilian Michels
 Fix For: 1.0.0


{{TestingCluster}} appears to be outdated and should be replaced by or 
consolidated with the {{ForkableMiniCluster}}. Both clusters start the testing 
actors. Additionally, ForkableMiniCluster cluster has support for forking, HA, 
and restarting actors.

As of now it looks like the use of both is arbitrary. The TestingCluster may 
produce test failures because multiple forked test instances could be trying to 
bind to the same free port.

It looks like the ForkableMiniCluster should also inherit from FlinkMiniCluster 
instead of LocalFlinkMiniCluster because it overwrites all inherited 
implementations of LocalFlinkMiniCluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Robert Metzger

I would also like to keep this RC. 0.10-SNAPSHOT will be equally stable.

On Tue, Feb 9, 2016 at 2:57 PM, Stephan Ewen  wrote:

> I would be in favor of continuing the release candidate.
> The change is more a feature improvement than a bug, and these should come
> before release candidates (otherwise we never get one done).
>
> The 1.0 release is also very close, so this improvement will be available
> anyways pretty soon.
>
> I personally also invested again quite a bit of time into testing this
> release candidate already.
>
>
>
> On Tue, Feb 9, 2016 at 1:51 PM, Fabian Hueske  wrote:
>
> > Thanks Ufuk,
> > +1 for a new RC
> >
> > 2016-02-09 13:49 GMT+01:00 Ufuk Celebi :
> >
> > > Hey Nick,
> > >
> > > I agree that this can be problematic when running multiple jobs on
> > > YARN. Since there is a chance that this might be the last release 0.10
> > > release, I would be OK to cancel the vote for your fix.
> > >
> > > Still, let's hear the opinion of others before doing this. What do you
> > > think?
> > >
> > > – Ufuk
> > >
> > >
> > > On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk 
> > wrote:
> > > > Perhaps too late for the RC, but I've backported FLINK-3293 to this
> > > branch
> > > > via FLINK-3372. Would be nice for those wanting to monitory yarn
> > > > application submissions.
> > > >
> > > > On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  wrote:
> > > >
> > > >> Dear Flink community,
> > > >>
> > > >> Please vote on releasing the following candidate as Apache Flink
> > version
> > > >> 0.10.2.
> > > >>
> > > >> Please note that this vote has a slightly shorter voting period of
> 48
> > > >> hours. Only a single change has been made since the last release
> > > >> candidate. Since the community has already done extensive testing of
> > the
> > > >> previous release candidate, I'm assuming 48 hours will suffice to
> vote
> > > on
> > > >> this one.
> > > >>
> > > >> The commit to be voted on:
> > > >> e525eb2f1413df238e994d01c909d2b90f1b7709
> > > >>
> > > >> Branch:
> > > >> release-0.10.2-rc2 (see
> > > >>
> > > >>
> > >
> >
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
> > > >> )
> > > >>
> > > >> The release artifacts to be voted on can be found at:
> > > >> http://people.apache.org/~uce/flink-0.10.2-rc2/
> > > >>
> > > >> The release artifacts are signed with the key with fingerprint
> > 9D403309:
> > > >> http://www.apache.org/dist/flink/KEYS
> > > >>
> > > >> The staging repository for this release can be found at:
> > > >>
> > https://repository.apache.org/content/repositories/orgapacheflink-1061
> > > >>
> > > >> -
> > > >>
> > > >> The vote is open for the next 48 hours and passes if a majority of
> at
> > > least
> > > >> three +1 PMC votes are cast.
> > > >>
> > > >> The vote ends on Wednesday February 10, 2016, 18:45 CET.
> > > >>
> > > >> [ ] +1 Release this package as Apache Flink 0.10.2
> > > >> [ ] -1 Do not release this package because ...
> > > >>
> > > >> ===
> > > >>
> > > >> The following commits have been added since the 0.10.2 RC 1:
> > > >>
> > > >> * 2cd0618 - [tools] Properly update all POM versions in release
> script
> > > >> (3 hours ago) 
> > > >>
> > >
> >
>

User Feedback

2016-02-09 Thread Martin Neumann

During this year's FOSDEM Martin Junghans and I set together and gathered
some feedback for the Flink project. It is based on our personal experience
as well as the feedback and questions from People we taught the system.
This is going to be a longer email therefore I have split things into
categories:


*Website and Documentation:*

   1. *Out-dated Google Search results*: Google searches lead to outdated
   web site versions (e.g. “flink transformations” or “flink iterations”
   return the 0.7 version of the corresponding pages).
   2. *Invalid Links on Website: *Links are confusing / broken (e.g. the
   Gelly /ML Links on the start page lead to the top of the feature page
   (which start with streaming) *-> maybe this can be validated
   automatically?*


*Batch API:*

   1. *.reduceGroup(GroupReduceFunction) and
   .groupCombine(CombineGroupFunction): *In other functions such as
   .flatMap(FlatMapFunction) the function call matches the naming of the
   operator. This structure is quite convenient for new user since they can
   make use of the autocompletion features of the IDE, basically start typing
   the function call and you get the correct class. This does not work for
   .reduceGroup() and .groupCombine() since the names are switched around. *->
   maybe the function can be renamed*
   2. *.print() and env.execute(): *Often .print() is used for debugging
   and developing programs replacing regular data sinks. Such a project will
   not run until the env.execute() is removed. It's very easy to forget to add
   it back in, once you change the .print() back to a proper sink. The project
   now will compile fine but will not produce any output since .execute() is
   missing. This is a very difficult bug to find especially since there is no
   warning or error when running the job. It’s common that people use more
   than one .print() statement during debugging and development. This can lead
   to confusion since each .print() forces the program to execute so the
   execution behavior is different than without the print. This is especially
   important, if the program contains non-deterministic data generation (like
   generating IDs). In the stream API .print() would not require to
   remove .execute() as a result the behavior of the two interfaces is
   inconsistent.
   3. *calling new when applying an operator eg: .reduceGroup(new
   GroupReduceFunction()): *Some of the people I taught the API’s to where
   confused by this. They knew it was a distributed system and they were
   wondering where the constructor would be actually called. They expected to
   hand a class to the function that would be initialized on each of the
   worker nodes. *-> maybe have a section about this in the documentation*
   4. *.project() loses type information / does not support .returns(..): *The
   project transformation currently loses type information which affects
   chained call with other transformations. One workaround is the definition
   of an intermediate dataset. However, to be consistent with other operators,
   project should support .returns() to define a type information if needed.


*Stream API:*

   1. *.keyBy(): *Currently .keyBy() creates a KeyedDataStream but every
   operator that consumes a KeyedDataStream produces a DataStream. This means
   it is not possible to create a program that uses a keyBy() followed by a
   sequence of transformation for each key without having to reapply keyBy()
   after each of those operators. (This was a common problem in my work for
   Ericsson and Spotify)
   2. *split() operator with multiple output types.: *Its common to have to
   split a single Stream into a different streams. For example a stream
   containing different system events might need to be broken into a stream
   for each type. The current split() operator requires all outputs to have
   the same data type. I cases where there are no direct type hierarchies the
   user needs to implement a wrapper type to make use of this function. An
   operator similar to split that allows output streams to have different
   types would greatly simplify those use cases


cheers Martin

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Stephan Ewen

I would be in favor of continuing the release candidate.
The change is more a feature improvement than a bug, and these should come
before release candidates (otherwise we never get one done).

The 1.0 release is also very close, so this improvement will be available
anyways pretty soon.

I personally also invested again quite a bit of time into testing this
release candidate already.



On Tue, Feb 9, 2016 at 1:51 PM, Fabian Hueske  wrote:

> Thanks Ufuk,
> +1 for a new RC
>
> 2016-02-09 13:49 GMT+01:00 Ufuk Celebi :
>
> > Hey Nick,
> >
> > I agree that this can be problematic when running multiple jobs on
> > YARN. Since there is a chance that this might be the last release 0.10
> > release, I would be OK to cancel the vote for your fix.
> >
> > Still, let's hear the opinion of others before doing this. What do you
> > think?
> >
> > – Ufuk
> >
> >
> > On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk 
> wrote:
> > > Perhaps too late for the RC, but I've backported FLINK-3293 to this
> > branch
> > > via FLINK-3372. Would be nice for those wanting to monitory yarn
> > > application submissions.
> > >
> > > On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  wrote:
> > >
> > >> Dear Flink community,
> > >>
> > >> Please vote on releasing the following candidate as Apache Flink
> version
> > >> 0.10.2.
> > >>
> > >> Please note that this vote has a slightly shorter voting period of 48
> > >> hours. Only a single change has been made since the last release
> > >> candidate. Since the community has already done extensive testing of
> the
> > >> previous release candidate, I'm assuming 48 hours will suffice to vote
> > on
> > >> this one.
> > >>
> > >> The commit to be voted on:
> > >> e525eb2f1413df238e994d01c909d2b90f1b7709
> > >>
> > >> Branch:
> > >> release-0.10.2-rc2 (see
> > >>
> > >>
> >
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
> > >> )
> > >>
> > >> The release artifacts to be voted on can be found at:
> > >> http://people.apache.org/~uce/flink-0.10.2-rc2/
> > >>
> > >> The release artifacts are signed with the key with fingerprint
> 9D403309:
> > >> http://www.apache.org/dist/flink/KEYS
> > >>
> > >> The staging repository for this release can be found at:
> > >>
> https://repository.apache.org/content/repositories/orgapacheflink-1061
> > >>
> > >> -
> > >>
> > >> The vote is open for the next 48 hours and passes if a majority of at
> > least
> > >> three +1 PMC votes are cast.
> > >>
> > >> The vote ends on Wednesday February 10, 2016, 18:45 CET.
> > >>
> > >> [ ] +1 Release this package as Apache Flink 0.10.2
> > >> [ ] -1 Do not release this package because ...
> > >>
> > >> ===
> > >>
> > >> The following commits have been added since the 0.10.2 RC 1:
> > >>
> > >> * 2cd0618 - [tools] Properly update all POM versions in release script
> > >> (3 hours ago) 
> > >>
> >
>

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Stefano Baghino

I see, thanks for the tip! I'll work on it; meanwhile, I've added some
functions and Scaladoc:
https://github.com/radicalbit/flink/blob/1159-implicit/flink-scala/src/main/scala/org/apache/flink/api/scala/extensions/package.scala

On Tue, Feb 9, 2016 at 12:01 PM, Till Rohrmann  wrote:

> Not the DataSet but the JoinDataSet and the CoGroupDataSet do in the form
> of an apply function.
> 
>
> On Tue, Feb 9, 2016 at 11:09 AM, Stefano Baghino <
> stefano.bagh...@radicalbit.io> wrote:
>
> > Sure, it was just a draft. I agree that filter and mapPartition make
> sense,
> > but coGroup and join don't look like they take a function.
> >
> > On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann 
> > wrote:
> >
> > > This looks like a good design to me :-) The only thing is that it is
> not
> > > complete. For example, the filter, mapPartition, coGroup and join
> > functions
> > > are missing.
> > >
> > > Cheers,
> > > Till
> > > 
> > >
> > > On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> > > stefano.bagh...@radicalbit.io> wrote:
> > >
> > > > What do you think of something like this?
> > > >
> > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> > > >
> > > > In this way, several extensions can be collected in this package
> object
> > > and
> > > > picked altogether or a-là-carte (e.g. import
> > > > org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> > > >
> > > > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann 
> > > > wrote:
> > > >
> > > > > I like the idea to support partial functions with Flink’s Scala
> API.
> > > > > However, I think that breaking the API and making it inconsistent
> > with
> > > > > respect to the Java API is not the best option. I would rather be
> in
> > > > favour
> > > > > of the first proposal where we add a new method xxxWith via
> implicit
> > > > > conversions.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > > 
> > > > >
> > > > > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > >
> > > > > > It took me a little time but I was able to put together some
> code.
> > > > > >
> > > > > > In this commit I just added a few methods renamed to prevent
> > > > overloading,
> > > > > > thus usable with PartialFunction instead of functions:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/aacd59e0ce98cccb66d48a30d07990ac8f345748
> > > > > >
> > > > > > In this other commit I coded the original proposal, renaming the
> > > > methods
> > > > > to
> > > > > > obtain the same effect as before, but with lower friction for
> Scala
> > > > > > developers (and provided some usage examples):
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/33403878eebba70def42f73a1cb671d13b1521b5
> > > > > >
> > > > > > On Thu, Jan 28, 2016 at 6:16 PM, Stefano Baghino <
> > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > >
> > > > > > > Hi Stephan,
> > > > > > >
> > > > > > > thank you for the quick reply and for your feedback; I agree
> with
> > > you
> > > > > > that
> > > > > > > breaking changes have to taken very seriously.
> > > > > > >
> > > > > > > The rationale behind my proposal is that Scala users are
> already
> > > > > > > accustomed to higher-order functions that manipulate
> collections
> > > and
> > > > it
> > > > > > > would beneficial for them to have an API that tries to adhere
> as
> > > much
> > > > > as
> > > > > > > possible to the interface provided by the Scala Collections
> API.
> > > IMHO
> > > > > > being
> > > > > > > able to manipulate a DataSet or DataStream like a Scala
> > collection
> > > > > > > idiomatically would appeal to developers and reduce the
> friction
> > > for
> > > > > them
> > > > > > > to learn Flink.
> > > > > > >
> > > > > > > If we want to pursue the renaming path, I think these changes
> > (and
> > > > > > porting
> > > > > > > the rest of the codebase, like `flink-ml` and `flink-contrib`,
> to
> > > the
> > > > > new
> > > > > > > method names) can be done in relatively little time. Since
> Flink
> > is
> > > > > > > approaching a major release, I think it's a good time to
> consider
> > > > this
> > > > > > > change, if the community deems it relevant.
> > > > > > >
> > > > > > > While we await for feedback on the proposal, I can start
> working
> > on
> > > > > both
> > > > > > > paths to see how it would affect the codebase, what do you
> think?
> > > > > > >
> > > > > > > On Thu, Jan 28, 2016 at 2:14 PM, Stephan Ewen <
> se...@apache.org>
> > > > > wrote:
> > > > > > >
> > > > > > >> Hi!
> > > > > > >>
> > > > > > >> Would be nice to support that, agreed.
> > > > > > >>
> > > > > > >> Such a fundamental break in the API worries me a bit, though
> - I
> > > > would
> > > > > > opt
> > > > > > >> for a non-breaking addition.
> > > > > > >> Wrapping the RichFunctions

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Ufuk Celebi

Hey Nick,

I agree that this can be problematic when running multiple jobs on
YARN. Since there is a chance that this might be the last release 0.10
release, I would be OK to cancel the vote for your fix.

Still, let's hear the opinion of others before doing this. What do you think?

– Ufuk


On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk  wrote:
> Perhaps too late for the RC, but I've backported FLINK-3293 to this branch
> via FLINK-3372. Would be nice for those wanting to monitory yarn
> application submissions.
>
> On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  wrote:
>
>> Dear Flink community,
>>
>> Please vote on releasing the following candidate as Apache Flink version
>> 0.10.2.
>>
>> Please note that this vote has a slightly shorter voting period of 48
>> hours. Only a single change has been made since the last release
>> candidate. Since the community has already done extensive testing of the
>> previous release candidate, I'm assuming 48 hours will suffice to vote on
>> this one.
>>
>> The commit to be voted on:
>> e525eb2f1413df238e994d01c909d2b90f1b7709
>>
>> Branch:
>> release-0.10.2-rc2 (see
>>
>> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
>> )
>>
>> The release artifacts to be voted on can be found at:
>> http://people.apache.org/~uce/flink-0.10.2-rc2/
>>
>> The release artifacts are signed with the key with fingerprint 9D403309:
>> http://www.apache.org/dist/flink/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapacheflink-1061
>>
>> -
>>
>> The vote is open for the next 48 hours and passes if a majority of at least
>> three +1 PMC votes are cast.
>>
>> The vote ends on Wednesday February 10, 2016, 18:45 CET.
>>
>> [ ] +1 Release this package as Apache Flink 0.10.2
>> [ ] -1 Do not release this package because ...
>>
>> ===
>>
>> The following commits have been added since the 0.10.2 RC 1:
>>
>> * 2cd0618 - [tools] Properly update all POM versions in release script
>> (3 hours ago) 
>>

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Stefano Baghino

I agree with you, but I acknowledge that there may be concerns regarding
the stability of the API. Perhaps the rationale behind the proposal of
Stephan and Till is to provide it as an extension to test how the
developers feel about it. It would be ideal to have a larger feedback from
the community. However I have to admit I like the approach.

On Tue, Feb 9, 2016 at 2:43 PM, Theodore Vasiloudis <
theodoros.vasilou...@gmail.com> wrote:

> Thanks for bringing this up Stefano, it would a very welcome addition
> indeed.
>
> I like the approach of having extensions through implicits as well. IMHO
> though this should be the default
> behavior, without the need to add another import.
>
> On Tue, Feb 9, 2016 at 1:29 PM, Stefano Baghino <
> stefano.bagh...@radicalbit.io> wrote:
>
> > I see, thanks for the tip! I'll work on it; meanwhile, I've added some
> > functions and Scaladoc:
> >
> >
> https://github.com/radicalbit/flink/blob/1159-implicit/flink-scala/src/main/scala/org/apache/flink/api/scala/extensions/package.scala
> >
> > On Tue, Feb 9, 2016 at 12:01 PM, Till Rohrmann 
> > wrote:
> >
> > > Not the DataSet but the JoinDataSet and the CoGroupDataSet do in the
> form
> > > of an apply function.
> > > 
> > >
> > > On Tue, Feb 9, 2016 at 11:09 AM, Stefano Baghino <
> > > stefano.bagh...@radicalbit.io> wrote:
> > >
> > > > Sure, it was just a draft. I agree that filter and mapPartition make
> > > sense,
> > > > but coGroup and join don't look like they take a function.
> > > >
> > > > On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann  >
> > > > wrote:
> > > >
> > > > > This looks like a good design to me :-) The only thing is that it
> is
> > > not
> > > > > complete. For example, the filter, mapPartition, coGroup and join
> > > > functions
> > > > > are missing.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > > 
> > > > >
> > > > > On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > >
> > > > > > What do you think of something like this?
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> > > > > >
> > > > > > In this way, several extensions can be collected in this package
> > > object
> > > > > and
> > > > > > picked altogether or a-là-carte (e.g. import
> > > > > > org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> > > > > >
> > > > > > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann <
> > trohrm...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > I like the idea to support partial functions with Flink’s Scala
> > > API.
> > > > > > > However, I think that breaking the API and making it
> inconsistent
> > > > with
> > > > > > > respect to the Java API is not the best option. I would rather
> be
> > > in
> > > > > > favour
> > > > > > > of the first proposal where we add a new method xxxWith via
> > > implicit
> > > > > > > conversions.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > > 
> > > > > > >
> > > > > > > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > > >
> > > > > > > > It took me a little time but I was able to put together some
> > > code.
> > > > > > > >
> > > > > > > > In this commit I just added a few methods renamed to prevent
> > > > > > overloading,
> > > > > > > > thus usable with PartialFunction instead of functions:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/aacd59e0ce98cccb66d48a30d07990ac8f345748
> > > > > > > >
> > > > > > > > In this other commit I coded the original proposal, renaming
> > the
> > > > > > methods
> > > > > > > to
> > > > > > > > obtain the same effect as before, but with lower friction for
> > > Scala
> > > > > > > > developers (and provided some usage examples):
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/33403878eebba70def42f73a1cb671d13b1521b5
> > > > > > > >
> > > > > > > > On Thu, Jan 28, 2016 at 6:16 PM, Stefano Baghino <
> > > > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > > > >
> > > > > > > > > Hi Stephan,
> > > > > > > > >
> > > > > > > > > thank you for the quick reply and for your feedback; I
> agree
> > > with
> > > > > you
> > > > > > > > that
> > > > > > > > > breaking changes have to taken very seriously.
> > > > > > > > >
> > > > > > > > > The rationale behind my proposal is that Scala users are
> > > already
> > > > > > > > > accustomed to higher-order functions that manipulate
> > > collections
> > > > > and
> > > > > > it
> > > > > > > > > would beneficial for them to have an API that tries to
> adhere
> > > as
> > > > > much
> > > > > > > as
> > > > > > > > > possible to the interface provided by the Scala Collections
> > > API.
> > > > > IMHO
> > > > > > > >

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Fabian Hueske

Thanks Ufuk,
+1 for a new RC

2016-02-09 13:49 GMT+01:00 Ufuk Celebi :

> Hey Nick,
>
> I agree that this can be problematic when running multiple jobs on
> YARN. Since there is a chance that this might be the last release 0.10
> release, I would be OK to cancel the vote for your fix.
>
> Still, let's hear the opinion of others before doing this. What do you
> think?
>
> – Ufuk
>
>
> On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk  wrote:
> > Perhaps too late for the RC, but I've backported FLINK-3293 to this
> branch
> > via FLINK-3372. Would be nice for those wanting to monitory yarn
> > application submissions.
> >
> > On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  wrote:
> >
> >> Dear Flink community,
> >>
> >> Please vote on releasing the following candidate as Apache Flink version
> >> 0.10.2.
> >>
> >> Please note that this vote has a slightly shorter voting period of 48
> >> hours. Only a single change has been made since the last release
> >> candidate. Since the community has already done extensive testing of the
> >> previous release candidate, I'm assuming 48 hours will suffice to vote
> on
> >> this one.
> >>
> >> The commit to be voted on:
> >> e525eb2f1413df238e994d01c909d2b90f1b7709
> >>
> >> Branch:
> >> release-0.10.2-rc2 (see
> >>
> >>
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
> >> )
> >>
> >> The release artifacts to be voted on can be found at:
> >> http://people.apache.org/~uce/flink-0.10.2-rc2/
> >>
> >> The release artifacts are signed with the key with fingerprint 9D403309:
> >> http://www.apache.org/dist/flink/KEYS
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapacheflink-1061
> >>
> >> -
> >>
> >> The vote is open for the next 48 hours and passes if a majority of at
> least
> >> three +1 PMC votes are cast.
> >>
> >> The vote ends on Wednesday February 10, 2016, 18:45 CET.
> >>
> >> [ ] +1 Release this package as Apache Flink 0.10.2
> >> [ ] -1 Do not release this package because ...
> >>
> >> ===
> >>
> >> The following commits have been added since the 0.10.2 RC 1:
> >>
> >> * 2cd0618 - [tools] Properly update all POM versions in release script
> >> (3 hours ago) 
> >>
>

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Theodore Vasiloudis

Thanks for bringing this up Stefano, it would a very welcome addition
indeed.

I like the approach of having extensions through implicits as well. IMHO
though this should be the default
behavior, without the need to add another import.

On Tue, Feb 9, 2016 at 1:29 PM, Stefano Baghino <
stefano.bagh...@radicalbit.io> wrote:

> I see, thanks for the tip! I'll work on it; meanwhile, I've added some
> functions and Scaladoc:
>
> https://github.com/radicalbit/flink/blob/1159-implicit/flink-scala/src/main/scala/org/apache/flink/api/scala/extensions/package.scala
>
> On Tue, Feb 9, 2016 at 12:01 PM, Till Rohrmann 
> wrote:
>
> > Not the DataSet but the JoinDataSet and the CoGroupDataSet do in the form
> > of an apply function.
> > 
> >
> > On Tue, Feb 9, 2016 at 11:09 AM, Stefano Baghino <
> > stefano.bagh...@radicalbit.io> wrote:
> >
> > > Sure, it was just a draft. I agree that filter and mapPartition make
> > sense,
> > > but coGroup and join don't look like they take a function.
> > >
> > > On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann 
> > > wrote:
> > >
> > > > This looks like a good design to me :-) The only thing is that it is
> > not
> > > > complete. For example, the filter, mapPartition, coGroup and join
> > > functions
> > > > are missing.
> > > >
> > > > Cheers,
> > > > Till
> > > > 
> > > >
> > > > On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> > > > stefano.bagh...@radicalbit.io> wrote:
> > > >
> > > > > What do you think of something like this?
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> > > > >
> > > > > In this way, several extensions can be collected in this package
> > object
> > > > and
> > > > > picked altogether or a-là-carte (e.g. import
> > > > > org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> > > > >
> > > > > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann <
> trohrm...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > I like the idea to support partial functions with Flink’s Scala
> > API.
> > > > > > However, I think that breaking the API and making it inconsistent
> > > with
> > > > > > respect to the Java API is not the best option. I would rather be
> > in
> > > > > favour
> > > > > > of the first proposal where we add a new method xxxWith via
> > implicit
> > > > > > conversions.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > > 
> > > > > >
> > > > > > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > >
> > > > > > > It took me a little time but I was able to put together some
> > code.
> > > > > > >
> > > > > > > In this commit I just added a few methods renamed to prevent
> > > > > overloading,
> > > > > > > thus usable with PartialFunction instead of functions:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/aacd59e0ce98cccb66d48a30d07990ac8f345748
> > > > > > >
> > > > > > > In this other commit I coded the original proposal, renaming
> the
> > > > > methods
> > > > > > to
> > > > > > > obtain the same effect as before, but with lower friction for
> > Scala
> > > > > > > developers (and provided some usage examples):
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/33403878eebba70def42f73a1cb671d13b1521b5
> > > > > > >
> > > > > > > On Thu, Jan 28, 2016 at 6:16 PM, Stefano Baghino <
> > > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > > >
> > > > > > > > Hi Stephan,
> > > > > > > >
> > > > > > > > thank you for the quick reply and for your feedback; I agree
> > with
> > > > you
> > > > > > > that
> > > > > > > > breaking changes have to taken very seriously.
> > > > > > > >
> > > > > > > > The rationale behind my proposal is that Scala users are
> > already
> > > > > > > > accustomed to higher-order functions that manipulate
> > collections
> > > > and
> > > > > it
> > > > > > > > would beneficial for them to have an API that tries to adhere
> > as
> > > > much
> > > > > > as
> > > > > > > > possible to the interface provided by the Scala Collections
> > API.
> > > > IMHO
> > > > > > > being
> > > > > > > > able to manipulate a DataSet or DataStream like a Scala
> > > collection
> > > > > > > > idiomatically would appeal to developers and reduce the
> > friction
> > > > for
> > > > > > them
> > > > > > > > to learn Flink.
> > > > > > > >
> > > > > > > > If we want to pursue the renaming path, I think these changes
> > > (and
> > > > > > > porting
> > > > > > > > the rest of the codebase, like `flink-ml` and
> `flink-contrib`,
> > to
> > > > the
> > > > > > new
> > > > > > > > method names) can be done in relatively little time. Since
> > Flink
> > > is
> > > > > > > > approaching a major release, I think it's a good time to
> > consider
> > > > > this
> > > > > > > > change, if the community deems it relevant.

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Henry Saputra

Hi Ufuk,

This is nice to have but not a blocker.

So unless we find blocker for the current RC I prefer to continue evaluate
and VOTE current RC.

- Henry

On Tuesday, February 9, 2016, Ufuk Celebi  wrote:

> Hey Nick,
>
> I agree that this can be problematic when running multiple jobs on
> YARN. Since there is a chance that this might be the last release 0.10
> release, I would be OK to cancel the vote for your fix.
>
> Still, let's hear the opinion of others before doing this. What do you
> think?
>
> – Ufuk
>
>
> On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk  > wrote:
> > Perhaps too late for the RC, but I've backported FLINK-3293 to this
> branch
> > via FLINK-3372. Would be nice for those wanting to monitory yarn
> > application submissions.
> >
> > On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  > wrote:
> >
> >> Dear Flink community,
> >>
> >> Please vote on releasing the following candidate as Apache Flink version
> >> 0.10.2.
> >>
> >> Please note that this vote has a slightly shorter voting period of 48
> >> hours. Only a single change has been made since the last release
> >> candidate. Since the community has already done extensive testing of the
> >> previous release candidate, I'm assuming 48 hours will suffice to vote
> on
> >> this one.
> >>
> >> The commit to be voted on:
> >> e525eb2f1413df238e994d01c909d2b90f1b7709
> >>
> >> Branch:
> >> release-0.10.2-rc2 (see
> >>
> >>
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
> >> )
> >>
> >> The release artifacts to be voted on can be found at:
> >> http://people.apache.org/~uce/flink-0.10.2-rc2/
> >>
> >> The release artifacts are signed with the key with fingerprint 9D403309:
> >> http://www.apache.org/dist/flink/KEYS
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapacheflink-1061
> >>
> >> -
> >>
> >> The vote is open for the next 48 hours and passes if a majority of at
> least
> >> three +1 PMC votes are cast.
> >>
> >> The vote ends on Wednesday February 10, 2016, 18:45 CET.
> >>
> >> [ ] +1 Release this package as Apache Flink 0.10.2
> >> [ ] -1 Do not release this package because ...
> >>
> >> ===
> >>
> >> The following commits have been added since the 0.10.2 RC 1:
> >>
> >> * 2cd0618 - [tools] Properly update all POM versions in release script
> >> (3 hours ago) 
> >>
>

[jira] [Created] (FLINK-3379) Refactor TimestampExtractor

2016-02-09 Thread Stephan Ewen (JIRA)

Stephan Ewen created FLINK-3379:
---

 Summary: Refactor TimestampExtractor
 Key: FLINK-3379
 URL: https://issues.apache.org/jira/browse/FLINK-3379
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 0.10.1
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Priority: Critical
 Fix For: 1.0.0


Based on a lot of user feedback, the current {{TimestampExtractor}} seems very 
confusing. It implements simultaneously two modes of generating watermarks:

  - Each record that passes through can decide to cause a watermark.

  - The timestamp extractor can define a certain watermark timestamp which is 
periodically picked up by the system and triggers a watermark (if larger than 
the previous watermark).

Figuring out how these modes interplay, and how to define the methods to only 
use one mode has been quite an obstacle for several users. We should break this 
class into two different classes, one per mode of generating watermarks, to 
make it easier to understand.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Stephan Ewen

+1 (binding)

Re-ran the tests from the previous RC with the new RC

  - Built the code (Linux x64, Scala 2.10, Hadoop 2.3.0)
  - Ran all tests
  - Checked that there are no cross-version dependencies
  - Checked that LICENSE and NOTICE are up to date
  - Check planVisualizer util
  - ran examples from command line
  - tested web frontend on Firefox and Chrome
  - ran program on multiple taskmanagers, killed one during execution,
recovery worked properly




On Tue, Feb 9, 2016 at 4:34 PM, Ufuk Celebi  wrote:

> Given the initial feedback, I would like to keep the RC going as well. The
> upcoming 1.0 release will include the fix and if users complain about it
> for 0.10 we can do another release in the future.
>
> —
>
> +1 for this release.
>
> Forwarding my tests from the last vote.
>
> Additionally, I’ve verified that the quickstart poms point to the correct
> version now.
>
> – Ufuk
>
> > On 09 Feb 2016, at 15:59, Robert Metzger  wrote:
> >
> > I would also like to keep this RC. 0.10-SNAPSHOT will be equally stable.
> >
> > On Tue, Feb 9, 2016 at 2:57 PM, Stephan Ewen  wrote:
> >
> >> I would be in favor of continuing the release candidate.
> >> The change is more a feature improvement than a bug, and these should
> come
> >> before release candidates (otherwise we never get one done).
> >>
> >> The 1.0 release is also very close, so this improvement will be
> available
> >> anyways pretty soon.
> >>
> >> I personally also invested again quite a bit of time into testing this
> >> release candidate already.
> >>
> >>
> >>
> >> On Tue, Feb 9, 2016 at 1:51 PM, Fabian Hueske 
> wrote:
> >>
> >>> Thanks Ufuk,
> >>> +1 for a new RC
> >>>
> >>> 2016-02-09 13:49 GMT+01:00 Ufuk Celebi :
> >>>
>  Hey Nick,
> 
>  I agree that this can be problematic when running multiple jobs on
>  YARN. Since there is a chance that this might be the last release 0.10
>  release, I would be OK to cancel the vote for your fix.
> 
>  Still, let's hear the opinion of others before doing this. What do you
>  think?
> 
>  – Ufuk
> 
> 
>  On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk 
> >>> wrote:
> > Perhaps too late for the RC, but I've backported FLINK-3293 to this
>  branch
> > via FLINK-3372. Would be nice for those wanting to monitory yarn
> > application submissions.
> >
> > On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  wrote:
> >
> >> Dear Flink community,
> >>
> >> Please vote on releasing the following candidate as Apache Flink
> >>> version
> >> 0.10.2.
> >>
> >> Please note that this vote has a slightly shorter voting period of
> >> 48
> >> hours. Only a single change has been made since the last release
> >> candidate. Since the community has already done extensive testing of
> >>> the
> >> previous release candidate, I'm assuming 48 hours will suffice to
> >> vote
>  on
> >> this one.
> >>
> >> The commit to be voted on:
> >> e525eb2f1413df238e994d01c909d2b90f1b7709
> >>
> >> Branch:
> >> release-0.10.2-rc2 (see
> >>
> >>
> 
> >>>
> >>
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
> >> )
> >>
> >> The release artifacts to be voted on can be found at:
> >> http://people.apache.org/~uce/flink-0.10.2-rc2/
> >>
> >> The release artifacts are signed with the key with fingerprint
> >>> 9D403309:
> >> http://www.apache.org/dist/flink/KEYS
> >>
> >> The staging repository for this release can be found at:
> >>
> >>> https://repository.apache.org/content/repositories/orgapacheflink-1061
> >>
> >> -
> >>
> >> The vote is open for the next 48 hours and passes if a majority of
> >> at
>  least
> >> three +1 PMC votes are cast.
> >>
> >> The vote ends on Wednesday February 10, 2016, 18:45 CET.
> >>
> >> [ ] +1 Release this package as Apache Flink 0.10.2
> >> [ ] -1 Do not release this package because ...
> >>
> >> ===
> >>
> >> The following commits have been added since the 0.10.2 RC 1:
> >>
> >> * 2cd0618 - [tools] Properly update all POM versions in release
> >> script
> >> (3 hours ago) 
> >>
> 
> >>>
> >>
>
>

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Ufuk Celebi

Given the initial feedback, I would like to keep the RC going as well. The 
upcoming 1.0 release will include the fix and if users complain about it for 
0.10 we can do another release in the future.

—

+1 for this release.

Forwarding my tests from the last vote.

Additionally, I’ve verified that the quickstart poms point to the correct 
version now.

– Ufuk

> On 09 Feb 2016, at 15:59, Robert Metzger  wrote:
> 
> I would also like to keep this RC. 0.10-SNAPSHOT will be equally stable.
> 
> On Tue, Feb 9, 2016 at 2:57 PM, Stephan Ewen  wrote:
> 
>> I would be in favor of continuing the release candidate.
>> The change is more a feature improvement than a bug, and these should come
>> before release candidates (otherwise we never get one done).
>> 
>> The 1.0 release is also very close, so this improvement will be available
>> anyways pretty soon.
>> 
>> I personally also invested again quite a bit of time into testing this
>> release candidate already.
>> 
>> 
>> 
>> On Tue, Feb 9, 2016 at 1:51 PM, Fabian Hueske  wrote:
>> 
>>> Thanks Ufuk,
>>> +1 for a new RC
>>> 
>>> 2016-02-09 13:49 GMT+01:00 Ufuk Celebi :
>>> 
 Hey Nick,
 
 I agree that this can be problematic when running multiple jobs on
 YARN. Since there is a chance that this might be the last release 0.10
 release, I would be OK to cancel the vote for your fix.
 
 Still, let's hear the opinion of others before doing this. What do you
 think?
 
 – Ufuk
 
 
 On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk 
>>> wrote:
> Perhaps too late for the RC, but I've backported FLINK-3293 to this
 branch
> via FLINK-3372. Would be nice for those wanting to monitory yarn
> application submissions.
> 
> On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  wrote:
> 
>> Dear Flink community,
>> 
>> Please vote on releasing the following candidate as Apache Flink
>>> version
>> 0.10.2.
>> 
>> Please note that this vote has a slightly shorter voting period of
>> 48
>> hours. Only a single change has been made since the last release
>> candidate. Since the community has already done extensive testing of
>>> the
>> previous release candidate, I'm assuming 48 hours will suffice to
>> vote
 on
>> this one.
>> 
>> The commit to be voted on:
>> e525eb2f1413df238e994d01c909d2b90f1b7709
>> 
>> Branch:
>> release-0.10.2-rc2 (see
>> 
>> 
 
>>> 
>> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
>> )
>> 
>> The release artifacts to be voted on can be found at:
>> http://people.apache.org/~uce/flink-0.10.2-rc2/
>> 
>> The release artifacts are signed with the key with fingerprint
>>> 9D403309:
>> http://www.apache.org/dist/flink/KEYS
>> 
>> The staging repository for this release can be found at:
>> 
>>> https://repository.apache.org/content/repositories/orgapacheflink-1061
>> 
>> -
>> 
>> The vote is open for the next 48 hours and passes if a majority of
>> at
 least
>> three +1 PMC votes are cast.
>> 
>> The vote ends on Wednesday February 10, 2016, 18:45 CET.
>> 
>> [ ] +1 Release this package as Apache Flink 0.10.2
>> [ ] -1 Do not release this package because ...
>> 
>> ===
>> 
>> The following commits have been added since the 0.10.2 RC 1:
>> 
>> * 2cd0618 - [tools] Properly update all POM versions in release
>> script
>> (3 hours ago) 
>> 
 
>>> 
>>

Breaking changes in the Streaming API

2016-02-09 Thread Stephan Ewen

Hi everyone!

There are two remaining issues right now pending for 1.0 that will cause
breaking API changes in the Streaming API.


1)
[FLINK-3379]

The Timestamp Extractor needs to be changed. That really seems necessary,
based on the user feedback, because a lot of people mentioned that they are
getting confused about the TimestampExtractor's mixed two-way system of
generating watermarks.

The issue suggests to pull the two different modes of generating watermarks
into two different classes.


2)

[FLINK-3371] makes the "Trigger" an abstract class (currently interface)
and moves the "TriggerResult" to a dedicated class. This is necessary for
avoiding breaking changes in the future, after the release.

The reason why for these changes are "aligned windows", which have one
Trigger for the entire window across all keys (
https://issues.apache.org/jira/browse/FLINK-3370)

Aligned windows are for example most sliding/tumbling time windows, while
unaligned windows (with a trigger per key) are for example session and
count windows. For aligned windows, we can implement an optimized
representation that uses less memory and is more lightweight to checkpoint.

Also, the Trigger class may evolve a bit, and and with an abstract class we
can add methods without breaking user-defined Triggers in the future.


Greetings,
Stephan

[jira] [Created] (FLINK-3380) Unstable Test: JobSubmissionFailsITCase

2016-02-09 Thread Klou (JIRA)

Klou created FLINK-3380:
---

 Summary: Unstable Test: JobSubmissionFailsITCase
 Key: FLINK-3380
 URL: https://issues.apache.org/jira/browse/FLINK-3380
 Project: Flink
  Issue Type: Bug
Reporter: Klou
Priority: Critical


{quote}
Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 16.881 sec <<< 
FAILURE! - in org.apache.flink.test.failingPrograms.JobSubmissionFailsITCase
org.apache.flink.test.failingPrograms.JobSubmissionFailsITCase  Time elapsed: 
13.04 sec  <<< FAILURE!
java.lang.AssertionError: Futures timed out after [1 milliseconds]
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.flink.test.failingPrograms.JobSubmissionFailsITCase.teardown(JobSubmissionFailsITCase.java:82)
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper

2016-02-09 Thread Greg Hogan (JIRA)

Greg Hogan created FLINK-3382:
-

 Summary: Improve clarity of object reuse in 
ReusingMutableToRegularIteratorWrapper
 Key: FLINK-3382
 URL: https://issues.apache.org/jira/browse/FLINK-3382
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime
Affects Versions: 1.0.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


Object reuse in {{ReusingMutableToRegularIteratorWrapper.hasNext()} can be 
clarified by creating a single object and storing the iterator's next value 
into the second reference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Stefano Baghino

Hi Till,

I do agree with your point, so much so so that at the time being I'd
suggest to keeping these additions as optional, up to the end-user to
opt-in.
Adding them by default would effectively be an addition to the DataSet API
(despite being separated at a source file level).
I think your solution works best right now. We can proceed on this path
unless we see more interest around support for case-style functions.

On Tue, Feb 9, 2016 at 7:31 PM, Till Rohrmann  wrote:

> What we could do is to add the implicit class to the package object of
> org.apache.flink.api.scala. Since one always has to import this package in
> order to have the proper TypeInformations, you wouldn’t have to import the
> extension explicitly.
>
> The thing with the API is that you don’t want to break it too often. Once
> people are used to it and have code implemented it always entails rewriting
> the code if you change the API in a breaking manner. This can be really
> annoying for users.
>
> Cheers,
> Till
> 
>
> On Tue, Feb 9, 2016 at 2:51 PM, Stefano Baghino <
> stefano.bagh...@radicalbit.io> wrote:
>
> > I agree with you, but I acknowledge that there may be concerns regarding
> > the stability of the API. Perhaps the rationale behind the proposal of
> > Stephan and Till is to provide it as an extension to test how the
> > developers feel about it. It would be ideal to have a larger feedback
> from
> > the community. However I have to admit I like the approach.
> >
> > On Tue, Feb 9, 2016 at 2:43 PM, Theodore Vasiloudis <
> > theodoros.vasilou...@gmail.com> wrote:
> >
> > > Thanks for bringing this up Stefano, it would a very welcome addition
> > > indeed.
> > >
> > > I like the approach of having extensions through implicits as well.
> IMHO
> > > though this should be the default
> > > behavior, without the need to add another import.
> > >
> > > On Tue, Feb 9, 2016 at 1:29 PM, Stefano Baghino <
> > > stefano.bagh...@radicalbit.io> wrote:
> > >
> > > > I see, thanks for the tip! I'll work on it; meanwhile, I've added
> some
> > > > functions and Scaladoc:
> > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/blob/1159-implicit/flink-scala/src/main/scala/org/apache/flink/api/scala/extensions/package.scala
> > > >
> > > > On Tue, Feb 9, 2016 at 12:01 PM, Till Rohrmann  >
> > > > wrote:
> > > >
> > > > > Not the DataSet but the JoinDataSet and the CoGroupDataSet do in
> the
> > > form
> > > > > of an apply function.
> > > > > 
> > > > >
> > > > > On Tue, Feb 9, 2016 at 11:09 AM, Stefano Baghino <
> > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > >
> > > > > > Sure, it was just a draft. I agree that filter and mapPartition
> > make
> > > > > sense,
> > > > > > but coGroup and join don't look like they take a function.
> > > > > >
> > > > > > On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann <
> > trohrm...@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > This looks like a good design to me :-) The only thing is that
> it
> > > is
> > > > > not
> > > > > > > complete. For example, the filter, mapPartition, coGroup and
> join
> > > > > > functions
> > > > > > > are missing.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > > 
> > > > > > >
> > > > > > > On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> > > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > > >
> > > > > > > > What do you think of something like this?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> > > > > > > >
> > > > > > > > In this way, several extensions can be collected in this
> > package
> > > > > object
> > > > > > > and
> > > > > > > > picked altogether or a-là-carte (e.g. import
> > > > > > > >
> org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> > > > > > > >
> > > > > > > > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann <
> > > > trohrm...@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I like the idea to support partial functions with Flink’s
> > Scala
> > > > > API.
> > > > > > > > > However, I think that breaking the API and making it
> > > inconsistent
> > > > > > with
> > > > > > > > > respect to the Java API is not the best option. I would
> > rather
> > > be
> > > > > in
> > > > > > > > favour
> > > > > > > > > of the first proposal where we add a new method xxxWith via
> > > > > implicit
> > > > > > > > > conversions.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Till
> > > > > > > > > 
> > > > > > > > >
> > > > > > > > > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > > > > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > > > > >
> > > > > > > > > > It took me a little time but I was able to put together
> > some
> > > > > code.
> > > > > > > > > >
> > > > > > > > > > In this commit I just added a few methods renamed to
> > prevent
> > > >

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Stephan Ewen

I think that if we make the class part of the default ".api.scala" package
object, we effectively add these methods to DataSet.scala, because they
will be always be available on the data set.

If we want to retain the liberty to not commit to this change now, then we
should probably ask users to explicitly decide to use this extension (via a
separate import).

On Tue, Feb 9, 2016 at 10:53 PM, Stefano Baghino <
stefano.bagh...@radicalbit.io> wrote:

> Hi Till,
>
> I do agree with your point, so much so so that at the time being I'd
> suggest to keeping these additions as optional, up to the end-user to
> opt-in.
> Adding them by default would effectively be an addition to the DataSet API
> (despite being separated at a source file level).
> I think your solution works best right now. We can proceed on this path
> unless we see more interest around support for case-style functions.
>
> On Tue, Feb 9, 2016 at 7:31 PM, Till Rohrmann 
> wrote:
>
> > What we could do is to add the implicit class to the package object of
> > org.apache.flink.api.scala. Since one always has to import this package
> in
> > order to have the proper TypeInformations, you wouldn’t have to import
> the
> > extension explicitly.
> >
> > The thing with the API is that you don’t want to break it too often. Once
> > people are used to it and have code implemented it always entails
> rewriting
> > the code if you change the API in a breaking manner. This can be really
> > annoying for users.
> >
> > Cheers,
> > Till
> > 
> >
> > On Tue, Feb 9, 2016 at 2:51 PM, Stefano Baghino <
> > stefano.bagh...@radicalbit.io> wrote:
> >
> > > I agree with you, but I acknowledge that there may be concerns
> regarding
> > > the stability of the API. Perhaps the rationale behind the proposal of
> > > Stephan and Till is to provide it as an extension to test how the
> > > developers feel about it. It would be ideal to have a larger feedback
> > from
> > > the community. However I have to admit I like the approach.
> > >
> > > On Tue, Feb 9, 2016 at 2:43 PM, Theodore Vasiloudis <
> > > theodoros.vasilou...@gmail.com> wrote:
> > >
> > > > Thanks for bringing this up Stefano, it would a very welcome addition
> > > > indeed.
> > > >
> > > > I like the approach of having extensions through implicits as well.
> > IMHO
> > > > though this should be the default
> > > > behavior, without the need to add another import.
> > > >
> > > > On Tue, Feb 9, 2016 at 1:29 PM, Stefano Baghino <
> > > > stefano.bagh...@radicalbit.io> wrote:
> > > >
> > > > > I see, thanks for the tip! I'll work on it; meanwhile, I've added
> > some
> > > > > functions and Scaladoc:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/blob/1159-implicit/flink-scala/src/main/scala/org/apache/flink/api/scala/extensions/package.scala
> > > > >
> > > > > On Tue, Feb 9, 2016 at 12:01 PM, Till Rohrmann <
> trohrm...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Not the DataSet but the JoinDataSet and the CoGroupDataSet do in
> > the
> > > > form
> > > > > > of an apply function.
> > > > > > 
> > > > > >
> > > > > > On Tue, Feb 9, 2016 at 11:09 AM, Stefano Baghino <
> > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > >
> > > > > > > Sure, it was just a draft. I agree that filter and mapPartition
> > > make
> > > > > > sense,
> > > > > > > but coGroup and join don't look like they take a function.
> > > > > > >
> > > > > > > On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann <
> > > trohrm...@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > This looks like a good design to me :-) The only thing is
> that
> > it
> > > > is
> > > > > > not
> > > > > > > > complete. For example, the filter, mapPartition, coGroup and
> > join
> > > > > > > functions
> > > > > > > > are missing.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > > 
> > > > > > > >
> > > > > > > > On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> > > > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > > > >
> > > > > > > > > What do you think of something like this?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> > > > > > > > >
> > > > > > > > > In this way, several extensions can be collected in this
> > > package
> > > > > > object
> > > > > > > > and
> > > > > > > > > picked altogether or a-là-carte (e.g. import
> > > > > > > > >
> > org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> > > > > > > > >
> > > > > > > > > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann <
> > > > > trohrm...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I like the idea to support partial functions with Flink’s
> > > Scala
> > > > > > API.
> > > > > > > > > > However, I think that breaking the API and making it
> > > > inconsistent
> > > > > > >

Re: User Feedback

2016-02-09 Thread Stephan Ewen

I can elaborate in the project(...) method:

".returns()" is there to supply TypeInformation in cases where the system
cannot determine it. In the case of "project()", the system can perfectly
determine the output type info from the input and the projection.

For just getting a typed result, I would use Java's generic method syntax,
then you can get around defining an intermediate variable:

DataSet> input = ...;

Tuple2 aTuple = input.>project(0,2).collect().get(0);


Greetings,
Stephan


On Tue, Feb 9, 2016 at 7:54 PM, Vasiliki Kalavri 
wrote:

> Hi Martin,
>
> thank you for the feedback. Let me try to answer some of your concerns.
>
>
> On 9 February 2016 at 15:35, Martin Neumann  wrote:
>
> > During this year's FOSDEM Martin Junghans and I set together and gathered
> > some feedback for the Flink project. It is based on our personal
> experience
> > as well as the feedback and questions from People we taught the system.
> > This is going to be a longer email therefore I have split things into
> > categories:
> >
> >
> > *Website and Documentation:*
> >
> >1. *Out-dated Google Search results*: Google searches lead to outdated
> >web site versions (e.g. “flink transformations” or “flink iterations”
> >return the 0.7 version of the corresponding pages).
> >
>
> I'm not sure we can do much about this. I would suggest searching in the
> documentation instead of relying on Google.
> There is a search box on the top of all documentation pages.
>
>
>
> >2. *Invalid Links on Website: *Links are confusing / broken (e.g. the
> >Gelly /ML Links on the start page lead to the top of the feature page
> >(which start with streaming) *-> maybe this can be validated
> >automatically?*
> >
> >
> That was bug recently reported and fixed (see FLINK-3316). If you find
>  more of those, please report by opening a JIRA or Pull Request.
>
>
>
> >
> > *Batch API:*
> >
> >1. *.reduceGroup(GroupReduceFunction) and
> >.groupCombine(CombineGroupFunction): *In other functions such as
> >.flatMap(FlatMapFunction) the function call matches the naming of the
> >operator. This structure is quite convenient for new user since they
> can
> >make use of the autocompletion features of the IDE, basically start
> > typing
> >the function call and you get the correct class. This does not work
> for
> >.reduceGroup() and .groupCombine() since the names are switched
> around.
> > *->
> >maybe the function can be renamed*
> >
>
> I agree this might be strange for new users, but I think it will be much
> more annoying for existing users if we change this. In my view, it's not an
> important case to justify breaking the API.
>
>
>
> >2. *.print() and env.execute(): *Often .print() is used for debugging
> >and developing programs replacing regular data sinks. Such a project
> > will
> >not run until the env.execute() is removed. It's very easy to forget
> to
> > add
> >it back in, once you change the .print() back to a proper sink. The
> > project
> >now will compile fine but will not produce any output since .execute()
> > is
> >missing. This is a very difficult bug to find especially since there
> is
> > no
> >warning or error when running the job. It’s common that people use
> more
> >than one .print() statement during debugging and development. This can
> > lead
> >to confusion since each .print() forces the program to execute so the
> >execution behavior is different than without the print. This is
> > especially
> >important, if the program contains non-deterministic data generation
> > (like
> >generating IDs). In the stream API .print() would not require to
> >remove .execute() as a result the behavior of the two interfaces is
> >inconsistent.
> >
>
> This is indeed an issue that many users find hard to get used to. We have
> changed the behavior of print() a couple of times before and I'm not sure
> it would be wise to do so again. Actually, once a user understands the
> difference between eager and lazy sinks, I think it's quite easy to avoid
> mistakes.
>
>
>
> >3. *calling new when applying an operator eg: .reduceGroup(new
> >GroupReduceFunction()): *Some of the people I taught the API’s to
> where
> >confused by this. They knew it was a distributed system and they were
> >wondering where the constructor would be actually called. They
> expected
> > to
> >hand a class to the function that would be initialized on each of the
> >worker nodes. *-> maybe have a section about this in the
> documentation*
> >
>
> I'm not sure I understand the confusion with this one. The goal of
> high-level APIs is to relieve the users from having to think about
> distribution. The only thing they need to understand is the
> DataSet/DataStream abstractions and how to create transformations on them.
>
>

Re: Case style anonymous functions not supported by Scala API

2016-02-09 Thread Till Rohrmann

What we could do is to add the implicit class to the package object of
org.apache.flink.api.scala. Since one always has to import this package in
order to have the proper TypeInformations, you wouldn’t have to import the
extension explicitly.

The thing with the API is that you don’t want to break it too often. Once
people are used to it and have code implemented it always entails rewriting
the code if you change the API in a breaking manner. This can be really
annoying for users.

Cheers,
Till


On Tue, Feb 9, 2016 at 2:51 PM, Stefano Baghino <
stefano.bagh...@radicalbit.io> wrote:

> I agree with you, but I acknowledge that there may be concerns regarding
> the stability of the API. Perhaps the rationale behind the proposal of
> Stephan and Till is to provide it as an extension to test how the
> developers feel about it. It would be ideal to have a larger feedback from
> the community. However I have to admit I like the approach.
>
> On Tue, Feb 9, 2016 at 2:43 PM, Theodore Vasiloudis <
> theodoros.vasilou...@gmail.com> wrote:
>
> > Thanks for bringing this up Stefano, it would a very welcome addition
> > indeed.
> >
> > I like the approach of having extensions through implicits as well. IMHO
> > though this should be the default
> > behavior, without the need to add another import.
> >
> > On Tue, Feb 9, 2016 at 1:29 PM, Stefano Baghino <
> > stefano.bagh...@radicalbit.io> wrote:
> >
> > > I see, thanks for the tip! I'll work on it; meanwhile, I've added some
> > > functions and Scaladoc:
> > >
> > >
> >
> https://github.com/radicalbit/flink/blob/1159-implicit/flink-scala/src/main/scala/org/apache/flink/api/scala/extensions/package.scala
> > >
> > > On Tue, Feb 9, 2016 at 12:01 PM, Till Rohrmann 
> > > wrote:
> > >
> > > > Not the DataSet but the JoinDataSet and the CoGroupDataSet do in the
> > form
> > > > of an apply function.
> > > > 
> > > >
> > > > On Tue, Feb 9, 2016 at 11:09 AM, Stefano Baghino <
> > > > stefano.bagh...@radicalbit.io> wrote:
> > > >
> > > > > Sure, it was just a draft. I agree that filter and mapPartition
> make
> > > > sense,
> > > > > but coGroup and join don't look like they take a function.
> > > > >
> > > > > On Tue, Feb 9, 2016 at 10:08 AM, Till Rohrmann <
> trohrm...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > This looks like a good design to me :-) The only thing is that it
> > is
> > > > not
> > > > > > complete. For example, the filter, mapPartition, coGroup and join
> > > > > functions
> > > > > > are missing.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > > 
> > > > > >
> > > > > > On Tue, Feb 9, 2016 at 1:18 AM, Stefano Baghino <
> > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > >
> > > > > > > What do you think of something like this?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/21a889a437875c88921c93e87d88a378c6b4299e
> > > > > > >
> > > > > > > In this way, several extensions can be collected in this
> package
> > > > object
> > > > > > and
> > > > > > > picked altogether or a-là-carte (e.g. import
> > > > > > > org.apache.flink.api.scala.extensions.AcceptPartialFunctions).
> > > > > > >
> > > > > > > On Mon, Feb 8, 2016 at 2:51 PM, Till Rohrmann <
> > > trohrm...@apache.org>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I like the idea to support partial functions with Flink’s
> Scala
> > > > API.
> > > > > > > > However, I think that breaking the API and making it
> > inconsistent
> > > > > with
> > > > > > > > respect to the Java API is not the best option. I would
> rather
> > be
> > > > in
> > > > > > > favour
> > > > > > > > of the first proposal where we add a new method xxxWith via
> > > > implicit
> > > > > > > > conversions.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > > 
> > > > > > > >
> > > > > > > > On Sun, Feb 7, 2016 at 12:44 PM, Stefano Baghino <
> > > > > > > > stefano.bagh...@radicalbit.io> wrote:
> > > > > > > >
> > > > > > > > > It took me a little time but I was able to put together
> some
> > > > code.
> > > > > > > > >
> > > > > > > > > In this commit I just added a few methods renamed to
> prevent
> > > > > > > overloading,
> > > > > > > > > thus usable with PartialFunction instead of functions:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/radicalbit/flink/commit/aacd59e0ce98cccb66d48a30d07990ac8f345748
> > > > > > > > >
> > > > > > > > > In this other commit I coded the original proposal,
> renaming
> > > the
> > > > > > > methods
> > > > > > > > to
> > > > > > > > > obtain the same effect as before, but with lower friction
> for
> > > > Scala
> > > > > > > > > developers (and provided some usage examples):
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Henry Saputra

Hi Nick,

Thanks for bringing up the issue. I believe some people (including me) will
try to run Apache Flink on YARN =)

If more 0.10.x releases needed before 1.0 definitely would love to include
this in.

- Henry

On Tue, Feb 9, 2016 at 9:33 AM, Nick Dimiduk  wrote:

> Agree it's not a blocker. I can manage my own build with this patch if
> necessary. Just thought I might not be the only one running streaming on
> yarn.
>
> On Tuesday, February 9, 2016, Henry Saputra 
> wrote:
>
> > Hi Ufuk,
> >
> > This is nice to have but not a blocker.
> >
> > So unless we find blocker for the current RC I prefer to continue
> evaluate
> > and VOTE current RC.
> >
> > - Henry
> >
> > On Tuesday, February 9, 2016, Ufuk Celebi  >
> > wrote:
> >
> > > Hey Nick,
> > >
> > > I agree that this can be problematic when running multiple jobs on
> > > YARN. Since there is a chance that this might be the last release 0.10
> > > release, I would be OK to cancel the vote for your fix.
> > >
> > > Still, let's hear the opinion of others before doing this. What do you
> > > think?
> > >
> > > – Ufuk
> > >
> > >
> > > On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk  > 
> > > > wrote:
> > > > Perhaps too late for the RC, but I've backported FLINK-3293 to this
> > > branch
> > > > via FLINK-3372. Would be nice for those wanting to monitory yarn
> > > > application submissions.
> > > >
> > > > On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  > 
> > > > wrote:
> > > >
> > > >> Dear Flink community,
> > > >>
> > > >> Please vote on releasing the following candidate as Apache Flink
> > version
> > > >> 0.10.2.
> > > >>
> > > >> Please note that this vote has a slightly shorter voting period of
> 48
> > > >> hours. Only a single change has been made since the last release
> > > >> candidate. Since the community has already done extensive testing of
> > the
> > > >> previous release candidate, I'm assuming 48 hours will suffice to
> vote
> > > on
> > > >> this one.
> > > >>
> > > >> The commit to be voted on:
> > > >> e525eb2f1413df238e994d01c909d2b90f1b7709
> > > >>
> > > >> Branch:
> > > >> release-0.10.2-rc2 (see
> > > >>
> > > >>
> > >
> >
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
> > > >> )
> > > >>
> > > >> The release artifacts to be voted on can be found at:
> > > >> http://people.apache.org/~uce/flink-0.10.2-rc2/
> > > >>
> > > >> The release artifacts are signed with the key with fingerprint
> > 9D403309:
> > > >> http://www.apache.org/dist/flink/KEYS
> > > >>
> > > >> The staging repository for this release can be found at:
> > > >>
> > https://repository.apache.org/content/repositories/orgapacheflink-1061
> > > >>
> > > >> -
> > > >>
> > > >> The vote is open for the next 48 hours and passes if a majority of
> at
> > > least
> > > >> three +1 PMC votes are cast.
> > > >>
> > > >> The vote ends on Wednesday February 10, 2016, 18:45 CET.
> > > >>
> > > >> [ ] +1 Release this package as Apache Flink 0.10.2
> > > >> [ ] -1 Do not release this package because ...
> > > >>
> > > >> ===
> > > >>
> > > >> The following commits have been added since the 0.10.2 RC 1:
> > > >>
> > > >> * 2cd0618 - [tools] Properly update all POM versions in release
> script
> > > >> (3 hours ago) 
> > > >>
> > >
> >
>

[jira] [Created] (FLINK-3383) Separate Maven deployment from CI testing

2016-02-09 Thread Maximilian Michels (JIRA)

Maximilian Michels created FLINK-3383:
-

 Summary: Separate Maven deployment from CI testing
 Key: FLINK-3383
 URL: https://issues.apache.org/jira/browse/FLINK-3383
 Project: Flink
  Issue Type: Improvement
  Components: Build System, Tests
Affects Versions: 1.0.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Critical


We currently handle our tests and the deployment of the Maven artifacts via 
Travis CI. Travis has a maximum allowed build time of two hours which we reach 
nearly every time. By that time, the tests have already been run but the 
deployment is still undergoing.

I propose to remove the Maven deployment from Travis. Instead, we could use 
Apache's Jenkins service or Apache's Buildbot service to trigger a deployment 
once a day.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

2016-02-09 Thread Nick Dimiduk

Agree it's not a blocker. I can manage my own build with this patch if
necessary. Just thought I might not be the only one running streaming on
yarn.

On Tuesday, February 9, 2016, Henry Saputra  wrote:

> Hi Ufuk,
>
> This is nice to have but not a blocker.
>
> So unless we find blocker for the current RC I prefer to continue evaluate
> and VOTE current RC.
>
> - Henry
>
> On Tuesday, February 9, 2016, Ufuk Celebi >
> wrote:
>
> > Hey Nick,
> >
> > I agree that this can be problematic when running multiple jobs on
> > YARN. Since there is a chance that this might be the last release 0.10
> > release, I would be OK to cancel the vote for your fix.
> >
> > Still, let's hear the opinion of others before doing this. What do you
> > think?
> >
> > – Ufuk
> >
> >
> > On Mon, Feb 8, 2016 at 8:05 PM, Nick Dimiduk  
> > > wrote:
> > > Perhaps too late for the RC, but I've backported FLINK-3293 to this
> > branch
> > > via FLINK-3372. Would be nice for those wanting to monitory yarn
> > > application submissions.
> > >
> > > On Mon, Feb 8, 2016 at 9:37 AM, Ufuk Celebi  
> > > wrote:
> > >
> > >> Dear Flink community,
> > >>
> > >> Please vote on releasing the following candidate as Apache Flink
> version
> > >> 0.10.2.
> > >>
> > >> Please note that this vote has a slightly shorter voting period of 48
> > >> hours. Only a single change has been made since the last release
> > >> candidate. Since the community has already done extensive testing of
> the
> > >> previous release candidate, I'm assuming 48 hours will suffice to vote
> > on
> > >> this one.
> > >>
> > >> The commit to be voted on:
> > >> e525eb2f1413df238e994d01c909d2b90f1b7709
> > >>
> > >> Branch:
> > >> release-0.10.2-rc2 (see
> > >>
> > >>
> >
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.10.2-rc2
> > >> )
> > >>
> > >> The release artifacts to be voted on can be found at:
> > >> http://people.apache.org/~uce/flink-0.10.2-rc2/
> > >>
> > >> The release artifacts are signed with the key with fingerprint
> 9D403309:
> > >> http://www.apache.org/dist/flink/KEYS
> > >>
> > >> The staging repository for this release can be found at:
> > >>
> https://repository.apache.org/content/repositories/orgapacheflink-1061
> > >>
> > >> -
> > >>
> > >> The vote is open for the next 48 hours and passes if a majority of at
> > least
> > >> three +1 PMC votes are cast.
> > >>
> > >> The vote ends on Wednesday February 10, 2016, 18:45 CET.
> > >>
> > >> [ ] +1 Release this package as Apache Flink 0.10.2
> > >> [ ] -1 Do not release this package because ...
> > >>
> > >> ===
> > >>
> > >> The following commits have been added since the 0.10.2 RC 1:
> > >>
> > >> * 2cd0618 - [tools] Properly update all POM versions in release script
> > >> (3 hours ago) 
> > >>
> >
>

RE: User Feedback

Re: User Feedback

Re: Case style anonymous functions not supported by Scala API

Re: Case style anonymous functions not supported by Scala API

Re: Case style anonymous functions not supported by Scala API

[jira] [Created] (FLINK-3376) Add an illustration of Event Time and Watermarks to the docsq

[jira] [Created] (FLINK-3375) Allow Watermark Generation in the Kafka Source

[jira] [Created] (FLINK-3374) CEPITCase testSimplePatternEventTime fails

[jira] [Created] (FLINK-3373) Using a newer library of Apache HttpClient than 4.2.6 will get class loading problems

[jira] [Created] (FLINK-3378) Consolidate TestingCluster and FokableFlinkMiniCluster

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

User Feedback

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

Re: Case style anonymous functions not supported by Scala API

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

Re: Case style anonymous functions not supported by Scala API

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

Re: Case style anonymous functions not supported by Scala API

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

[jira] [Created] (FLINK-3379) Refactor TimestampExtractor

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

Breaking changes in the Streaming API

[jira] [Created] (FLINK-3380) Unstable Test: JobSubmissionFailsITCase

[jira] [Created] (FLINK-3382) Improve clarity of object reuse in ReusingMutableToRegularIteratorWrapper

Re: Case style anonymous functions not supported by Scala API

Re: Case style anonymous functions not supported by Scala API

Re: User Feedback

Re: Case style anonymous functions not supported by Scala API

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

[jira] [Created] (FLINK-3383) Separate Maven deployment from CI testing

Re: [VOTE] Release Apache Flink 0.10.2 (RC2)

32 matches

Site Navigation

Mail list logo

Footer information