[jira] [Created] (FLINK-3711) Scala fold() example syntax incorrect

2016-04-06 Thread Shannon Carey (JIRA)
Shannon Carey created FLINK-3711:


 Summary: Scala fold() example syntax incorrect
 Key: FLINK-3711
 URL: https://issues.apache.org/jira/browse/FLINK-3711
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0.1, 1.0.0
Reporter: Shannon Carey
Priority: Minor


Scala's KeyedStream#fold which accepts scala.Function2 is defined as a 
partially appliable function. The documentation, however, is written as if it 
is a non-partial function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3710) ScalaDocs for org.apache.flink.streaming.scala are missing from the web site

2016-04-06 Thread Elias Levy (JIRA)
Elias Levy created FLINK-3710:
-

 Summary: ScalaDocs for org.apache.flink.streaming.scala are 
missing from the web site
 Key: FLINK-3710
 URL: https://issues.apache.org/jira/browse/FLINK-3710
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0.1
Reporter: Elias Levy


The ScalaDocs only include docs for org.apache.flink.scala and sub-packages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DISCUSS] Macro-benchmarking for performance tuning and regression detection

2016-04-06 Thread Greg Hogan
I'd like to discuss the creation of a macro-benchmarking module for Flink.
This could be run during pre-release testing to detect performance
regressions and during development when refactoring or performance tuning
code on the hot path.

Many users have published benchmarks and the Flink libraries already
contain a modest selection of algorithms. Some benefits of creating a
consolidated collection of macro-benchmarks include:

- comprehensive code coverage: a diverse set of algorithms can stress every
aspect of Flink (streaming, batch, sorts, joins, spilling, cluster, ...)

- codify best practices: benchmarks should be relatively stable and
repeatable

- efficient: an automated system can run many more tests and generate more
accurate results

Macro-benchmarks would be useful in analyzing improved performance with the
proposed specialized serializes and comparators [FLINK-3599] or making
Flink NUMA-aware [FLINK-3163].

I've also been looking recently at some of the hot code and see about a
~12-14% total improvement when modifying NormalizedKeySorter.compare/swap
to bitshift and bitmask rather than divide and modulo. The trade-off is
that to align on a power-of-2 we have holes in and require additional
MemoryBuffers. And I'm testing on a single data type, IntValue, and there
may be different results for LongValue or StringValue or custom types or
with different algorithms. And replacing multiply with a left shift reduces
performance, demonstrating the need to test changes in isolation.

There are many more ideas, i.e. NormalizedKeySorter writing keys before the
pointer so that the offset computation is performed outside of the compare
and sort methods. Also, SpanningRecordSerializer could skip to the next
buffer rather than writing length across buffers. These changes might each
be worth a few percent. Other changes might be less than a 1% speedup, but
taken in aggregate will yield a noticeable performance increase.

I like the idea of profile first, measure second, then create and discuss
the pull request.

As for the actual macro-benchmarking framework, it would be nice if the
algorithms would also verify correctness alongside performance. The
algorithm interface would be warmup (run only once) and execute, which
would be run multiple times in an interleaved manner. There benchmarking
duration should be tunable.

The framework would be responsible for configuration of as well as starting
and stopping the cluster, executing algorithms and recording performance,
and comparing and analyzing results.

Greg


[jira] [Created] (FLINK-3709) [streaming] Graph event rates over time

2016-04-06 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created FLINK-3709:
---

 Summary: [streaming] Graph event rates over time
 Key: FLINK-3709
 URL: https://issues.apache.org/jira/browse/FLINK-3709
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Reporter: Nick Dimiduk


The streaming server job page displays bytes and records sent and received, 
which answers the question "is data moving?" The next obvious question is "is 
data moving over time?" That could be answered by a chart displaying 
bytes/events rates. This would be a great chart to add to this display.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: CEP blog post

2016-04-06 Thread Till Rohrmann
That is a good point Ufuk. Will add the note.

On Wed, Apr 6, 2016 at 2:03 PM, Ufuk Celebi  wrote:

> The website has been updated for 1.0.1. :-)
>
> @Till: If you don't mention it in the post, it makes sense to have a
> note at the end of the post saying that the code examples only work
> with 1.0.1.
>
> On Mon, Apr 4, 2016 at 3:35 PM, Till Rohrmann 
> wrote:
> > Thanks a lot to all for the valuable feedback. I've incorporated your
> > suggestions and will publish the article, once Flink 1.0.1 has been
> released
> > (we need 1.0.1 to run the example code).
> >
> > Cheers,
> > Till
> >
> > On Mon, Apr 4, 2016 at 10:29 AM, gen tang  wrote:
> >>
> >> It is really a good article. Please put it on Flink Blog
> >>
> >> Cheers
> >> Gen
> >>
> >>
> >> On Fri, Apr 1, 2016 at 9:56 PM, Till Rohrmann 
> >> wrote:
> >>>
> >>> Hi Flink community,
> >>>
> >>> I've written a short blog [1] post about Flink's new CEP library which
> >>> basically showcases its functionality using a monitoring example. I
> would
> >>> like to publish the post on the flink.apache.org blog next week, if
> nobody
> >>> objects. Feedback is highly appreciated :-)
> >>>
> >>> [1]
> >>>
> https://docs.google.com/document/d/1rF2zVjitdTcooIwzJKNCIvAOi85j-wDXf1goXWXHHbk/edit?usp=sharing
> >>>
> >>> Cheers,
> >>> Till
> >>
> >>
> >
>


Re: CEP blog post

2016-04-06 Thread Ufuk Celebi
The website has been updated for 1.0.1. :-)

@Till: If you don't mention it in the post, it makes sense to have a
note at the end of the post saying that the code examples only work
with 1.0.1.

On Mon, Apr 4, 2016 at 3:35 PM, Till Rohrmann  wrote:
> Thanks a lot to all for the valuable feedback. I've incorporated your
> suggestions and will publish the article, once Flink 1.0.1 has been released
> (we need 1.0.1 to run the example code).
>
> Cheers,
> Till
>
> On Mon, Apr 4, 2016 at 10:29 AM, gen tang  wrote:
>>
>> It is really a good article. Please put it on Flink Blog
>>
>> Cheers
>> Gen
>>
>>
>> On Fri, Apr 1, 2016 at 9:56 PM, Till Rohrmann 
>> wrote:
>>>
>>> Hi Flink community,
>>>
>>> I've written a short blog [1] post about Flink's new CEP library which
>>> basically showcases its functionality using a monitoring example. I would
>>> like to publish the post on the flink.apache.org blog next week, if nobody
>>> objects. Feedback is highly appreciated :-)
>>>
>>> [1]
>>> https://docs.google.com/document/d/1rF2zVjitdTcooIwzJKNCIvAOi85j-wDXf1goXWXHHbk/edit?usp=sharing
>>>
>>> Cheers,
>>> Till
>>
>>
>


[ANNOUNCE] Flink 1.0.1 Released

2016-04-06 Thread Ufuk Celebi
The Flink PMC is pleased to announce the availability of Flink 1.0.1.

The official release announcement:
http://flink.apache.org/news/2016/04/06/release-1.0.1.html

Release binaries:
http://apache.openmirror.de/flink/flink-1.0.1/

Please update your Maven dependencies to the new 1.0.1 version and
update your binaries.

On behalf of the Flink PMC, I would like to thank everybody who
contributed to the release.


[jira] [Created] (FLINK-3708) Scala API for CEP

2016-04-06 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3708:


 Summary: Scala API for CEP
 Key: FLINK-3708
 URL: https://issues.apache.org/jira/browse/FLINK-3708
 Project: Flink
  Issue Type: Improvement
  Components: CEP
Affects Versions: 1.1.0
Reporter: Till Rohrmann


Currently, The CEP library does not support Scala case classes, because the 
{{TypeExtractor}} cannot handle them. In order to support them, it would be 
necessary to offer a Scala API for the CEP library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3707) Add missing Rich(Flat)JoinCoGroupFunction

2016-04-06 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3707:


 Summary: Add missing Rich(Flat)JoinCoGroupFunction
 Key: FLINK-3707
 URL: https://issues.apache.org/jira/browse/FLINK-3707
 Project: Flink
  Issue Type: Improvement
  Components: DataStream API
Affects Versions: 1.0.0, 1.1.0
Reporter: Till Rohrmann


Flink's {{DataStream}} API supports a {{JoinCoGroupFunction}} and 
{{FlatJoinCoGroupFunction}}. However, it is missing the {{RichFunction}} 
equivalents. I think we should add a {{RichJoinCoGroupFunction}} and 
{{RichFlatJoinCoGroupFunction}} to make the API consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Powered by Flink

2016-04-06 Thread Henry Saputra
Thanks, Slim. I have just updated the wiki page with this entries.

On Tue, Apr 5, 2016 at 10:20 AM, Slim Baltagi  wrote:

> Hi
>
> The following are missing in the ‘Powered by Flink’ list:
>
>- *king.com  *
>
> https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces88
>- *Otto Group  *
>http://data-artisans.com/how-we-selected-apache-flink-at-otto-group/
>- *Eura Nova *https://research.euranova.eu/flink-forward-2015-talk/
>- *Big Data Europe *http://www.big-data-europe.eu
>
> Thanks
>
> Slim Baltagi
>
>
> On Apr 5, 2016, at 10:08 AM, Robert Metzger  wrote:
>
> Hi everyone,
>
> I would like to bring the "Powered by Flink" wiki page [1] to the
> attention of Flink user's who recently joined the Flink community. The list
> tracks which organizations are using Flink.
> If your company / university / research institute / ... is using Flink but
> the name is not yet listed there, let me know and I'll add the name.
>
> Regards,
> Robert
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink
>
>
> On Mon, Oct 19, 2015 at 4:10 PM, Matthias J. Sax  wrote:
>
>> +1
>>
>> On 10/19/2015 04:05 PM, Maximilian Michels wrote:
>> > +1 Let's collect in the Wiki for now. At some point in time, we might
>> > want to have a dedicated page on the Flink homepage.
>> >
>> > On Mon, Oct 19, 2015 at 3:31 PM, Timo Walther 
>> wrote:
>> >> Ah ok, sorry. I think linking to the wiki is also ok.
>> >>
>> >>
>> >> On 19.10.2015 15:18, Fabian Hueske wrote:
>> >>>
>> >>> @Timo: The proposal was to keep the list in the wiki (can be easily
>> >>> extended) but link from the main website to the wiki page.
>> >>>
>> >>> 2015-10-19 15:16 GMT+02:00 Timo Walther :
>> >>>
>>  +1 for adding it to the website instead of wiki.
>>  "Who is using Flink?" is always a question difficult to answer to
>>  interested users.
>> 
>> 
>>  On 19.10.2015 15:08, Suneel Marthi wrote:
>> 
>>  +1 to this.
>> 
>>  On Mon, Oct 19, 2015 at 3:00 PM, Fabian Hueske 
>> wrote:
>> 
>> > Sounds good +1
>> >
>> > 2015-10-19 14:57 GMT+02:00 Márton Balassi < <
>> balassi.mar...@gmail.com>
>> > balassi.mar...@gmail.com>:
>> >
>> >> Thanks for starting and big +1 for making it more prominent.
>> >>
>> >> On Mon, Oct 19, 2015 at 2:53 PM, Fabian Hueske < <
>> fhue...@gmail.com>
>> >
>> > fhue...@gmail.com> wrote:
>> >>>
>> >>> Thanks for starting this Kostas.
>> >>>
>> >>> I think the list is quite hidden in the wiki. Should we link from
>> >>> flink.apache.org to that page?
>> >>>
>> >>> Cheers, Fabian
>> >>>
>> >>> 2015-10-19 14:50 GMT+02:00 Kostas Tzoumas < 
>> >
>> > ktzou...@apache.org>:
>> 
>>  Hi everyone,
>> 
>>  I started a "Powered by Flink" wiki page, listing some of the
>>  organizations that are using Flink:
>> 
>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink
>> 
>>  If you would like to be added to the list, just send me a short
>> email
>>  with your organization's name and a description and I will add
>> you to
>> >
>> > the
>> 
>>  wiki page.
>> 
>>  Best,
>>  Kostas
>> 
>> >>>
>> 
>> 
>> >>
>>
>>
>
>