Reduced Availability from 17.6. - 24.6

2017-06-16 Thread Aljoscha Krettek
Hi,

I’ll be on vacation next week, just in case anyone is wondering why I’m not 
responding. :-)

Best,
Aljoscha

Re: FlinkML on slack

2017-06-16 Thread Andrew Psaltis
Hi Stravros,
Can you please invite me to the FlinkML slack channel as well.

Thanks,
Andrew

On Thu, Jun 15, 2017 at 3:08 PM, Lokesh Amarnath 
wrote:

> Hi Stravros,
>
> Could you also please add me to the Slack channel? My email id is:
> lokesh.amarn...@gmail.com.
>
> Thanks,
> Lokesh
>
>
>
> On Thu, Jun 15, 2017 at 6:27 PM, Stavros Kontopoulos <
> st.kontopou...@gmail.com> wrote:
>
> > Ziyad added.
> >
> > Stavros
> >
> > On Sun, Jun 11, 2017 at 4:45 PM, Ziyad Muhammed 
> wrote:
> >
> > > Hi Stavros
> > >
> > > Could you please send me an invite to the slack channel?
> > >
> > > Best
> > > Ziyad
> > >
> > >
> > > On Sun, Jun 11, 2017 at 1:53 AM, Stavros Kontopoulos <
> > > st.kontopou...@gmail.com> wrote:
> > >
> > > > @Henry @Tao @Martin invitations sent... Thnx @Theo for handling the
> > > Apache
> > > > compliance issues.
> > > >
> > > > Best,
> > > > Stavros
> > > >
> > > > On Sat, Jun 10, 2017 at 10:27 PM, Henry Saputra <
> > henry.sapu...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Stavros,
> > > > >
> > > > > Could you also send me invite to the Slack?
> > > > >
> > > > > My email is hsapu...@apache.org
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Henry
> > > > >
> > > > >
> > > > > On Thu, Jun 8, 2017 at 2:21 AM, Stavros Kontopoulos <
> > > > > st.kontopou...@gmail.com> wrote:
> > > > >
> > > > > > Hi Aljoscha,
> > > > > >
> > > > > > Slack is invite only to the best of my knowledge, I just sent you
> > an
> > > > > > invitation.
> > > > > >
> > > > > > Best,
> > > > > > Stavros
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 8, 2017 at 11:31 AM, Aljoscha Krettek <
> > > aljos...@apache.org
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Is the slack invite based? If yes, could you please send me
> one?
> > > > > > >
> > > > > > > Best,
> > > > > > > Aljoscha
> > > > > > >
> > > > > > > > On 7. Jun 2017, at 21:56, Stavros Kontopoulos <
> > > > > > st.kontopou...@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > We took the initiative to create the organization for FlinkML
> > on
> > > > > slack
> > > > > > > > (thnx Eron).
> > > > > > > > There is now a channel for model-serving
> > > > > > > >  > > 1CjWL9aLxPrKytKxUF5c3ohs0ickp0
> > > > > > > fdEXPsPYPEywsE/edit#>.
> > > > > > > > Another is coming for flink-jpmml.
> > > > > > > > You are invited to join the channels and the efforts. @Gabor
> > > @Theo
> > > > > > please
> > > > > > > > consider adding channels for the other efforts there as well.
> > > > > > > >
> > > > > > > > FlinkMS on Slack  (
> > > > > > > https://flinkml.slack.com/)
> > > > > > > >
> > > > > > > > Details for the efforts here: Flink Roadmap doc
> > > > > > > >  > > 1afQbvZBTV15qF3vobVWUjxQc49h3U
> > > > > > > d06MIRhahtJ6dw/edit#>
> > > > > > > >
> > > > > > > > Github  (
> > https://github.com/FlinkML)
> > > > > > > >
> > > > > > > >
> > > > > > > > Stavros
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Thanks,
Andrew

Subscribe to my book: Streaming Data 

twiiter: @itmdata 


[jira] [Created] (FLINK-6939) Not store IterativeCondition with NFA state

2017-06-16 Thread Jark Wu (JIRA)
Jark Wu created FLINK-6939:
--

 Summary: Not store IterativeCondition with NFA state
 Key: FLINK-6939
 URL: https://issues.apache.org/jira/browse/FLINK-6939
 Project: Flink
  Issue Type: Sub-task
Reporter: Jark Wu
Assignee: Jark Wu


Currently, the IterativeCondition is stored with the total NFA state. And 
de/serialized every time when update/get the NFA state. It is a heavy operation 
and not necessary. In addition it is a required feature for FLINK-6938.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6938) IterativeCondition should support RichFunction interface

2017-06-16 Thread Jark Wu (JIRA)
Jark Wu created FLINK-6938:
--

 Summary: IterativeCondition should support RichFunction interface
 Key: FLINK-6938
 URL: https://issues.apache.org/jira/browse/FLINK-6938
 Project: Flink
  Issue Type: Sub-task
  Components: CEP, Table API & SQL
Reporter: Jark Wu
Assignee: Jark Wu


In FLIP-20, we need IterativeCondition to support an {{open()}} method to 
compile the generated code once. We do not want to insert a if condition  in 
the {{filter()}} method. So I suggest make IterativeCondition support 
{{RichFunction}} interface.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Table API / SQL features for Flink 1.4.0

2017-06-16 Thread jincheng sun
Hi Fabian,
Thanks for bring up this discuss.
In order to enrich Flink's built-in scalar function, friendly user
experience, I recommend adding as much scalar functions as possible in
version 1.4 release. I have filed the JIRAs(
https://issues.apache.org/jira/browse/FLINK-6810), and try my best to work
on them.

Of course, welcome anybody to add sub-tasks or take the JIRAs.

Cheers,
SunJincheng

2017-06-16 16:07 GMT+08:00 Fabian Hueske :

> Thanks for your response Shaoxuan,
>
> My "Table-table join with retraction" is probably the same as your
> "unbounded stream-stream join with retraction".
> Basically, a join between two dynamic tables with unique keys (either
> because of an upsert stream->table conversion or an unbounded aggregation).
>
> Best, Fabian
>
> 2017-06-16 0:56 GMT+02:00 Shaoxuan Wang :
>
> > Nice timing, Fabian!
> >
> > Your checklist aligns our plans very well. Here are the things we are
> > working on & planning to contribute to release 1.4:
> > 1. DDL (with property waterMark config for source-table, and emit config
> on
> > result-table)
> > 2. unbounded stream-stream joins (with retraction supported)
> > 3. backend state user interface for UDAGG
> > 4. UDOP (as oppose to UDF(scalars to scalar)/UDTF(scalar to
> > table)/UDAGG(table to scalar), this allows user to define a table to
> table
> > conversion business logic)
> >
> > Some of them already have PR/jira, while some are not. We will send out
> the
> > design doc for the missing ones very soon. Looking forward to the 1.4
> > release.
> >
> > Btw, what is "Table-Table (with retraction)" you have mentioned in your
> > plan?
> >
> > Regards,
> > Shaoxuan
> >
> >
> >
> > On Thu, Jun 15, 2017 at 10:29 PM, Fabian Hueske 
> wrote:
> >
> > > Hi everybody,
> > >
> > > I would like to start a discussion about the targeted feature set of
> the
> > > Table API / SQL for Flink 1.4.0.
> > > Flink 1.3.0 was released about 2 weeks ago and we have 2.5 months (~11
> > > weeks, until begin of September) left until the feature freeze for
> Flink
> > > 1.4.0.
> > >
> > > I think it makes sense to start with a collection of desired features.
> > Once
> > > we have a list of requested features, we might want to prioritize and
> > maybe
> > > also assign responsibilities.
> > >
> > > When we prioritize, we should keep in mind that:
> > > - we want to have a consistent API. Larger features should be developed
> > in
> > > a feature branch first.
> > > - the next months are typical time for vacations
> > > - we have been bottlenecked by committer resources in the last release.
> > >
> > > I think the following features would be a nice addition to the current
> > > state:
> > >
> > > - Conversion of a stream into an upsert table (with retraction,
> updating
> > to
> > > the last row per key)
> > > - Joins for streaming tables
> > >   - Stream-Stream (time-range predicate) there is already a PR for
> > > processing time joins
> > >   - Table-Table (with retraction)
> > > - Support for late arriving records in group window aggregations
> > > - Exposing a keyed result table as queryable state
> > >
> > > Which features are others looking for?
> > >
> > > Cheers,
> > > Fabian
> > >
> >
>


[jira] [Created] (FLINK-6937) Fix link markdown in Production Readiness Checklist doc

2017-06-16 Thread Juan Paulo Gutierrez (JIRA)
Juan Paulo Gutierrez created FLINK-6937:
---

 Summary: Fix link markdown in Production Readiness Checklist doc
 Key: FLINK-6937
 URL: https://issues.apache.org/jira/browse/FLINK-6937
 Project: Flink
  Issue Type: Improvement
Reporter: Juan Paulo Gutierrez
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6936) Add multiple targets support for custom partitioner

2017-06-16 Thread Xingcan Cui (JIRA)
Xingcan Cui created FLINK-6936:
--

 Summary: Add multiple targets support for custom partitioner
 Key: FLINK-6936
 URL: https://issues.apache.org/jira/browse/FLINK-6936
 Project: Flink
  Issue Type: Improvement
  Components: DataStream API
Reporter: Xingcan Cui
Assignee: Xingcan Cui
Priority: Minor


The current user-facing Partitioner only allows returning one target.
{code:java}
@Public
public interface Partitioner extends java.io.Serializable, Function {

/**
 * Computes the partition for the given key.
 *
 * @param key The key.
 * @param numPartitions The number of partitions to partition into.
 * @return The partition index.
 */
int partition(K key, int numPartitions);
}
{code}
Actually, this function should return multiple partitions and this may be a 
historical legacy.
There could be at least three approaches to solve this.
# Make the `protected DataStream setConnectionType(StreamPartitioner 
partitioner)` method in DataStream public and that allows users to directly 
define StreamPartitioner.
# Change the `partition` method in the Partitioner interface to return an int 
array instead of a single int value.
# Add a new `multicast` method to DataStream and provide a MultiPartitioner 
interface which returns an int array.

Considering the consistency of API, the 3rd approach seems to be an acceptable 
choice. [~aljoscha], what do you think?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS]: Integrating Flink Table API & SQL with CEP

2017-06-16 Thread Jark Wu
Thanks Fabian for the standard Row Pattern Recognition in SQL!

I just created a FLIP for this feature.

FILP-20:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-20%3A+Integration+of+SQL+and+CEP

Cheers,
Jark Wu

2017-06-16 0:49 GMT+08:00 Fabian Hueske :

> Hi everybody,
>
> I just stumbled over this blog post [1] which discusses new features in SQL
> 2016.
> Apparently the match_recognize clause is part of that. The blogpost also
> contains a slide set that presents the pattern matching feature and a link
> to a 90 page technical report.
>
> I thought this might be helpful as a references.
>
> Cheers, Fabian
>
> [1] http://modern-sql.com/blog/2017-06/whats-new-in-sql-2016
>
> 2017-06-15 11:53 GMT+02:00 Till Rohrmann :
>
> > @Jark: You should now have the permissions to create pages in the Flink
> > wiki.
> >
> > Cheers,
> > Till
> >
> > On Thu, Jun 15, 2017 at 5:11 AM, Jark Wu  wrote:
> >
> > > Hi Till,
> > >
> > > Could you grant me the edit permission of Flink WIKI? My id is imjark.
> > >
> > > Thanks,
> > > Jark Wu
> > >
> > > 2017-06-15 0:07 GMT+08:00 Till Rohrmann :
> > >
> > > > I think that the integration of SQL and CEP would make a good FLIP.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Wed, Jun 14, 2017 at 2:40 PM, Jark Wu  wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Do you think whether we should create a FLIP for this proposal to
> > track
> > > > > progress?
> > > > >
> > > > > Regards,
> > > > > Jark
> > > > >
> > > > > 2017-06-13 16:59 GMT+08:00 Dian Fu :
> > > > >
> > > > > > Hi Fabian,
> > > > > >
> > > > > > Thanks a lot. Agree that we can start working by adding the
> missing
> > > > > > features of the CEP library.
> > > > > >
> > > > > > Best regards,
> > > > > > Dian
> > > > > >
> > > > > > On Tue, Jun 13, 2017 at 4:26 PM, Fabian Hueske <
> fhue...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > @Dian Fu: I gave you contributor permissions. :-)
> > > > > > >
> > > > > > > I don't think we have to wait for Calcite 1.13 to start working
> > on
> > > > the
> > > > > > > missing features of the CEP library and extending the
> prototype.
> > > > > > > We might want to wait with the integration into flink-table
> until
> > > > > Calcite
> > > > > > > 1.13 is out and we updated the dependency though.
> > > > > > >
> > > > > > > Best, Fabian
> > > > > > >
> > > > > > > 2017-06-13 9:45 GMT+02:00 jincheng sun <
> sunjincheng...@gmail.com
> > >:
> > > > > > >
> > > > > > > > Hi Jark, Dian,
> > > > > > > >
> > > > > > > > Thanks for bring up this discuss and share the prototype.
> > > > > > > >
> > > > > > > > +1 to push this great feature forward!
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > SunJincheng
> > > > > > > >
> > > > > > > > 2017-06-13 15:34 GMT+08:00 Jark Wu :
> > > > > > > >
> > > > > > > > > Thank you Yueting for pointing out the mistake in the
> > > prototype.
> > > > I
> > > > > > > > > accidentally introduced it when merge code.
> > > > > > > > >
> > > > > > > > > I'm so glad to see so many people are interested in the
> > > feature.
> > > > > > Let's
> > > > > > > > work
> > > > > > > > > out together to push it forward!
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Jark Wu
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2017-06-13 15:27 GMT+08:00 Liangfei Su <
> suliang...@gmail.com
> > >:
> > > > > > > > >
> > > > > > > > > > +1 for the feature. Myself was a user of Siddhi, this is
> > > pretty
> > > > > > user
> > > > > > > > > > friendly feature to provide to user.
> > > > > > > > > >
> > > > > > > > > > On Tue, Jun 13, 2017 at 3:09 PM, Dian Fu <
> > > > dian0511...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Yueting & Dawid Wysakowicz,
> > > > > > > > > > >
> > > > > > > > > > > Very glad that you're interested in this feature and
> > you're
> > > > > > > > definitely
> > > > > > > > > > > welcome to join this work and also anyone else.:)
> > > > > > > > > > >
> > > > > > > > > > > Best regards,
> > > > > > > > > > > Dian
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Jun 13, 2017 at 2:35 PM, Dawid Wysakowicz <
> > > > > > > > > > > wysakowicz.da...@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > Integrating SQL with CEP seems like a really nice
> idea.
> > > > > > > > > Unfortunately I
> > > > > > > > > > > had
> > > > > > > > > > > > time just for a brief look at the design doc, but it
> > > looks
> > > > > > really
> > > > > > > > > cool
> > > > > > > > > > > and
> > > > > > > > > > > > thorough. Also will have a second run tomorrow and
> will
> > > try
> > > > > to
> > > > > > > > > provide
> > > > > > > > > > > more
> > > > > > > > > > > > comments. Anyway will be glad to help pushing the
> > > > initiative
> > > > > > > > forward.
> 

Re: [DISCUSS] FLIP-19: Improved BLOB storage architecture

2017-06-16 Thread Biao Liu
Hi Till

I agree with you about the Flink's DC. It is another topic indeed. I just
thought that we can think more about it before refactoring BLOB service.
Make sure that it's easy to implement DC on the refactored architecture.

I have another question about BLOB service. Can we abstract the BLOB
service to some high-level interfaces? May be just some put/get methods in
the interfaces. Easy to extend will be useful in some scenarios.

For example in Yarn mode, there are some cool features interesting us.
1. Yarn can localize files only once in one slave machine, all TMs in the
same job can share these files. That may save lots of bandwidth for large
scale jobs or jobs which have large BLOBs.
2. We can skip uploading files if they are already on DFS. That's a common
scenario in distributed cache.
3. Even more, actually we don't need a BlobServer component in Yarn mode.
We can rely on DFS to distribute files. There is always a DFS available in
Yarn cluster.

If we do so, the BLOB service through network can be the default
implementation. It could work in any situation. It's also clear that it
does not dependent on Hadoop explicitly. And we can do some optimization in
different kinds of clusters without any hacking.

That are just some rough ideas above. But I think well abstracted
interfaces will be very helpful.


Re: [DISCUSS] Table API / SQL features for Flink 1.4.0

2017-06-16 Thread Fabian Hueske
Thanks for your response Shaoxuan,

My "Table-table join with retraction" is probably the same as your
"unbounded stream-stream join with retraction".
Basically, a join between two dynamic tables with unique keys (either
because of an upsert stream->table conversion or an unbounded aggregation).

Best, Fabian

2017-06-16 0:56 GMT+02:00 Shaoxuan Wang :

> Nice timing, Fabian!
>
> Your checklist aligns our plans very well. Here are the things we are
> working on & planning to contribute to release 1.4:
> 1. DDL (with property waterMark config for source-table, and emit config on
> result-table)
> 2. unbounded stream-stream joins (with retraction supported)
> 3. backend state user interface for UDAGG
> 4. UDOP (as oppose to UDF(scalars to scalar)/UDTF(scalar to
> table)/UDAGG(table to scalar), this allows user to define a table to table
> conversion business logic)
>
> Some of them already have PR/jira, while some are not. We will send out the
> design doc for the missing ones very soon. Looking forward to the 1.4
> release.
>
> Btw, what is "Table-Table (with retraction)" you have mentioned in your
> plan?
>
> Regards,
> Shaoxuan
>
>
>
> On Thu, Jun 15, 2017 at 10:29 PM, Fabian Hueske  wrote:
>
> > Hi everybody,
> >
> > I would like to start a discussion about the targeted feature set of the
> > Table API / SQL for Flink 1.4.0.
> > Flink 1.3.0 was released about 2 weeks ago and we have 2.5 months (~11
> > weeks, until begin of September) left until the feature freeze for Flink
> > 1.4.0.
> >
> > I think it makes sense to start with a collection of desired features.
> Once
> > we have a list of requested features, we might want to prioritize and
> maybe
> > also assign responsibilities.
> >
> > When we prioritize, we should keep in mind that:
> > - we want to have a consistent API. Larger features should be developed
> in
> > a feature branch first.
> > - the next months are typical time for vacations
> > - we have been bottlenecked by committer resources in the last release.
> >
> > I think the following features would be a nice addition to the current
> > state:
> >
> > - Conversion of a stream into an upsert table (with retraction, updating
> to
> > the last row per key)
> > - Joins for streaming tables
> >   - Stream-Stream (time-range predicate) there is already a PR for
> > processing time joins
> >   - Table-Table (with retraction)
> > - Support for late arriving records in group window aggregations
> > - Exposing a keyed result table as queryable state
> >
> > Which features are others looking for?
> >
> > Cheers,
> > Fabian
> >
>


[jira] [Created] (FLINK-6935) Integration of SQL and CEP

2017-06-16 Thread Jark Wu (JIRA)
Jark Wu created FLINK-6935:
--

 Summary: Integration of SQL and CEP
 Key: FLINK-6935
 URL: https://issues.apache.org/jira/browse/FLINK-6935
 Project: Flink
  Issue Type: New Feature
  Components: CEP, Table API & SQL
Reporter: Jark Wu


Flink's CEP library is a great library for complex event processing, more and 
more customers are expressing their interests in it. But it also has some 
limitations that users usually have to write a lot of code even for a very 
simple pattern match use case as it currently only supports the Java API.

CEP DSLs and SQLs strongly resemble each other. CEP's additional features 
compared to SQL boil down to pattern detection. So It will be awesome to 
consolidate CEP and SQL. It makes SQL more powerful to support more usage 
scenario. And it gives users the ability to easily and quickly to build CEP 
applications.

The FLIP can be found here:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-20%3A+Integration+of+SQL+and+CEP

This is an umbrella issue for the FLIP. We should wait for Calcite 1.13 to 
start this work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)