[jira] [Created] (FLINK-6906) TumblingProcessingTimeWindows can't set offset parameter to a negative value

2017-06-12 Thread bmnouwq (JIRA)
bmnouwq created FLINK-6906:
--

 Summary: TumblingProcessingTimeWindows  can't  set offset 
parameter to a negative value
 Key: FLINK-6906
 URL: https://issues.apache.org/jira/browse/FLINK-6906
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Affects Versions: 1.2.1, 1.3.0, 1.2.0, 1.2.2
Reporter: bmnouwq


Excuse my English.

In the DataStream API,I specify 
{code:java}
org.apache.flink.streaming.api.windowing.assigners.TumblingProcessingTimeWindows
 
{code}
 an offset of  a negative value such as Time.hours(-8),When trying to submit 
the Job will  throw IllegalArgumentException "TumblingProcessingTimeWindows 
parameters must satisfy  0 <= offset < size" .

This is problematic for living in somewhere which is not using UTC-00:00 time.

{code:java}
org.apache.flink.streaming.api.windowing.assigners.SlidingEventTimeWindows
{code},
{code:java}
org.apache.flink.streaming.api.windowing.assigners.SlidingProcessingTimeWindows
{code},
{code:java}
org.apache.flink.streaming.api.windowing.assigners.TumblingEventTimeWindows
{code} has the same problem.

We should use the offset absolute value to compare with the size in the 
construct method.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6905) Support for zeroOrMore to CEP's pattern API

2017-06-12 Thread Dian Fu (JIRA)
Dian Fu created FLINK-6905:
--

 Summary: Support for zeroOrMore to CEP's pattern API
 Key: FLINK-6905
 URL: https://issues.apache.org/jira/browse/FLINK-6905
 Project: Flink
  Issue Type: Bug
  Components: CEP
Reporter: Dian Fu


Currently the quantifier has supported oneOrMore, times(int times), one(),we 
should also support API such as zeroOrMore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS]: Integrating Flink Table API & SQL with CEP

2017-06-12 Thread Dian Fu
Hi Fabian,

We have evaluated the missing features of Flink CEP roughly, it should not
be quite difficult to support them. Kostas, Dawid, what's your thought?

For supporting MATCH_RECOGNIZE, do you think we could create the JIRAs and
start to work right now or we should wait until the release of calcite 1.13?

Btw, could you help to add me(dian.fu) to the contributor list, then I can
assign the JIRAs to myself? Thanks a lot.

Best regards,
Dian

On Tue, Jun 13, 2017 at 3:59 AM, Fabian Hueske  wrote:

> Hi Jark,
>
> Thanks for updating the design doc and sharing your prototype!
> I didn't look at the code in detail, but the fact that it is less than 1k
> LOC is very promising. It seems that most of the complex CEP logic can be
> reused :-)
> Adding a dependency on flink-cep should be fine, IMO. It is a very slim
> library with almost none external dependencies.
>
> Regarding the missing features of Flink CEP that you listed in the design
> doc, it would be good to get some in put from Kostas and Dawid which are
> the main contributors to CEP.
> Do you have already plans regarding some of the missing features or can you
> assess how hard it would be to integrate them?
>
> Cheers, Fabian
>
> Btw. The Calcite community started a discussion about releasing Calcite
> 1.13. So, the missing features might soon be available.
>
> 2017-06-12 14:25 GMT+02:00 Jark Wu :
>
> > Hi guys,
> >
> > Good news! We have made a prototype for integrating CEP and SQL. See this
> > link
> > https://github.com/apache/flink/compare/master...
> > wuchong:cep-on-sql?expand=1
> >
> >
> > You can check CepITCase to try the simple CQL example.
> >
> > Meanwhile, we updated our design doc with additional implementation
> detail,
> > including how
> > to translate MATCH_RECOGNIZE into CEP API, and the features needed to add
> > to Flink CEP,
> > and the implementation plan. See the document
> > https://docs.google.com/document/d/1HaaO5eYI1VZjyhtVPZOi3jVzikU7i
> > K15H0YbniTnN30/edit#heading=h.4oas4koy8qu3
> >
> > In the prototype, we make flink-table dependency on flink-cep. Do you
> think
> > that is fine?
> >
> > What do you think about the prototype and the design doc ?
> >
> > Any feedbacks are welcome!
> >
> > Cheers,
> > Jark Wu
> >
> >
> > 2017-06-08 17:54 GMT+08:00 Till Rohrmann :
> >
> > > Thanks for sharing your ideas with the community. I really like the
> > design
> > > document and think that it's a good approach to follow Oracle's SQL
> > > extension for pattern matching. Looking forward to having support for
> SQL
> > > with CEP capabilities :-)
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Jun 8, 2017 at 8:57 AM, Jark Wu  wrote:
> > >
> > > > Hi  @Kostas, @Fabian, thank you for your support.
> > > >
> > > > @Fabian, I totally agree with you that we should focus on SQL first.
> > > Let's
> > > > keep Table API in mind and discuss that later.
> > > >
> > > > Regarding to the orderBy() clause, I'm not sure about that. I think
> it
> > > > makes sense to make it required in streaming mode(either order by
> > rowtime
> > > > or order by proctime). But CEP also works in batch mode, and not
> > > necessary
> > > > to order by some column. Nevertheless, we can support CEP on batch
> SQL
> > > > later.
> > > >
> > > > We are estimating how to implement MATCH_RECOGNIZE with CEP library
> > (with
> > > > NFA, CEP operator). And we will output a detailed doc and a prototype
> > in
> > > > the next days.
> > > >
> > > > Regards,
> > > > Jark Wu
> > > >
> > > >
> > > > 2017-06-07 21:40 GMT+08:00 Fabian Hueske :
> > > >
> > > >> Thanks Dian and Jark for this proposal!
> > > >>
> > > >> As you wrote, Till and I (and Kostas) have been thinking about this
> > for
> > > >> some time but haven't had time to work on this feature.
> > > >> I think it would be a great addition and value add for Flink's SQL
> > > >> support and Table API.
> > > >>
> > > >> I read the proposal and think it is very good. We might need to add
> a
> > > bit
> > > >> more details, esp. when planning the concrete steps of the
> > > implementation.
> > > >>
> > > >> A few comments to the proposal:
> > > >> - IMO, the development should start focusing on SQL and its
> semantics.
> > > >> Pattern support for the Table API should be added later. We followed
> > > that
> > > >> approach for the OVER windows and I think it worked quiet well.
> > > >> - We probably want to reuse as much as possible from the CEP
> library.
> > > >> That means we need to check if the semantics of the CEP library and
> > > >> Oracle's PATTERN syntax are aligned (or how we can express the
> PATTERN
> > > >> semantics with the CEP library). This should be one of the first
> > steps,
> > > IMO.
> > > >> - I would make the orderBy() clause required. In regular SQL rows
> have
> > > no
> > > >> order, so we need to make that explicit (this would also be
> consistent
> > > with
> > > >> the OVER windows).

[jira] [Created] (FLINK-6904) Support for quantifier range to CEP's pattern API

2017-06-12 Thread Dian Fu (JIRA)
Dian Fu created FLINK-6904:
--

 Summary: Support for quantifier range to CEP's pattern API
 Key: FLINK-6904
 URL: https://issues.apache.org/jira/browse/FLINK-6904
 Project: Flink
  Issue Type: Bug
  Components: CEP
Reporter: Dian Fu


Currently the quantifier has supported oneOrMore, times(int times), one(),we 
should also support API such as times(int from, int to) to specify a quantifier 
range.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS]: Integrating Flink Table API & SQL with CEP

2017-06-12 Thread Fabian Hueske
Hi Jark,

Thanks for updating the design doc and sharing your prototype!
I didn't look at the code in detail, but the fact that it is less than 1k
LOC is very promising. It seems that most of the complex CEP logic can be
reused :-)
Adding a dependency on flink-cep should be fine, IMO. It is a very slim
library with almost none external dependencies.

Regarding the missing features of Flink CEP that you listed in the design
doc, it would be good to get some in put from Kostas and Dawid which are
the main contributors to CEP.
Do you have already plans regarding some of the missing features or can you
assess how hard it would be to integrate them?

Cheers, Fabian

Btw. The Calcite community started a discussion about releasing Calcite
1.13. So, the missing features might soon be available.

2017-06-12 14:25 GMT+02:00 Jark Wu :

> Hi guys,
>
> Good news! We have made a prototype for integrating CEP and SQL. See this
> link
> https://github.com/apache/flink/compare/master...
> wuchong:cep-on-sql?expand=1
>
>
> You can check CepITCase to try the simple CQL example.
>
> Meanwhile, we updated our design doc with additional implementation detail,
> including how
> to translate MATCH_RECOGNIZE into CEP API, and the features needed to add
> to Flink CEP,
> and the implementation plan. See the document
> https://docs.google.com/document/d/1HaaO5eYI1VZjyhtVPZOi3jVzikU7i
> K15H0YbniTnN30/edit#heading=h.4oas4koy8qu3
>
> In the prototype, we make flink-table dependency on flink-cep. Do you think
> that is fine?
>
> What do you think about the prototype and the design doc ?
>
> Any feedbacks are welcome!
>
> Cheers,
> Jark Wu
>
>
> 2017-06-08 17:54 GMT+08:00 Till Rohrmann :
>
> > Thanks for sharing your ideas with the community. I really like the
> design
> > document and think that it's a good approach to follow Oracle's SQL
> > extension for pattern matching. Looking forward to having support for SQL
> > with CEP capabilities :-)
> >
> > Cheers,
> > Till
> >
> > On Thu, Jun 8, 2017 at 8:57 AM, Jark Wu  wrote:
> >
> > > Hi  @Kostas, @Fabian, thank you for your support.
> > >
> > > @Fabian, I totally agree with you that we should focus on SQL first.
> > Let's
> > > keep Table API in mind and discuss that later.
> > >
> > > Regarding to the orderBy() clause, I'm not sure about that. I think it
> > > makes sense to make it required in streaming mode(either order by
> rowtime
> > > or order by proctime). But CEP also works in batch mode, and not
> > necessary
> > > to order by some column. Nevertheless, we can support CEP on batch SQL
> > > later.
> > >
> > > We are estimating how to implement MATCH_RECOGNIZE with CEP library
> (with
> > > NFA, CEP operator). And we will output a detailed doc and a prototype
> in
> > > the next days.
> > >
> > > Regards,
> > > Jark Wu
> > >
> > >
> > > 2017-06-07 21:40 GMT+08:00 Fabian Hueske :
> > >
> > >> Thanks Dian and Jark for this proposal!
> > >>
> > >> As you wrote, Till and I (and Kostas) have been thinking about this
> for
> > >> some time but haven't had time to work on this feature.
> > >> I think it would be a great addition and value add for Flink's SQL
> > >> support and Table API.
> > >>
> > >> I read the proposal and think it is very good. We might need to add a
> > bit
> > >> more details, esp. when planning the concrete steps of the
> > implementation.
> > >>
> > >> A few comments to the proposal:
> > >> - IMO, the development should start focusing on SQL and its semantics.
> > >> Pattern support for the Table API should be added later. We followed
> > that
> > >> approach for the OVER windows and I think it worked quiet well.
> > >> - We probably want to reuse as much as possible from the CEP library.
> > >> That means we need to check if the semantics of the CEP library and
> > >> Oracle's PATTERN syntax are aligned (or how we can express the PATTERN
> > >> semantics with the CEP library). This should be one of the first
> steps,
> > IMO.
> > >> - I would make the orderBy() clause required. In regular SQL rows have
> > no
> > >> order, so we need to make that explicit (this would also be consistent
> > with
> > >> the OVER windows).
> > >>
> > >> Let me know what you think.
> > >>
> > >> Best, Fabian
> > >>
> > >> 2017-06-07 11:41 GMT+02:00 Kostas Kloudas <
> k.klou...@data-artisans.com
> > >:
> > >>
> > >>> Thanks a lot for opening the discussion!
> > >>>
> > >>> This is a really interesting idea that has been in our heads
> > >>> since the first implementation of the CEP library.
> > >>>
> > >>> A big +1 for moving forward with this.
> > >>>
> > >>> And as for the design document, I will definitely have a look
> > >>> and comment there.
> > >>>
> > >>> Kostas
> > >>>
> > >>> On Jun 7, 2017, at 10:05 AM, Jark Wu  wrote:
> > >>>
> > >>> Sorry, I forgot to cc you guys @Fabian, @Timo, @Till, @Kostas
> > >>>
> > >>> 2017-06-07 15:42 GMT+08:00 Jark Wu :
> > >>>
> > 

[jira] [Created] (FLINK-6903) Activate checkstyle for runtime/akka

2017-06-12 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-6903:
-

 Summary: Activate checkstyle for runtime/akka
 Key: FLINK-6903
 URL: https://issues.apache.org/jira/browse/FLINK-6903
 Project: Flink
  Issue Type: Improvement
  Components: Local Runtime
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6902) Activate strict checkstyle for flink-streaming-scala

2017-06-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6902:
---

 Summary: Activate strict checkstyle for flink-streaming-scala
 Key: FLINK-6902
 URL: https://issues.apache.org/jira/browse/FLINK-6902
 Project: Flink
  Issue Type: Sub-task
  Components: Scala API
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Minor
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Removal of twitter-inputformat

2017-06-12 Thread sblackmon
Hello,

Apache Streams (incubating) maintains and publishes json-schemas and 
jackson-compatible POJOs for Twitter and other popular third-party APIs.

http://streams.apache.org/site/0.5.1-incubating-SNAPSHOT/streams-project/streams-contrib/streams-provider-twitter/index.html

We also have a repository of public examples, one of which demonstrates how to 
embed various twitter data collectors into Flink.

http://streams.apache.org/site/0.5.1-incubating-SNAPSHOT/streams-examples/streams-examples-flink/flink-twitter-collection/index.html

We’d welcome support of anyone from Flink project to help us maintain and 
improve these examples.  Potentially, Flink could maintain the benefit of the 
existence of useful, ready-to-run examples for new Flink users, while getting 
the boring code out of your code base.  Also, our examples have integration 
tests that actually connect to twitter and check that everything continues to 
work :)

if anyone wants to know more about this, feel free to reach out to the team on 
d...@streams.incubator.apache.org

Steve
sblack...@apache.org
On June 12, 2017 at 7:18:08 AM, Aljoscha Krettek (aljos...@apache.org) wrote:

Bumpety-bump.  

I would be in favour or removing this:  
- It can be implemented as a MapFunction parser after a TextInputFormat  
- Additions, changes, fixes that happen on TextInputFormat are not reflected to 
SimpleTweetInputFormat  
- SimpleTweetInput format overrides nextRecord(), which is not something 
DelimitedInputFormats are normally supposed to do, I think  
- The Tweet POJO has a very strange naming scheme  

Best,  
Aljoscha  

> On 7. Jun 2017, at 11:15, Chesnay Schepler  wrote:  
>  
> Hello,  
>  
> I'm proposing to remove the Twitter-InputFormat in FLINK-6710 
> , with an open PR you can 
> find here .  
> The PR currently has a +1 from Robert, but Timo raised some concerns saying 
> that it is useful for prototyping and  
> advised me to start a discussion on the ML.  
>  
> This format is a DelimitedInputFormat that reads JSON objects and turns them 
> into a custom tweet class.  
> I believe this format doesn't provide much value to Flink; there's nothing 
> interesting about it as an InputFormat,  
> as it is purely an exercise in manually converting a JSON object into a POJO. 
>  
> This is apparent since you could just as well use 
> ExecutionEnvironment#readTextFile(...) and throw the parsing logic  
> into a subsequent MapFunction.  
>  
> In the PR i suggested to replace this with a JsonInputFormat, but this was a 
> misguided attempt at getting Timo to agree  
> to the removal. This format has the same problem outlined above, as it could 
> be effectively implemented with a one-liner map function.  
>  
> So the question now is whether we want to keep it, remove it, or replace it 
> with something more general.  
>  
> Regards,  
> Chesnay  



[jira] [Created] (FLINK-6901) Flip checkstyle configuration files

2017-06-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6901:
---

 Summary: Flip checkstyle configuration files
 Key: FLINK-6901
 URL: https://issues.apache.org/jira/browse/FLINK-6901
 Project: Flink
  Issue Type: Improvement
  Components: Checkstyle
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler


We currently have 2 checkstyle files, {{checkstyle.xml}} as the basic version, 
and {{strict-checkstyle.xml}} as a heavily expanded version that is applied to 
most existing modules (see FLINK-6698).

[~greghogan] suggested to flip the checkstyle while reviewing this PR 
https://github.com/apache/flink/pull/4086, given that the strict checkstyle is 
supposed to subsume the existing checkstyle in the long-term.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6900) Limit size of indiivual components in DropwizardReporter

2017-06-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6900:
---

 Summary: Limit size of indiivual components in DropwizardReporter
 Key: FLINK-6900
 URL: https://issues.apache.org/jira/browse/FLINK-6900
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.3.0, 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.3.1, 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6899) Wrong state array size in NestedMapsStateTable

2017-06-12 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-6899:


 Summary: Wrong state array size in NestedMapsStateTable
 Key: FLINK-6899
 URL: https://issues.apache.org/jira/browse/FLINK-6899
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.3.0, 1.4.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Minor
 Fix For: 1.4.0


The {{NestedMapsStateTable}} initializes the internal state array with a wrong 
size which leads to a waste of memory. We should initialize the array with the 
number of key groups of the assigned range instead of the max parallelism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6898) Limit size of operator component in metric name

2017-06-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6898:
---

 Summary: Limit size of operator component in metric name
 Key: FLINK-6898
 URL: https://issues.apache.org/jira/browse/FLINK-6898
 Project: Flink
  Issue Type: Improvement
  Components: Metrics
Affects Versions: 1.3.0, 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Critical
 Fix For: 1.3.1, 1.4.0


The operator name for some operators (specifically windows) can be very, very 
long (250+) characters.

I propose to limit the total space that the operator component can take up in a 
metric name to 60 characters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS]: Integrating Flink Table API & SQL with CEP

2017-06-12 Thread Jark Wu
Hi guys,

Good news! We have made a prototype for integrating CEP and SQL. See this
link
https://github.com/apache/flink/compare/master...wuchong:cep-on-sql?expand=1


You can check CepITCase to try the simple CQL example.

Meanwhile, we updated our design doc with additional implementation detail,
including how
to translate MATCH_RECOGNIZE into CEP API, and the features needed to add
to Flink CEP,
and the implementation plan. See the document
https://docs.google.com/document/d/1HaaO5eYI1VZjyhtVPZOi3jVzikU7iK15H0YbniTnN30/edit#heading=h.4oas4koy8qu3

In the prototype, we make flink-table dependency on flink-cep. Do you think
that is fine?

What do you think about the prototype and the design doc ?

Any feedbacks are welcome!

Cheers,
Jark Wu


2017-06-08 17:54 GMT+08:00 Till Rohrmann :

> Thanks for sharing your ideas with the community. I really like the design
> document and think that it's a good approach to follow Oracle's SQL
> extension for pattern matching. Looking forward to having support for SQL
> with CEP capabilities :-)
>
> Cheers,
> Till
>
> On Thu, Jun 8, 2017 at 8:57 AM, Jark Wu  wrote:
>
> > Hi  @Kostas, @Fabian, thank you for your support.
> >
> > @Fabian, I totally agree with you that we should focus on SQL first.
> Let's
> > keep Table API in mind and discuss that later.
> >
> > Regarding to the orderBy() clause, I'm not sure about that. I think it
> > makes sense to make it required in streaming mode(either order by rowtime
> > or order by proctime). But CEP also works in batch mode, and not
> necessary
> > to order by some column. Nevertheless, we can support CEP on batch SQL
> > later.
> >
> > We are estimating how to implement MATCH_RECOGNIZE with CEP library (with
> > NFA, CEP operator). And we will output a detailed doc and a prototype in
> > the next days.
> >
> > Regards,
> > Jark Wu
> >
> >
> > 2017-06-07 21:40 GMT+08:00 Fabian Hueske :
> >
> >> Thanks Dian and Jark for this proposal!
> >>
> >> As you wrote, Till and I (and Kostas) have been thinking about this for
> >> some time but haven't had time to work on this feature.
> >> I think it would be a great addition and value add for Flink's SQL
> >> support and Table API.
> >>
> >> I read the proposal and think it is very good. We might need to add a
> bit
> >> more details, esp. when planning the concrete steps of the
> implementation.
> >>
> >> A few comments to the proposal:
> >> - IMO, the development should start focusing on SQL and its semantics.
> >> Pattern support for the Table API should be added later. We followed
> that
> >> approach for the OVER windows and I think it worked quiet well.
> >> - We probably want to reuse as much as possible from the CEP library.
> >> That means we need to check if the semantics of the CEP library and
> >> Oracle's PATTERN syntax are aligned (or how we can express the PATTERN
> >> semantics with the CEP library). This should be one of the first steps,
> IMO.
> >> - I would make the orderBy() clause required. In regular SQL rows have
> no
> >> order, so we need to make that explicit (this would also be consistent
> with
> >> the OVER windows).
> >>
> >> Let me know what you think.
> >>
> >> Best, Fabian
> >>
> >> 2017-06-07 11:41 GMT+02:00 Kostas Kloudas  >:
> >>
> >>> Thanks a lot for opening the discussion!
> >>>
> >>> This is a really interesting idea that has been in our heads
> >>> since the first implementation of the CEP library.
> >>>
> >>> A big +1 for moving forward with this.
> >>>
> >>> And as for the design document, I will definitely have a look
> >>> and comment there.
> >>>
> >>> Kostas
> >>>
> >>> On Jun 7, 2017, at 10:05 AM, Jark Wu  wrote:
> >>>
> >>> Sorry, I forgot to cc you guys @Fabian, @Timo, @Till, @Kostas
> >>>
> >>> 2017-06-07 15:42 GMT+08:00 Jark Wu :
> >>>
>  Hi devs,
> 
>  Dian and me and our teammates have investigated this for a long time.
>  We think consolidating Flink SQL and CEP is an exciting thing for
> Flink.
>  It'll make SQL more powerful and give users the ability to easily and
>  quickly build CEP applications.  And I find Flink community has also
> talked
>  about this idea before, such as the mailing list [1] and [2] and
> Fabian &
>  Till's talk in Flink Forward 2016 [3].
> 
>  I think THIS IS THE POINT to bring up this topic again. Because we
>  already have pattern matching foundation in Flink CEP library, and
> Stream
>  SQL is ready now and Calcite has partially supported pattern matching
>  syntax!  We also drafted a design doc about how to integrate SQL and
> CEP,
>  and how to support CEP on Table API. https://docs.google.com/docume
>  nt/d/1HaaO5eYI1VZjyhtVPZOi3jVzikU7iK15H0YbniTnN30/edit?usp=sharing
> 
> 
>  @Fabian, @Timo, @Till, @Kostas I include you into this discussion, it
>  would be great to hear your response.
> 
> 
> 

Re: [DISCUSS] Removal of twitter-inputformat

2017-06-12 Thread Aljoscha Krettek
Bumpety-bump.

I would be in favour or removing this:
 - It can be implemented as a MapFunction parser after a TextInputFormat
 - Additions, changes, fixes that happen on TextInputFormat are not reflected 
to SimpleTweetInputFormat
 - SimpleTweetInput format overrides nextRecord(), which is not something 
DelimitedInputFormats are normally supposed to do, I think
 - The Tweet POJO has a very strange naming scheme

Best,
Aljoscha

> On 7. Jun 2017, at 11:15, Chesnay Schepler  wrote:
> 
> Hello,
> 
> I'm proposing to remove the Twitter-InputFormat in FLINK-6710 
> , with an open PR you can 
> find here .
> The PR currently has a +1 from Robert, but Timo raised some concerns saying 
> that it is useful for prototyping and
> advised me to start a discussion on the ML.
> 
> This format is a DelimitedInputFormat that reads JSON objects and turns them 
> into a custom tweet class.
> I believe this format doesn't provide much value to Flink; there's nothing 
> interesting about it as an InputFormat,
> as it is purely an exercise in manually converting a JSON object into a POJO.
> This is apparent since you could just as well use 
> ExecutionEnvironment#readTextFile(...) and throw the parsing logic
> into a subsequent MapFunction.
> 
> In the PR i suggested to replace this with a JsonInputFormat, but this was a 
> misguided attempt at getting Timo to agree
> to the removal. This format has the same problem outlined above, as it could 
> be effectively implemented with a one-liner map function.
> 
> So the question now is whether we want to keep it, remove it, or replace it 
> with something more general.
> 
> Regards,
> Chesnay



[jira] [Created] (FLINK-6897) Re-add support for Java 8 lambdas.

2017-06-12 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-6897:
-

 Summary: Re-add support for Java 8 lambdas.
 Key: FLINK-6897
 URL: https://issues.apache.org/jira/browse/FLINK-6897
 Project: Flink
  Issue Type: Bug
  Components: CEP
Affects Versions: 1.3.0
Reporter: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6896) Creating a table from a POJO and use table sink to output fail

2017-06-12 Thread Mark You (JIRA)
Mark You created FLINK-6896:
---

 Summary: Creating a table from a POJO and use table sink to output 
fail
 Key: FLINK-6896
 URL: https://issues.apache.org/jira/browse/FLINK-6896
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Affects Versions: 1.3.0
Reporter: Mark You


Following example fails at sink, using debug mode to see the reason of 
ArrayIndexOutOfBoundException is cause by the input type is Pojo type not Row?

Sample:
{code:title=TumblingWindow.java|borderStyle=solid}
public class TumblingWindow {

public static void main(String[] args) throws Exception {
List data = new ArrayList();
data.add(new Content(1L, "Hi"));
data.add(new Content(2L, "Hallo"));
data.add(new Content(3L, "Hello"));
data.add(new Content(4L, "Hello"));
data.add(new Content(7L, "Hello"));
data.add(new Content(8L, "Hello world"));
data.add(new Content(16L, "Hello world"));

final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

DataStream stream = env.fromCollection(data);

DataStream stream2 = stream.assignTimestampsAndWatermarks(
new 
BoundedOutOfOrdernessTimestampExtractor(Time.milliseconds(1)) {

/**
 * 
 */
private static final long serialVersionUID = 
410512296011057717L;

@Override
public long extractTimestamp(Content element) {
return element.getRecordTime();
}

});

final StreamTableEnvironment tableEnv = 
TableEnvironment.getTableEnvironment(env);

Table table = tableEnv.fromDataStream(stream2, 
"urlKey,uplink,downlink,httpGetMessageCount,httpPostMessageCount,statusCode,rowtime.rowtime");

Table windowTable = 
table.window(Tumble.over("1.hours").on("rowtime").as("w")).groupBy("w, urlKey")

.select("w.start,urlKey,uplink.sum,downlink.sum,httpGetMessageCount.sum,httpPostMessageCount.sum
 ");

//table.printSchema();

TableSink windowSink = new 
CsvTableSink("/Users/mark/Documents/specific-website-code.csv", ",", 1,
WriteMode.OVERWRITE);
windowTable.writeToSink(windowSink);

// tableEnv.toDataStream(windowTable, Row.class).print();
env.execute();
}

public static class Content implements Serializable {

/**
 * 
 */
private static final long serialVersionUID = 1429246948772430441L;

private String urlKey;

private long recordTime;
// private String recordTimeStr;

private long httpGetMessageCount;
private long httpPostMessageCount;
private long uplink;
private long downlink;
private long statusCode;
private long statusCodeCount;

public Content() {
super();
}

public Content(long recordTime, String urlKey) {
super();
this.recordTime = recordTime;
this.urlKey = urlKey;
}

public String getUrlKey() {
return urlKey;
}

public void setUrlKey(String urlKey) {
this.urlKey = urlKey;
}

public long getRecordTime() {
return recordTime;
}

public void setRecordTime(long recordTime) {
this.recordTime = recordTime;
}

public long getHttpGetMessageCount() {
return httpGetMessageCount;
}

public void setHttpGetMessageCount(long httpGetMessageCount) {
this.httpGetMessageCount = httpGetMessageCount;
}

public long getHttpPostMessageCount() {
return httpPostMessageCount;
}

public void setHttpPostMessageCount(long httpPostMessageCount) {
this.httpPostMessageCount = httpPostMessageCount;
}

public long getUplink() {
return uplink;
}

public void setUplink(long uplink) {
this.uplink = uplink;
}

public long getDownlink() {
return downlink;
}

public void setDownlink(long downlink) {
this.downlink = downlink;
}

public long getStatusCode() {
return statusCode;
}

public void setStatusCode(long statusCode) {
this.statusCode = statusCode;
}

public long getStatusCodeCount() {
return statusCodeCount;
}

public void setStatusCodeCount(long statusCodeCount) {
this.statusCodeCount = statusCodeCount;
}

}

private class TimestampWithEqualWatermark implements 

[jira] [Created] (FLINK-6895) Add STR_TO_DATE supported in SQL

2017-06-12 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6895:
--

 Summary: Add STR_TO_DATE supported in SQL
 Key: FLINK-6895
 URL: https://issues.apache.org/jira/browse/FLINK-6895
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: sunjincheng


STR_TO_DATE(str,format) This is the inverse of the DATE_FORMAT() function. It 
takes a string str and a format string format. STR_TO_DATE() returns a DATETIME 
value if the format string contains both date and time parts, or a DATE or TIME 
value if the string contains only date or time parts. If the date, time, or 
datetime value extracted from str is illegal, STR_TO_DATE() returns NULL and 
produces a warning.

* Syntax:
STR_TO_DATE(str,format) 

* Arguments
**str: -
**format: -

* Return Types
  DATAETIME/DATE/TIME

* Example:
  STR_TO_DATE('01,5,2013','%d,%m,%Y') -> '2013-05-01'
  SELECT STR_TO_DATE('a09:30:17','a%h:%i:%s') -> '09:30:17'

* See more:
** [MySQL| 
https://dev.mysql.com/doc/refman/5.7/en/date-and-time-functions.html#function_str-to-date]





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6894) Add DATE_FORMAT supported in SQL

2017-06-12 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6894:
--

 Summary: Add DATE_FORMAT supported in SQL
 Key: FLINK-6894
 URL: https://issues.apache.org/jira/browse/FLINK-6894
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: sunjincheng


DATE_FORMAT(date,format) Formats the date value according to the format string.

* Syntax:
DATE_FORMAT(date,format)

* Arguments
**date: -
**format: -

* Return Types
  String

* Example:
  DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y') -> 'Sunday October 2009'
  DATE_FORMAT('1900-10-04 22:23:00','%D %y %a %d %m %b %j') -> '4th 00 Thu 04 
10 Oct 277'

* See more:
** [MySQL| 
https://dev.mysql.com/doc/refman/5.7/en/date-and-time-functions.html#function_date-format]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Elasticsearch 5 connector maven artifact

2017-06-12 Thread Nico Kruber
This is among the fixes for 1.3.1

see https://issues.apache.org/jira/browse/FLINK-6812


Nico

On Friday, 9 June 2017 08:57:39 CEST Dawid Wysakowicz wrote:
> Hi devs,
> 
> I tried using flink-connector-elasticsearch5_2.10 dependency:
> > 
> > 
> > org.apache.flink
> > 
> > flink-connector-elasticsearch5_2.10
> > 
> > 1.3.0
> > 
> > 
> 
> But it seems it was not published with 1.3.0 release. Was it intended? I
> filed a JIRA for it: https://issues.apache.org/jira/browse/FLINK-6854
> 
> 
> Z pozdrowieniami! / Cheers!
> 
> Dawid Wysakowicz
> 
> *Data/Software Engineer*
> 
> Skype: dawid_wys | Twitter: @OneMoreCoder
> 
> 



signature.asc
Description: This is a digitally signed message part.


[jira] [Created] (FLINK-6893) Add BIN supported in SQL

2017-06-12 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6893:
--

 Summary: Add BIN supported in SQL
 Key: FLINK-6893
 URL: https://issues.apache.org/jira/browse/FLINK-6893
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: sunjincheng


BIN(N) Returns a string representation of the binary value of N, where N is a 
longlong (BIGINT) number. This is equivalent to CONV(N,10,2). Returns NULL if N 
is NULL.

* Syntax:
BIN(num)

* Arguments
**num: a long/bigint value

* Return Types
  String

* Example:
  BIN(12) -> '1100'

* See more:
** [MySQL| 
https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_bin]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6892) Add LPAD supported in SQL

2017-06-12 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6892:
--

 Summary: Add LPAD supported in SQL
 Key: FLINK-6892
 URL: https://issues.apache.org/jira/browse/FLINK-6892
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: sunjincheng


LPAD(str,len,padstr) Returns the string str, left-padded with the string padstr 
to a length of len characters. If str is longer than len, the return value is 
shortened to len characters.

* Syntax:
LPAD(str,len,padstr) 

* Arguments
**str: -
**len: -
**padstr: -

* Return Types
  String

* Example:
  LPAD('hi',4,'??') -> '??hi'
  LPAD('hi',1,'??') -> 'h'

* See more:
** [MySQL| 
https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_lpad]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6891) Add LOG supported in SQL

2017-06-12 Thread sunjincheng (JIRA)
sunjincheng created FLINK-6891:
--

 Summary: Add LOG supported in SQL
 Key: FLINK-6891
 URL: https://issues.apache.org/jira/browse/FLINK-6891
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.4.0
Reporter: sunjincheng


LONG (N) A single parameter version of the function returns the natural 
logarithm of N, and if two arguments are called, it returns any radix of the 
logarithm of N. 
* Syntax:
LOG ( float_expression [, base ] )  

* Arguments
**float_expression:  Is an expression of type float or of a type that can be 
implicitly converted to float.
**base: Optional integer argument that sets the base for the logarithm.

* Return Types
  float

* Example:
  LOG(10) -> 2.30

*See more:
 **[MSQL|https://docs.microsoft.com/en-us/sql/t-sql/functions/log-transact-sql]
 **[MySQL| 
https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_log]





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6890) flink-dist Jar contains non-shaded Guava dependencies (built with Maven 3.0.5)

2017-06-12 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-6890:
--

 Summary: flink-dist Jar contains non-shaded Guava dependencies 
(built with Maven 3.0.5)
 Key: FLINK-6890
 URL: https://issues.apache.org/jira/browse/FLINK-6890
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.2.1, 1.3.0
Reporter: Tzu-Li (Gordon) Tai


See original discussion on ML: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Guava-version-conflict-td13561.html.

Running {{mvn dependency:tree}} for {{flink-dist}} did not reveal any Guava 
dependencies.
This was tested with Maven 3.0.5.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6889) Improve Path parsing for Windows paths with scheme

2017-06-12 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6889:
---

 Summary: Improve Path parsing for Windows paths with scheme
 Key: FLINK-6889
 URL: https://issues.apache.org/jira/browse/FLINK-6889
 Project: Flink
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.4.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Priority: Minor
 Fix For: 1.4.0


The {{Path}} parsing for Windows paths with a scheme (for example 
{{file:\\\C:\text.txt}} could be improved a bit.

For example, if a scheme is present the path is never identified as a Windows 
path because that is determined by whether it starts with the drive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)