date:20200106

Re: [VOTE] FLIP-90: Support SQL 2016-2017 JSON functions in Flink SQL

2020-01-06 Thread Jingsong Li

+1 non-binding. Thanks Forward for driving this.

Considering that it is made up of independent and certain things from
SQL standard and Calcite, I think it can be started as soon as possible.

Best,
Jingsong Lee

On Tue, Dec 31, 2019 at 5:09 PM Forward Xu  wrote:

> Hi all,
>
> I'd like to start the vote of FLIP-90 [1] since that we have reached an
> agreement on the design in the discussion thread [2].
>
> This vote will be open for at least 72 hours. Unless there is an objection,
> I will try to close it by January 3, 2020 08:00 UTC if we have received
> sufficient votes.
>
> Best,
> ForwardXu
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=141724550
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-JSON-functions-in-Flink-SQL-td32674.html
>


-- 
Best, Jingsong Lee

Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

2020-01-06 Thread Jark Wu

Hi Timo,

Yest. I have checked that, they are the same thing in the concept from the
documentation[1][2].
>From the desciption of the umbrella issue [FLINK-8478] of DataStream
interval join, we can also find that they are the same:

> The goal is to implement a DataStream API to join together two streams
based on a key and user defined time boundaries.
> This feature should mirror what is already possible in Flink SQL:
> SELECT s1.value, s2.value FROM Table s1 JOIN Table s2 ON o.key = s.key
AND o.rowtime BETWEEN s.rowtime AND s.rowtime + INTERVAL '1' HOUR;

Best,
Jark

[1]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/joining.html#interval-join
[2]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/joins.html#time-windowed-joins
[3]: https://issues.apache.org/jira/browse/FLINK-8478

On Mon, 6 Jan 2020 at 18:33, Timo Walther  wrote:

> If the semantics of both APIs are 100% equivalent, I'm fine with the
> renaming. Have you checked that?
>
> Regards,
> Timo
>
> On 06.01.20 10:46, Zhenghua Gao wrote:
> > +1 to align the terminology.
> >
> > *Best Regards,*
> > *Zhenghua Gao*
> >
> >
> > On Fri, Jan 3, 2020 at 12:59 PM Jingsong Li 
> wrote:
> >
> >> +1 for this documentation change.
> >> Hope less confuse to users.
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >> On Fri, Jan 3, 2020 at 12:09 PM Benchao Li  wrote:
> >>
> >>> +1
> >>>
> >>> It's good to align the terminology between Table API & SQL and
> >> DataStream.
> >>>
> >>> Jark Wu  于2020年1月3日周五 下午12:04写道：
> >>>
>  Hi everyone,
> 
>  As we discussed in the mailing list[1], the current "Time-windowed
> >> Join"
> >>> in
>  Table API & SQL is a little misleading which is not the same to
> "Window
>  Join" in DataStream, but the same to "Interval Join" in DataStream.
> 
>  So I would like to start a vote to rename the terminology of
> >>> "Time-windowed
>  Join" to "Interval Join" in Table API & SQL **before 1.10 release**.
> 
>  Note that this is a purely documentation change, no updates for public
> >>> API
>  or Javadocs. Updates for implementation codes (e.g. rename
>  DataStreamWindowJoin) is not targeted to 1.10.
> 
>  This vote will be open for at least 72 hours. Unless there is an
> >>> objection.
>  This vote is required Consensus Approval which is the same to a FLIP
> >>> vote.
> 
>  Best,
>  Jark
> 
>  [1]:
> 
> 
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-the-terminology-of-quot-Time-windowed-Join-quot-to-quot-Interval-Join-quot-in-Table-L-td36202.html
> 
> >>>
> >>>
> >>> --
> >>>
> >>> Benchao Li
> >>> School of Electronics Engineering and Computer Science, Peking
> University
> >>> Tel:+86-15650713730
> >>> Email: libenc...@gmail.com; libenc...@pku.edu.cn
> >>>
> >>
> >>
> >> --
> >> Best, Jingsong Lee
> >>
> >
>
>

Re: [DISCUSS] Set default planner for SQL Client to Blink planner in 1.10 release

2020-01-06 Thread Jark Wu

Thanks everyone for the feedback!

We have reached a clear consensus that the proposal is accepted.

There are 3 binding votes from:
- Timo
- Hequn
- Jark

I will make sure that this change is applied to Flink code base and
created FLINK-15495 to track this.

Best,
Jark

On Mon, 6 Jan 2020 at 13:54, 贺小令  wrote:

> +1 for making the blink planner as default planner, the blink planner
> becomes more stable since 1.10
>
> Dian Fu  于2020年1月6日周一 上午11:51写道：
>
>> +1 to set blink planner as the default planner for SQL client considering
>> that so many features added since 1.10 are only available in the blink
>> planner.
>>
>> 在 2020年1月6日，上午11:04，Rui Li  写道：
>>
>> +1. I think it improves user experience.
>>
>> On Mon, Jan 6, 2020 at 10:18 AM Zhenghua Gao  wrote:
>>
>>> +1 for making blink planner as the default planner for SQL Client since
>>> we have made a huge improvement in 1.10.
>>>
>>> *Best Regards,*
>>> *Zhenghua Gao*
>>>
>>>
>>> On Sun, Jan 5, 2020 at 2:42 PM Benchao Li  wrote:
>>>
 +1

 We have used blink planner since 1.9.0 release in our production
 environment, and it behaves really impressive.

 Hequn Cheng  于2020年1月5日周日 下午1:58写道：

> +1 to make blink planner as the default planner for SQL Client, hence
> we can give the blink planner a bit more exposure.
>
> Best, Hequn
>
> On Fri, Jan 3, 2020 at 6:32 PM Jark Wu  wrote:
>
>> Hi Benoît,
>>
>> Thanks for the reminder. I will look into the issue and hopefully we
>> can target it into 1.9.2 and 1.10.
>>
>> Cheers,
>> Jark
>>
>> On Fri, 3 Jan 2020 at 18:21, Benoît Paris <
>> benoit.pa...@centraliens-lille.org> wrote:
>>
>>> >  If anyone finds that blink planner has any significant defects
>>> and has a larger regression than the old planner, please let us know.
>>>
>>> Overall, the Blink-exclusive features are must (TopN, deduplicate,
>>> LAST_VALUE, plan reuse, etc)! But all use cases of the Legacy planner in
>>> production are not covered:
>>> An edge case of Temporal Table Functions does not allow computed
>>> Tables (as opposed to TableSources) to be used on the query side in 
>>> Blink (
>>> https://issues.apache.org/jira/browse/FLINK-14200)
>>>
>>> Cheers
>>> Ben
>>>
>>>
>>> On Fri, Jan 3, 2020 at 10:00 AM Jeff Zhang  wrote:
>>>
 +1, I have already made blink as the default planner of flink
 interpreter in Zeppelin


 Jingsong Li  于2020年1月3日周五 下午4:37写道：

> Hi Jark,
>
> +1 for default blink planner in SQL-CLI.
> I believe this new planner can be put into practice in production.
> We've worked hard for nearly a year, but the old planner didn't
> move on.
>
> And I'd like to cc to u...@flink.apache.org.
> If anyone finds that blink planner has any significant defects and
> has a larger regression than the old planner, please let us know. We 
> will
> be very grateful.
>
> Best,
> Jingsong Lee
>
> On Fri, Jan 3, 2020 at 4:14 PM Leonard Xu 
> wrote:
>
>> +1 for this.
>> We bring many SQL/API features and enhance stability in 1.10
>> release, and almost all of them happens in Blink planner.
>> SQL CLI is the most convenient entrypoint for me, I believe many
>> users will have a better experience If we set Blink planner as 
>> default
>> planner.
>>
>> Best,
>> Leonard
>>
>> > 在 2020年1月3日，15:16，Terry Wang  写道：
>> >
>> > Since what blink planner can do is a superset of flink planner,
>> big +1 for changing the default planner to Blink planner from my 
>> side.
>> >
>> > Best,
>> > Terry Wang
>> >
>> >
>> >
>> >> 2020年1月3日 15:00，Jark Wu  写道：
>> >>
>> >> Hi everyone,
>> >>
>> >> In 1.10 release, Flink SQL supports many awesome features and
>> improvements,
>> >> including:
>> >> - support watermark statement and computed column in DDL
>> >> - fully support all data types in Hive
>> >> - Batch SQL performance improvements (TPC-DS 7x than Hive MR)
>> >> - support INSERT OVERWRITE and INSERT PARTITION
>> >>
>> >> However, all the features and improvements are only avaiable
>> in Blink
>> >> planner, not in Old planner.
>> >> There are also some other features are limited in Blink
>> planner, e.g.
>> >> Dimension Table Join [1],
>> >> TopN [2], Deduplicate [3], streaming aggregates optimization
>> [4], and so on.
>> >>
>> >> But Old planner is still the default planner in Table API &
>> SQL. It is
>> >> frustrating for users to set
>

[jira] [Created] (FLINK-15497) Topn

2020-01-06 Thread Kurt Young (Jira)

Kurt Young created FLINK-15497:
--

 Summary: Topn
 Key: FLINK-15497
 URL: https://issues.apache.org/jira/browse/FLINK-15497
 Project: Flink
  Issue Type: Bug
Reporter: Kurt Young






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] JDBC exactly-once sink

2020-01-06 Thread Jingsong Li

Hi Piotr,

We already have "JDBCUpsertOutputFormat". It is mainly proposed for
Table/SQL API, but DataStream can use it to have the semantics of upsert.
But it has weaker semantics than exactly once, and can only guarantee the
final consistency.

Best,
Jingsong Lee

On Mon, Jan 6, 2020 at 8:07 PM Piotr Nowojski  wrote:

> Hi,
>
> Also +1 for using the XA. There might be scenarios where WAL could be a
> better option, but I think XA should be the default/first choice. If there
> will be a bigger demand for WAL, we can always provide it as an alternative.
>
> As Jingsong mentioned, with WAL I would be worried about batch like
> workload. Spamming external database with all of the accumulated records
> once per checkpoint can easily leads to DDoS scenarios, external system
> crashes.
>
> Jingsong do we already have an upsert JDBC sink? I guess it's just in
> Table API, right?
>
> Piotrek
>
> > On 6 Jan 2020, at 10:43, Jingsong Li  wrote:
> >
> > Thanks Roman for driving this.
> >
> > Although Upsert sink can bring some consistency guarantee to JDBC sink,
> > users have strong requirements to append exactly-once sink too.
> >
> > +1 to use XA distribution transaction.
> >
> > For WAL, as far as my experience is concerned, writing JDBC in large
> > quantities when checkpoint is completed will easily lead to too much
> > pressure and some instability to database.
> >
> > For testing, maybe we need add some XA testing in e2e.
> >
> > But I'm worried about the performance of XA. Sometimes using MySQL's XA
> > will lead to poor performance (of course, it may be related to usage).
> >
> > Best,
> > Jingsong Lee
> >
> > On Mon, Jan 6, 2020 at 4:41 PM Roman Khachatryan <
> ro...@data-artisans.com>
> > wrote:
> >
> >> Hi everyone,
> >>
> >> I'm currently working on exactly-once JDBC sink implementation for
> Flink.
> >> Any ideas and/or feedback are welcome.
> >>
> >> I've considered the following options:
> >> 1. Two-phase commit. This is similar to Kafka sink.
> >> XA or database-specific API can be used. In case of XA, each sink
> subtask
> >> acts as a transaction manager, and each checkpoint-subtask pair
> corresponds
> >> to an XA transaction (with a single branch)
> >> 2. Write-ahead log. This is similar to Cassandra sink.
> >> Transactions metadata needs to be stored in the database along with
> data to
> >> avoid adding duplicates after recovery.
> >>
> >> For some scenarios, WAL might be better, but in general, XA seems to be
> a
> >> better option.
> >>
> >> ==
> >> XA vs WAL comparison
> >> ==
> >>
> >> 1. Consistency: XA preferable
> >> WAL: longer inconsistency windows when writing from several sink
> subtasks
> >>
> >> 2. Performance and efficiency: XA preferable (depends on the use case)
> >> XA:
> >> - long-running transactions may delay GC and may hold locks (depends on
> the
> >> use case)
> >> - databases/drivers may have XA implementation issues
> >> WAL:
> >> - double (de)serialization and IO (first to flink state, then to
> database)
> >> - read-from-state and write-to-database spikes on checkpoint completion
> >> both may have read spikes in consumer
> >>
> >> 3. Database support: XA preferable
> >> XA: most popular RDBMS do support it (at least mysql, pgsql, mssql,
> oracle,
> >> db2, sybase)
> >> WAL: meta table DDL may differ
> >>
> >> 4. Operability: depends on the use case
> >> XA:
> >> - increased undo segment (db may need to maintain a view from the
> >> transaction start)
> >> - abandoned transactions cleanup (abandoned tx may cause starvation if
> for
> >> example database blocks inserts of duplicates in different transactions)
> >> - (jars aren't an issue - most drivers ship XA implementation)
> >> WAL:
> >> - increased intermediate flink state
> >> - need to maintain meta table
> >>
> >> 5. Simplicity: about the same
> >> XA: more corner cases
> >> WAL: state and meta table management
> >> Both wrap writes into transactions
> >>
> >> 6. Testing - WAL preferable
> >> XA requires MVVC and proper XA support (no jars needed for derby)
> >>
> >> --
> >> Regards,
> >> Roman
> >>
> >
> >
> > --
> > Best, Jingsong Lee
>
>

-- 
Best, Jingsong Lee

[jira] [Created] (FLINK-15496) Remove RegisterApplicationMasterResponseReflector

2020-01-06 Thread Zili Chen (Jira)

Zili Chen created FLINK-15496:
-

 Summary: Remove RegisterApplicationMasterResponseReflector
 Key: FLINK-15496
 URL: https://issues.apache.org/jira/browse/FLINK-15496
 Project: Flink
  Issue Type: Improvement
  Components: Deployment / YARN
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.11.0


{{RegisterApplicationMasterResponseReflector}} is no longer needed. Directly 
call {{registerApplicationMasterResponse.getContainersFromPreviousAttempts()}} 
is viable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-90: Support SQL 2016-2017 JSON functions in Flink SQL

2020-01-06 Thread Forward Xu

Thanks Jark for checking the doc. hi,Timo, please help to check to see if
there is anything else to add.

Best,
Forward

Jark Wu  于2020年1月6日周一 下午2:58写道：

> Thanks Forward for the updating. It is much more clearer now about the
> returning type, especially JSON_VALUE.
> The design doc looks good to me now.
>
> Best,
> Jark
>
> On Fri, 3 Jan 2020 at 21:42, Forward Xu  wrote:
>
> > Hi Timo, Jack,
> > Well, I added additional column to describe the return type of each
> > function and
> > updated the google doc.
> >
> > Best,
> > Forward
> >
> > Jark Wu  于2020年1月3日周五 下午5:48写道：
> >
> > > Hi Timo,
> > >
> > > That's a good point.
> > > We didn't introduce any new types. We will use the function definition
> > > defined by Calcite[1].
> > > So all the functions return STRING/BOOLEAN.
> > >
> > > Hi Forward,
> > > I think we may need an additional column to describe the return type of
> > > each function.
> > >
> > > Best,
> > > Jark
> > >
> > > [1]:
> > >
> > >
> >
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/fun/SqlJsonQueryFunction.java
> > >
> > > On Fri, 3 Jan 2020 at 17:30, Timo Walther  wrote:
> > >
> > > > Hi,
> > > >
> > > > sorry for jumping into the discussion so late. I had a quick look at
> > the
> > > > FLIP. It looks very nice and detailed. I have one question that I
> could
> > > > not find in the FLIP itself. Maybe it is hidden in the long
> discussion
> > > > thread.
> > > >
> > > > What are the return types of all functions? Do we introduce new types
> > > > with this FLIP? Also the RAW types should be avoided. Do all
> functions
> > > > return STRING/BOOLEAN?
> > > >
> > > > Thanks,
> > > > Timo
> > > >
> > > >
> > > > On 31.12.19 09:39, Hequn Cheng wrote:
> > > > > Thanks a lot for the update. +1 to start a vote.
> > > > >
> > > > > On Tue, Dec 31, 2019 at 2:38 PM Forward Xu  >
> > > > wrote:
> > > > >
> > > > >> Hi Jark, Hequn,
> > > > >>
> > > > >> I have updated the documentation.
> > > > >>
> > > > >> Best,
> > > > >>
> > > > >> Forward
> > > > >>
> > > > >> Forward Xu  于2019年12月29日周日 下午4:01写道：
> > > > >>
> > > > >>> Hi Jark, Hequn,
> > > > >>>
> > > > >>> Thank you very much, Introducing new `TableSymbol`s sounds like a
> > > good
> > > > >>> idea. +1 for the proposal.
> > > > >>>
> > > > >>> I think this idea is good, I will add this in the documentation.
> > > > >>>
> > > > >>>
> > > > >>> Best, Forward
> > > > >>>
> > > > >>> Hequn Cheng  于2019年12月29日周日 下午3:41写道：
> > > > >>>
> > > >  Hi Jark,
> > > > 
> > > >  Introducing new `TableSymbol`s sounds like a good idea. +1 for
> the
> > > >  proposal.
> > > >  @ForwardXu what do you think? Would be great if the document can
> > be
> > > >  updated
> > > >  accordingly.
> > > > 
> > > >  Best, Hequn
> > > > 
> > > > 
> > > >  On Fri, Dec 27, 2019 at 4:14 PM Jark Wu 
> wrote:
> > > > 
> > > > > Thanks for looking into the design Hequn. I agree it would be
> > great
> > > > to
> > > > > have a full story design.
> > > > >
> > > > > For the ON ERROR and ON EMPTY clause in Table API, some initial
> > > > > thoughts in my mind is that
> > > > > we can introduce some new `TableSymbol`s as the second
> parameter
> > of
> > > > >> json
> > > > > function, e.g. `JsonErrorStrategy`.
> > > > >
> > > > > For example,
> > > > >
> > > > > JSON_VALUE(v, 'lax $.b' ERROR ON ERROR)
> > > > > == is equal to Table API ==>
> > > > > v.jsonValue("lax $.b", JsonErrorStrategy.ERROR)
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > >
> > > > > On Thu, 26 Dec 2019 at 23:37, Hequn Cheng <
> chenghe...@gmail.com>
> > > > >> wrote:
> > > > >
> > > > >> Hi Jark & ForwardXu,
> > > > >>
> > > > >> The design doc looks very nice! Only some minor feedback from
> my
> > > > >> side.
> > > > >>
> > > > >> As calcite has already implemented the JSON functions, I would
> > > > >> suppose
> > > > >> the semantics and implementation are right for SQL.
> > > > >>
> > > > >> For TableAPI, I think the most important is to keep align with
> > the
> > > > >> SQL(which has also been mentioned by Jark in the previous
> > > > >> discussion).
> > > >  Have
> > > > >> an equivalent feature set for all APIs and maintain it
> otherwise
> > > >  confusion
> > > > >> increases especially when more and more functions are added.
> The
> > > >  document
> > > > >> has documented how to support TableAPI. I think this is very
> > good!
> > > > >> And
> > > >  it
> > > > >> would be better to also include ON ERROR or ON EMPTY for Table
> > > API.
> > > > >> We
> > > >  can
> > > > >> implement these features step by step, but maybe we should
> > design
> > > > all
> > > >  these
> > > > >> once for all to avoid API changes later. Meanwhile, these
> > features
> > > > >> are
> > > >  also
> > > > >> commonly required by users.

[jira] [Created] (FLINK-15495) Set default planner for SQL Client to Blink planner

2020-01-06 Thread Jark Wu (Jira)

Jark Wu created FLINK-15495:
---

 Summary: Set default planner for SQL Client to Blink planner
 Key: FLINK-15495
 URL: https://issues.apache.org/jira/browse/FLINK-15495
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Client
Reporter: Jark Wu
Assignee: Jark Wu
 Fix For: 1.10.0


As discussed in the mailing list [1], we will change the default planner to 
Blink planner for SQL CLI. 

[1]: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Set-default-planner-for-SQL-Client-to-Blink-planner-in-1-10-release-td36379.html




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15494) time field index wrong in LogicalWindowAggregateRuleBase

2020-01-06 Thread Benchao Li (Jira)

Benchao Li created FLINK-15494:
--

 Summary: time field index wrong in LogicalWindowAggregateRuleBase
 Key: FLINK-15494
 URL: https://issues.apache.org/jira/browse/FLINK-15494
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.9.1, 1.10.0
Reporter: Benchao Li


When we use two time window in blink planner, will result in calculating wrong 
time field index.

 

This has two phenomenon as far as I know:
 # wrong index may has correct field name, and will pass the build, but has 
wrong rowtime in runtime.
 # wrong index has incorrect field name, and will not pass the build.

 

How to reproduce this problem:

I added a unit test in `WindowAggregateITCase`:
{code:java}
@Test
def testDoubleTumbleWindow(): Unit = {
  val stream = failingDataSource(data)
.assignTimestampsAndWatermarks(
  new TimestampAndWatermarkWithOffset
[(Long, Int, Double, Float, BigDecimal, String, String)](10L))
  val table = stream.toTable(tEnv,
'rowtime.rowtime, 'int, 'double, 'float, 'bigdec, 'string, 'name)
  tEnv.registerTable("T1", table)

  val sql =
"""
  |SELECT SUM(cnt)
  |FROM (
  |  SELECT COUNT(1) AS cnt, TUMBLE_ROWTIME(rowtime, INTERVAL '10' SECOND) 
AS ts
  |  FROM T1
  |  GROUP BY `int`, `string`, TUMBLE(rowtime, INTERVAL '10' SECOND)
  |)
  |GROUP BY TUMBLE(ts, INTERVAL '10' SECOND)
  |""".stripMargin

  val sink = new TestingAppendSink
  tEnv.sqlQuery(sql).toAppendStream[Row].addSink(sink)
  env.execute()

  val expected = Seq("9")
  assertEquals(expected.sorted, sink.getAppendResults.sorted)
}
{code}
which will result in exception:
{code:java}
java.lang.RuntimeException: Error while applying rule 
StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args 
[rel#308:FlinkLogicalWindowAggregate.LOGICAL.any.None: 
0.false.UNKNOWN(input=RelSubset#307,group={},EXPR$0=SUM($1),window=TumblingGroupWindow('w$,
 int, 1),properties=)]java.lang.RuntimeException: Error while applying rule 
StreamExecGroupWindowAggregateRule(in:LOGICAL,out:STREAM_PHYSICAL), args 
[rel#308:FlinkLogicalWindowAggregate.LOGICAL.any.None: 
0.false.UNKNOWN(input=RelSubset#307,group={},EXPR$0=SUM($1),window=TumblingGroupWindow('w$,
 int, 1),properties=)]
 at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:235)
 at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631)
 at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:327) at 
org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:64)
 at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)
 at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)
 at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
 at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
 at scala.collection.Iterator$class.foreach(Iterator.scala:891) at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at 
scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157) at 
scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104) at 
org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.optimize(FlinkChainedProgram.scala:57)
 at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.optimizeTree(StreamCommonSubGraphBasedOptimizer.scala:167)
 at 
org.apache.flink.table.planner.plan.optimize.StreamCommonSubGraphBasedOptimizer.doOptimize(StreamCommonSubGraphBasedOptimizer.scala:89)
 at 
org.apache.flink.table.planner.plan.optimize.CommonSubGraphBasedOptimizer.optimize(CommonSubGraphBasedOptimizer.scala:77)
 at 
org.apache.flink.table.planner.delegation.PlannerBase.optimize(PlannerBase.scala:248)
 at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:151)
 at 
org.apache.flink.table.api.scala.internal.StreamTableEnvironmentImpl.toDataStream(StreamTableEnvironmentImpl.scala:210)
 at 
org.apache.flink.table.api.scala.internal.StreamTableEnvironmentImpl.toAppendStream(StreamTableEnvironmentImpl.scala:107)
 at 
org.apache.flink.table.api.scala.TableConversions.toAppendStream(TableConversions.scala:101)
 at 
org.apache.flink.table.planner.runtime.stream.sql.WindowAggregateITCase.testDoubleTumbleWindow(WindowAggregateITCase.scala:125)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.

Re: Hi Guys

2020-01-06 Thread Xintong Song

Hi,
Welcome to the Apache Flink community. You do not need a contributor
permission for opening jira tickets. Once the ticket is opened, discussed
and consensus are reached, you can ask a committer to assign the ticket and
start working on it.
Please refer to the community contribution guidelines [1].

Thank you~

Xintong Song

[1] https://flink.apache.org/contributing/how-to-contribute.html

On Tue, Jan 7, 2020 at 11:54 AM 阿穆 <305343...@qq.com> wrote:

> Hi Guys, I want to contribute to Apache Flink. Would you please give me
> the permission as a contributor? My JIRA ID is mackson.

Re: Why aren't Range Scans Supported in MapState?

2020-01-06 Thread Yun Tang

Hi Aaron

We would usually compare MapState with Map in java language. And your request 
is not what typical Map interface but SortedMap defines.

I think it's reasonable If we introduce SortedMapState interface in Flink. 
However, there existed several challenges for RocksDB.

  *   RocksDB only support byte-wise comparator by default, which means those 
non-fixed length serializer (e.g. StringSerializer) would generate bytes with 
different lexicon order compared with their original strings in java. Blink had 
ever tried to implement this feature named OrderedBytes [1], which still has 
many limitations and only applied in table API.

  *   RocksDB actually support java comparator and we have also implemented it 
privately on RocksDB-4.2. However, the performance behaves badly.
  *   RocksDB later support native comparatot for java users [2] and we are 
still looking at this future although this would be a bit complex for Flink 
users if they want their classes could be comparable as they want.

Hope this could help you.

[1] 
https://github.com/apache/flink/blob/blink/flink-libraries/flink-table/src/main/java/org/apache/flink/table/typeutils/ordered/OrderedBytes.java
  *   

 [2] https://github.com/facebook/rocksdb/pull/3334

Best
Yun Tang


  *


From: Aaron Langford 
Sent: Tuesday, January 7, 2020 1:36
To: dev@flink.apache.org 
Subject: Why aren't Range Scans Supported in MapState?

Hey Flink Community,

RocksDB natively supports range scans for keys. Is there a good story why
we don't have a way to ask for keys by range or prefix currently in
MapState?

Aaron

Hi Guys

2020-01-06 Thread ????

Hi Guys, I want to contribute to Apache Flink. Would you please give me the 
permission as a contributor? My JIRA ID is mackson.

[jira] [Created] (FLINK-15493) FlinkKafkaInternalProducerITCase.testProducerWhenCommitEmptyPartitionsToOutdatedTxnCoordinator failed on travis

2020-01-06 Thread Dian Fu (Jira)

Dian Fu created FLINK-15493:
---

 Summary: 
FlinkKafkaInternalProducerITCase.testProducerWhenCommitEmptyPartitionsToOutdatedTxnCoordinator
 failed on travis
 Key: FLINK-15493
 URL: https://issues.apache.org/jira/browse/FLINK-15493
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.10.0
Reporter: Dian Fu
 Fix For: 1.10.0


FlinkKafkaInternalProducerITCase.testProducerWhenCommitEmptyPartitionsToOutdatedTxnCoordinator
 failed on travis with the following exception:
{code:java}
Test 
testProducerWhenCommitEmptyPartitionsToOutdatedTxnCoordinator(org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase)
 failed with: org.junit.runners.model.TestTimedOutException: test timed out 
after 3 milliseconds at java.lang.Object.wait(Native Method) at 
java.lang.Object.wait(Object.java:502) at 
org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:92)
 at 
org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260) 
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.createTestTopic(KafkaTestEnvironmentImpl.java:177)
 at 
org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironment.createTestTopic(KafkaTestEnvironment.java:115)
 at 
org.apache.flink.streaming.connectors.kafka.KafkaTestBase.createTestTopic(KafkaTestBase.java:197)
 at 
org.apache.flink.streaming.connectors.kafka.FlinkKafkaInternalProducerITCase.testProducerWhenCommitEmptyPartitionsToOutdatedTxnCoordinator(FlinkKafkaInternalProducerITCase.java:176)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.lang.Thread.run(Thread.java:748)
{code}
instance: [https://api.travis-ci.org/v3/job/633307060/log.txt]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2020-01-06 Thread Peter Huang

Dear All,

Happy new year! According to existing feedback from the community, we
revised the doc with the consideration of session cluster support, and
concrete interface changes needed and execution plan. Please take one more
round of review at your most convenient time.

https://docs.google.com/document/d/1aAwVjdZByA-0CHbgv16Me-vjaaDMCfhX7TzVVTuifYM/edit#


Best Regards
Peter Huang





On Thu, Jan 2, 2020 at 11:29 AM Peter Huang 
wrote:

> Hi Dian,
> Thanks for giving us valuable feedbacks.
>
> 1) It's better to have a whole design for this feature
> For the suggestion of enabling the cluster mode also session cluster, I
> think Flink already supported it. WebSubmissionExtension already allows
> users to start a job with the specified jar by using web UI.
> But we need to enable the feature from CLI for both local jar, remote jar.
> I will align with Yang Wang first about the details and update the design
> doc.
>
> 2) It's better to consider the convenience for users, such as debugging
>
> I am wondering whether we can store the exception in jobgragh
> generation in application master. As no streaming graph can be scheduled in
> this case, there will be no more TM will be requested from FlinkRM.
> If the AM is still running, users can still query it from CLI. As it
> requires more change, we can get some feedback from 
> and @zjf...@gmail.com .
>
> 3) It's better to consider the impact to the stability of the cluster
>
> I agree with Yang Wang's opinion.
>
>
>
> Best Regards
> Peter Huang
>
>
> On Sun, Dec 29, 2019 at 9:44 PM Dian Fu  wrote:
>
>> Hi all,
>>
>> Sorry to jump into this discussion. Thanks everyone for the discussion.
>> I'm very interested in this topic although I'm not an expert in this part.
>> So I'm glad to share my thoughts as following:
>>
>> 1) It's better to have a whole design for this feature
>> As we know, there are two deployment modes: per-job mode and session
>> mode. I'm wondering which mode really needs this feature. As the design doc
>> mentioned, per-job mode is more used for streaming jobs and session mode is
>> usually used for batch jobs(Of course, the job types and the deployment
>> modes are orthogonal). Usually streaming job is only needed to be submitted
>> once and it will run for days or weeks, while batch jobs will be submitted
>> more frequently compared with streaming jobs. This means that maybe session
>> mode also needs this feature. However, if we support this feature in
>> session mode, the application master will become the new centralized
>> service(which should be solved). So in this case, it's better to have a
>> complete design for both per-job mode and session mode. Furthermore, even
>> if we can do it phase by phase, we need to have a whole picture of how it
>> works in both per-job mode and session mode.
>>
>> 2) It's better to consider the convenience for users, such as debugging
>> After we finish this feature, the job graph will be compiled in the
>> application master, which means that users cannot easily get the exception
>> message synchorousely in the job client if there are problems during the
>> job graph compiling (especially for platform users), such as the resource
>> path is incorrect, the user program itself has some problems, etc. What I'm
>> thinking is that maybe we should throw the exceptions as early as possible
>> (during job submission stage).
>>
>> 3) It's better to consider the impact to the stability of the cluster
>> If we perform the compiling in the application master, we should consider
>> the impact of the compiling errors. Although YARN could resume the
>> application master in case of failures, but in some case the compiling
>> failure may be a waste of cluster resource and may impact the stability the
>> cluster and the other jobs in the cluster, such as the resource path is
>> incorrect, the user program itself has some problems(in this case, job
>> failover cannot solve this kind of problems) etc. In the current
>> implemention, the compiling errors are handled in the client side and there
>> is no impact to the cluster at all.
>>
>> Regarding to 1), it's clearly pointed in the design doc that only per-job
>> mode will be supported. However, I think it's better to also consider the
>> session mode in the design doc.
>> Regarding to 2) and 3), I have not seen related sections in the design
>> doc. It will be good if we can cover them in the design doc.
>>
>> Feel free to correct me If there is anything I misunderstand.
>>
>> Regards,
>> Dian
>>
>>
>> > 在 2019年12月27日，上午3:13，Peter Huang  写道：
>> >
>> > Hi Yang,
>> >
>> > I can't agree more. The effort definitely needs to align with the final
>> > goal of FLIP-73.
>> > I am thinking about whether we can achieve the goal with two phases.
>> >
>> > 1) Phase I
>> > As the CLiFrontend will not be depreciated soon. We can still use the
>> > deployMode flag there,
>> > pass the program info through Flink configuration,  use the
>> > ClassPathJobGraphRetriever
>> > to generat

[jira] [Created] (FLINK-15492) flink-clients test failing on Red Hat 7.6 PowerPC Linux

2020-01-06 Thread Ronald O. Edmark (Jira)

Ronald O. Edmark created FLINK-15492:


 Summary: flink-clients test failing on Red Hat 7.6 PowerPC Linux
 Key: FLINK-15492
 URL: https://issues.apache.org/jira/browse/FLINK-15492
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.8.3
 Environment: Red Hat 7.6 Linux ppc64le

Java 1.8.0.232

Maven 3.2.5

 

 
Reporter: Ronald O. Edmark


Steps to reproduce, running as non-root user.

 

git clone https://github.com/apache/flink.git
cd flink
git checkout release-1.8.3
mvn clean package

 

[ERROR] Tests run: 10, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 3.586 
s <<< FAILURE! - in org.apache.flink.client.program.rest.RestClusterClientTest 
[ERROR] 
testJobSubmitCancelStop(org.apache.flink.client.program.rest.RestClusterClientTest)
 Time elapsed: 1.972 s <<< ERROR! 
org.apache.flink.client.program.ProgramInvocationException: Could not retrieve 
the execution result. (JobID: 090e041e68eb44341e6b43e02407034f) at 
org.apache.flink.client.program.rest.RestClusterClientTest.testJobSubmitCancelStop(RestClusterClientTest.java:236)
 Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph. Caused by: 
org.apache.flink.runtime.rest.util.RestClientException: [File upload failed.] 
[ERROR] 
testDetachedJobSubmission(org.apache.flink.client.program.rest.RestClusterClientTest)
 Time elapsed: 0.066 s <<< ERROR! 
org.apache.flink.client.program.ProgramInvocationException: Could not submit 
job (JobID: 2812b449f37dae9391da9ce8c8655839) at 
org.apache.flink.client.program.rest.RestClusterClientTest.testDetachedJobSubmission(RestClusterClientTest.java:266)
 Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph. Caused by: 
org.apache.flink.runtime.rest.util.RestClientException: [File upload failed.] 
[ERROR] 
testSubmitJobAndWaitForExecutionResult(org.apache.flink.client.program.rest.RestClusterClientTest)
 Time elapsed: 0.028 s <<< ERROR! 
org.apache.flink.client.program.ProgramInvocationException: Could not retrieve 
the execution result. (JobID: cbdaff25d30441a18877efefc031cd9a) at 
org.apache.flink.client.program.rest.RestClusterClientTest.testSubmitJobAndWaitForExecutionResult(RestClusterClientTest.java:381)
 Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph. Caused by: 
org.apache.flink.runtime.rest.util.RestClientException: [File upload failed.]

 

[ERROR] Errors: [ERROR] RestClusterClientTest.testDetachedJobSubmission:266 » 
ProgramInvocation Could ... [ERROR] 
RestClusterClientTest.testJobSubmitCancelStop:236 » ProgramInvocation Could 
no... [ERROR] RestClusterClientTest.testSubmitJobAndWaitForExecutionResult:381 
» ProgramInvocation [INFO] [ERROR] Tests run: 81, Failures: 0, Errors: 3, 
Skipped: 0 [INFO] [INFO] 
 [INFO] 
Reactor Summary: [INFO] [INFO] force-shading 
.. SUCCESS [ 1.946 s] [INFO] flink 
.. SUCCESS [ 3.498 s] [INFO] 
flink-annotations .. SUCCESS [ 2.634 s] [INFO] 
flink-shaded-hadoop  SUCCESS [ 0.279 s] [INFO] 
flink-shaded-hadoop2 ... SUCCESS [ 16.085 s] [INFO] 
flink-shaded-hadoop2-uber .. SUCCESS [ 17.725 s] [INFO] 
flink-shaded-yarn-tests  SUCCESS [ 20.892 s] [INFO] 
flink-shaded-curator ... SUCCESS [ 1.113 s] [INFO] 
flink-metrics .. SUCCESS [ 0.199 s] [INFO] 
flink-metrics-core . SUCCESS [ 2.411 s] [INFO] 
flink-test-utils-parent  SUCCESS [ 0.208 s] [INFO] 
flink-test-utils-junit . SUCCESS [ 2.290 s] [INFO] 
flink-core . SUCCESS [01:48 min] [INFO] 
flink-java . SUCCESS [ 31.331 s] [INFO] 
flink-queryable-state .. SUCCESS [ 0.188 s] [INFO] 
flink-queryable-state-client-java .. SUCCESS [ 2.741 s] [INFO] 
flink-filesystems .. SUCCESS [ 0.239 s] [INFO] 
flink-hadoop-fs  SUCCESS [ 10.429 s] [INFO] 
flink-runtime .. SUCCESS [10:44 min] [INFO] 
flink-scala  SUCCESS [01:43 min] [INFO] 
flink-mapr-fs .. SUCCESS [ 2.217 s] [INFO] 
flink-filesystems :: flink-fs-hadoop-shaded  SUCCESS [ 5.939 s] [INFO] 
flink-s3-fs-base ... SUCCESS [ 10.766 s] [INFO] 
flink-s3-fs-hadoop . SUCCESS [ 8.244 s] [INFO] 
flink-s3-fs-presto .

Why aren't Range Scans Supported in MapState?

2020-01-06 Thread Aaron Langford

Hey Flink Community,

RocksDB natively supports range scans for keys. Is there a good story why
we don't have a way to ask for keys by range or prefix currently in
MapState?

Aaron

Re: [DISCUSS] Some feedback after trying out the new FLIP-49 memory configurations

2020-01-06 Thread Andrey Zagrebin

Thank you for more explanation Stephan and feedback Jingsong,

I see the point now.
‘taskmanager.memory.flink.size’ is indeed simpler to understand for the
newcomers in the default config because it includes less types of memory to
consider.
Most of the time beginners will think about heap size and maybe state size
to tweak in the try-out jobs.
‘taskmanager.memory.flink.size’ is better scoped for those types of memory.

The process memory forces to understand the total memory consumption
which is more important for the next steps and will probably require
reading the docs in more depth anyways.
I agree if we decide for ‘flink.size’, it is worth mentioning a pointer to
‘process.size' in its comment as an alternative to it.

I am ok with ‘flink.size’ in the default config.

Best,
Andrey

On Mon, Dec 30, 2019 at 5:13 AM Jingsong Li  wrote:

> Thank you for your wonderful discussion.
>
> +1 for set "taskmanager.memory.flink.size" in the default config.
> Maybe we can write and explain "taskmanager.memory.process.size" in the
> comments.
>
> Most of the time,
> - "trying out" users is less in-depth users and use standalone mode.
> - Production users use active setups.
>
> Default config it very important to "trying out".
> In "trying out", for a novice Java user of Flink, he want to configure as
> much memory as my standalone process should be. In fact, it's hard for him
> to realize the JVM overhead. And there is no need that must let him know.
>
> > I don't think it is realistic that users set the process memory to full
> machine memory. There is a lot on the machine as well in most cases.
>
> +1, We often run Flink in a less clean environment, such as the environment
> have HDFS. Whether we are testing or producing, we will not use all the
> memory of the machine, such as always leaving some memory for PageCache.
>
> Best,
> Jingsong Lee
>
> On Sat, Dec 28, 2019 at 7:02 PM Stephan Ewen  wrote:
>
> > "taskmanager.memory.flink.size" in the default config has a few
> advantages.
> >
> >  - The value in the default config needs to be suitable for "trying out"
> > Flink, for a good "getting started" experience.
> >
> >   - For trying out Flink, standalone is the most common entry point
> (except
> > running in IDE).
> >
> >   - In standalone setup, from total process memory, we subtract quite a
> bit
> > before we arrive at the usable memory. We also subtract managed memory
> from
> > the heap now. I fear we might end up at a heap that becomes so small that
> > it makes for a bad "getting started" experience.
> >
> >   - I don't think it is realistic that users set the process memory to
> full
> > machine memory. There is a lot on the machine as well in most cases.
> >
> >   - In the JVM world, users are used to configuring the heap size and
> know
> > that there is additional memory overhead. The
> > "taskmanager.memory.flink.size" option fits well with that mindset.
> >
> >   - One you start to think about the total process memory of Yarn
> > containers, you are already past the getting-started phase and on the
> > tuning phase.
> >
> >
> > On Tue, Dec 24, 2019, 10:25 Andrey Zagrebin 
> wrote:
> >
> > > Thanks for the summary, Xintong! It makes sense to me.
> > >
> > > How about putting "taskmanager.memory.flink.size" in the configuration?
> > > > Then new downloaded Flink behaves similar to the previous Standalone
> > > setups.
> > > > If someone upgrades the binaries, but re-uses their old
> configuration,
> > > > then they get the compatibility as discussed previously.
> > > > We used that approach previously with the fine-grained failover
> > recovery.
> > >
> > >
> > >
> > > > I'm trying to understand why "taskmanager.memory.flink.size" rather
> > than
> > > > "taskmanager.memory.process.size" in the default flink-conf.yaml. Or
> > put
> > > it
> > > > another way, why do we want the new downloaded Flink behaves similar
> to
> > > > previous Standalone setups rather than previous active mode setups?
> Is
> > > > there any special reason that I overlooked, which makes backwards
> > > > compatibility for standalone setups more important than for active
> > > setups?
> > > > IMO, "taskmanager.memory.process.size" is easier for the new comers.
> > For
> > > > standalone setups, users can simply configure it to their machines'
> > > > available memory size, without needing to worry about leaving enough
> > > space
> > > > for JVM overehead / metaspace. For containerized setups, it's more
> > > > predictable how many memory the containers / Flink could use, which
> is
> > > more
> > > > friendly for users to manage their resource quota.
> > > > Therefore, unless there is anything I overlooked, I'm in favor of
> > putting
> > > > "taskmanager.memory.process.size" rather than
> > > > "taskmanager.memory.flink.size" in the default configuration.
> > >
> > >
> > > I agree that having "taskmanager.memory.process.size" in default config
> > > should be easier to understand and tweak for the new users because it
> is
>

[jira] [Created] (FLINK-15491) Fix incorrect path while missing trailing slash of high-availability.zookeeper.path.running-registry parameter.

2020-01-06 Thread Qingxin Wu (Jira)

Qingxin Wu created FLINK-15491:
--

 Summary: Fix incorrect path while missing trailing slash of 
high-availability.zookeeper.path.running-registry parameter.
 Key: FLINK-15491
 URL: https://issues.apache.org/jira/browse/FLINK-15491
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Configuration
Affects Versions: 1.9.0
Reporter: Qingxin Wu
 Attachments: 03a8857d50e911f9d09c68df0.png

By default, the value of configuration 
*high-availability.zookeeper.path.running-registry* must end with a slash. If 
we set this parameter as following:
{code:java}
high-availability.zookeeper.path.running-registry: /running_job_registry{code}


an incorrect path will be created, as shown in the following picture

!03a8857d50e911f9d09c68df0.png!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15490) KafkaITCase.testOneSourceMultiplePartitions fails on Travis

2020-01-06 Thread Till Rohrmann (Jira)

Till Rohrmann created FLINK-15490:
-

 Summary: KafkaITCase.testOneSourceMultiplePartitions fails on 
Travis
 Key: FLINK-15490
 URL: https://issues.apache.org/jira/browse/FLINK-15490
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka, Tests
Affects Versions: 1.10.0
Reporter: Till Rohrmann


The test {{KafkaITCase.testOneSourceMultiplePartitions}} failed on Travis 
because it received a duplicate value:

{code}
13:10:49,276 INFO  
org.apache.flink.streaming.connectors.kafka.testutils.FailingIdentityMapper  - 
> Failing mapper  1: count=2802, totalCount=3167
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
at 
org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:648)
at 
org.apache.flink.streaming.util.TestStreamEnvironment.execute(TestStreamEnvironment.java:77)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1628)
at org.apache.flink.test.util.TestUtils.tryExecute(TestUtils.java:35)
at 
org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase.runOneSourceMultiplePartitionsExactlyOnceTest(KafkaConsumerTestBase.java:912)
at 
org.apache.flink.streaming.connectors.kafka.KafkaITCase.testOneSourceMultiplePartitions(KafkaITCase.java:102)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by 
FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=1, 
backoffTimeMS=0)
at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:110)
at 
org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:76)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:186)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:181)
at 
org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:175)
at 
org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:476)
at 
org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:380)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:279)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:194)
at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiv

[ANNOUNCE] Flink Forward SF Call for Presentation closing soon!

2020-01-06 Thread Fabian Hueske

Hi all,

First of all, Happy New Year to everyone!

Many of you probably didn't spent the holidays thinking a lot about Flink.
Now, however, is the right time to focus again and decide which talk(s) to
submit for Flink Forward San Francisco because the Call for Presentations
is closing this Sunday, January 12th.

Flink Forward SF will talk place on March 23-25, 2020.
For the first time, the conference will feature two days of talks (one more
than before), so we are offering twice as many speaking slots as last year.

If you'd like to share your Flink experience with an international audience
of enthusiastic Flink users and committers, you should definitely submit
talk proposal.

We are looking for talks on the following topics:
* Use Cases
* Operations
* Technology Deep Dive
* Ecosystem
* Community

You can find more detailed track descriptions and the form to submit a
proposal at

--> https://www.flink-forward.org/sf-2020/call-for-presentations

Best regards,
Fabian
(PC Chair for Flink Forward SF 2020)

[jira] [Created] (FLINK-15489) Cannot update jobmanager/taskmanager logs

2020-01-06 Thread Dawid Wysakowicz (Jira)

Dawid Wysakowicz created FLINK-15489:


 Summary: Cannot update jobmanager/taskmanager logs
 Key: FLINK-15489
 URL: https://issues.apache.org/jira/browse/FLINK-15489
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.10.0
Reporter: Dawid Wysakowicz


There is no way to query the latest state of logs of jobmanager/taskmanager.

The Web UI show only the first version that was ever displayed.

How to reproduce:
* (not sure if necessary) configure logback as described here: 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/best_practices.html#use-logback-when-running-flink-on-a-cluster
* start a cluster
* show jobmanager logs in the Web UI
* run example job
* check again the jobmanager logs, there is no trace of the job. Clicking the 
refresh button does not help



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15488) Cannot start a taskmanger if using logback

2020-01-06 Thread Dawid Wysakowicz (Jira)

Dawid Wysakowicz created FLINK-15488:


 Summary: Cannot start a taskmanger if using logback
 Key: FLINK-15488
 URL: https://issues.apache.org/jira/browse/FLINK-15488
 Project: Flink
  Issue Type: Bug
  Components: API / Core, Deployment / Scripts
Affects Versions: 1.10.0
Reporter: Dawid Wysakowicz
 Fix For: 1.10.0


When using logback it is not possible to start the taskmanager using 
{{taskamanger.sh}} scripts. The same problem (probably) occurs when using slf4j 
that logs into the console.

The problem is that when calculating memory configuration with 
{{BashJavaUtils}} class the result is returned through standard output. If 
something is logged into the console it may result in undefined behavior such 
as e.g. 
{code}
Error: Could not find or load main class 13:51:23.961
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-15487) Expose the new type inference for scalar functions

2020-01-06 Thread Timo Walther (Jira)

Timo Walther created FLINK-15487:


 Summary: Expose the new type inference for scalar functions
 Key: FLINK-15487
 URL: https://issues.apache.org/jira/browse/FLINK-15487
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API, Table SQL / Planner
Reporter: Timo Walther
Assignee: Timo Walther


After FLINK-15281 is merged, we are able to introduce functions with the new 
type inference. We start by enabling scalar functions through the stack.

This includes the following changes:

- Introduce a {{UserDefinedFunction.getTypeInference}} method
- Expose {{TableEnvironment.createTemporarySystemFunction}} and 
{{TableEnvironment.createTemporaryFunction}}
- Update FunctionCatalog
- Update the code generation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] JDBC exactly-once sink

2020-01-06 Thread Piotr Nowojski

Hi,

Also +1 for using the XA. There might be scenarios where WAL could be a better 
option, but I think XA should be the default/first choice. If there will be a 
bigger demand for WAL, we can always provide it as an alternative.

As Jingsong mentioned, with WAL I would be worried about batch like workload. 
Spamming external database with all of the accumulated records once per 
checkpoint can easily leads to DDoS scenarios, external system crashes.

Jingsong do we already have an upsert JDBC sink? I guess it's just in Table 
API, right?

Piotrek

> On 6 Jan 2020, at 10:43, Jingsong Li  wrote:
> 
> Thanks Roman for driving this.
> 
> Although Upsert sink can bring some consistency guarantee to JDBC sink,
> users have strong requirements to append exactly-once sink too.
> 
> +1 to use XA distribution transaction.
> 
> For WAL, as far as my experience is concerned, writing JDBC in large
> quantities when checkpoint is completed will easily lead to too much
> pressure and some instability to database.
> 
> For testing, maybe we need add some XA testing in e2e.
> 
> But I'm worried about the performance of XA. Sometimes using MySQL's XA
> will lead to poor performance (of course, it may be related to usage).
> 
> Best,
> Jingsong Lee
> 
> On Mon, Jan 6, 2020 at 4:41 PM Roman Khachatryan 
> wrote:
> 
>> Hi everyone,
>> 
>> I'm currently working on exactly-once JDBC sink implementation for Flink.
>> Any ideas and/or feedback are welcome.
>> 
>> I've considered the following options:
>> 1. Two-phase commit. This is similar to Kafka sink.
>> XA or database-specific API can be used. In case of XA, each sink subtask
>> acts as a transaction manager, and each checkpoint-subtask pair corresponds
>> to an XA transaction (with a single branch)
>> 2. Write-ahead log. This is similar to Cassandra sink.
>> Transactions metadata needs to be stored in the database along with data to
>> avoid adding duplicates after recovery.
>> 
>> For some scenarios, WAL might be better, but in general, XA seems to be a
>> better option.
>> 
>> ==
>> XA vs WAL comparison
>> ==
>> 
>> 1. Consistency: XA preferable
>> WAL: longer inconsistency windows when writing from several sink subtasks
>> 
>> 2. Performance and efficiency: XA preferable (depends on the use case)
>> XA:
>> - long-running transactions may delay GC and may hold locks (depends on the
>> use case)
>> - databases/drivers may have XA implementation issues
>> WAL:
>> - double (de)serialization and IO (first to flink state, then to database)
>> - read-from-state and write-to-database spikes on checkpoint completion
>> both may have read spikes in consumer
>> 
>> 3. Database support: XA preferable
>> XA: most popular RDBMS do support it (at least mysql, pgsql, mssql, oracle,
>> db2, sybase)
>> WAL: meta table DDL may differ
>> 
>> 4. Operability: depends on the use case
>> XA:
>> - increased undo segment (db may need to maintain a view from the
>> transaction start)
>> - abandoned transactions cleanup (abandoned tx may cause starvation if for
>> example database blocks inserts of duplicates in different transactions)
>> - (jars aren't an issue - most drivers ship XA implementation)
>> WAL:
>> - increased intermediate flink state
>> - need to maintain meta table
>> 
>> 5. Simplicity: about the same
>> XA: more corner cases
>> WAL: state and meta table management
>> Both wrap writes into transactions
>> 
>> 6. Testing - WAL preferable
>> XA requires MVVC and proper XA support (no jars needed for derby)
>> 
>> --
>> Regards,
>> Roman
>> 
> 
> 
> -- 
> Best, Jingsong Lee

[jira] [Created] (FLINK-15486) LocalInputChannelTest.testConcurrentConsumeMultiplePartitions failing on ppc64le platform.

2020-01-06 Thread Siddhesh Ghadi (Jira)

Siddhesh Ghadi created FLINK-15486:
--

 Summary: 
LocalInputChannelTest.testConcurrentConsumeMultiplePartitions failing on 
ppc64le platform.
 Key: FLINK-15486
 URL: https://issues.apache.org/jira/browse/FLINK-15486
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
 Environment: arch: ppc64le
os: rhel7.6
jdk: 8
mvn: 3.6.2
Reporter: Siddhesh Ghadi
 Attachments: surefire-report.txt

LocalInputChannelTest.testConcurrentConsumeMultiplePartitions fails due to 
timeout however when the timeout is increased, it is passing. Any pointer to 
possible solution/explanation for such behavior would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

2020-01-06 Thread Timo Walther

If the semantics of both APIs are 100% equivalent, I'm fine with the 
renaming. Have you checked that?


Regards,
Timo

On 06.01.20 10:46, Zhenghua Gao wrote:

+1 to align the terminology.

*Best Regards,*
*Zhenghua Gao*


On Fri, Jan 3, 2020 at 12:59 PM Jingsong Li  wrote:


+1 for this documentation change.
Hope less confuse to users.

Best,
Jingsong Lee

On Fri, Jan 3, 2020 at 12:09 PM Benchao Li  wrote:


+1

It's good to align the terminology between Table API & SQL and

DataStream.


Jark Wu  于2020年1月3日周五 下午12:04写道：


Hi everyone,

As we discussed in the mailing list[1], the current "Time-windowed

Join"

in

Table API & SQL is a little misleading which is not the same to "Window
Join" in DataStream, but the same to "Interval Join" in DataStream.

So I would like to start a vote to rename the terminology of

"Time-windowed

Join" to "Interval Join" in Table API & SQL **before 1.10 release**.

Note that this is a purely documentation change, no updates for public

API

or Javadocs. Updates for implementation codes (e.g. rename
DataStreamWindowJoin) is not targeted to 1.10.

This vote will be open for at least 72 hours. Unless there is an

objection.

This vote is required Consensus Approval which is the same to a FLIP

vote.


Best,
Jark

[1]:





http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-the-terminology-of-quot-Time-windowed-Join-quot-to-quot-Interval-Join-quot-in-Table-L-td36202.html





--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenc...@gmail.com; libenc...@pku.edu.cn




--
Best, Jingsong Lee

[jira] [Created] (FLINK-15485) Reopen tests like 'ignore TODO/when FLINK-xx is closed/when FLINK-xx is merged' has been finished

2020-01-06 Thread Leonard Xu (Jira)

Leonard Xu created FLINK-15485:
--

 Summary: Reopen tests like 'ignore TODO/when FLINK-xx is 
closed/when FLINK-xx is merged' has been finished 
 Key: FLINK-15485
 URL: https://issues.apache.org/jira/browse/FLINK-15485
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Leonard Xu


Sometimes we  close test and left comment like 'TODO/when FLINK-xx is 
closed/when FLINK-xx is merged' for various reasons and ready to reopen  it 
after they are really fixed.

Unfortunately we missed some of them. This issue aims to reopen tests that 
close by

* FLINK-12088  
* FLINK-13740 
* CALCITE-1860  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

2020-01-06 Thread Zhenghua Gao

+1 to align the terminology.

*Best Regards,*
*Zhenghua Gao*


On Fri, Jan 3, 2020 at 12:59 PM Jingsong Li  wrote:

> +1 for this documentation change.
> Hope less confuse to users.
>
> Best,
> Jingsong Lee
>
> On Fri, Jan 3, 2020 at 12:09 PM Benchao Li  wrote:
>
> > +1
> >
> > It's good to align the terminology between Table API & SQL and
> DataStream.
> >
> > Jark Wu  于2020年1月3日周五 下午12:04写道：
> >
> > > Hi everyone,
> > >
> > > As we discussed in the mailing list[1], the current "Time-windowed
> Join"
> > in
> > > Table API & SQL is a little misleading which is not the same to "Window
> > > Join" in DataStream, but the same to "Interval Join" in DataStream.
> > >
> > > So I would like to start a vote to rename the terminology of
> > "Time-windowed
> > > Join" to "Interval Join" in Table API & SQL **before 1.10 release**.
> > >
> > > Note that this is a purely documentation change, no updates for public
> > API
> > > or Javadocs. Updates for implementation codes (e.g. rename
> > > DataStreamWindowJoin) is not targeted to 1.10.
> > >
> > > This vote will be open for at least 72 hours. Unless there is an
> > objection.
> > > This vote is required Consensus Approval which is the same to a FLIP
> > vote.
> > >
> > > Best,
> > > Jark
> > >
> > > [1]:
> > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-the-terminology-of-quot-Time-windowed-Join-quot-to-quot-Interval-Join-quot-in-Table-L-td36202.html
> > >
> >
> >
> > --
> >
> > Benchao Li
> > School of Electronics Engineering and Computer Science, Peking University
> > Tel:+86-15650713730
> > Email: libenc...@gmail.com; libenc...@pku.edu.cn
> >
>
>
> --
> Best, Jingsong Lee
>

Re: [DISCUSS] JDBC exactly-once sink

2020-01-06 Thread Jingsong Li

Thanks Roman for driving this.

Although Upsert sink can bring some consistency guarantee to JDBC sink,
users have strong requirements to append exactly-once sink too.

+1 to use XA distribution transaction.

For WAL, as far as my experience is concerned, writing JDBC in large
quantities when checkpoint is completed will easily lead to too much
pressure and some instability to database.

For testing, maybe we need add some XA testing in e2e.

But I'm worried about the performance of XA. Sometimes using MySQL's XA
will lead to poor performance (of course, it may be related to usage).

Best,
Jingsong Lee

On Mon, Jan 6, 2020 at 4:41 PM Roman Khachatryan 
wrote:

> Hi everyone,
>
> I'm currently working on exactly-once JDBC sink implementation for Flink.
> Any ideas and/or feedback are welcome.
>
> I've considered the following options:
> 1. Two-phase commit. This is similar to Kafka sink.
> XA or database-specific API can be used. In case of XA, each sink subtask
> acts as a transaction manager, and each checkpoint-subtask pair corresponds
> to an XA transaction (with a single branch)
> 2. Write-ahead log. This is similar to Cassandra sink.
> Transactions metadata needs to be stored in the database along with data to
> avoid adding duplicates after recovery.
>
> For some scenarios, WAL might be better, but in general, XA seems to be a
> better option.
>
> ==
> XA vs WAL comparison
> ==
>
> 1. Consistency: XA preferable
> WAL: longer inconsistency windows when writing from several sink subtasks
>
> 2. Performance and efficiency: XA preferable (depends on the use case)
> XA:
> - long-running transactions may delay GC and may hold locks (depends on the
> use case)
> - databases/drivers may have XA implementation issues
> WAL:
> - double (de)serialization and IO (first to flink state, then to database)
> - read-from-state and write-to-database spikes on checkpoint completion
> both may have read spikes in consumer
>
> 3. Database support: XA preferable
> XA: most popular RDBMS do support it (at least mysql, pgsql, mssql, oracle,
> db2, sybase)
> WAL: meta table DDL may differ
>
> 4. Operability: depends on the use case
> XA:
> - increased undo segment (db may need to maintain a view from the
> transaction start)
> - abandoned transactions cleanup (abandoned tx may cause starvation if for
> example database blocks inserts of duplicates in different transactions)
> - (jars aren't an issue - most drivers ship XA implementation)
> WAL:
> - increased intermediate flink state
> - need to maintain meta table
>
> 5. Simplicity: about the same
> XA: more corner cases
> WAL: state and meta table management
> Both wrap writes into transactions
>
> 6. Testing - WAL preferable
> XA requires MVVC and proper XA support (no jars needed for derby)
>
> --
> Regards,
> Roman
>


-- 
Best, Jingsong Lee

[jira] [Created] (FLINK-15484) How does the CEP of flink compare the current data of yesterday with the reference data? Is there a better way to get it continuously from ES

2020-01-06 Thread smallyin (Jira)

smallyin created FLINK-15484:


 Summary: How does the CEP of flink compare the current data of 
yesterday with the reference data? Is there a better way to get it continuously 
from ES
 Key: FLINK-15484
 URL: https://issues.apache.org/jira/browse/FLINK-15484
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Affects Versions: 1.9.1
Reporter: smallyin






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[DISCUSS] JDBC exactly-once sink

2020-01-06 Thread Roman Khachatryan

Hi everyone,

I'm currently working on exactly-once JDBC sink implementation for Flink.
Any ideas and/or feedback are welcome.

I've considered the following options:
1. Two-phase commit. This is similar to Kafka sink.
XA or database-specific API can be used. In case of XA, each sink subtask
acts as a transaction manager, and each checkpoint-subtask pair corresponds
to an XA transaction (with a single branch)
2. Write-ahead log. This is similar to Cassandra sink.
Transactions metadata needs to be stored in the database along with data to
avoid adding duplicates after recovery.

For some scenarios, WAL might be better, but in general, XA seems to be a
better option.

==
XA vs WAL comparison
==

1. Consistency: XA preferable
WAL: longer inconsistency windows when writing from several sink subtasks

2. Performance and efficiency: XA preferable (depends on the use case)
XA:
- long-running transactions may delay GC and may hold locks (depends on the
use case)
- databases/drivers may have XA implementation issues
WAL:
- double (de)serialization and IO (first to flink state, then to database)
- read-from-state and write-to-database spikes on checkpoint completion
both may have read spikes in consumer

3. Database support: XA preferable
XA: most popular RDBMS do support it (at least mysql, pgsql, mssql, oracle,
db2, sybase)
WAL: meta table DDL may differ

4. Operability: depends on the use case
XA:
- increased undo segment (db may need to maintain a view from the
transaction start)
- abandoned transactions cleanup (abandoned tx may cause starvation if for
example database blocks inserts of duplicates in different transactions)
- (jars aren't an issue - most drivers ship XA implementation)
WAL:
- increased intermediate flink state
- need to maintain meta table

5. Simplicity: about the same
XA: more corner cases
WAL: state and meta table management
Both wrap writes into transactions

6. Testing - WAL preferable
XA requires MVVC and proper XA support (no jars needed for derby)

--
Regards,
Roman

[jira] [Created] (FLINK-15483) Support to set environment variables for jobmanager pod

2020-01-06 Thread Yang Wang (Jira)

Yang Wang created FLINK-15483:
-

 Summary: Support to set environment variables for jobmanager pod
 Key: FLINK-15483
 URL: https://issues.apache.org/jira/browse/FLINK-15483
 Project: Flink
  Issue Type: Sub-task
  Components: Deployment / Kubernetes
Reporter: Yang Wang


Currently, we could use `containerized.master.env.` to set the user-defined 
environment variables. For Yarn, it works correctly. However, it could not take 
effect on Kubernetes deployment.

 

Some users have tried the flink native integration and find this problem. This 
is nice to have in 1.10 and not a blocker. Since we could set the environment 
when building the image instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] FLIP-90: Support SQL 2016-2017 JSON functions in Flink SQL

Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

Re: [DISCUSS] Set default planner for SQL Client to Blink planner in 1.10 release

[jira] [Created] (FLINK-15497) Topn

Re: [DISCUSS] JDBC exactly-once sink

[jira] [Created] (FLINK-15496) Remove RegisterApplicationMasterResponseReflector

Re: [DISCUSS] FLIP-90: Support SQL 2016-2017 JSON functions in Flink SQL

[jira] [Created] (FLINK-15495) Set default planner for SQL Client to Blink planner

[jira] [Created] (FLINK-15494) time field index wrong in LogicalWindowAggregateRuleBase

Re: Hi Guys

Re: Why aren't Range Scans Supported in MapState?

Hi Guys

[jira] [Created] (FLINK-15493) FlinkKafkaInternalProducerITCase.testProducerWhenCommitEmptyPartitionsToOutdatedTxnCoordinator failed on travis

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

[jira] [Created] (FLINK-15492) flink-clients test failing on Red Hat 7.6 PowerPC Linux

Why aren't Range Scans Supported in MapState?

Re: [DISCUSS] Some feedback after trying out the new FLIP-49 memory configurations

[jira] [Created] (FLINK-15491) Fix incorrect path while missing trailing slash of high-availability.zookeeper.path.running-registry parameter.

[jira] [Created] (FLINK-15490) KafkaITCase.testOneSourceMultiplePartitions fails on Travis

[ANNOUNCE] Flink Forward SF Call for Presentation closing soon!

[jira] [Created] (FLINK-15489) Cannot update jobmanager/taskmanager logs

[jira] [Created] (FLINK-15488) Cannot start a taskmanger if using logback

[jira] [Created] (FLINK-15487) Expose the new type inference for scalar functions

Re: [DISCUSS] JDBC exactly-once sink

[jira] [Created] (FLINK-15486) LocalInputChannelTest.testConcurrentConsumeMultiplePartitions failing on ppc64le platform.

Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

[jira] [Created] (FLINK-15485) Reopen tests like 'ignore TODO/when FLINK-xx is closed/when FLINK-xx is merged' has been finished

Re: [VOTE] Rename terminology "Time-windowed Join" to "Interval Join" in Table API & SQL

Re: [DISCUSS] JDBC exactly-once sink

[jira] [Created] (FLINK-15484) How does the CEP of flink compare the current data of yesterday with the reference data? Is there a better way to get it continuously from ES

[DISCUSS] JDBC exactly-once sink

[jira] [Created] (FLINK-15483) Support to set environment variables for jobmanager pod

32 matches

Site Navigation

Mail list logo

Footer information