date:20191118

[jira] [Created] (FLINK-14849) Can not submit job when use hive connector

2019-11-18 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-14849:


 Summary: Can not submit job when use hive connector
 Key: FLINK-14849
 URL: https://issues.apache.org/jira/browse/FLINK-14849
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: Jingsong Lee


{code:java}
With:

org.apache.hive
hive-exec
3.1.1


Caused by: java.lang.ClassCastException: org.codehaus.janino.CompilerFactory 
cannot be cast to org.codehaus.commons.compiler.ICompilerFactory
at 
org.codehaus.commons.compiler.CompilerFactoryFactory.getCompilerFactory(CompilerFactoryFactory.java:129)
at 
org.codehaus.commons.compiler.CompilerFactoryFactory.getDefaultCompilerFactory(CompilerFactoryFactory.java:79)
at 
org.apache.calcite.rel.metadata.JaninoRelMetadataProvider.compile(JaninoRelMetadataProvider.java:432)
... 68 more
{code}
After https://issues.apache.org/jira/browse/FLINK-13749 , flink-client will use 
default child-first resolve-order.

If user jar has some conflict dependents, there will be some problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-14848) BaseRowSerializer.toBinaryRow wrongly process null for objects with variable-length part

2019-11-18 Thread Zhenghua Gao (Jira)

Zhenghua Gao created FLINK-14848:


 Summary: BaseRowSerializer.toBinaryRow wrongly process null for 
objects with variable-length part
 Key: FLINK-14848
 URL: https://issues.apache.org/jira/browse/FLINK-14848
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: Zhenghua Gao


For the fixed-length objects, the writer calls setNullAt() to update 
fixed-length part（which set null bits and initialize fixed-length part with 0;

For the variable-length objects, the writer calls setNullAt to update 
fixed-length part and need to assign & initialize variable-length part



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-14847) Support retrieving Hive PK constraints

2019-11-18 Thread Rui Li (Jira)

Rui Li created FLINK-14847:
--

 Summary: Support retrieving Hive PK constraints
 Key: FLINK-14847
 URL: https://issues.apache.org/jira/browse/FLINK-14847
 Project: Flink
  Issue Type: Task
  Components: Connectors / Hive
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-14846) Correct the default writerbuffer size documentation of RocksDB

2019-11-18 Thread Yun Tang (Jira)

Yun Tang created FLINK-14846:


 Summary: Correct the default writerbuffer size documentation of 
RocksDB
 Key: FLINK-14846
 URL: https://issues.apache.org/jira/browse/FLINK-14846
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Reporter: Yun Tang
 Fix For: 1.10.0


When introduce {{RocksDBConfigurableOptions}}, the default writer buffer size 
is referenced from RocksDB's javadoc. Unfortunately, RocksDB's official javadoc 
was described incorrectly as {{4MB}} for a long time until I create a 
[PR|https://github.com/facebook/rocksdb/pull/5670] to correct it. This also 
leads [our 
description|https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#state-backend-rocksdb-writebuffer-size]
 of default write-buffer size not correct, we should fix this to avoid to 
mislead users.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] Support configure remote flink jar

2019-11-18 Thread Thomas Weise

There is a related use case (not specific to HDFS) that I came across:

It would be nice if the jar upload endpoint could accept the URL of a jar
file as alternative to the jar file itself. Such URL could point to an
artifactory or distributed file system.

Thomas


On Mon, Nov 18, 2019 at 7:40 PM Yang Wang  wrote:

> Hi tison,
>
> Thanks for your starting this discussion.
> * For user customized flink-dist jar, it is an useful feature. Since it
> could avoid to upload the flink-dist jar
> every time. Especially in production environment, it could accelerate the
> submission process.
> * For the standard flink-dist jar, FLINK-13938[1] could solve
> the problem.Upload a official flink release
> binary to distributed storage(hdfs) first, and then all the submission
> could benefit from it. Users could
> also upload the customized flink-dist jar to accelerate their submission.
>
> If the flink-dist jar could be specified to a remote path, maybe the user
> jar have the same situation.
>
> [1]. https://issues.apache.org/jira/browse/FLINK-13938
>
> tison  于2019年11月19日周二 上午11:17写道：
>
> > Hi forks,
> >
> > Recently, our customers ask for a feature configuring remote flink jar.
> > I'd like to reach to you guys
> > to see whether or not it is a general need.
> >
> > ATM Flink only supports configures local file as flink jar via `-yj`
> > option. If we pass a HDFS file
> > path, due to implementation detail it will fail with
> > IllegalArgumentException. In the story we support
> > configure remote flink jar, this limitation is eliminated. We also make
> > use of YARN locality so that
> > reducing uploading overhead, instead, asking YARN to localize the jar on
> > AM container started.
> >
> > Besides, it possibly has overlap with FLINK-13938. I'd like to put the
> > discussion on our
> > mailing list first.
> >
> > Are you looking forward to such a feature?
> >
> > @Yang Wang: this feature is different from that we discussed offline, it
> > only focuses on flink jar, not
> > all ship files.
> >
> > Best,
> > tison.
> >
>

Re: [DISCUSS] Support configure remote flink jar

2019-11-18 Thread Yang Wang

Hi tison,

Thanks for your starting this discussion.
* For user customized flink-dist jar, it is an useful feature. Since it
could avoid to upload the flink-dist jar
every time. Especially in production environment, it could accelerate the
submission process.
* For the standard flink-dist jar, FLINK-13938[1] could solve
the problem.Upload a official flink release
binary to distributed storage(hdfs) first, and then all the submission
could benefit from it. Users could
also upload the customized flink-dist jar to accelerate their submission.

If the flink-dist jar could be specified to a remote path, maybe the user
jar have the same situation.

[1]. https://issues.apache.org/jira/browse/FLINK-13938

tison  于2019年11月19日周二 上午11:17写道：

> Hi forks,
>
> Recently, our customers ask for a feature configuring remote flink jar.
> I'd like to reach to you guys
> to see whether or not it is a general need.
>
> ATM Flink only supports configures local file as flink jar via `-yj`
> option. If we pass a HDFS file
> path, due to implementation detail it will fail with
> IllegalArgumentException. In the story we support
> configure remote flink jar, this limitation is eliminated. We also make
> use of YARN locality so that
> reducing uploading overhead, instead, asking YARN to localize the jar on
> AM container started.
>
> Besides, it possibly has overlap with FLINK-13938. I'd like to put the
> discussion on our
> mailing list first.
>
> Are you looking forward to such a feature?
>
> @Yang Wang: this feature is different from that we discussed offline, it
> only focuses on flink jar, not
> all ship files.
>
> Best,
> tison.
>

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-18 Thread Yang Wang

+1 (non-binding)

It is great to have a new end-to-end test framework, even it is only for
performance tests now.

Best,
Yang

Jingsong Li  于2019年11月19日周二 上午9:54写道：

> +1 (non-binding)
>
> Best,
> Jingsong Lee
>
> On Mon, Nov 18, 2019 at 7:59 PM Becket Qin  wrote:
>
> > +1 (binding) on having the test suite.
> >
> > BTW, it would be good to have a few more details about the performance
> > tests. For example:
> > 1. How do the testing records look like? The size and key distributions.
> > 2. The resources for each task.
> > 3. The intended configuration for the jobs.
> > 4. What exact source and sink it would use.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Nov 18, 2019 at 7:25 PM Zhijiang  > .invalid>
> > wrote:
> >
> > > +1 (binding)!
> > >
> > > It is a good thing to enhance our testing work.
> > >
> > > Best,
> > > Zhijiang
> > >
> > >
> > > --
> > > From:Hequn Cheng 
> > > Send Time:2019 Nov. 18 (Mon.) 18:22
> > > To:dev 
> > > Subject:Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing
> > Framework
> > >
> > > +1 (binding)!
> > > I think this would be very helpful to detect regression problems.
> > >
> > > Best, Hequn
> > >
> > > On Mon, Nov 18, 2019 at 4:28 PM vino yang 
> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Best,
> > > > Vino
> > > >
> > > > jincheng sun  于2019年11月18日周一 下午2:31写道：
> > > >
> > > > > +1  (binding)
> > > > >
> > > > > OpenInx  于2019年11月18日周一 下午12:09写道：
> > > > >
> > > > > > +1  (non-binding)
> > > > > >
> > > > > > On Mon, Nov 18, 2019 at 11:54 AM aihua li  >
> > > > wrote:
> > > > > >
> > > > > > > +1  (non-binding)
> > > > > > >
> > > > > > > Thanks Yu Li for driving on this.
> > > > > > >
> > > > > > > > 在 2019年11月15日，下午8:10，Yu Li  写道：
> > > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > I would like to start the vote for FLIP-83 [1] which is
> > discussed
> > > > and
> > > > > > > > reached consensus in the discussion thread [2].
> > > > > > > >
> > > > > > > > The vote will be open for at least 72 hours (excluding
> > weekend).
> > > > I'll
> > > > > > try
> > > > > > > > to close it by 2019-11-20 21:00 CST, unless there is an
> > objection
> > > > or
> > > > > > not
> > > > > > > > enough votes.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > > > > > [2] https://s.apache.org/7fqrz
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Yu
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>

[jira] [Created] (FLINK-14845) Introduce data compression to blocking shuffle.

2019-11-18 Thread Yingjie Cao (Jira)

Yingjie Cao created FLINK-14845:
---

 Summary: Introduce data compression to blocking shuffle.
 Key: FLINK-14845
 URL: https://issues.apache.org/jira/browse/FLINK-14845
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: Yingjie Cao


Currently, blocking shuffle writer writes raw output data to disk without 
compression. For IO bounded scenario, this can be optimized by compressing the 
output data. It is better to introduce a compression mechanism and offer users 
a config option to let the user decide whether to compress the shuffle data. 
Actually, we hava implemented compression in our inner Flink version and  here 
are some key points:

1. Where to compress/decompress?

Compressing at upstream and decompressing at downstream.

2. Which thread do compress/decompress?

Task threads do compress/decompress.

3. Data compression granularity.

Per buffer.

4. How to handle that when data size become even bigger after compression?

Give up compression in this case and introduce an extra flag to identify if the 
data was compressed, that is, the output may be a mixture of compressed and 
uncompressed data.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[DISCUSS] Support configure remote flink jar

2019-11-18 Thread tison

Hi forks,

Recently, our customers ask for a feature configuring remote flink jar. I'd
like to reach to you guys
to see whether or not it is a general need.

ATM Flink only supports configures local file as flink jar via `-yj`
option. If we pass a HDFS file
path, due to implementation detail it will fail with
IllegalArgumentException. In the story we support
configure remote flink jar, this limitation is eliminated. We also make use
of YARN locality so that
reducing uploading overhead, instead, asking YARN to localize the jar on AM
container started.

Besides, it possibly has overlap with FLINK-13938. I'd like to put the
discussion on our
mailing list first.

Are you looking forward to such a feature?

@Yang Wang: this feature is different from that we discussed offline, it
only focuses on flink jar, not
all ship files.

Best,
tison.

[jira] [Created] (FLINK-14844) ConnectorCatalogTable should not patch the row type with TimeAttributes for batch mode

2019-11-18 Thread Danny Chen (Jira)

Danny Chen created FLINK-14844:
--

 Summary: ConnectorCatalogTable should not patch the row type with 
TimeAttributes for batch mode
 Key: FLINK-14844
 URL: https://issues.apache.org/jira/browse/FLINK-14844
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.9.1
Reporter: Danny Chen
 Fix For: 1.10.0


In current code, ConnectorCatalogTable patch up the row type with time 
attributes for both batch and stream mode, which is wrong, because batch mode 
does not need that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

2019-11-18 Thread Yijie Shen

Hi everyone,

I've put the catalog part design in separate doc with more details for
easier communication.

https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing

I would love to hear your thoughts on this.

Best,
Yijie

On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen 
wrote:

> Hi everyone,
>
> Glad to receive your valuable feedbacks.
>
> I'd first separate the Pulsar catalog as another doc and show more design
> and implementation details there.
>
> For the current FLIP-72, I would separate it into the sink part for
> current work and keep the source part as future works until we reach
> FLIP-27 finals.
>
> I also reply to some of the comments in the design doc. I will rewrite the
> catalog part in regarding to Bowen's advice in both email and comments.
>
> Thanks for the help again.
>
> Best,
> Yijie
>
> On Fri, Oct 18, 2019 at 12:40 AM Rong Rong  wrote:
>
>> Hi Yijie,
>>
>> I also agree with Jark on separating the Catalog part into another FLIP.
>>
>> With FLIP-27[1] also in the air, it is also probably great to split and
>> unblock the sink implementation contribution.
>> I would suggest either putting in a detail implementation plan section in
>> the doc, or (maybe too much separation?) splitting them into different
>> FLIPs. What do you guys think?
>>
>> --
>> Rong
>>
>> [1]
>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
>>
>> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu  wrote:
>>
>> > Hi Yijie,
>> >
>> > Thanks for the design document. I agree with Bowen that the catalog part
>> > needs more details.
>> > And I would suggest to separate Pulsar Catalog as another FLIP. IMO, it
>> has
>> > little to do with source/sink.
>> > Having a separate FLIP can unblock the contribution for sink (or source)
>> > and keep the discussion more focus.
>> > I also left some comments in the documentation.
>> >
>> > Thanks,
>> > Jark
>> >
>> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen 
>> > wrote:
>> >
>> > > Hi Bowen,
>> > >
>> > > Thanks for your comments. I'll add catalog details as you suggested.
>> > >
>> > > One more question: since we decide to not implement source part of the
>> > > connector at the moment.
>> > > What can users do with a Pulsar catalog?
>> > > Create a table backed by Pulsar and check existing pulsar tables to
>> see
>> > > their schemas? Drop tables maybe?
>> > >
>> > > Best,
>> > > Yijie
>> > >
>> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li  wrote:
>> > >
>> > > > Hi Yijie,
>> > > >
>> > > > Per the discussion, maybe you can move pulsar source to 'future
>> work'
>> > > > section in the FLIP for now?
>> > > >
>> > > > Besides, the FLIP seems to be quite rough at the moment, and I'd
>> > > recommend
>> > > > to add more details .
>> > > >
>> > > > A few questions mainly regarding the proposed pulsar catalog.
>> > > >
>> > > >- Can you provide some background of pulsar schema registry and
>> how
>> > it
>> > > >works?
>> > > >- The proposed design of pulsar catalog is very vague now, can
>> you
>> > > >share some details of how a pulsar catalog would work internally?
>> > E.g.
>> > > >   - which APIs does it support exactly? E.g. I see from your
>> > > >   prototype that table creation is supported but not alteration.
>> > > >   - is it going to connect to a pulsar schema registry via a
>> http
>> > > >   client or a pulsar client, etc
>> > > >   - will it be able to handle multiple versions of pulsar, or
>> just
>> > > >   one? How is compatibility handles between different
>> Flink-Pulsar
>> > > versions?
>> > > >   - will it support only reading from pulsar schema registry ,
>> or
>> > > >   both read/write? Will it work end-to-end in Flink SQL for
>> users
>> > to
>> > > create
>> > > >   and manipulate a pulsar table such as "CREATE TABLE t WITH
>> > > >   PROPERTIES(type=pulsar)" and "DROP TABLE t"?
>> > > >   - Is a pulsar topic always gonna be a non-partitioned table?
>> How
>> > is
>> > > >   a partitioned topic mapped to a Flink table?
>> > > >- How to map Flink's catalog/database namespace to pulsar's
>> > > >multi-tenant namespaces? I'm not very familiar with how multi
>> > tenancy
>> > > works
>> > > >in pulsar, and some background context/use cases may help here
>> too.
>> > > E.g.
>> > > >   - can a pulsar client/consumer/producer be multiple-tenant at
>> the
>> > > >   same time?
>> > > >   - how does authentication work in pulsar's multi-tenancy and
>> the
>> > > >   catalog? asking since I didn't see the proposed pulsar catalog
>> > has
>> > > >   username/password configs
>> > > >   - the FLIP seems propose mapping a pulsar cluster and
>> > > >   'tenant/namespace' respectively to Flink's 'catalog' and
>> > > 'database'. I
>> > > >   wonder whether it totally makes sense, or should we actually
>> map
>> > > "tenant"
>> > > >   to "catalog", and "namespace" to "database"?
>> > > >
>>

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-18 Thread Jingsong Li

+1 (non-binding)

Best,
Jingsong Lee

On Mon, Nov 18, 2019 at 7:59 PM Becket Qin  wrote:

> +1 (binding) on having the test suite.
>
> BTW, it would be good to have a few more details about the performance
> tests. For example:
> 1. How do the testing records look like? The size and key distributions.
> 2. The resources for each task.
> 3. The intended configuration for the jobs.
> 4. What exact source and sink it would use.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Nov 18, 2019 at 7:25 PM Zhijiang  .invalid>
> wrote:
>
> > +1 (binding)!
> >
> > It is a good thing to enhance our testing work.
> >
> > Best,
> > Zhijiang
> >
> >
> > --
> > From:Hequn Cheng 
> > Send Time:2019 Nov. 18 (Mon.) 18:22
> > To:dev 
> > Subject:Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing
> Framework
> >
> > +1 (binding)!
> > I think this would be very helpful to detect regression problems.
> >
> > Best, Hequn
> >
> > On Mon, Nov 18, 2019 at 4:28 PM vino yang  wrote:
> >
> > > +1 (non-binding)
> > >
> > > Best,
> > > Vino
> > >
> > > jincheng sun  于2019年11月18日周一 下午2:31写道：
> > >
> > > > +1  (binding)
> > > >
> > > > OpenInx  于2019年11月18日周一 下午12:09写道：
> > > >
> > > > > +1  (non-binding)
> > > > >
> > > > > On Mon, Nov 18, 2019 at 11:54 AM aihua li 
> > > wrote:
> > > > >
> > > > > > +1  (non-binding)
> > > > > >
> > > > > > Thanks Yu Li for driving on this.
> > > > > >
> > > > > > > 在 2019年11月15日，下午8:10，Yu Li  写道：
> > > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > I would like to start the vote for FLIP-83 [1] which is
> discussed
> > > and
> > > > > > > reached consensus in the discussion thread [2].
> > > > > > >
> > > > > > > The vote will be open for at least 72 hours (excluding
> weekend).
> > > I'll
> > > > > try
> > > > > > > to close it by 2019-11-20 21:00 CST, unless there is an
> objection
> > > or
> > > > > not
> > > > > > > enough votes.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > > > > [2] https://s.apache.org/7fqrz
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Yu
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
>


-- 
Best, Jingsong Lee

Re: [VOTE] FLIP-79: Flink Function DDL Support (1.10 Release Feature Only)

2019-11-18 Thread Bowen Li

Hi Dawid,

As title of this thread says, only features related to 1.10 release (MVP)
is accepted, and the other part is still under discussion. I updated the
FLIP's status field to reflect that. Also added it to
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals

I will help to shepherd this effort in 1.10 release.

Thanks,
Bowen

On Mon, Nov 18, 2019 at 8:59 AM Peter Huang 
wrote:

> Hi David,
>
> Sorry for my misbehavior. FLINK-7151 should be the parent tickets of the
> function DDL effort. I recently created several tickets of it, currently
> including
> https://issues.apache.org/jira/browse/FLINK-14055
> https://issues.apache.org/jira/browse/FLINK-14711
> https://issues.apache.org/jira/browse/FLINK-14837
> https://issues.apache.org/jira/browse/FLINK-14841
>
> For the FLIP status, I think probably Bowen can add more context here.
>
> Best Regards
> Peter Huang
>
>
>
>
>
>
>
>
> On Mon, Nov 18, 2019 at 5:04 AM Dawid Wysakowicz 
> wrote:
>
>> Hey,
>>
>> I've just realized this FLIP was accepted but it was not reflected in
>> the wiki page.
>>
>> Moreover I could not find a single jira issue that would link to the
>> FLIP. I've seen though a few other issues and PRs being opened regarding
>> this FLIP (FLINK-7151?, FLINK-14837, FLINK-14055). I was also a bit
>> confused with the relationship between these issues and e.g.
>> FLINK-10232. I tried to clarify a bit the way I understood it, but maybe
>> somebody more familiar with the whole story could improve it further.
>>
>> Could some of the committers that voted on this proposal help with a bit
>> of shepherding?
>>
>> Best,
>>
>> Dawid
>>
>> On 12/11/2019 07:59, Yu Li wrote:
>> > Thanks Bowen and Peter! Great to know.
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> > On Tue, 12 Nov 2019 at 13:38, Peter Huang 
>> > wrote:
>> >
>> >> Thank you in advance. Once the PR is done from my side, I will @ you
>> and
>> >> yu for review.
>> >>
>> >> On Mon, Nov 11, 2019 at 9:15 PM Bowen Li  wrote:
>> >>
>> >>> Thanks Peter! I assigned the ticket to you, and I can help with
>> reviewing
>> >>> and merging PRs of this FLIP.
>> >>>
>> >>> Ccing 1.10 release manager Yu that we are on track of FLIP-79.
>> >>>
>> >>> Cheers,
>> >>> Bowen
>> >>>
>> >>> On Mon, Nov 11, 2019 at 9:03 PM Peter Huang <
>> huangzhenqiu0...@gmail.com>
>> >>> wrote:
>> >>>
>>  Thanks, everyone for joining the discussion and giving feedback!. The
>>  voting time for FLIP-79 has passed. I'm closing the vote now.
>> 
>>  There were seven +1 votes, 3 of which are binding:
>>  - Bowen Li (binding)
>>  - Kurt Yong (binding)
>>  - Shuyi Chen (binding)
>> 
>>  - Terry Wang (non-binding)
>>  - Xuefu Zhang (non-binding)
>>  - Vino Yang (non-binding)
>>  - Jingsong Lee (non-binding)
>> 
>>  There were no disapproving votes. Thus, FLIP-79 has been accepted.
>> Now,
>> >>> we
>>  agreed on the function DDL syntax as listed on the FLIP
>>   and also the initial execution plan for release 1.10. Currently, I
>> will
>>  mainly work on these two Jira tickets.
>> 
>>  1) https://issues.apache.org/jira/browse/FLINK-7151
>>  2) https://issues.apache.org/jira/browse/FLINK-14711
>> 
>> 
>> 
>>  Best Regards
>>  Peter Huang
>> 
>> 
>>  Best Regards
>>  Peter Huang
>> 
>>  On Mon, Nov 11, 2019 at 7:42 PM Jingsong Li 
>>  wrote:
>> 
>> > +1 (non-binding)
>> >
>> > Best,
>> > Jingsong Lee
>> >
>> > On Tue, Nov 12, 2019 at 9:49 AM vino yang 
>> >>> wrote:
>> >> +1 (non-binding)
>> >>
>> >> Best,
>> >> Vino
>> >>
>> >> Xuefu Z  于2019年11月12日周二 上午3:27写道：
>> >>
>> >>> +1 (non-binding)
>> >>>
>> >>> On Mon, Nov 11, 2019 at 9:54 AM Shuyi Chen 
>>  wrote:
>>  +1 (binding)
>> 
>>  On Sat, Nov 9, 2019 at 11:17 PM Kurt Young 
>>  wrote:
>> > +1 (binding)
>> >
>> > Best,
>> > Kurt
>> >
>> >
>> > On Sun, Nov 10, 2019 at 12:25 PM Peter Huang <
>> >>> huangzhenqiu0...@gmail.com
>> > wrote:
>> >
>> >> Hi Yu,
>> >>
>> >> Thanks for your reminder about the timeline of delivering
>> >>> the
>> > basic
>> >> function DDL in release 1.10.
>> >> As I replied to Xuefu, the "CREATE FUNCTION" and "DROP
>>  FUNCTION"
>> >> can
>> >> relatively easy achieve by revising the existing PR.
>> >> Definitely, I probably need to start to work on a basic
>> >>> version
>> > of
>> >> PR
>>  for
>> >> "ALTER FUNCTION" and "SHOW FUNCTIONS".
>> >> Please let me know if you have any suggestion to better
>> >>> align
>>  the
>> > timeline
>> >> of the ongoing catalog related efforts.
>> >>
>> >> Best Regards
>> >> Peter Huang
>> >>
>> >>
>> >> On Sat, Nov 9, 2019 at 7:26 PM

[jira] [Created] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

2019-11-18 Thread Gary Yao (Jira)

Gary Yao created FLINK-14843:


 Summary: Streaming bucketing end-to-end test can fail with Output 
hash mismatch
 Key: FLINK-14843
 URL: https://issues.apache.org/jira/browse/FLINK-14843
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem, Tests
Affects Versions: 1.10.0
 Environment: rev: dcc1330375826b779e4902176bb2473704dabb11
Reporter: Gary Yao


*Description*
Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can fail 
with Output hash mismatch.

{noformat}
Number of running task managers has reached 4.
Job (67212178694f8b2a9bc9d9572567a53f) is running.
Waiting until all values have been produced
Truncating buckets
Number of produced values 26325/6
Truncating buckets
Number of produced values 31315/6
Truncating buckets
Number of produced values 36735/6
Truncating buckets
Number of produced values 40705/6
Truncating buckets
Number of produced values 46125/6
Truncating buckets
Number of produced values 51135/6
Truncating buckets
Number of produced values 56555/6
Truncating buckets
Number of produced values 61935/6
Cancelling job 67212178694f8b2a9bc9d9572567a53f.
Cancelled job 67212178694f8b2a9bc9d9572567a53f.
Waiting for job (67212178694f8b2a9bc9d9572567a53f) to reach terminal state 
CANCELED ...
Job (67212178694f8b2a9bc9d9572567a53f) reached terminal state CANCELED
Job 67212178694f8b2a9bc9d9572567a53f was cancelled, time to verify
FAIL Bucketing Sink: Output hash mismatch.  Got 
4e2d1859e41184a38e5bc95090fe9941, expected 01aba5ff77a0ef5e5cf6a727c248bdc3.
head hexdump of actual:
000   (   2   ,   1   0   ,   0   ,   S   o   m   e   p   a   y
010   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   1
020   ,   S   o   m   e   p   a   y   l   o   a   d   .   .   .
030   )  \n   (   2   ,   1   0   ,   2   ,   S   o   m   e   p
040   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,   1   0
050   ,   3   ,   S   o   m   e   p   a   y   l   o   a   d   .
060   .   .   )  \n   (   2   ,   1   0   ,   4   ,   S   o   m   e
070   p   a   y   l   o   a   d   .   .   .   )  \n   (   2   ,
080   1   0   ,   5   ,   S   o   m   e   p   a   y   l   o   a
090   d   .   .   .   )  \n   (   2   ,   1   0   ,   6   ,   S   o
0a0   m   e   p   a   y   l   o   a   d   .   .   .   )  \n   (
0b0   2   ,   1   0   ,   7   ,   S   o   m   e   p   a   y   l
0c0   o   a   d   .   .   .   )  \n   (   2   ,   1   0   ,   8   ,
0d0   S   o   m   e   p   a   y   l   o   a   d   .   .   .   )
0e0  \n   (   2   ,   1   0   ,   9   ,   S   o   m   e   p   a
0f0   y   l   o   a   d   .   .   .   )  \n
0fa
Stopping taskexecutor daemon (pid: 654547) on host gyao-desktop.
Stopping standalonesession daemon (pid: 650368) on host gyao-desktop.
Stopping taskexecutor daemon (pid: 650812) on host gyao-desktop.
Skipping taskexecutor daemon (pid: 651347), because it is not running anymore 
on gyao-desktop.
Skipping taskexecutor daemon (pid: 651795), because it is not running anymore 
on gyao-desktop.
Skipping taskexecutor daemon (pid: 652249), because it is not running anymore 
on gyao-desktop.
Stopping taskexecutor daemon (pid: 653481) on host gyao-desktop.
Stopping taskexecutor daemon (pid: 654099) on host gyao-desktop.
[FAIL] Test script contains errors.
Checking of logs skipped.

[FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' failed 
after 2 minutes and 3 seconds! Test exited with exit code 1
{noformat}


*How to reproduce*
Comment out the delay of 10s after the 1st TM is restarted to provoke the issue:

{code:bash}
echo "Restarting 1 TM"
$FLINK_DIR/bin/taskmanager.sh start
wait_for_number_of_running_tms 4

#sleep 10

echo "Killing 2 TMs"
kill_random_taskmanager
kill_random_taskmanager
wait_for_number_of_running_tms 2
{code}

Command to run the test:
{noformat}
FLINK_DIR=build-target/ flink-end-to-end-tests/run-single-test.sh skip 
flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh
{noformat}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

RE: SQL for Avro GenericRecords on Parquet

2019-11-18 Thread Hanan Yehudai

HI Peter.  Thanks.
This is my code .  I used one of the parquet / avro tests as a reference.

The code will fail on
Test testScan(ParquetTestCase) failed with:
java.lang.UnsupportedOperationException
   at 
org.apache.parquet.filter2.recordlevel.IncrementallyUpdatedFilterPredicate$ValueInspector.update(IncrementallyUpdatedFilterPredicate.java:71)
   at 
org.apache.parquet.filter2.recordlevel.FilteringPrimitiveConverter.addLong(FilteringPrimitiveConverter.java:105)
   at 
org.apache.parquet.column.impl.ColumnReaderImpl$2$4.writeValue(ColumnReaderImpl.java:268)


CODE :

import org.apache.avro.Schema;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.generic.GenericRecordBuilder;
import org.apache.avro.specific.SpecificRecord;
import org.apache.avro.specific.SpecificRecordBuilderBase;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.io.ParallelIteratorInputFormat;
import org.apache.flink.api.java.io.TupleCsvInputFormat;
import org.apache.flink.api.java.tuple.Tuple;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.core.fs.Path;

import org.apache.flink.formats.parquet.ParquetTableSource;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.PrintSinkFunction;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.api.java.BatchTableEnvironment;

import org.apache.flink.table.api.java.StreamTableEnvironment;
import org.apache.flink.table.sinks.CsvTableSink;
import org.apache.flink.table.sinks.TableSink;
import org.apache.flink.test.util.MultipleProgramsTestBase;
import org.apache.flink.types.Row;

import org.apache.avro.generic.IndexedRecord;
import org.apache.parquet.avro.AvroSchemaConverter;
import org.apache.parquet.schema.MessageType;
import org.junit.BeforeClass;
import org.junit.ClassRule;
import org.junit.Test;
import org.junit.rules.TemporaryFolder;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

import static org.junit.Assert.assertEquals;

import org.apache.parquet.avro.AvroParquetWriter;
import org.apache.parquet.hadoop.ParquetWriter;


public class  ParquetTestCase extends MultipleProgramsTestBase {

private static String avroSchema = "{\n" +
"  \"name\": \"SimpleRecord\",\n" +
"  \"type\": \"record\",\n" +
"  \"fields\": [\n" +
"{ \"default\": null, \"name\": \"timestamp_edr\", \"type\": [ 
\"null\", \"long\" ]},\n" +
"{ \"default\": null, \"name\": \"id\", \"type\": [ \"null\", 
\"long\" ]},\n" +
"{ \"default\": null, \"name\": \"recordType_\", \"type\": [ 
\"null\", \"string\"]}\n" +
"  ],\n" +
"  \"schema_id\": 1,\n" +
"  \"type\": \"record\"\n" +
"}";

private static final AvroSchemaConverter SCHEMA_CONVERTER = new 
AvroSchemaConverter();
private static Schema schm = new Schema.Parser().parse(avroSchema);
private static Path testPath;


public ParquetTestCase() {
super(TestExecutionMode.COLLECTION);
}


@BeforeClass
public static void setup() throws Exception {

GenericRecordBuilder genericRecordBuilder = new 
GenericRecordBuilder(schm);


List recs = new ArrayList<>();
for (int i = 0; i < 6; i++) {
GenericRecord gr = genericRecordBuilder.set("timestamp_edr", 
System.currentTimeMillis() / 1000).set("id", 333L).set("recordType_", 
"Type1").build();
recs.add(gr);
GenericRecord gr2 = genericRecordBuilder.set("timestamp_edr", 
System.currentTimeMillis() / 1000).set("id", 22L).set("recordType_", 
"Type2").build();
recs.add(gr2);
}

testPath = new Path("/tmp",  UUID.randomUUID().toString());


ParquetWriter writer = 
AvroParquetWriter.builder(
new 
org.apache.hadoop.fs.Path(testPath.toUri())).withSchema(schm).build();

for (IndexedRecord record : recs) {
writer.write(record);
}
writer.close();
}


private ParquetTableSource createParquetTableSource(Path path) throws 
IOException {
MessageType nestedSchema = SCHEMA_CONVERTER.convert(schm);
ParquetTableSource parquetTableSource = ParquetTableSource.builder()
.path(path.getPath())
.forParquetSchema(nestedSchema)
.build();
return parquetTableSource;
}

@Test
public void testScan() throws Exception {
ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();

BatchTableEnvironment batchTableEnvironment  =
BatchTableEnvironment.create(env);

Re: [DISCUSSION] Kafka Metrics Reporter

2019-11-18 Thread Yun Tang

Hi all

Glad to see this topic in community. 
We at Alibaba also implemented a kafka metrics reporter and extend it to other 
message queues like Alibaba cloud log service [1] half a year ago. The reason 
why we not launch a similar discussion is that we previously thought we only 
provide a way to report metrics to kafka. Unlike current supported metrics 
reporter, e.g. InfluxDB, Graphite, they all have an easy-to-use data source in 
grafana to visualize metrics. Even with kafka metrics reporter, we still need 
another way to consume data out and work as a data source for observability 
platform, and this would be diverse for different companies.

I think this is the main concern to include this in a popular open-source main 
repo, and I pretty agree with Becket's suggestion to contribute this as a 
flink-package and we could offer an end-to-end solution including how to 
visualize these metrics data.

[1] https://www.alibabacloud.com/help/doc-detail/29003.htm  

Best
Yun Tang

On 11/18/19, 8:19 AM, "Becket Qin"  wrote:

Hi Gyula,

Thanks for bringing this up. It is a useful addition to have a Kafka
metrics reporter. I understand that we already have Prometheus and DataDog
reporters in the Flink main repo. However, personally speaking, I would
slightly prefer to have the Kafka metrics reporter as an ecosystem project
instead of in the main repo due to the following reasons:

1. To keep core Flink more focused. So in general if a component is more
relevant to external system rather than Flink, it might be good to keep it
as an ecosystem project. And metrics reporter seems a good example of that.
2. This helps encourage more contributions to Flink ecosystem instead of
giving the impression that anything in Flink ecosystem must be in Flink
main repo.
3. To facilitate our ecosystem project authors, we have launched a
website[1] to help the community keep track of and advertise the ecosystem
projects. It looks a good place to put the Kafka metrics reporter.

Regarding the message format, while I think use JSON by default is fine as
it does not introduce much external dependency, I wonder if we should make
the message format pluggable. Many companies probably already have their
own serde format for all the Kafka messages. For example, maybe they would
like to just use an Avro record for their metrics instead of introducing a
new JSON format. Also in many cases, there could be a lot of metric
messages sent by the Flink jobs. JSON format is less efficient and might
have too much overhead in that case.

Thanks,

Jiangjie (Becket) Qin

[1] https://flink-packages.org/

On Mon, Nov 18, 2019 at 3:30 AM Konstantin Knauf 
wrote:

> Hi Gyula,
>
> thank you for proposing this. +1 for adding a KafkaMetricsReporter. In
> terms of the dependency we could go a similar route as for the "universal"
> Flink Kafka Connector which to my knowledge always tracks the latest Kafka
> version as of the Flink release and relies on compatibility of the
> underlying KafkaClient. JSON sounds good to me.
>
> Cheers,
>
> Konstantin
>
>
>
>
>
> On Sun, Nov 17, 2019 at 1:46 PM Gyula Fóra  wrote:
>
> > Hi all!
> >
> > Several users have asked in the past about a Kafka based metrics 
reporter
> > which can serve as a natural connector between arbitrary metric storage
> > systems and a straightforward way to process Flink metrics downstream.
> >
> > I think this would be an extremely useful addition but I would like to
> hear
> > what others in the dev community think about it before submitting a
> proper
> > proposal.
> >
> > There are at least 3 questions to discuss here:
> >
> >
> > *1. Do we want the Kafka metrics reporter in the Flink repo?*As it 
is
> > much more generic than other metrics reporters already included, I would
> > say yes. Also as almost everyone uses Flink with Kafka it would be a
> > natural reporter choice for a lot of users.
> > *2. How should we handle the Kafka dependency of the connector?*
> > I think it would be an overkill to add different Kafka versions 
here,
> > so I would use Kafka 2.+ which has the best compatibility and is future
> > proof
> > *3. What message format should we use?*
> > I would go with JSON for readability and compatibility
> >
> > There is a relevant JIRA open for this already.
> > https://issues.apache.org/jira/browse/FLINK-14531
> >
> > We at Cloudera also promote this as a scalable way of pushing metrics to
> > other systems so we are very happy to contribute an implementation or
> > cooperate with others on building it.
> >
> > Please let me know what you think!
> >
> > Cheers,
> > Gyula
> >
>
>
>

Re: SQL for Avro GenericRecords on Parquet

2019-11-18 Thread Peter Huang

Hi Hanan,

Thanks for reporting the issue. Would you please attach your test code
here? I may help to investigate.



Best Regards
Peter Huang

On Mon, Nov 18, 2019 at 2:51 AM Hanan Yehudai 
wrote:

> I have tried to persist Generic Avro records in a parquet file and then
> read it via ParquetTablesource – using SQL.
> Seems that the SQL I not executed properly !
>
> The persisted records are :
> Id  ,  type
> 333,Type1
> 22,Type2
> 333,Type1
> 22,Type2
> 333,Type1
> 22,Type2
> 333,Type1
> 22,Type2
> 333,Type1
> 22,Type2
> 333,Type1
> 22,Type2
>
> While SQL  of SELECT id  ,recordType_  FROM ParquetTable  - return the
> above ( which is correct)
> Running  : "SELECT id  ,recordType_  FROM ParquetTable  where
> recordType_='Type1' "
> Will result in :
> 333,Type1
> 22,Type1
> 333,Type1
> 22,Type1
> 333,Type1
> 22,Type1
> 333,Type1
> 22,Type1
> 333,Type1
> 22,Type1
> 333,Type1
> 22,Type1
>
> As if the equal sign is assignment and not equal …
>
> am I doing something wrong ? is it an issue of Generic record vs
> SpecificRecords ?
>
>
>

[jira] [Created] (FLINK-14842) add logging for functions loaded from modules

2019-11-18 Thread Bowen Li (Jira)

Bowen Li created FLINK-14842:


 Summary: add logging for functions loaded from modules
 Key: FLINK-14842
 URL: https://issues.apache.org/jira/browse/FLINK-14842
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0


to help users identify which module provides that function



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] FLIP-79: Flink Function DDL Support (1.10 Release Feature Only)

2019-11-18 Thread Peter Huang

Hi David,

Sorry for my misbehavior. FLINK-7151 should be the parent tickets of the
function DDL effort. I recently created several tickets of it, currently
including
https://issues.apache.org/jira/browse/FLINK-14055
https://issues.apache.org/jira/browse/FLINK-14711
https://issues.apache.org/jira/browse/FLINK-14837
https://issues.apache.org/jira/browse/FLINK-14841

For the FLIP status, I think probably Bowen can add more context here.

Best Regards
Peter Huang








On Mon, Nov 18, 2019 at 5:04 AM Dawid Wysakowicz 
wrote:

> Hey,
>
> I've just realized this FLIP was accepted but it was not reflected in
> the wiki page.
>
> Moreover I could not find a single jira issue that would link to the
> FLIP. I've seen though a few other issues and PRs being opened regarding
> this FLIP (FLINK-7151?, FLINK-14837, FLINK-14055). I was also a bit
> confused with the relationship between these issues and e.g.
> FLINK-10232. I tried to clarify a bit the way I understood it, but maybe
> somebody more familiar with the whole story could improve it further.
>
> Could some of the committers that voted on this proposal help with a bit
> of shepherding?
>
> Best,
>
> Dawid
>
> On 12/11/2019 07:59, Yu Li wrote:
> > Thanks Bowen and Peter! Great to know.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Tue, 12 Nov 2019 at 13:38, Peter Huang 
> > wrote:
> >
> >> Thank you in advance. Once the PR is done from my side, I will @ you and
> >> yu for review.
> >>
> >> On Mon, Nov 11, 2019 at 9:15 PM Bowen Li  wrote:
> >>
> >>> Thanks Peter! I assigned the ticket to you, and I can help with
> reviewing
> >>> and merging PRs of this FLIP.
> >>>
> >>> Ccing 1.10 release manager Yu that we are on track of FLIP-79.
> >>>
> >>> Cheers,
> >>> Bowen
> >>>
> >>> On Mon, Nov 11, 2019 at 9:03 PM Peter Huang <
> huangzhenqiu0...@gmail.com>
> >>> wrote:
> >>>
>  Thanks, everyone for joining the discussion and giving feedback!. The
>  voting time for FLIP-79 has passed. I'm closing the vote now.
> 
>  There were seven +1 votes, 3 of which are binding:
>  - Bowen Li (binding)
>  - Kurt Yong (binding)
>  - Shuyi Chen (binding)
> 
>  - Terry Wang (non-binding)
>  - Xuefu Zhang (non-binding)
>  - Vino Yang (non-binding)
>  - Jingsong Lee (non-binding)
> 
>  There were no disapproving votes. Thus, FLIP-79 has been accepted.
> Now,
> >>> we
>  agreed on the function DDL syntax as listed on the FLIP
>   and also the initial execution plan for release 1.10. Currently, I
> will
>  mainly work on these two Jira tickets.
> 
>  1) https://issues.apache.org/jira/browse/FLINK-7151
>  2) https://issues.apache.org/jira/browse/FLINK-14711
> 
> 
> 
>  Best Regards
>  Peter Huang
> 
> 
>  Best Regards
>  Peter Huang
> 
>  On Mon, Nov 11, 2019 at 7:42 PM Jingsong Li 
>  wrote:
> 
> > +1 (non-binding)
> >
> > Best,
> > Jingsong Lee
> >
> > On Tue, Nov 12, 2019 at 9:49 AM vino yang 
> >>> wrote:
> >> +1 (non-binding)
> >>
> >> Best,
> >> Vino
> >>
> >> Xuefu Z  于2019年11月12日周二 上午3:27写道：
> >>
> >>> +1 (non-binding)
> >>>
> >>> On Mon, Nov 11, 2019 at 9:54 AM Shuyi Chen 
>  wrote:
>  +1 (binding)
> 
>  On Sat, Nov 9, 2019 at 11:17 PM Kurt Young 
>  wrote:
> > +1 (binding)
> >
> > Best,
> > Kurt
> >
> >
> > On Sun, Nov 10, 2019 at 12:25 PM Peter Huang <
> >>> huangzhenqiu0...@gmail.com
> > wrote:
> >
> >> Hi Yu,
> >>
> >> Thanks for your reminder about the timeline of delivering
> >>> the
> > basic
> >> function DDL in release 1.10.
> >> As I replied to Xuefu, the "CREATE FUNCTION" and "DROP
>  FUNCTION"
> >> can
> >> relatively easy achieve by revising the existing PR.
> >> Definitely, I probably need to start to work on a basic
> >>> version
> > of
> >> PR
>  for
> >> "ALTER FUNCTION" and "SHOW FUNCTIONS".
> >> Please let me know if you have any suggestion to better
> >>> align
>  the
> > timeline
> >> of the ongoing catalog related efforts.
> >>
> >> Best Regards
> >> Peter Huang
> >>
> >>
> >> On Sat, Nov 9, 2019 at 7:26 PM Yu Li 
> >>> wrote:
> >>> Thanks for driving this Peter!
> >>>
> >>> I agree it would be great if we could include this
> >>> feature in
> >> 1.10.
> >>> However, FWIW, since we are following the time-based
> >>> release
> >> policy
>  [1]
> >> and
> >>> 1.10 release is approaching its feature freeze (planned
> >>> to be
> > at
> >>> the
> > end
> >> of
> >>> November) [2], I'm a little bit concerned about the
> >>> schedule.
> >>> [1]
>

[jira] [Created] (FLINK-14841) Add create and drop function DDL in parser

2019-11-18 Thread Zhenqiu Huang (Jira)

Zhenqiu Huang created FLINK-14841:
-

 Summary: Add create and drop function DDL in parser
 Key: FLINK-14841
 URL: https://issues.apache.org/jira/browse/FLINK-14841
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.10.0
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[ANNOUNCE] Launch of flink-packages.org: A website to foster the Flink Ecosystem

2019-11-18 Thread Robert Metzger

Hi all,

I would like to announce that Ververica, with the permission of the Flink
PMC, is launching a website called flink-packages.org. This goes back to an
effort proposed earlier in 2019 [1]
The idea of the site is to help developers building extensions / connectors
/ API etc. for Flink to get attention for their project.
At the same time, we want to help Flink users to find those ecosystem
projects, so that they can benefit from the work. A voting and commenting
functionality allows users to rate and and discuss about individual
packages.

You can find the website here: https://flink-packages.org/

The full announcement is available here:
https://www.ververica.com/blog/announcing-flink-community-packages

I'm happy to hear any feedback about the site.

Best,
Robert


[1]
https://lists.apache.org/thread.html/c306b8b6d5d2ca071071b634d647f47769760e1e91cd758f52a62c93@%3Cdev.flink.apache.org%3E

Re: [VOTE] FLIP-79: Flink Function DDL Support (1.10 Release Feature Only)

2019-11-18 Thread Dawid Wysakowicz

Hey,

I've just realized this FLIP was accepted but it was not reflected in
the wiki page.

Moreover I could not find a single jira issue that would link to the
FLIP. I've seen though a few other issues and PRs being opened regarding
this FLIP (FLINK-7151?, FLINK-14837, FLINK-14055). I was also a bit
confused with the relationship between these issues and e.g.
FLINK-10232. I tried to clarify a bit the way I understood it, but maybe
somebody more familiar with the whole story could improve it further.

Could some of the committers that voted on this proposal help with a bit
of shepherding?

Best,

Dawid

On 12/11/2019 07:59, Yu Li wrote:
> Thanks Bowen and Peter! Great to know.
>
> Best Regards,
> Yu
>
>
> On Tue, 12 Nov 2019 at 13:38, Peter Huang 
> wrote:
>
>> Thank you in advance. Once the PR is done from my side, I will @ you and
>> yu for review.
>>
>> On Mon, Nov 11, 2019 at 9:15 PM Bowen Li  wrote:
>>
>>> Thanks Peter! I assigned the ticket to you, and I can help with reviewing
>>> and merging PRs of this FLIP.
>>>
>>> Ccing 1.10 release manager Yu that we are on track of FLIP-79.
>>>
>>> Cheers,
>>> Bowen
>>>
>>> On Mon, Nov 11, 2019 at 9:03 PM Peter Huang 
>>> wrote:
>>>
 Thanks, everyone for joining the discussion and giving feedback!. The
 voting time for FLIP-79 has passed. I'm closing the vote now.

 There were seven +1 votes, 3 of which are binding:
 - Bowen Li (binding)
 - Kurt Yong (binding)
 - Shuyi Chen (binding)

 - Terry Wang (non-binding)
 - Xuefu Zhang (non-binding)
 - Vino Yang (non-binding)
 - Jingsong Lee (non-binding)

 There were no disapproving votes. Thus, FLIP-79 has been accepted. Now,
>>> we
 agreed on the function DDL syntax as listed on the FLIP
  and also the initial execution plan for release 1.10. Currently, I will
 mainly work on these two Jira tickets.

 1) https://issues.apache.org/jira/browse/FLINK-7151
 2) https://issues.apache.org/jira/browse/FLINK-14711



 Best Regards
 Peter Huang


 Best Regards
 Peter Huang

 On Mon, Nov 11, 2019 at 7:42 PM Jingsong Li 
 wrote:

> +1 (non-binding)
>
> Best,
> Jingsong Lee
>
> On Tue, Nov 12, 2019 at 9:49 AM vino yang 
>>> wrote:
>> +1 (non-binding)
>>
>> Best,
>> Vino
>>
>> Xuefu Z  于2019年11月12日周二 上午3:27写道：
>>
>>> +1 (non-binding)
>>>
>>> On Mon, Nov 11, 2019 at 9:54 AM Shuyi Chen 
 wrote:
 +1 (binding)

 On Sat, Nov 9, 2019 at 11:17 PM Kurt Young 
 wrote:
> +1 (binding)
>
> Best,
> Kurt
>
>
> On Sun, Nov 10, 2019 at 12:25 PM Peter Huang <
>>> huangzhenqiu0...@gmail.com
> wrote:
>
>> Hi Yu,
>>
>> Thanks for your reminder about the timeline of delivering
>>> the
> basic
>> function DDL in release 1.10.
>> As I replied to Xuefu, the "CREATE FUNCTION" and "DROP
 FUNCTION"
>> can
>> relatively easy achieve by revising the existing PR.
>> Definitely, I probably need to start to work on a basic
>>> version
> of
>> PR
 for
>> "ALTER FUNCTION" and "SHOW FUNCTIONS".
>> Please let me know if you have any suggestion to better
>>> align
 the
> timeline
>> of the ongoing catalog related efforts.
>>
>> Best Regards
>> Peter Huang
>>
>>
>> On Sat, Nov 9, 2019 at 7:26 PM Yu Li 
>>> wrote:
>>> Thanks for driving this Peter!
>>>
>>> I agree it would be great if we could include this
>>> feature in
>> 1.10.
>>> However, FWIW, since we are following the time-based
>>> release
>> policy
 [1]
>> and
>>> 1.10 release is approaching its feature freeze (planned
>>> to be
> at
>>> the
> end
>> of
>>> November) [2], I'm a little bit concerned about the
>>> schedule.
>>> [1]
 https://cwiki.apache.org/confluence/display/FLINK/Time-based+releases
>>> [2]
>>>
>>>
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-10-td32824.html
>>> Best Regards,
>>> Yu
>>>
>>>
>>> On Sat, 9 Nov 2019 at 04:12, Xuefu Z 
> wrote:
 Hi Peter,

 Thanks for driving this. I'm all-in for this. However,
>>> as I
>> read
 the
>>> latest
 FLIP, I have a couple of questions/comments:

 1. It seems that "JVM" is proposed as a language type in
>> parallel
 to
 python. I'm not sure that's very intuitive. JVM stands
>>> for
>> "java
>> virtual
 machine", so the language is really "JAVA", correct? I
>>> know
>>> "scala"
> is
>>> also

Re: [DISCUSS] Release flink-shaded 9.0

2019-11-18 Thread Ufuk Celebi

@Chesnay: I know you said that you are pretty busy these days. If we can't
find anybody else to work on this, when would you be available to create
the first RC?

On Sun, Nov 17, 2019 at 6:48 AM Hequn Cheng  wrote:

> Hi,
>
> Big +1 to release 9.0.
> It would be good if we can solve these security vulnerabilities.
>
> Thanks a lot for your nice work and kick off the release so quickly.
>
>
> On Fri, Nov 15, 2019 at 11:50 PM Ufuk Celebi  wrote:
>
> > From what I can see, the Jackson version bump fixes quite a few
> > vulnerabilities. Therefore, I'd be +1 to release flink-shaded 9.0.
> >
> > Thanks for all the work to verify this on master already.
> >
> > – Ufuk
> >
> >
> > On Fri, Nov 15, 2019 at 2:26 PM Chesnay Schepler 
> > wrote:
> >
> > > Hello,
> > >
> > > I'd like to kick off the next release for flink-shaded. Background is
> > > that we recently bumped jackson to 2.10.1 to fix a variety of security
> > > vulnerabilities, and it would be good to include them in the upcoming
> > > 1.8.3/1.9.2 releases.
> > >
> > > The release would contain few changes beyond the jackson changes;
> > > flink-shaded can now be compiled on Java 11 and an encoding issue for
> > > the NOTICE files was fixed.
> > >
> > > So overall this should be very little overhead.
> > >
> > > I have already verified that the master would work with this version
> > > (this being a reasonable indicator for it also working in previous
> > > version).
> > >
> > > I'd also appreciate it if someone would volunteer to handle the
> release;
> > > I'm quite bogged down at the moment :(
> > >
> > > Regards,
> > >
> > > Chesnay
> > >
> > >
> >
>

[jira] [Created] (FLINK-14840) Use new Executor interface in SQL cli

2019-11-18 Thread Aljoscha Krettek (Jira)

Aljoscha Krettek created FLINK-14840:


 Summary: Use new Executor interface in SQL cli
 Key: FLINK-14840
 URL: https://issues.apache.org/jira/browse/FLINK-14840
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Client
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek


Currently, the SQL cli has custom code for job deployment in 
{{ProgramDeployer}}. We should replace this by using the newly introduced 
{{Executor}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-18 Thread Becket Qin

+1 (binding) on having the test suite.

BTW, it would be good to have a few more details about the performance
tests. For example:
1. How do the testing records look like? The size and key distributions.
2. The resources for each task.
3. The intended configuration for the jobs.
4. What exact source and sink it would use.

Thanks,

Jiangjie (Becket) Qin

On Mon, Nov 18, 2019 at 7:25 PM Zhijiang 
wrote:

> +1 (binding)!
>
> It is a good thing to enhance our testing work.
>
> Best,
> Zhijiang
>
>
> --
> From:Hequn Cheng 
> Send Time:2019 Nov. 18 (Mon.) 18:22
> To:dev 
> Subject:Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework
>
> +1 (binding)!
> I think this would be very helpful to detect regression problems.
>
> Best, Hequn
>
> On Mon, Nov 18, 2019 at 4:28 PM vino yang  wrote:
>
> > +1 (non-binding)
> >
> > Best,
> > Vino
> >
> > jincheng sun  于2019年11月18日周一 下午2:31写道：
> >
> > > +1  (binding)
> > >
> > > OpenInx  于2019年11月18日周一 下午12:09写道：
> > >
> > > > +1  (non-binding)
> > > >
> > > > On Mon, Nov 18, 2019 at 11:54 AM aihua li 
> > wrote:
> > > >
> > > > > +1  (non-binding)
> > > > >
> > > > > Thanks Yu Li for driving on this.
> > > > >
> > > > > > 在 2019年11月15日，下午8:10，Yu Li  写道：
> > > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I would like to start the vote for FLIP-83 [1] which is discussed
> > and
> > > > > > reached consensus in the discussion thread [2].
> > > > > >
> > > > > > The vote will be open for at least 72 hours (excluding weekend).
> > I'll
> > > > try
> > > > > > to close it by 2019-11-20 21:00 CST, unless there is an objection
> > or
> > > > not
> > > > > > enough votes.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > > > [2] https://s.apache.org/7fqrz
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > >
> > > > >
> > > >
> > >
> >
>
>

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-18 Thread Zhijiang

+1 (binding)!

It is a good thing to enhance our testing work.

Best,
Zhijiang 


--
From:Hequn Cheng 
Send Time:2019 Nov. 18 (Mon.) 18:22
To:dev 
Subject:Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

+1 (binding)!
I think this would be very helpful to detect regression problems.

Best, Hequn

On Mon, Nov 18, 2019 at 4:28 PM vino yang  wrote:

> +1 (non-binding)
>
> Best,
> Vino
>
> jincheng sun  于2019年11月18日周一 下午2:31写道：
>
> > +1  (binding)
> >
> > OpenInx  于2019年11月18日周一 下午12:09写道：
> >
> > > +1  (non-binding)
> > >
> > > On Mon, Nov 18, 2019 at 11:54 AM aihua li 
> wrote:
> > >
> > > > +1  (non-binding)
> > > >
> > > > Thanks Yu Li for driving on this.
> > > >
> > > > > 在 2019年11月15日，下午8:10，Yu Li  写道：
> > > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to start the vote for FLIP-83 [1] which is discussed
> and
> > > > > reached consensus in the discussion thread [2].
> > > > >
> > > > > The vote will be open for at least 72 hours (excluding weekend).
> I'll
> > > try
> > > > > to close it by 2019-11-20 21:00 CST, unless there is an objection
> or
> > > not
> > > > > enough votes.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > > [2] https://s.apache.org/7fqrz
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > >
> > > >
> > >
> >
>

SQL for Avro GenericRecords on Parquet

2019-11-18 Thread Hanan Yehudai

I have tried to persist Generic Avro records in a parquet file and then read it 
via ParquetTablesource – using SQL.
Seems that the SQL I not executed properly !

The persisted records are :
Id  ,  type
333,Type1
22,Type2
333,Type1
22,Type2
333,Type1
22,Type2
333,Type1
22,Type2
333,Type1
22,Type2
333,Type1
22,Type2

While SQL  of SELECT id  ,recordType_  FROM ParquetTable  - return the above ( 
which is correct)
Running  : "SELECT id  ,recordType_  FROM ParquetTable  where 
recordType_='Type1' "
Will result in :
333,Type1
22,Type1
333,Type1
22,Type1
333,Type1
22,Type1
333,Type1
22,Type1
333,Type1
22,Type1
333,Type1
22,Type1

As if the equal sign is assignment and not equal …

am I doing something wrong ? is it an issue of Generic record vs 
SpecificRecords ?

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-18 Thread Hequn Cheng

+1 (binding)!
I think this would be very helpful to detect regression problems.

Best, Hequn

On Mon, Nov 18, 2019 at 4:28 PM vino yang  wrote:

> +1 (non-binding)
>
> Best,
> Vino
>
> jincheng sun  于2019年11月18日周一 下午2:31写道：
>
> > +1  (binding)
> >
> > OpenInx  于2019年11月18日周一 下午12:09写道：
> >
> > > +1  (non-binding)
> > >
> > > On Mon, Nov 18, 2019 at 11:54 AM aihua li 
> wrote:
> > >
> > > > +1  (non-binding)
> > > >
> > > > Thanks Yu Li for driving on this.
> > > >
> > > > > 在 2019年11月15日，下午8:10，Yu Li  写道：
> > > > >
> > > > > Hi All,
> > > > >
> > > > > I would like to start the vote for FLIP-83 [1] which is discussed
> and
> > > > > reached consensus in the discussion thread [2].
> > > > >
> > > > > The vote will be open for at least 72 hours (excluding weekend).
> I'll
> > > try
> > > > > to close it by 2019-11-20 21:00 CST, unless there is an objection
> or
> > > not
> > > > > enough votes.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > > [2] https://s.apache.org/7fqrz
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > >
> > > >
> > >
> >
>

[jira] [Created] (FLINK-14839) JobGraph#userJars & JobGraph#classpaths are non-null

2019-11-18 Thread Zili Chen (Jira)

Zili Chen created FLINK-14839:
-

 Summary: JobGraph#userJars & JobGraph#classpaths are non-null
 Key: FLINK-14839
 URL: https://issues.apache.org/jira/browse/FLINK-14839
 Project: Flink
  Issue Type: Improvement
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.10.0


So we can improve our codes a bit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] PyFlink User-Defined Function Resource Management

2019-11-18 Thread Yangze Guo

Thanks for driving this discussion, Dian!

+1 for this proposal. It will help to reduce container failure due to
the memory overuse.
Some comments left in the design doc.

Best,
Yangze Guo

On Mon, Nov 18, 2019 at 4:06 PM Xintong Song  wrote:
>
> Sorry for the late reply.
>
> +1 for the general proposal.
>
> And one remainder, to use UNKNOWN resource requirement, we need to make
> sure optimizer knowns which operators use off-heap managed memory, and
> compute and set a fraction to the operators. See FLIP-53[1] for more
> details, and I would suggest you to double check with @Zhu Zhu who works on
> this part.
>
> Thank you~
>
> Xintong Song
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
>
> On Tue, Nov 12, 2019 at 11:53 AM Dian Fu  wrote:
>
> > Hi Jincheng,
> >
> > Thanks for the reply and also looking forward to the feedback from the
> > community.
> >
> > Thanks,
> > Dian
> >
> > > 在 2019年11月11日，下午2:34，jincheng sun  写道：
> > >
> > > Hi all,
> > >
> > > +1， Thanks for bring up this discussion Dian!
> > >
> > > The Resource Management is very important for PyFlink UDF. So, It's great
> > > if anyone can add more comments or inputs in the design doc or feedback
> > in
> > > ML. :)
> > >
> > > Best,
> > > Jincheng
> > >
> > > Dian Fu  于2019年11月5日周二 上午11:32写道：
> > >
> > >> Hi everyone,
> > >>
> > >> In FLIP-58[1] it will add the support of Python user-defined stateless
> > >> function for Python Table API. It will launch a separate Python process
> > for
> > >> Python user-defined function execution. The resources used by the Python
> > >> process should be managed properly by Flink’s resource management
> > >> framework. FLIP-49[2] has proposed a unified memory management framework
> > >> and PyFlink user-defined function resource management should be based on
> > >> it. Jincheng, Hequn, Xintong, GuoWei and I discussed offline about
> > this. I
> > >> draft a design doc[3] and want to start a discussion about PyFlink
> > >> user-defined function resource management.
> > >>
> > >> Welcome any comments on the design doc or giving us feedback on the ML
> > >> directly.
> > >>
> > >> Regards,
> > >> Dian
> > >>
> > >> [1]
> > >>
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> > >> [2]
> > >>
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> > >> [3]
> > >>
> > https://docs.google.com/document/d/1LQP8L66Thu2yVv6YRSfmF9EkkMnwhBHGjcTQ11GUmFc/edit#heading=h.4q4ggaftf78m
> >
> >

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

2019-11-18 Thread vino yang

+1 (non-binding)

Best,
Vino

jincheng sun  于2019年11月18日周一 下午2:31写道：

> +1  (binding)
>
> OpenInx  于2019年11月18日周一 下午12:09写道：
>
> > +1  (non-binding)
> >
> > On Mon, Nov 18, 2019 at 11:54 AM aihua li  wrote:
> >
> > > +1  (non-binding)
> > >
> > > Thanks Yu Li for driving on this.
> > >
> > > > 在 2019年11月15日，下午8:10，Yu Li  写道：
> > > >
> > > > Hi All,
> > > >
> > > > I would like to start the vote for FLIP-83 [1] which is discussed and
> > > > reached consensus in the discussion thread [2].
> > > >
> > > > The vote will be open for at least 72 hours (excluding weekend). I'll
> > try
> > > > to close it by 2019-11-20 21:00 CST, unless there is an objection or
> > not
> > > > enough votes.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework
> > > > [2] https://s.apache.org/7fqrz
> > > >
> > > > Best Regards,
> > > > Yu
> > >
> > >
> >
>

[jira] [Created] (FLINK-14838) Cleanup the description about container number config option in Scala and python shell doc

2019-11-18 Thread vinoyang (Jira)

vinoyang created FLINK-14838:


 Summary: Cleanup the description about container number config 
option in Scala and python shell doc
 Key: FLINK-14838
 URL: https://issues.apache.org/jira/browse/FLINK-14838
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: vinoyang


Currently, the config option {{-n}} for Flink on Yarn has not been supported 
since Flink 1.8+. FLINK-12362 did the cleanup job about this config option. 
However, the scala shell and python doc still contains some description about 
{{-n}} which may make users confused. This issue used to track the cleanup work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-14837) Support Function DDL in TableEnvironment

2019-11-18 Thread Zhenqiu Huang (Jira)

Zhenqiu Huang created FLINK-14837:
-

 Summary: Support Function DDL in TableEnvironment
 Key: FLINK-14837
 URL: https://issues.apache.org/jira/browse/FLINK-14837
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.10.0
Reporter: Zhenqiu Huang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-14836) Unable to set yarn container number for scala shell in yarn mode

2019-11-18 Thread Jeff Zhang (Jira)

Jeff Zhang created FLINK-14836:
--

 Summary: Unable to set yarn container number for scala shell in 
yarn mode
 Key: FLINK-14836
 URL: https://issues.apache.org/jira/browse/FLINK-14836
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Jeff Zhang


In FLINK-12362, "-n" is removed which cause unable to set container number for 
scala shell in yarn mode. The root cause is that start-scala-shell didn't use 
`flink run` which means it doesn't support all the parameters of flink run 
(including -p). 

Either we need to revert FLINK-12362, or make start-scala-shell use `flink run`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Re: [DISCUSS] PyFlink User-Defined Function Resource Management

2019-11-18 Thread Xintong Song

Sorry for the late reply.

+1 for the general proposal.

And one remainder, to use UNKNOWN resource requirement, we need to make
sure optimizer knowns which operators use off-heap managed memory, and
compute and set a fraction to the operators. See FLIP-53[1] for more
details, and I would suggest you to double check with @Zhu Zhu who works on
this part.

Thank you~

Xintong Song


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management

On Tue, Nov 12, 2019 at 11:53 AM Dian Fu  wrote:

> Hi Jincheng,
>
> Thanks for the reply and also looking forward to the feedback from the
> community.
>
> Thanks,
> Dian
>
> > 在 2019年11月11日，下午2:34，jincheng sun  写道：
> >
> > Hi all,
> >
> > +1， Thanks for bring up this discussion Dian!
> >
> > The Resource Management is very important for PyFlink UDF. So, It's great
> > if anyone can add more comments or inputs in the design doc or feedback
> in
> > ML. :)
> >
> > Best,
> > Jincheng
> >
> > Dian Fu  于2019年11月5日周二 上午11:32写道：
> >
> >> Hi everyone,
> >>
> >> In FLIP-58[1] it will add the support of Python user-defined stateless
> >> function for Python Table API. It will launch a separate Python process
> for
> >> Python user-defined function execution. The resources used by the Python
> >> process should be managed properly by Flink’s resource management
> >> framework. FLIP-49[2] has proposed a unified memory management framework
> >> and PyFlink user-defined function resource management should be based on
> >> it. Jincheng, Hequn, Xintong, GuoWei and I discussed offline about
> this. I
> >> draft a design doc[3] and want to start a discussion about PyFlink
> >> user-defined function resource management.
> >>
> >> Welcome any comments on the design doc or giving us feedback on the ML
> >> directly.
> >>
> >> Regards,
> >> Dian
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> >> [2]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> >> [3]
> >>
> https://docs.google.com/document/d/1LQP8L66Thu2yVv6YRSfmF9EkkMnwhBHGjcTQ11GUmFc/edit#heading=h.4q4ggaftf78m
>
>

[jira] [Created] (FLINK-14849) Can not submit job when use hive connector

[jira] [Created] (FLINK-14848) BaseRowSerializer.toBinaryRow wrongly process null for objects with variable-length part

[jira] [Created] (FLINK-14847) Support retrieving Hive PK constraints

[jira] [Created] (FLINK-14846) Correct the default writerbuffer size documentation of RocksDB

Re: [DISCUSS] Support configure remote flink jar

Re: [DISCUSS] Support configure remote flink jar

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

[jira] [Created] (FLINK-14845) Introduce data compression to blocking shuffle.

[DISCUSS] Support configure remote flink jar

[jira] [Created] (FLINK-14844) ConnectorCatalogTable should not patch the row type with TimeAttributes for batch mode

Re: [DISCUSS] FLIP-72: Introduce Pulsar Connector

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

Re: [VOTE] FLIP-79: Flink Function DDL Support (1.10 Release Feature Only)

[jira] [Created] (FLINK-14843) Streaming bucketing end-to-end test can fail with Output hash mismatch

RE: SQL for Avro GenericRecords on Parquet

Re: [DISCUSSION] Kafka Metrics Reporter

Re: SQL for Avro GenericRecords on Parquet

[jira] [Created] (FLINK-14842) add logging for functions loaded from modules

Re: [VOTE] FLIP-79: Flink Function DDL Support (1.10 Release Feature Only)

[jira] [Created] (FLINK-14841) Add create and drop function DDL in parser

[ANNOUNCE] Launch of flink-packages.org: A website to foster the Flink Ecosystem

Re: [VOTE] FLIP-79: Flink Function DDL Support (1.10 Release Feature Only)

Re: [DISCUSS] Release flink-shaded 9.0

[jira] [Created] (FLINK-14840) Use new Executor interface in SQL cli

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

SQL for Avro GenericRecords on Parquet

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

[jira] [Created] (FLINK-14839) JobGraph#userJars & JobGraph#classpaths are non-null

Re: [DISCUSS] PyFlink User-Defined Function Resource Management

Re: [VOTE] FLIP-83: Flink End-to-end Performance Testing Framework

[jira] [Created] (FLINK-14838) Cleanup the description about container number config option in Scala and python shell doc

[jira] [Created] (FLINK-14837) Support Function DDL in TableEnvironment

[jira] [Created] (FLINK-14836) Unable to set yarn container number for scala shell in yarn mode

Re: [DISCUSS] PyFlink User-Defined Function Resource Management

35 matches

Site Navigation

Mail list logo

Footer information