[jira] [Created] (FLINK-12723) Adds a wiki page about setting up a Python Table API development environment

2019-06-03 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12723: --- Summary: Adds a wiki page about setting up a Python Table API development environment Key: FLINK-12723 URL: https://issues.apache.org/jira/browse/FLINK-12723 Project: Flink

[jira] [Created] (FLINK-12722) Adds Python Table API tutorial

2019-06-03 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12722: --- Summary: Adds Python Table API tutorial Key: FLINK-12722 URL: https://issues.apache.org/jira/browse/FLINK-12722 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-12721) make flink-json more precisely when handle integer type

2019-06-03 Thread aloyszhang (JIRA)
aloyszhang created FLINK-12721: -- Summary: make flink-json more precisely when handle integer type Key: FLINK-12721 URL: https://issues.apache.org/jira/browse/FLINK-12721 Project: Flink Issue

[jira] [Created] (FLINK-12720) Add the Python Table API Sphinx docs

2019-06-03 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12720: --- Summary: Add the Python Table API Sphinx docs Key: FLINK-12720 URL: https://issues.apache.org/jira/browse/FLINK-12720 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-12719) Add the catalog API for the Python Table API

2019-06-03 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12719: --- Summary: Add the catalog API for the Python Table API Key: FLINK-12719 URL: https://issues.apache.org/jira/browse/FLINK-12719 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-12718) allow users to specify hive-site.xml location to configure hive metastore client in HiveCatalog

2019-06-03 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12718: Summary: allow users to specify hive-site.xml location to configure hive metastore client in HiveCatalog Key: FLINK-12718 URL: https://issues.apache.org/jira/browse/FLINK-12718

[jira] [Created] (FLINK-12717) Add windows support for the Python shell script

2019-06-03 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12717: --- Summary: Add windows support for the Python shell script Key: FLINK-12717 URL: https://issues.apache.org/jira/browse/FLINK-12717 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-12716) Add an interactive shell for Python Table API

2019-06-03 Thread Dian Fu (JIRA)
Dian Fu created FLINK-12716: --- Summary: Add an interactive shell for Python Table API Key: FLINK-12716 URL: https://issues.apache.org/jira/browse/FLINK-12716 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-12715) Hive-1.2.1 build is broken

2019-06-03 Thread Rui Li (JIRA)
Rui Li created FLINK-12715: -- Summary: Hive-1.2.1 build is broken Key: FLINK-12715 URL: https://issues.apache.org/jira/browse/FLINK-12715 Project: Flink Issue Type: Sub-task Reporter:

Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-03 Thread vino yang
Hi Ken, Thanks for your reply. As I said before, we try to reuse Flink's state concept (fault tolerance and guarantee "Exactly-Once" semantics). So we did not consider cache. In addition, if we use Flink's state, the OOM related issue is not a key problem we need to consider. Best, Vino Ken

[jira] [Created] (FLINK-12714) confusion about flink time window TimeWindow#getWindowStartWithOffset

2019-06-03 Thread Wenshuai Hou (JIRA)
Wenshuai Hou created FLINK-12714: Summary: confusion about flink time window TimeWindow#getWindowStartWithOffset Key: FLINK-12714 URL: https://issues.apache.org/jira/browse/FLINK-12714 Project: Flink

[jira] [Created] (FLINK-12713) deprecate descriptor, validator, and factory of ExternalCatalog

2019-06-03 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12713: Summary: deprecate descriptor, validator, and factory of ExternalCatalog Key: FLINK-12713 URL: https://issues.apache.org/jira/browse/FLINK-12713 Project: Flink

[jira] [Created] (FLINK-12712) deprecate ExternalCatalog and its subclasses and impls

2019-06-03 Thread Bowen Li (JIRA)
Bowen Li created FLINK-12712: Summary: deprecate ExternalCatalog and its subclasses and impls Key: FLINK-12712 URL: https://issues.apache.org/jira/browse/FLINK-12712 Project: Flink Issue Type:

Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-03 Thread Ken Krugler
Hi all, Cascading implemented this “map-side reduce” functionality with an LLR cache. That worked well, as then the skewed keys would always be in the cache. The API let you decide the size of the cache, in terms of number of entries. Having a memory limit would have been better for many of

Re: [ANNOUNCE] Apache Flink-shaded 7.0 released

2019-06-03 Thread Till Rohrmann
Thanks a lot Jincheng and to the community for making this release possible. Cheers, Till On Mon, Jun 3, 2019 at 2:14 PM Hequn Cheng wrote: > Thanks a lot to Jincheng and Chesnay and to the community making this > release possible! > > Best, Hequn > > On Mon, Jun 3, 2019 at 3:16 PM Jark Wu

[jira] [Created] (FLINK-12711) Separate function implementation and definition

2019-06-03 Thread Timo Walther (JIRA)
Timo Walther created FLINK-12711: Summary: Separate function implementation and definition Key: FLINK-12711 URL: https://issues.apache.org/jira/browse/FLINK-12711 Project: Flink Issue Type:

[jira] [Created] (FLINK-12710) Unify built-in and user-defined functions in the API

2019-06-03 Thread Timo Walther (JIRA)
Timo Walther created FLINK-12710: Summary: Unify built-in and user-defined functions in the API Key: FLINK-12710 URL: https://issues.apache.org/jira/browse/FLINK-12710 Project: Flink Issue

Re: [ANNOUNCE] Apache Flink-shaded 7.0 released

2019-06-03 Thread Hequn Cheng
Thanks a lot to Jincheng and Chesnay and to the community making this release possible! Best, Hequn On Mon, Jun 3, 2019 at 3:16 PM Jark Wu wrote: > Thanks Jincheng for your effort! > > On Sat, 1 Jun 2019 at 05:19, Bowen Li wrote: > > > Thanks Jincheng for driving this release! > > > > On Thu,

[jira] [Created] (FLINK-12709) Implement RestartBackoffTimeStrategyFactoryLoader

2019-06-03 Thread Zhu Zhu (JIRA)
Zhu Zhu created FLINK-12709: --- Summary: Implement RestartBackoffTimeStrategyFactoryLoader Key: FLINK-12709 URL: https://issues.apache.org/jira/browse/FLINK-12709 Project: Flink Issue Type: Sub-task

Re: Key state does not support migration

2019-06-03 Thread Tzu-Li (Gordon) Tai
Hi Richard, Schema evolution for data types that are used as keys is not allowed because, potentially, if the schema of the key changes, hash codes of keys may also change and can break partitioning for internal state managed by Flink. There are of course some evolution scenarios that would not

Re: Key state does not support migration

2019-06-03 Thread Till Rohrmann
Hi Richard, I've pulled in Gordon who worked on this feature. He should be able to tell you about the current limitations of Flink's schema evolution. Cheers, Till On Wed, May 29, 2019 at 1:44 PM Richard Deurwaarder wrote: > Hello, > > I am running into the problem where (avro) schema

Re: [DISCUSS] FLIP-39: Flink ML pipeline and ML libs

2019-06-03 Thread Stavros Kontopoulos
Hi, Some portion of the code could be migrated to the new Table API no? I am saying that because the new API design is based on scikit-learn and the old one was also inspired by it. Best, Stavros On Wed, May 22, 2019 at 1:24 PM Shaoxuan Wang wrote: > Another consensus (from the offline

[jira] [Created] (FLINK-12708) Introduce new Interfaces for source and sink to make Blink runner work

2019-06-03 Thread Jark Wu (JIRA)
Jark Wu created FLINK-12708: --- Summary: Introduce new Interfaces for source and sink to make Blink runner work Key: FLINK-12708 URL: https://issues.apache.org/jira/browse/FLINK-12708 Project: Flink

Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-03 Thread vino yang
Hi Piotr, The localKeyBy API returns an instance of KeyedStream (we just added an inner flag to identify the local mode) which is Flink has provided before. Users can call all the APIs(especially *window* APIs) which KeyedStream provided. So if users want to use local aggregation, they should

[jira] [Created] (FLINK-12707) Close minicluster will cause memory leak when there are StreamTask closed abnormal

2019-06-03 Thread liuzhaokun (JIRA)
liuzhaokun created FLINK-12707: -- Summary: Close minicluster will cause memory leak when there are StreamTask closed abnormal Key: FLINK-12707 URL: https://issues.apache.org/jira/browse/FLINK-12707

Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-03 Thread Piotr Nowojski
Hi, +1 for the idea from my side. I’ve even attempted to add similar feature quite some time ago, but didn’t get enough traction [1]. I’ve read through your document and I couldn’t find it mentioning anywhere, when the pre aggregated result should be emitted down the stream? I think that’s

[jira] [Created] (FLINK-12706) Introduce ShuffleManager interface and its configuration

2019-06-03 Thread Andrey Zagrebin (JIRA)
Andrey Zagrebin created FLINK-12706: --- Summary: Introduce ShuffleManager interface and its configuration Key: FLINK-12706 URL: https://issues.apache.org/jira/browse/FLINK-12706 Project: Flink

Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-03 Thread sf lee
Excited and Big +1 for this feature. SHI Xiaogang 于2019年6月3日周一 下午3:37写道: > Nice feature. > Looking forward to having it in Flink. > > Regards, > Xiaogang > > vino yang 于2019年6月3日周一 下午3:31写道: > > > Hi all, > > > > As we mentioned in some conference, such as Flink Forward SF 2019 and > QCon > >

Re: [DISCUSS] FLIP-33: Standardize connector metrics

2019-06-03 Thread Piotr Nowojski
Hi again :) > - pending.bytes, Gauge > - pending.messages, Gauge +1 And true, instead of overloading one of the metric it is better when user can choose to provide only one of them. Re 2: > If I understand correctly, this metric along with the pending mesages / > bytes would answer the

Re: [DISCUSS] Support Local Aggregation in Flink

2019-06-03 Thread SHI Xiaogang
Nice feature. Looking forward to having it in Flink. Regards, Xiaogang vino yang 于2019年6月3日周一 下午3:31写道: > Hi all, > > As we mentioned in some conference, such as Flink Forward SF 2019 and QCon > Beijing 2019, our team has implemented "Local aggregation" in our inner > Flink fork. This feature

[DISCUSS] Support Local Aggregation in Flink

2019-06-03 Thread vino yang
Hi all, As we mentioned in some conference, such as Flink Forward SF 2019 and QCon Beijing 2019, our team has implemented "Local aggregation" in our inner Flink fork. This feature can effectively alleviate data skew. Currently, keyed streams are widely used to perform aggregating operations

Re: [ANNOUNCE] Apache Flink-shaded 7.0 released

2019-06-03 Thread Jark Wu
Thanks Jincheng for your effort! On Sat, 1 Jun 2019 at 05:19, Bowen Li wrote: > Thanks Jincheng for driving this release! > > On Thu, May 30, 2019 at 11:40 PM Terry Wang wrote: > > > Wow~ Glad to see this! > > Thanks Jincheng and Chesnay for your effort! > > > > > 在 2019年5月31日,下午1:53,jincheng

Re: Checkpoint / Two Phase Commit

2019-06-03 Thread Piotr Nowojski
Hi, Sorry for late answer. > As I understand how it operates, the pre-phase state is when the checkpoint > is initiated and the checkpoint barrier advances from source to sink. Once > the pre-phase is complete (and successful), then the next step in the > process is where the "Sink" operator is

[jira] [Created] (FLINK-12705) Allow user to specify the Hive version in use

2019-06-03 Thread Rui Li (JIRA)
Rui Li created FLINK-12705: -- Summary: Allow user to specify the Hive version in use Key: FLINK-12705 URL: https://issues.apache.org/jira/browse/FLINK-12705 Project: Flink Issue Type: Sub-task