[jira] [Created] (FLINK-14458) Assert implementation classes of catalog objects

2019-10-18 Thread Bowen Li (Jira)
Bowen Li created FLINK-14458:


 Summary: Assert implementation classes of catalog objects
 Key: FLINK-14458
 URL: https://issues.apache.org/jira/browse/FLINK-14458
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Bowen Li


We need to assert implementation classes of catalog objects, including 
table/view/function/partition/stats/db, to make sure they are of the right 
implementations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-68: Extend Core Table System with Pluggable Modules

2019-10-18 Thread Bowen Li
Thanks Dawid and everyone.

I'm hereby glad to announce that we have unanimously approved this FLIP
with 5 +1 votes, 3 binding (Timo, Jark, Dawid) and 2 non-binding (Xuefu,
Peter), and no -1.

This FLIP shall move to implementation phase and will target for Flink 1.10.


On Fri, Oct 18, 2019 at 1:29 AM Dawid Wysakowicz 
wrote:

> Thank you Bowen for the update. Great to hear we can have just
>
> ModuleFactory#createModule(Map)
>
> +1 for the FLIP. Nice design BTW ;)
>
> Best,
>
> Dawid
>
>
> On 17/10/2019 18:36, Bowen Li wrote:
> > Thanks for pointing them out, Dawid. I've went over the overall doc again
> > and corrected the above typos.
> >
> > - ModuleManager#listFunctions() returns Set
> > - ModuleManager holds a LinkedHashMap to keep loaded
> > modules in order
> > - ModuleFactory#createModule(Map) and returns Module
> >
> >
> > On Thu, Oct 17, 2019 at 2:27 AM Dawid Wysakowicz  >
> > wrote:
> >
> >> Hi all,
> >>
> >> Generally I'm fine with the design. Before I cast my +1 I wanted to
> >> clarify one thing. Is the module name in ModuleFactory#createModule
> >> necessary? Can't it be just?:
> >>
> >> interface ModuleFactory extends TableFactory {
> >>Module createModule(Map properties);
> >> }
> >>
> >> The name under which the module was registered should not affect the
> >> implementation of the module as far as I can tell. Could we remove this
> >> parameter from the method?
> >>
> >> I also spotted a few "bugs" in the design, but they do not affect the
> >> outcome of the design, as they are either just artifacts of refactoring
> the
> >> FLIP or affect only the internal implementation:
> >>
> >>- there is a typo in the ModuleFactory#createModule return type. It
> >>should be Module instead of Plugin
> >>- the return type of ModuleManager:listFunctions() should be
> >>Set instead of Set>, right?
> >>- we cannot use list to store the modules in ModuleManager if I am
> not
> >>mistaken. We need to store them in a Map to e.g. be able to unload
> the
> >>modules by its name.
> >>
> >> Best,
> >>
> >> Dawid
> >> On 17/10/2019 04:16, Jark Wu wrote:
> >>
> >> +1
> >>
> >> Thanks,
> >> Jark
> >>
> >> On Thu, 17 Oct 2019 at 04:44, Peter Huang 
> 
> >> wrote:
> >>
> >>
> >> +1 Thanks
> >>
> >> On Wed, Oct 16, 2019 at 12:48 PM Xuefu Z  <
> usxu...@gmail.com> wrote:
> >>
> >>
> >> +1 (non-biding)
> >>
> >> On Wed, Oct 16, 2019 at 2:26 AM Timo Walther  <
> twal...@apache.org> wrote:
> >>
> >>
> >> +1
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 15.10.19 20:50, Bowen Li wrote:
> >>
> >> Hi all,
> >>
> >> I'd like to kick off a voting thread for FLIP-68: Extend Core Table
> >>
> >> System
> >>
> >> with Pluggable Modules [1], as we have reached consensus in [2].
> >>
> >> The voting period will be open for at least 72 hours, ending at 7pm
> >>
> >> Oct
> >>
> >> 18
> >>
> >> UTC.
> >>
> >> Thanks,
> >> Bowen
> >>
> >> [1]
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>
> >> [2] https://www.mail-archive.com/dev@flink.apache.org/msg29894.html
> >>
> >> --
> >> Xuefu Zhang
> >>
> >> "In Honey We Trust!"
> >>
> >>
> >>
>
>


Re: [DISCUSS] Rename the SQL ANY type to OPAQUE type

2019-10-18 Thread Xuefu Z
Thanks to Timo for bringing up an interesting topic.

Personally, "OPAQUE" doesn't seem very intuitive with respect to types. (It
suits pretty well to glasses, thought. :)) Anyway, could we just use
"UNKNOWN", which is more explicit and true reflects its nature?

Thanks,
Xuefu


On Fri, Oct 18, 2019 at 7:51 AM Timo Walther  wrote:

> Hi everyone,
>
> Stephan pointed out that our naming of a generic/blackbox/opaque type in
> SQL might be not intuitive for users. As the term ANY rather describes a
> "super-class of all types" which is not the case in our type system. Our
> current ANY type stands for a type that is just a blackbox within SQL,
> serialized by some custom serializer, that can only be modified within
> UDFs.
>
> I also gathered feedback from a training instructor and native English
> speaker (David in CC) where I received the following:
>
> "The way I’m thinking about this is this: there’s a concept here that
> people have to become aware of, which is that Flink SQL is able to
> operate generically on opaquely typed things — and folks need to be able
> to connect what they see in code examples, etc. with this concept (which
> they may be unaware of initially).
> I feel like ANY misses the mark a little bit, but isn’t particularly
> bad. I do worry that it may cause some confusion about its purpose and
> power. I think OPAQUE would more clearly express what’s going on."
>
> Also resources like Wikipedia [1] show that this terminology is common:
>
> "a data type whose concrete data structure is not defined [...] its
> values can only be manipulated by calling subroutines that have access
> to the missing information"
>
> I would therefore vote for refactoring the type name because it is not
> used much yet.
>
> Implications are:
>
> - a new parser keyword "OPAQUE" and changed SQL parser
>
> - changes for logical type root, logical type visitors, and their usages
>
> What do you think?
>
> Thanks,
>
> Timo
>
> [1] https://en.wikipedia.org/wiki/Opaque_data_type
>
>
>

-- 
Xuefu Zhang

"In Honey We Trust!"


Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

2019-10-18 Thread Aljoscha Krettek
Hi,

In general, I’m also for “execution" compared to just “exec”. For some of these 
options, though, I’m wondering whether “pipeline.” or “job.” 
makes more sense. Over time, a lot of things have accumulated in 
ExecutionConfig but a lot of them are not execution related, I think. For 
example, auto-type-registration would make more sense as 
“pipeline.auto-type-registration”. For some other options, I think we should 
consider not exposing them via the configuration if we don’t think that we want 
to have them in the long term.

I’ll try to categorise what I think:

Don’t expose:
 - defaultInputDependencyConstraint (I think this is an internal flag for the 
Blink runner)
 - executionMode (I think this is also Blink internals)
 - printProgressDuringExecution (I don’t know if this flag still does anything)

Maybe don’t expose:
 - defaultKryoSerializerClasses
 - setGlobalJobParameters (if we expose it it should be “pipeline”)

pipeline/job:
 - autoTypeRegistration
 - autoWatermarkInterval
 - closureCleaner
 - disableGenericTypes
 - enableAutoGeneratedUIDs
 - forceAvro
 - forceKryo
 - setMaxParallelism
 - setParallelism
 - objectReuse (this one is hard, could be execution)
 - registeredKryoTypes
 - registeredPojoTypes
 - timeCharacteristic
 - isChainingEnabled
 - cachedFile

execution:
 - latencyTrackingInterval
 - setRestartStrategy
 - taskCancellationIntervalMillis
 - taskCancellationTimeoutMillis
 - bufferTimeout

checkpointing: (this might be “execution.checkpointing”)
 - useSnapshotCompression
 - 
 - defaultStateBackend

What do you think?

Best,
Aljoscha


> On 17. Oct 2019, at 09:32, Timo Walther  wrote:
> 
> Sounds good to me.
> 
> Thanks,
> 
> Timo
> 
> 
> On 17.10.19 09:30, Kostas Kloudas wrote:
>> Hi Timo,
>> 
>> I agree that distinguishing between "executor" and "execution" when
>> scanning through a configuration file can be difficult. These names
>> were mainly influenced by the fact that FLIP-73 introduced the
>> "Executor".
>> In addition, I agree that "deployment" or "deploy" sound good
>> alternatives. Between the two, I would go with "deployment" (although
>> I like more the "deploy" as it is more imperative) for the simple
>> reason that we do not use verbs anywhere else (I think) in config
>> options.
>> 
>> Now for the "exec" or "execution", personally I like the longer
>> version as it is clearer.
>> 
>> So, to summarise, I would vote for "deployment", "execution", and
>> "pipeline" for job invariants, like the jars.
>> 
>> What do you think?
>> 
>> Cheers,
>> Kostas
>> 
>> On Wed, Oct 16, 2019 at 5:28 PM Timo Walther  wrote:
>>> Hi Kostas,
>>> 
>>> can we still discuss the naming of the properties? For me, having
>>> "execution" and "exector" as prefixes might be confusing in the future
>>> and difficult to identify if you scan through a list of properties.
>>> 
>>> How about `deployment` and `execution`? Or `deployer` and `exec`?
>>> 
>>> Regards,
>>> Timo
>>> 
>>> On 16.10.19 16:31, Kostas Kloudas wrote:
 Hi all,
 
 Thanks for opening the discussion!
 
 I like the idea, so +1 from my side and actually this is aligned with
 our intensions for the FLIP-73 effort.
 
 For the naming convention of the parameters introduced in the FLIP, my
 proposal would be have the full word "execution" instead of the
 shorter "exec".
 The reason for this, is that in the context of FLIP-73, we are also
 planning to introduce some new configuration parameters and the
 convention we
 are currently using is the following:
 
 pipeline.***: for job parameters that will not change between
 executions of the same job, e.g. the jar location
 executor.***: for parameters relevant to the instantiation of the
 correct executor, e.g. YARN, detached, etc
 execution.***: for parameters that are relevant to a specific
 execution of a given pipeline, e.g. parallelism or savepoint settings
 
 I understand that sometimes the boundaries may not be that clear for a
 parameter but I hope this will not be relevant to most of the
 parameters.
 
 I will also open a FLIP with some addition parameters but until then,
 this is the scheme that we are planning to follow.
 
 Cheers,
 Kostas
 
 
 
 On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz  
 wrote:
> Hi Gyula,
> 
> Yes you are right, we were also considering the external configurer. The
> reason we suggest the built in method is that it is more tightly coupled
> with the place the options are actually set. Therefore our hope is that,
> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
> updates also the configure method. I am not entirely against your
> suggestion though, if this is the preferred way in the community.
> 
> Does anyone has any comments regarding the option keys?
> 
> Best,
> 
> Dawid
> 
> On 30/08/2019 14:57, Gyula Fóra wrote:
>> Hi 

[DISCUSS] Rename the SQL ANY type to OPAQUE type

2019-10-18 Thread Timo Walther

Hi everyone,

Stephan pointed out that our naming of a generic/blackbox/opaque type in 
SQL might be not intuitive for users. As the term ANY rather describes a 
"super-class of all types" which is not the case in our type system. Our 
current ANY type stands for a type that is just a blackbox within SQL, 
serialized by some custom serializer, that can only be modified within UDFs.


I also gathered feedback from a training instructor and native English 
speaker (David in CC) where I received the following:


"The way I’m thinking about this is this: there’s a concept here that 
people have to become aware of, which is that Flink SQL is able to 
operate generically on opaquely typed things — and folks need to be able 
to connect what they see in code examples, etc. with this concept (which 
they may be unaware of initially).
I feel like ANY misses the mark a little bit, but isn’t particularly 
bad. I do worry that it may cause some confusion about its purpose and 
power. I think OPAQUE would more clearly express what’s going on."


Also resources like Wikipedia [1] show that this terminology is common:

"a data type whose concrete data structure is not defined [...] its 
values can only be manipulated by calling subroutines that have access 
to the missing information"


I would therefore vote for refactoring the type name because it is not 
used much yet.


Implications are:

- a new parser keyword "OPAQUE" and changed SQL parser

- changes for logical type root, logical type visitors, and their usages

What do you think?

Thanks,

Timo

[1] https://en.wikipedia.org/wiki/Opaque_data_type




[jira] [Created] (FLINK-14457) Shift down ClusterClient#configuration

2019-10-18 Thread Zili Chen (Jira)
Zili Chen created FLINK-14457:
-

 Summary: Shift down ClusterClient#configuration
 Key: FLINK-14457
 URL: https://issues.apache.org/jira/browse/FLINK-14457
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.10.0


Toward a {{ClusterClient}} interface. A follow up could be figure out what 
configuration is exactly used. I suspect we don't have to hold this 
configuration object.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14456) Remove or shift down all field in ClusterClient

2019-10-18 Thread Zili Chen (Jira)
Zili Chen created FLINK-14456:
-

 Summary: Remove or shift down all field in ClusterClient
 Key: FLINK-14456
 URL: https://issues.apache.org/jira/browse/FLINK-14456
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Reporter: Zili Chen
Assignee: Zili Chen
 Fix For: 1.10.0


Towards a {{ClusterClient}} interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [REMINDER] Ensuring build stability

2019-10-18 Thread Jark Wu
+1 to encourage every developer to subscribe builds@f.a.o and welcome any
suggestions about the builds notification.

I think everyone should pay more attention on the failed builds, not only
the one who merged commits. The build is still not stable.
If you find a failed test case from the failed builds, you can search in
the JIRA[1] by the test name to see whether there is already a issue
created.
If there is not, it would be great to create one (and can cc who is
familiar with this if you know).
If there is one, you can attach the new failed build log link under the
JIRA issue to give more clues.

Best,
Jark

[1]:
https://issues.apache.org/jira/browse/FLINK-14402?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20text%20~%20%22testOneToOneAtLeastOnceRegularSink%22%20ORDER%20BY%20createdDate%20DESC

On Fri, 18 Oct 2019 at 20:45, Gary Yao  wrote:

> Hi community,
>
> We created the bui...@flink.apache.org mailing list as an effort to be
> more
> transparent about build instabilities, especially about issues that were
> only
> surfacing in CRON builds at the time, such as, Scala 2.12 and Java 9
> compilation errors.
>
> While the situation has gotten better, there still occasionally issues
> concerning the CRON builds that are reported late (e.g., [1][2]). The
> mailing
> list was created almost 2 months ago. However, last time Chesnay checked,
> there were only 21 subscribers. To continue the high quality of our
> releases,
> we should aim at keeping the master and the release branches stable at all
> times. Therefore, I encourage everyone who is developing Flink to monitor
> the
> builds mailing list. To subscribe, all you need to do is to send an empty
> email to:
>
> builds-subscr...@flink.apache.org
>
> Best,
> Gary
>
> [1] https://issues.apache.org/jira/browse/FLINK-14186
> [2] https://issues.apache.org/jira/browse/FLINK-14226
>


[REMINDER] Ensuring build stability

2019-10-18 Thread Gary Yao
Hi community,

We created the bui...@flink.apache.org mailing list as an effort to be more
transparent about build instabilities, especially about issues that were
only
surfacing in CRON builds at the time, such as, Scala 2.12 and Java 9
compilation errors.

While the situation has gotten better, there still occasionally issues
concerning the CRON builds that are reported late (e.g., [1][2]). The
mailing
list was created almost 2 months ago. However, last time Chesnay checked,
there were only 21 subscribers. To continue the high quality of our
releases,
we should aim at keeping the master and the release branches stable at all
times. Therefore, I encourage everyone who is developing Flink to monitor
the
builds mailing list. To subscribe, all you need to do is to send an empty
email to:

builds-subscr...@flink.apache.org

Best,
Gary

[1] https://issues.apache.org/jira/browse/FLINK-14186
[2] https://issues.apache.org/jira/browse/FLINK-14226


[jira] [Created] (FLINK-14455) Improve join reorder feature without table statistics

2019-10-18 Thread Leonard Xu (Jira)
Leonard Xu created FLINK-14455:
--

 Summary: Improve join reorder feature without table statistics
 Key: FLINK-14455
 URL: https://issues.apache.org/jira/browse/FLINK-14455
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner
Affects Versions: 1.9.0
Reporter: Leonard Xu


tpc-ds query85.sql run fail when enable 
TABLE_OPTIMIZER_JOIN_REORDER_ENABLED,and run success when disable 
TABLE_OPTIMIZER_JOIN_REORDER_ENABLED.
I analyzed the optimized plan and thought it‘s a bad base for  join reorder 
feature.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[RESULT] [VOTE] Release 1.9.1, release candidate #1

2019-10-18 Thread Jark Wu
Hi everyone,

I'm happy to announce that we have unanimously approved this release.

There are 6 approving votes, 3 of which are binding:
* Jark
* Jingsong
* Hequn
* Kurt (binding)
* Till (binding)
* Jincheng (binding)

There are no disapproving votes.

Thanks everyone!

Cheers,
Jark


Re: [VOTE] Release 1.9.1, release candidate #1

2019-10-18 Thread Jark Wu
Thanks everyone for checking the release and voting!
I’ll summary the voting result in a separate email.

Cheers,
Jark

On Fri, 18 Oct 2019 at 16:41, jincheng sun  wrote:

> +1(binding)
> - Verified signatures and checksums.
> - Verified a local flink cluster and ran the streaming WordCount example
> without suspicious output.
> - Verified all POM files point to the right version.
>
> Best,
> Jincheng
>
> Till Rohrmann  于2019年10月17日周四 下午11:45写道:
>
> > +1 (binding)
> >
> > - Verified signatures and checksums
> > - Source archive does not contain any new binaries
> > - Successfully built Flink from sources for Hadoop 2.7.5 and Scala 2.12
> > - Verified that no new dependencies have been added
> > - Verified that the last release-1.9 cron job passed. The cron job being
> > executed on the 03.10. had one failing profile where no output was
> > generated during maven-remote-resources-plugin:1.5:process. I think this
> is
> > not a problem.
> > - Started a local flink cluster and ran the streaming WordCount example
> > - Verified that logs and out file don't contain suspicious output
> > modulo FLINK-14385
> >
> > Cheers,
> > Till
> >
> > On Mon, Oct 14, 2019 at 3:01 PM Kurt Young  wrote:
> >
> > > +1
> > >
> > > - Verify that the source archives do not contains any binaries
> > > - Start the cluster locally and ran some examples successfully
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Mon, Oct 14, 2019 at 4:32 AM Jark Wu  wrote:
> > >
> > > > Thanks @Hequn and @Yun Tang, I set the fixVersion of FLINK-14385 to
> > 1.8.3
> > > > and 1.9.2.
> > > >
> > > > Btw, I would really appreciate if any PMC member could help with the
> > > > verification.
> > > > I hope we can release 1.9.1 this week.
> > > > The vote will still be open until we reach at least 3 binding votes
> or
> > > any
> > > > blocker issue is found.
> > > >
> > > > Thanks,
> > > > Jark
> > > >
> > > > > 在 2019年10月14日,02:11,Yun Tang  写道:
> > > > >
> > > > > I think FLINK-14385<
> > https://issues.apache.org/jira/browse/FLINK-14385>
> > > > is not introduced from release-1.9 but since when we decide to
> release
> > > > hadoop-free Flink form 1.8 [1]
> > > > >
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-11266
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Hequn Cheng 
> > > > > Sent: Sunday, October 13, 2019 17:28
> > > > > To: dev 
> > > > > Subject: Re: [VOTE] Release 1.9.1, release candidate #1
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Do the following checks and all are success.
> > > > > - Verified signatures and hashes.
> > > > > - Built from the source archive.
> > > > > - Check repository contains all artifacts.
> > > > > - Test WordCount on a local standalone cluster.
> > > > >a. Both streaming and batch
> > > > >b. Web UI works fine
> > > > > - Test WordCount on yarn cluster, with 4 nodes.
> > > > >a. Both streaming and batch
> > > > >b. Web UI works fine
> > > > >c. Read write hdfs files.
> > > > >d. Test session mode and non-session mode.
> > > > >
> > > > > Minor comments(not blocker)
> > > > > - One comment about the website pr.
> > > > > - Found an issue that there are
> > > > > ClassNotFoundException(java.lang.ClassNotFoundException:
> > > > > org.apache.hadoop.yarn.exceptions.YarnException) when run the setup
> > > > > example(under Local Flink Cluster). It may be confusing for new
> users
> > > and
> > > > > would be nice if can be improved. (
> > > > > https://issues.apache.org/jira/browse/FLINK-14385)
> > > > >
> > > > > Best, Hequn
> > > > >
> > > > > On Sat, Oct 12, 2019 at 6:01 PM Jark Wu  wrote:
> > > > >
> > > > >> Hi Jingsong,
> > > > >>
> > > > >> Thanks for verifying. I updated the fixVersion to 1.9.2 for these
> > > > issues.
> > > > >>
> > > > >> Best,
> > > > >> Jark
> > > > >>
> > > > >>> 在 2019年10月12日,16:45,Jingsong Li  写道:
> > > > >>>
> > > > >>> +1 (non-binding)
> > > > >>>
> > > > >>> - Check if checksums files match the corresponding release files
> > > > >>> - Check if GPG files match the corresponding release files
> > > > >>> - Verify that the source archives do not contains any binaries
> > > > >>> - Build the source with Maven to ensure all source files have
> > Apache
> > > > >> headers
> > > > >>> - Check that all POM files point to the same version (1.9.1)
> > > > >>> - Start a local cluster both Scala 2.11 and 2.12, and shut down.
> > > > verified
> > > > >>> out and log, verified we ui. run examples.
> > > > >>> All succeeded.
> > > > >>>
> > > > >>> Hi Jark, there are some JIRA issue still use fix version 1.9.0,
> do
> > > you
> > > > >> need
> > > > >>> modify fix version?
> > > > >>> https://issues.apache.org/jira/browse/FLINK-14328
> > > > >>> https://issues.apache.org/jira/browse/FLINK-14327
> > > > >>> https://issues.apache.org/jira/browse/FLINK-14215
> > > > >>> https://issues.apache.org/jira/browse/FLINK-14072
> > > > >>> https://issues.apache.org/jira/browse/FLINK-12576
> > >

[jira] [Created] (FLINK-14454) Usages of SavepointSerializers.setFailWhenLegacyStateDetected are never reverted

2019-10-18 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-14454:


 Summary: Usages of 
SavepointSerializers.setFailWhenLegacyStateDetected are never reverted
 Key: FLINK-14454
 URL: https://issues.apache.org/jira/browse/FLINK-14454
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.8.0
Reporter: Chesnay Schepler


The {{AbstractOperatorRestoreTestBase}} and {{SavepointMigrationTestBase}} use 
{{SavepointSerializers#setFailWhenLegacyStateDetected}} to disable the logic 
that fails reading savepoints when they contain legacy state.

This logic is not re-enabled after the test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14453) Support building pipelined regions from base topology

2019-10-18 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14453:
---

 Summary: Support building pipelined regions from base topology
 Key: FLINK-14453
 URL: https://issues.apache.org/jira/browse/FLINK-14453
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0


Previously pipelined regions can only be built from FailoverTopology. With 
FLINK-14451 that extends FailoverTopology from base topology, 
PipelinedRegionComputeUtil can be modified a bit to able to build regions from 
the base topology.
This is a pre-requisite of FLINK-14312. And also enables SchedulingTopology to 
provide pipelined region info.

More details see FLINK-14330 and the [design 
doc|https://docs.google.com/document/d/1f88luAOfUQ6Pm4JkxYexLXpfH-crcXJdbubi1pS2Y5A/edit#].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14452) Keep only one execution topology in scheduler

2019-10-18 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14452:
---

 Summary: Keep only one execution topology in scheduler
 Key: FLINK-14452
 URL: https://issues.apache.org/jira/browse/FLINK-14452
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0


With FLINK-14450 and FLINK-14451, the SchedulingTopology and FailoverTopology 
are able to share one default implementation. We can change the scheduler to 
create and keep only one such an execution topology instance to save computing 
and memory cost.

More details see FLINK-14330 and the [design 
doc|https://docs.google.com/document/d/1f88luAOfUQ6Pm4JkxYexLXpfH-crcXJdbubi1pS2Y5A/edit#].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14451) Change FailoverTopology to extend base topology

2019-10-18 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14451:
---

 Summary: Change FailoverTopology to extend base topology
 Key: FLINK-14451
 URL: https://issues.apache.org/jira/browse/FLINK-14451
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0


This task is to change FailoverTopology to extend the base topology 
introduced in FLINK-14330. ExecutionGraphToSchedulingTopologyAdapter(default 
implementation of SchedulingTopology) should also implements and replace 
DefaultFailoverTopology.

More details see FLINK-14330 and the [design 
doc|https://docs.google.com/document/d/1f88luAOfUQ6Pm4JkxYexLXpfH-crcXJdbubi1pS2Y5A/edit#].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14450) Change SchedulingTopology to extend base topology

2019-10-18 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14450:
---

 Summary: Change SchedulingTopology to extend base topology
 Key: FLINK-14450
 URL: https://issues.apache.org/jira/browse/FLINK-14450
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0


This task is to change SchedulingTopology to extend the base topology 
introduced in FLINK-14330.

More details see FLINK-14330 and the [design 
doc|https://docs.google.com/document/d/1f88luAOfUQ6Pm4JkxYexLXpfH-crcXJdbubi1pS2Y5A/edit#].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [NOTICE] Binary licensing is now auto-generated

2019-10-18 Thread Aljoscha Krettek
Nice! :-)

> On 18. Oct 2019, at 11:13, Zhu Zhu  wrote:
> 
> Thanks Chesnay for this improvement!
> 
> vino yang  于2019年10月18日周五 上午10:29写道:
> 
>> Thanks Chesnay! Great job!
>> 
>> Best,
>> Vino
>> 
>> Jark Wu  于2019年10月18日周五 上午9:51写道:
>> 
>>> Thanks Chesnay! This is really a great job!
>>> 
>>> Best,
>>> Jark
>>> 
 在 2019年10月17日,22:03,未来阳光 <2217232...@qq.com> 写道:
 
 Thanks for this improvement Chesnay !
 
 
 
 
 ---Original---
 From: "Chesnay Schepler">>> Date: Thu, Oct 17, 2019 21:37 PM
 To: "dev@flink.apache.org">>> Subject: [NOTICE] Binary licensing is now auto-generated
 
 
 Hello,
 
 I just merged FLINK-14008 to 1.8, 1.9 and 1.10, which means that from
 now on the tricky part of the binary licensing (NOTICE-binary,
 licenses-binary) is automatically generated during the release process.
 
 As such these files have been removed from the root directory of the
 project (thus, you don't have to update these things anymore ;)).
 
 This also means that only builds of flink-dist that were built as part
 of the release process will have these files attached.
 
 I have updated the Licensing guide
 ;
>>> accordingly.
>>> 
>>> 
>> 



[jira] [Created] (FLINK-14449) SavepointMigrationTestBase deadline should be setup in the test

2019-10-18 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-14449:


 Summary: SavepointMigrationTestBase deadline should be setup in 
the test
 Key: FLINK-14449
 URL: https://issues.apache.org/jira/browse/FLINK-14449
 Project: Flink
  Issue Type: Sub-task
  Components: Tests
Affects Versions: 1.8.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.10.0


The {{SavepointMigrationTestBase}} contains a {{static final Deadline}} that is 
used in all tests. In practice this means that the deadline is quite 
unreliable, since it is setup when the class is instantiated, opposed to any 
tests being run.

If fork-reuse is enabled the tests consistently fail with a timeout for this 
reason.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14448) Enable fork-reuse for flink-tests

2019-10-18 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-14448:


 Summary: Enable fork-reuse for flink-tests
 Key: FLINK-14448
 URL: https://issues.apache.org/jira/browse/FLINK-14448
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.10.0


Umbrella issue for enabling fork-reuse in flink-tests to speed up test 
execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISUCSS] FLIP-80: Expression String Serializable and Deserializable

2019-10-18 Thread Jark Wu
Hi everyone,

I would like to start a discussion[1] about how to make Expression string
serializable and deserializable. Expression is the general interface for
all kinds of expressions in Flink Table API & SQL, it represents a logical
tree for producing a computation result. In FLIP-66[2] and FLIP-70[3], we
introduced watermark and computed column syntax in DDL. The watermark
strategy and computed column are both represented in Expression. In order
to persist watermark and computed column information in catalog, we need to
figure out how to persist and restore Expression.

FLIP-80:
https://docs.google.com/document/d/1LxPEzbPuEVWNixb1L_USv0gFgjRMgoZuMsAecS_XvdE/edit?usp=sharing

Thanks for any feedback!

Best,
Jark

[1]:
https://docs.google.com/document/d/1LxPEzbPuEVWNixb1L_USv0gFgjRMgoZuMsAecS_XvdE/edit?usp=sharing
[2]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-66%3A+Support+time+attribute+in+SQL+DDL
[3]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-70%3A+Flink+SQL+Computed+Column+Design


[jira] [Created] (FLINK-14447) Network metrics doc table render confusion

2019-10-18 Thread vinoyang (Jira)
vinoyang created FLINK-14447:


 Summary: Network metrics doc table render confusion
 Key: FLINK-14447
 URL: https://issues.apache.org/jira/browse/FLINK-14447
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: vinoyang


More detail: 
https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html#network-deprecated-use-default-shuffle-service-metrics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Drop Python 2 support for 1.10

2019-10-18 Thread Dian Fu
Hi all,

Thanks you all for the votes.

So far we have got 4 +1 votes, 3 binding (Jincheng, Hequn, Bowen), 1 
non-binding (Vino) and there is no -1 votes.

Therefore, I'm glad to announce that the proposal "Drop Python 2 support for 
1.10" has passed.

Thanks,
Dian

> 在 2019年10月16日,下午3:07,vino yang  写道:
> 
> +1
> 
> Bowen Li  于2019年10月16日周三 上午5:12写道:
> 
>> +1
>> 
>> On Sun, Oct 13, 2019 at 10:54 PM Hequn Cheng  wrote:
>> 
>>> +1
>>> 
>>> Thanks a lot for driving this, Dian!
>>> 
>>> On Mon, Oct 14, 2019 at 1:46 PM jincheng sun 
>>> wrote:
>>> 
 +1
 
 Dian Fu  于2019年10月14日周一 下午1:21写道:
 
> Hi all,
> 
> I would like to start the vote for "Drop Python 2 support for 1.10",
 which
> is discussed and reached a consensus in the discussion thread[1].
> 
> The vote will be open for at least 72 hours. Unless there is an
 objection,
> I will try to close it by Oct 17, 2019 18:00 UTC if we have received
> sufficient votes.
> 
> Regards,
> Dian
> 
> [1]
> 
 
>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-Python-2-support-for-1-10-td33824.html
> <
> 
 
>>> 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-Python-2-support-for-1-10-td33824.html
>> 
 
>>> 
>> 



Re: [NOTICE] Binary licensing is now auto-generated

2019-10-18 Thread Zhu Zhu
Thanks Chesnay for this improvement!

vino yang  于2019年10月18日周五 上午10:29写道:

> Thanks Chesnay! Great job!
>
> Best,
> Vino
>
> Jark Wu  于2019年10月18日周五 上午9:51写道:
>
> > Thanks Chesnay! This is really a great job!
> >
> > Best,
> > Jark
> >
> > > 在 2019年10月17日,22:03,未来阳光 <2217232...@qq.com> 写道:
> > >
> > > Thanks for this improvement Chesnay !
> > >
> > >
> > >
> > >
> > > ---Original---
> > > From: "Chesnay Schepler" > > Date: Thu, Oct 17, 2019 21:37 PM
> > > To: "dev@flink.apache.org" > > Subject: [NOTICE] Binary licensing is now auto-generated
> > >
> > >
> > > Hello,
> > >
> > > I just merged FLINK-14008 to 1.8, 1.9 and 1.10, which means that from
> > > now on the tricky part of the binary licensing (NOTICE-binary,
> > > licenses-binary) is automatically generated during the release process.
> > >
> > > As such these files have been removed from the root directory of the
> > > project (thus, you don't have to update these things anymore ;)).
> > >
> > > This also means that only builds of flink-dist that were built as part
> > > of the release process will have these files attached.
> > >
> > > I have updated the Licensing guide
> > > ;
> > accordingly.
> >
> >
>


Re: [VOTE] Release 1.9.1, release candidate #1

2019-10-18 Thread jincheng sun
+1(binding)
- Verified signatures and checksums.
- Verified a local flink cluster and ran the streaming WordCount example
without suspicious output.
- Verified all POM files point to the right version.

Best,
Jincheng

Till Rohrmann  于2019年10月17日周四 下午11:45写道:

> +1 (binding)
>
> - Verified signatures and checksums
> - Source archive does not contain any new binaries
> - Successfully built Flink from sources for Hadoop 2.7.5 and Scala 2.12
> - Verified that no new dependencies have been added
> - Verified that the last release-1.9 cron job passed. The cron job being
> executed on the 03.10. had one failing profile where no output was
> generated during maven-remote-resources-plugin:1.5:process. I think this is
> not a problem.
> - Started a local flink cluster and ran the streaming WordCount example
> - Verified that logs and out file don't contain suspicious output
> modulo FLINK-14385
>
> Cheers,
> Till
>
> On Mon, Oct 14, 2019 at 3:01 PM Kurt Young  wrote:
>
> > +1
> >
> > - Verify that the source archives do not contains any binaries
> > - Start the cluster locally and ran some examples successfully
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Oct 14, 2019 at 4:32 AM Jark Wu  wrote:
> >
> > > Thanks @Hequn and @Yun Tang, I set the fixVersion of FLINK-14385 to
> 1.8.3
> > > and 1.9.2.
> > >
> > > Btw, I would really appreciate if any PMC member could help with the
> > > verification.
> > > I hope we can release 1.9.1 this week.
> > > The vote will still be open until we reach at least 3 binding votes or
> > any
> > > blocker issue is found.
> > >
> > > Thanks,
> > > Jark
> > >
> > > > 在 2019年10月14日,02:11,Yun Tang  写道:
> > > >
> > > > I think FLINK-14385<
> https://issues.apache.org/jira/browse/FLINK-14385>
> > > is not introduced from release-1.9 but since when we decide to release
> > > hadoop-free Flink form 1.8 [1]
> > > >
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-11266
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Hequn Cheng 
> > > > Sent: Sunday, October 13, 2019 17:28
> > > > To: dev 
> > > > Subject: Re: [VOTE] Release 1.9.1, release candidate #1
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Do the following checks and all are success.
> > > > - Verified signatures and hashes.
> > > > - Built from the source archive.
> > > > - Check repository contains all artifacts.
> > > > - Test WordCount on a local standalone cluster.
> > > >a. Both streaming and batch
> > > >b. Web UI works fine
> > > > - Test WordCount on yarn cluster, with 4 nodes.
> > > >a. Both streaming and batch
> > > >b. Web UI works fine
> > > >c. Read write hdfs files.
> > > >d. Test session mode and non-session mode.
> > > >
> > > > Minor comments(not blocker)
> > > > - One comment about the website pr.
> > > > - Found an issue that there are
> > > > ClassNotFoundException(java.lang.ClassNotFoundException:
> > > > org.apache.hadoop.yarn.exceptions.YarnException) when run the setup
> > > > example(under Local Flink Cluster). It may be confusing for new users
> > and
> > > > would be nice if can be improved. (
> > > > https://issues.apache.org/jira/browse/FLINK-14385)
> > > >
> > > > Best, Hequn
> > > >
> > > > On Sat, Oct 12, 2019 at 6:01 PM Jark Wu  wrote:
> > > >
> > > >> Hi Jingsong,
> > > >>
> > > >> Thanks for verifying. I updated the fixVersion to 1.9.2 for these
> > > issues.
> > > >>
> > > >> Best,
> > > >> Jark
> > > >>
> > > >>> 在 2019年10月12日,16:45,Jingsong Li  写道:
> > > >>>
> > > >>> +1 (non-binding)
> > > >>>
> > > >>> - Check if checksums files match the corresponding release files
> > > >>> - Check if GPG files match the corresponding release files
> > > >>> - Verify that the source archives do not contains any binaries
> > > >>> - Build the source with Maven to ensure all source files have
> Apache
> > > >> headers
> > > >>> - Check that all POM files point to the same version (1.9.1)
> > > >>> - Start a local cluster both Scala 2.11 and 2.12, and shut down.
> > > verified
> > > >>> out and log, verified we ui. run examples.
> > > >>> All succeeded.
> > > >>>
> > > >>> Hi Jark, there are some JIRA issue still use fix version 1.9.0, do
> > you
> > > >> need
> > > >>> modify fix version?
> > > >>> https://issues.apache.org/jira/browse/FLINK-14328
> > > >>> https://issues.apache.org/jira/browse/FLINK-14327
> > > >>> https://issues.apache.org/jira/browse/FLINK-14215
> > > >>> https://issues.apache.org/jira/browse/FLINK-14072
> > > >>> https://issues.apache.org/jira/browse/FLINK-12576
> > > >>>
> > > >>> Best,
> > > >>> Jingsong Lee
> > > >>>
> > > >>>
> > > >>> On Wed, Oct 9, 2019 at 3:32 PM Jark Wu  wrote:
> > > >>>
> > >  +1 from my side.
> > > 
> > >  - checked signatures and hashes
> > >  - checked that all POM files point to the same version
> > >  - verified that the source archives do not contains any binaries
> > >  - build the source release with Scala 2.12 and Scala 2.11
> > successfully
> > > >>

Re: [DISCUSS] Improve Flink logging with contextual information

2019-10-18 Thread Gyula Fóra
Hi all!

Thanks for the answers, this has been very helpful and we could set up a
similar scheme using the Env variables.

Cheers,
Gyula

On Tue, Oct 15, 2019 at 9:55 AM Paul Lam  wrote:

> +1 to Rong’s approach. We use a similar solution to the log context
> problem
> on YARN setups. FYI.
>
> WRT container contextual informations, we collection logs via ELK so that
> the log file paths (which contains application id and container id) and
> the host
> are attached with the logs. But if you don’t want a new log collector, you
> can
> also use the system env variables in your log pattern. Flink sets the
> container
> informations into the system env variables, which could be found in the
> container
> launch script.
>
> WRT job contextual informations, we’ve tried MDC on task threads but it
> ended
> up with poor readability because Flink system threads are not set with the
> MDC
> variables (in my case user info), so now we use user name in system env as
> the logger pattern variable instead. However, for job id/name, I’m afraid
> that
> they can not be found in the default system env variables. You may need
> to find a way to set them into the system env or system properties.
>
> Best,
> Paul Lam
>
> > 在 2019年10月15日,12:50,Rong Rong  写道:
> >
> > Hi Gyula,
> >
> > Sorry for the late reply. I think it is definitely a challenge in terms
> of
> > log visibility.
> > However, for your requirement I think you can customize your Flink job by
> > utilizing a customized log formatter/encoder (e.g. log4j.properties or
> > logback.xml) and a suitable logger implementation.
> >
> > One example you can follow is to provide customFields in your log
> encoding
> > [1,2] and utilizing a supported Appender to append your log to a file.
> > You can also utilize a more customized appender to log the data into some
> > external database (for example, ElasticSearch and access via Kibana).
> >
> > One challenge you might face is how to configure these contextual
> > information dynamically. In our setup, these contextual information are
> > configured as system env params when job launches. so loggers can
> > dynamically resolve them during start time.
> >
> > Please let me know if any of the suggestions above helps.
> >
> > Cheers,
> > Rong
> >
> > [1]
> >
> https://github.com/logstash/logstash-logback-encoder/blob/master/src/test/resources/logback-test.xml#L13
> > [2] https://github.com/logstash/logstash-logback-encoder
> >
> > On Thu, Oct 3, 2019 at 1:56 AM Gyula Fóra  wrote:
> >
> >> Hi all!
> >>
> >> We have been thinking that it would be a great improvement to add
> >> contextual information to the Flink logs:
> >>
> >> - Container / yarn / host info to JM/TM logs
> >> - Job info (job id/ jobname) to task logs
> >>
> >> I this should be similar to how the metric scopes are set up and should
> be
> >> able to provide the same information for logs. Ideally it would be user
> >> configurable.
> >>
> >> We are wondering what would be the best way to do this, and would like
> to
> >> ask for opinions or past experiences.
> >>
> >> Our natural first thought was setting NDC / MDC in the different threads
> >> but it seems to be a somewhat fragile mechanism as it can be easily
> >> "cleared" or deleted by the user.
> >>
> >> What do you think?
> >>
> >> Gyula
> >>
>
>


Re: [VOTE] FLIP-68: Extend Core Table System with Pluggable Modules

2019-10-18 Thread Dawid Wysakowicz
Thank you Bowen for the update. Great to hear we can have just

ModuleFactory#createModule(Map)

+1 for the FLIP. Nice design BTW ;)

Best,

Dawid


On 17/10/2019 18:36, Bowen Li wrote:
> Thanks for pointing them out, Dawid. I've went over the overall doc again
> and corrected the above typos.
>
> - ModuleManager#listFunctions() returns Set
> - ModuleManager holds a LinkedHashMap to keep loaded
> modules in order
> - ModuleFactory#createModule(Map) and returns Module
>
>
> On Thu, Oct 17, 2019 at 2:27 AM Dawid Wysakowicz 
> wrote:
>
>> Hi all,
>>
>> Generally I'm fine with the design. Before I cast my +1 I wanted to
>> clarify one thing. Is the module name in ModuleFactory#createModule
>> necessary? Can't it be just?:
>>
>> interface ModuleFactory extends TableFactory {
>>Module createModule(Map properties);
>> }
>>
>> The name under which the module was registered should not affect the
>> implementation of the module as far as I can tell. Could we remove this
>> parameter from the method?
>>
>> I also spotted a few "bugs" in the design, but they do not affect the
>> outcome of the design, as they are either just artifacts of refactoring the
>> FLIP or affect only the internal implementation:
>>
>>- there is a typo in the ModuleFactory#createModule return type. It
>>should be Module instead of Plugin
>>- the return type of ModuleManager:listFunctions() should be
>>Set instead of Set>, right?
>>- we cannot use list to store the modules in ModuleManager if I am not
>>mistaken. We need to store them in a Map to e.g. be able to unload the
>>modules by its name.
>>
>> Best,
>>
>> Dawid
>> On 17/10/2019 04:16, Jark Wu wrote:
>>
>> +1
>>
>> Thanks,
>> Jark
>>
>> On Thu, 17 Oct 2019 at 04:44, Peter Huang  
>> 
>> wrote:
>>
>>
>> +1 Thanks
>>
>> On Wed, Oct 16, 2019 at 12:48 PM Xuefu Z  
>>  wrote:
>>
>>
>> +1 (non-biding)
>>
>> On Wed, Oct 16, 2019 at 2:26 AM Timo Walther  
>>  wrote:
>>
>>
>> +1
>>
>> Thanks,
>> Timo
>>
>>
>> On 15.10.19 20:50, Bowen Li wrote:
>>
>> Hi all,
>>
>> I'd like to kick off a voting thread for FLIP-68: Extend Core Table
>>
>> System
>>
>> with Pluggable Modules [1], as we have reached consensus in [2].
>>
>> The voting period will be open for at least 72 hours, ending at 7pm
>>
>> Oct
>>
>> 18
>>
>> UTC.
>>
>> Thanks,
>> Bowen
>>
>> [1]
>>
>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>
>> [2] https://www.mail-archive.com/dev@flink.apache.org/msg29894.html
>>
>> --
>> Xuefu Zhang
>>
>> "In Honey We Trust!"
>>
>>
>>



signature.asc
Description: OpenPGP digital signature