Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Becket Qin
Hi Sijie,

Yes, we will have to support existing old connectors and new connectors in
parallel for a while. We have to take that maintenance overhead because
existing connectors have been used by the users for a long time. I guess It
may take at least a year for us to fully remove the old connectors.

Process wise, we can do the same for Pulsar connector. But I am not sure if
we want to have the same burden on Pulsar connector, and I would like to
understand the benefit of doing that.

For users, the benefit of having the old Pulsar connector checked in seems
limited because 1) that code base will be immediately deprecated in the
next release in 3-4 months; 2) users can always use it even if it is not in
the Flink code base. Admittedly it is not as convenient as having it in
Flink code base, but doesn't seem super either. And after 3-4 months, users
can just use the new connector in Flink repo.

For Flink developers, the old connector code base is not something that we
want to evolve later. Instead, these code will be deprecated and
removed. So why do we want to get a beta version out to attract people to
use something we don't want to maintain?

Thanks,

Jiangjie (Becket) Qin



On Fri, Sep 20, 2019 at 10:12 AM Sijie Guo  wrote:

> Thanks everyone here. Sorry for jumping into the discussion here.
>
> I am not very familiar about the deprecation process in Flink. If I
> misunderstood the process, please fix me.
>
> As far as I understand, FLIP-27 is introducing a new unified API for
> connectors. After it introduces the new API
> and before moving all the existing connectors from old API to new API, both
> old ApI and new API will co-exist
> for a while until Flink moves all existing connectors to new API. So the
> Pulsar connector (using old API) can
> follow the deprecation process with other connector using old API and the
> deprecation of old API, no?
>
> If that's the case, I think contributing the current connector back to
> Flink rather than maintaining it outside Flink
> would provide a bit more benefits. We can deprecate the existing
> streamnative/pulsar-flink repo and point the users
> to use the connector in Flink repo. So all the review processes will happen
> within Flink for both old connector and
> new connector. It also reduces the confusions for the users as the
> documentation and code base happen in one place.
>
> Thoughts?
>
> - Sijie
>
>
>
>
> On Fri, Sep 20, 2019 at 12:53 AM Becket Qin  wrote:
>
> > Thanks for the explanation, Stephan. I have a few questions / thoughts.
> >
> > So that means we will remove the old connector without a major version
> > bump, is that correct?
> >
> > I am not 100% sure if mixing 1.10 connectors with 1.11 connectors will
> > always work because we saw some dependency class collisions in the past.
> To
> > make it safe we may have to maintain the old code for one more release.
> >
> > To be honest I am still wondering if we have to put the old connector in
> > Flink repo. if we check in the old connector to Flink. We will end up in
> > the following situation:
> > 1. Old connector in streamnative/pulsar-flink repo.
> > 2. Old connector in Flink Repo, which may be different from the one in
> > Pulsar repo. (Added in 1.10, deprecated in 1.11, removed in 1.12)
> > 3. New connector in Flink Repo.
> >
> > We need to think about how to make the users in each case happy.
> > - For users of (1), I assume Sijie and Yijie will have to maintain the
> code
> > a bit longer for its own compatibility even after we have (2). In that
> > case, bugs found in old connector may or may not need to be fixed in both
> > Flink and the streamnative/pulsar-flink repo.
> > - For users of (2), will we provide bug fixes? If we do, it will be a
> > little awkward because those bug fixes will be immediately deprecated in
> > 1.11, and removed in 1.12. So we are essentially asking users to migrate
> > away from the bug fix. After Flink 1.12, users may still have to switch
> to
> > use (3) due to the potential dependency class conflicts mentioned above.
> > - Users of (3) have a much easier life and don't need to worry too much.
> >
> > The above story seems a little complicated to tell. I think it will be
> much
> > easier to not have (2) at all.
> > 1. Old connector in streamnative/pulsar-flink repo.
> > 3. New connector in Flink Repo.
> >
> > - Old connector will only be maintained in streamnative/pulsar-flink repo
> > until it is fully deprecated. Users can always use the existing Pulsar
> > connector in that repo.
> > - New connector will be in Flink repo and maintained like the other
> > connectors.
> >
> > This seems much simpler for users understand and they will not be blocked
> > from using the old connector. If the concern is about the quality of the
> > connector in streamnative/pulsar-flink repo, is it enough for us just to
> > review the code in streamnative/pulsar-flink connector to make sure it
> > looks good from Flink's perspective?
> >
> > What do you think?
> >
> > 

[jira] [Created] (FLINK-14138) Show Pending Slots in Job Detail

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14138:
--

 Summary: Show Pending Slots in Job Detail
 Key: FLINK-14138
 URL: https://issues.apache.org/jira/browse/FLINK-14138
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Yadong Xie
 Attachments: 屏幕快照 2019-09-20 下午12.04.00.png, 屏幕快照 2019-09-20 
下午12.04.05.png

It is hard to troubleshoot when all subtasks are always on the SCHEDULED 
status(just like the screenshot below) when users submit a job.

!屏幕快照 2019-09-20 下午12.04.00.png|width=494,height=258!

The most common reason for this problem is that vertex has applied for more 
resources than the cluster has. A pending slots tab could help users to check 
which vertex or subtask is blocked.

!屏幕快照 2019-09-20 下午12.04.05.png|width=576,height=163!

 

REST API needed:

add /jobs/:jobid/pending-slots API to get pending slots data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14137) Show Attempt History in Vertex SubTask

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14137:
--

 Summary: Show Attempt History in Vertex SubTask
 Key: FLINK-14137
 URL: https://issues.apache.org/jira/browse/FLINK-14137
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Yadong Xie
 Attachments: 屏幕快照 2019-09-20 上午11.32.54.png, 屏幕快照 2019-09-20 
上午11.32.59.png

According to the 
[docs|https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jobs-jobid-vertices-vertexid-subtasks-subtaskindex],
 there may exist more than one attempt in a subtask, but there is no way to get 
the attempt history list in the REST API, users have no way to know if the 
subtask has failed before.

!屏幕快照 2019-09-20 上午11.32.54.png|width=499,height=205!

We can add the Attempt History tab under the Subtasks drawer on the job vertex 
page, here is a demo below.

!屏幕快照 2019-09-20 上午11.32.59.png|width=518,height=203!

REST API needed:

add /jobs/:jobid/vertices/:vertexid/subtasks/:subtaskindex/attempts API to get 
attempt history.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-56: Dynamic Slot Allocation

2019-09-19 Thread Xintong Song
I'm not sure if I understand the implementation plan you suggested
correctly. To my understanding, it seems that all the steps except for step
5 have to happen in strict order.

   - Profiles to be used in step 2 is reported with step 1.
   - SlotProfile in TaskExecutorGateway#requestSlot in step 3 comes from
   profiles used in step 2.
   - Only if RM request slots from TM with profiles (step 3), would TM be
   able to do the proper bookkeeping (step 4)
   - Step 5 can be done as long as we have step 2.
   - Step 6 relies on both step 4  and step 5, for proper bookkeepings on
   both TM and RM sides before enabling non-default profiles.

That means we can only work on the steps in the following order.
1-2-3-4-6
   \-5-/

What I'm trying to achieve with the current plan, is to have most of the
implementation steps paralleled, as the following. So that Andrey and I can
work concurrently without blocking each other too much.
1-2-3-4
   \5-6-7


I also agree that it would be good to not add too much separate codes. I
would suggest leave that decision to the implementation time. E.g., if by
the time we do the TM side bookkeeping, the RM side has already implemented
requesting slots with profiles, then we do not need to separate the code
paths.


To that end, I think it makes sense to adjust step 5-7 to first use default
slot resource profiles for all the bookkeepings, and replace it with the
requested profiles at the end.


What do you think?


Thank you~

Xintong Song



On Thu, Sep 19, 2019 at 7:59 PM Till Rohrmann  wrote:

> I think besides of point 1. and 3. there are no dependencies between the RM
> and TM side changes. Also, I'm not sure whether it makes sense to split the
> slot manager changes up into the proposed steps 5, 6 and 7.
>
> I would highly recommend to not add too much duplicate logic/separate code
> paths because it just adds blind spots which are probably not as well
> tested as the old code paths.
>
> Cheers,
> Till
>
> On Thu, Sep 19, 2019 at 11:58 AM Xintong Song 
> wrote:
>
> > Thanks for the comments, Till.
> >
> > - Agree on removing SlotID.
> >
> > - Regarding the implementation plan, it is true that we can possibly
> reduce
> > codes separated by the feature option. But I think to do that we need to
> > introduce more dependencies between implementation steps. With the
> current
> > plan, we can easily separate steps on the RM side and the TM side, and
> > start concurrently working on them after quickly updating the interfaces
> in
> > between. The feature will come alive when the steps on both RM/TM sides
> are
> > finished. Since we are planning to have two persons (Andrey and I)
> working
> > on this FLIP, I think the current plan is probably more convenient.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Thu, Sep 19, 2019 at 5:09 PM Till Rohrmann 
> > wrote:
> >
> > > Hi Xintong,
> > >
> > > thanks for starting the vote. The general plan looks good. Hence +1
> from
> > my
> > > side. I still have some minor comments one could think about:
> > >
> > > * As we no longer have predetermined slots on the TaskExecutor, I think
> > we
> > > can get rid of the SlotID. Instead, an allocated slot will be
> identified
> > by
> > > the AllocationID and the TaskManager's ResourceID in order to
> > differentiate
> > > duplicate registrations.
> > > * For the implementation plan, I believe there is only one tiny part on
> > the
> > > SlotManager for which we need a separate code path/feature flag which
> is
> > > how we find a matching slot. Everything else should be possible to
> > > implement in a way that it works with dynamic and static slot
> allocation:
> > > 1. Let TMs register with default slot profile at RM
> > > 2. Change SlotManager to use reported slot profiles instead of
> > > pre-calculated profiles
> > > 3. Replace SlotID with SlotProfile in TaskExecutorGateway#requestSlot
> > > 4. Extend TM to support dynamic slot allocation (aka proper
> bookkeeping)
> > > (can happen concurrently to any of steps 2-3)
> > > 5. Add bookkeeping to SlotManager (for pending TMs and registered TMs)
> > but
> > > still only use default slot profiles for matching with slot requests
> > > 6. Allow to match slot requests with reported resources instead of
> > default
> > > slot profiles (here we could use a feature flag to switch between
> dynamic
> > > and static slot allocation)
> > >
> > > Wdyt?
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Sep 19, 2019 at 9:45 AM Andrey Zagrebin 
> > > wrote:
> > >
> > > > Hi Xintong,
> > > >
> > > > Thanks for starting the vote, +1 from my side.
> > > >
> > > > Best,
> > > > Andrey
> > > >
> > > > On Tue, Sep 17, 2019 at 4:26 PM Xintong Song 
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I would like to start the vote for FLIP-56 [1], on which a
> consensus
> > is
> > > > > reached in this discussion thread [2].
> > > > >
> > > > > The vote will be open for at least 72 hours. I'll try to close it
> > after
> > > > > Sep. 20 15:00 

Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-19 Thread Zhu Zhu
Thanks Steven for the feedback!
Could you share more information about the metrics you add in you
customized restart strategy?

Thanks,
Zhu Zhu

Steven Wu  于2019年9月20日周五 上午7:11写道:

> We do use config like "restart-strategy:
> org.foobar.MyRestartStrategyFactoryFactory". Mainly to add additional
> metrics than the Flink provided ones.
>
> On Thu, Sep 19, 2019 at 4:50 AM Zhu Zhu  wrote:
>
>> Thanks everyone for the input.
>>
>> The RestartStrategy customization is not recognized as a public interface
>> as it is not explicitly documented.
>> As it is not used from the feedbacks of this survey, I'll conclude that
>> we do not need to support customized RestartStrategy for the new scheduler
>> in Flink 1.10
>>
>> Other usages are still supported, including all the strategies and
>> configuring ways described in
>> https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/task_failure_recovery.html#restart-strategies
>> .
>>
>> Feel free to share in this thread if you has any concern for it.
>>
>> Thanks,
>> Zhu Zhu
>>
>> Zhu Zhu  于2019年9月12日周四 下午10:33写道:
>>
>>> Thanks Oytun for the reply!
>>>
>>> Sorry for not have stated it clearly. When saying "customized
>>> RestartStrategy", we mean that users implement an
>>> *org.apache.flink.runtime.executiongraph.restart.RestartStrategy* by
>>> themselves and use it by configuring like "restart-strategy:
>>> org.foobar.MyRestartStrategyFactoryFactory".
>>>
>>> The usage of restart strategies you mentioned will keep working with the
>>> new scheduler.
>>>
>>> Thanks,
>>> Zhu Zhu
>>>
>>> Oytun Tez  于2019年9月12日周四 下午10:05写道:
>>>
 Hi Zhu,

 We are using custom restart strategy like this:

 environment.setRestartStrategy(failureRateRestart(2, Time.minutes(1),
 Time.minutes(10)));


 ---
 Oytun Tez

 *M O T A W O R D*
 The World's Fastest Human Translation Platform.
 oy...@motaword.com — www.motaword.com


 On Thu, Sep 12, 2019 at 7:11 AM Zhu Zhu  wrote:

> Hi everyone,
>
> I wanted to reach out to you and ask how many of you are using a
> customized RestartStrategy[1] in production jobs.
>
> We are currently developing the new Flink scheduler[2] which interacts
> with restart strategies in a different way. We have to re-design the
> interfaces for the new restart strategies (so called
> RestartBackoffTimeStrategy). Existing customized RestartStrategy will not
> work any more with the new scheduler.
>
> We want to know whether we should keep the way
> to customized RestartBackoffTimeStrategy so that existing customized
> RestartStrategy can be migrated.
>
> I'd appreciate if you can share the status if you are using customized
> RestartStrategy. That will be valuable for use to make decisions.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/task_failure_recovery.html#restart-strategies
> [2] https://issues.apache.org/jira/browse/FLINK-10429
>
> Thanks,
> Zhu Zhu
>



[jira] [Created] (FLINK-14136) Operator Topology and Metrics Inside Vertex

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14136:
--

 Summary: Operator Topology and Metrics Inside Vertex
 Key: FLINK-14136
 URL: https://issues.apache.org/jira/browse/FLINK-14136
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Yadong Xie
 Attachments: Kapture 2019-09-17 at 14.31.46.gif, screenshot.png, 
screenshot2.png

In the screenshot below, users can get vertex topology data in the job detail 
page, but the operator topology and metrics inside vertex is missing in the 
graph.

!screenshot.png|width=477,height=206!

There are actually two operators in the first vertex, their names are Source: 
Custom Source and Timestamps/Watermarks, but users can only see Source: Custom 
Source -> Timestamps/Watermarks in the vertex level.

We can already get some metrics at the operator-level such as records-in and 
records-out from the metrics REST API (in the screenshot below).

!screenshot2.png|width=475,height=210!

If we can get the operators’ topology data inside a vertex, users can the whole 
operator topology with record-received and records-sent information after a 
glance at the graph, we think it would be quite useful to troubleshoot jobs’ 
problem when it is running. Here is a demo in the gif below.

!Kapture 2019-09-17 at 14.31.46.gif|width=563,height=286!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14135) Introduce vectorized orc InputFormat for blink runtime

2019-09-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-14135:


 Summary: Introduce vectorized orc InputFormat for blink runtime
 Key: FLINK-14135
 URL: https://issues.apache.org/jira/browse/FLINK-14135
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / ORC
Reporter: Jingsong Lee


VectorizedOrcInputFormat is introduced to read orc data in batches.

When returning each row of data, instead of actually retrieving each field, we 
use BaseRow's abstraction to return a Columnar Row-like view.

This will greatly improve the downstream filtered scenarios, so that there is 
no need to access redundant fields on the filtered data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Sijie Guo
Thanks everyone here. Sorry for jumping into the discussion here.

I am not very familiar about the deprecation process in Flink. If I
misunderstood the process, please fix me.

As far as I understand, FLIP-27 is introducing a new unified API for
connectors. After it introduces the new API
and before moving all the existing connectors from old API to new API, both
old ApI and new API will co-exist
for a while until Flink moves all existing connectors to new API. So the
Pulsar connector (using old API) can
follow the deprecation process with other connector using old API and the
deprecation of old API, no?

If that's the case, I think contributing the current connector back to
Flink rather than maintaining it outside Flink
would provide a bit more benefits. We can deprecate the existing
streamnative/pulsar-flink repo and point the users
to use the connector in Flink repo. So all the review processes will happen
within Flink for both old connector and
new connector. It also reduces the confusions for the users as the
documentation and code base happen in one place.

Thoughts?

- Sijie




On Fri, Sep 20, 2019 at 12:53 AM Becket Qin  wrote:

> Thanks for the explanation, Stephan. I have a few questions / thoughts.
>
> So that means we will remove the old connector without a major version
> bump, is that correct?
>
> I am not 100% sure if mixing 1.10 connectors with 1.11 connectors will
> always work because we saw some dependency class collisions in the past. To
> make it safe we may have to maintain the old code for one more release.
>
> To be honest I am still wondering if we have to put the old connector in
> Flink repo. if we check in the old connector to Flink. We will end up in
> the following situation:
> 1. Old connector in streamnative/pulsar-flink repo.
> 2. Old connector in Flink Repo, which may be different from the one in
> Pulsar repo. (Added in 1.10, deprecated in 1.11, removed in 1.12)
> 3. New connector in Flink Repo.
>
> We need to think about how to make the users in each case happy.
> - For users of (1), I assume Sijie and Yijie will have to maintain the code
> a bit longer for its own compatibility even after we have (2). In that
> case, bugs found in old connector may or may not need to be fixed in both
> Flink and the streamnative/pulsar-flink repo.
> - For users of (2), will we provide bug fixes? If we do, it will be a
> little awkward because those bug fixes will be immediately deprecated in
> 1.11, and removed in 1.12. So we are essentially asking users to migrate
> away from the bug fix. After Flink 1.12, users may still have to switch to
> use (3) due to the potential dependency class conflicts mentioned above.
> - Users of (3) have a much easier life and don't need to worry too much.
>
> The above story seems a little complicated to tell. I think it will be much
> easier to not have (2) at all.
> 1. Old connector in streamnative/pulsar-flink repo.
> 3. New connector in Flink Repo.
>
> - Old connector will only be maintained in streamnative/pulsar-flink repo
> until it is fully deprecated. Users can always use the existing Pulsar
> connector in that repo.
> - New connector will be in Flink repo and maintained like the other
> connectors.
>
> This seems much simpler for users understand and they will not be blocked
> from using the old connector. If the concern is about the quality of the
> connector in streamnative/pulsar-flink repo, is it enough for us just to
> review the code in streamnative/pulsar-flink connector to make sure it
> looks good from Flink's perspective?
>
> What do you think?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Thu, Sep 19, 2019 at 6:58 PM Stephan Ewen  wrote:
>
> > My take would be the following:
> >
> >   - If we merge the connector now and replace it with a FLIP-27 version
> > before the 1.10 release, then we need no deprecation process
> >   - If we don't manage to replace it with a FLIP-27 version before the
> 1.10
> > release, than it is good that we have the other version, so no users get
> > blocked.
> >
> > In the latter case we can see how we want to do it. Immediate removal of
> > the old version or deprecation label and keeping it for one more release.
> > Given that you should be able to use a Flink 1.10 connector with Flink
> 1.11
> > as well (stable public APIs) there is also a workaround if you need an
> old
> > connector in a newer version. So immediate removal might even be
> feasible.
> >
> >
> > On Thu, Sep 19, 2019 at 11:09 AM Becket Qin 
> wrote:
> >
> > > Hi Stephan,
> > >
> > > Thanks for the clarification. I completely agree with you and Thomas on
> > the
> > > process of adding connectors to Flink repo. However, I am wondering
> what
> > is
> > > the deprecation process? Given the main concern here was that we may
> have
> > > to maintain two Pulsar connector code bases until the old one is
> removed
> > > from the repo, it would be good to know how long we have to do that.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin

[jira] [Created] (FLINK-14134) Introduce LimitableTableSource to optimize limit

2019-09-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-14134:


 Summary: Introduce LimitableTableSource to optimize limit
 Key: FLINK-14134
 URL: https://issues.apache.org/jira/browse/FLINK-14134
 Project: Flink
  Issue Type: Sub-task
Reporter: Jingsong Lee


SQL: select *from t1 limit 1

Now source will scan full table, if we can introduce LimitableTableSource, let 
source know the limit line, source can just read one row is OK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14133) Improve batch sql and hive integrate performance

2019-09-19 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-14133:


 Summary: Improve batch sql and hive integrate performance
 Key: FLINK-14133
 URL: https://issues.apache.org/jira/browse/FLINK-14133
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Hive, Table SQL / Planner
Reporter: Jingsong Lee


Now we have basically merged the batch function of blink planner and basically 
integrated hive. But there are still many problems with performance, and we 
need to ensure basic performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: How to implement grouping set in stream

2019-09-19 Thread Dian Fu
AFAIK, grouping sets has already been supported for streaming in blink planner. 
You could check FLINK-12192 for details.

Regards,
Dian

> 在 2019年9月10日,下午6:51,刘建刚  写道:
> 
>   I want to implement grouping set in stream. I am new to flink sql. I 
> want to find a example to teach me how to self define rule and implement 
> corresponding operator. Can anyone give me any suggestion?



Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-19 Thread Steven Wu
We do use config like "restart-strategy:
org.foobar.MyRestartStrategyFactoryFactory". Mainly to add additional
metrics than the Flink provided ones.

On Thu, Sep 19, 2019 at 4:50 AM Zhu Zhu  wrote:

> Thanks everyone for the input.
>
> The RestartStrategy customization is not recognized as a public interface
> as it is not explicitly documented.
> As it is not used from the feedbacks of this survey, I'll conclude that we
> do not need to support customized RestartStrategy for the new scheduler in
> Flink 1.10
>
> Other usages are still supported, including all the strategies and
> configuring ways described in
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/task_failure_recovery.html#restart-strategies
> .
>
> Feel free to share in this thread if you has any concern for it.
>
> Thanks,
> Zhu Zhu
>
> Zhu Zhu  于2019年9月12日周四 下午10:33写道:
>
>> Thanks Oytun for the reply!
>>
>> Sorry for not have stated it clearly. When saying "customized
>> RestartStrategy", we mean that users implement an
>> *org.apache.flink.runtime.executiongraph.restart.RestartStrategy* by
>> themselves and use it by configuring like "restart-strategy:
>> org.foobar.MyRestartStrategyFactoryFactory".
>>
>> The usage of restart strategies you mentioned will keep working with the
>> new scheduler.
>>
>> Thanks,
>> Zhu Zhu
>>
>> Oytun Tez  于2019年9月12日周四 下午10:05写道:
>>
>>> Hi Zhu,
>>>
>>> We are using custom restart strategy like this:
>>>
>>> environment.setRestartStrategy(failureRateRestart(2, Time.minutes(1),
>>> Time.minutes(10)));
>>>
>>>
>>> ---
>>> Oytun Tez
>>>
>>> *M O T A W O R D*
>>> The World's Fastest Human Translation Platform.
>>> oy...@motaword.com — www.motaword.com
>>>
>>>
>>> On Thu, Sep 12, 2019 at 7:11 AM Zhu Zhu  wrote:
>>>
 Hi everyone,

 I wanted to reach out to you and ask how many of you are using a
 customized RestartStrategy[1] in production jobs.

 We are currently developing the new Flink scheduler[2] which interacts
 with restart strategies in a different way. We have to re-design the
 interfaces for the new restart strategies (so called
 RestartBackoffTimeStrategy). Existing customized RestartStrategy will not
 work any more with the new scheduler.

 We want to know whether we should keep the way
 to customized RestartBackoffTimeStrategy so that existing customized
 RestartStrategy can be migrated.

 I'd appreciate if you can share the status if you are using customized
 RestartStrategy. That will be valuable for use to make decisions.

 [1]
 https://ci.apache.org/projects/flink/flink-docs-master/dev/task_failure_recovery.html#restart-strategies
 [2] https://issues.apache.org/jira/browse/FLINK-10429

 Thanks,
 Zhu Zhu

>>>


Re: [DISCUSS] FLIP-68: Extend Core Table System with Modular Plugins

2019-09-19 Thread Bowen Li
Thanks everyone for your feedback. I've converted it to a FLIP wiki [1].

Please take another look. If there's no more concerns, I'd like to start a
voting thread for it.

Thanks

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Modular+Plugins




On Tue, Sep 17, 2019 at 11:25 AM Bowen Li  wrote:

> Hi devs,
>
> We'd like to kick off a conversation on "FLIP-68:  Extend Core Table
> System with Modular Plugins" [1].
>
> The modular approach was raised in discussion of how to support Hive
> built-in functions in FLIP-57 [2]. As we discussed and looked deeper, we
> think it’s a good opportunity to broaden the design and the corresponding
> problem it aims to solve. The motivation is to expand Flink’s core table
> system and enable users to do customizations by writing pluggable modules.
>
> There are two aspects of the motivation:
> 1. Enpower users to write code and do customized developement for Flink
> table core
> 2. Enable users to integrate Flink with cores and built-in objects of
> other systems, so users can reuse what they are familiar with in other SQL
> systems seamlessly as core and built-ins of Flink table
>
> Please take a look, and feedbacks are welcome.
>
> Bowen
>
> [1]
> https://docs.google.com/document/d/17CPMpMbPDjvM4selUVEfh_tqUK_oV0TODAUA9dfHakc/edit?usp=sharing
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html
>


[jira] [Created] (FLINK-14132) Extend core table system with modular plugins

2019-09-19 Thread Bowen Li (Jira)
Bowen Li created FLINK-14132:


 Summary: Extend core table system with modular plugins
 Key: FLINK-14132
 URL: https://issues.apache.org/jira/browse/FLINK-14132
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0


please see FLIP-68 
(https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Modular+Plugins)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Bowen Li
Another reason I prefer "CREATE TEMPORARY BUILTIN FUNCTION" over "ALTER
BUILTIN FUNCTION xxx TEMPORARILY" is - what if users want to drop the
temporary built-in function in the same session? With the former one, they
can run something like "DROP TEMPORARY BUILTIN FUNCTION"; With the latter
one, I'm not sure how users can "restore" the original builtin function
easily from an "altered" function without introducing further nonstandard
SQL syntax.

Also please pardon me as I realized using net may not be a good idea... I'm
trying to fit this vote into cases listed in Flink Bylaw [1].

>From the following result, the majority seems to be #2 too as it has the
most approval so far and doesn't have strong "-1".

#1:3 (+1), 1 (0), 4(-1)
#2:4(0), 3 (+1), 1(+0.5)
   * Dawid -1/0 depending on keyword
#3:2(+1), 3(-1), 3(0)

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120731026

On Thu, Sep 19, 2019 at 10:30 AM Bowen Li  wrote:

> Hi,
>
> Thanks everyone for your votes. I summarized the result as following:
>
> #1:3 (+1), 1 (0), 4(-1) - net: -1
> #2:4(0), 2 (+1), 1(+0.5)  - net: +2.5
> Dawid -1/0 depending on keyword
> #3:2(+1), 3(-1), 3(0)   - net: -1
>
> Given the result, I'd like to change my vote for #2 from 0 to +1, to make
> it a stronger case with net +3.5. So the votes so far are:
>
> #1:3 (+1), 1 (0), 4(-1) - net: -1
> #2:4(0), 3 (+1), 1(+0.5)  - net: +3.5
> Dawid -1/0 depending on keyword
> #3:2(+1), 3(-1), 3(0)   - net: -1
>
> What do you think? Do you think we can conclude with this result? Or would
> you like to take it as a formal FLIP vote with 3 days voting period?
>
> BTW, I'd prefer "CREATE TEMPORARY BUILTIN FUNCTION" over "ALTER BUILTIN
> FUNCTION xxx TEMPORARILY" because
> 1. the syntax is more consistent with "CREATE FUNCTION" and "CREATE
> TEMPORARY FUNCTION"
> 2. "ALTER BUILTIN FUNCTION xxx TEMPORARILY" implies it alters a built-in
> function but it actually doesn't, the logic only creates a temp function
> with higher priority than that built-in function in ambiguous resolution
> order; and it would behave inconsistently with "ALTER FUNCTION".
>
>
>
> On Thu, Sep 19, 2019 at 2:58 AM Fabian Hueske  wrote:
>
>> I agree, it's very similar from the implementation point of view and the
>> implications.
>>
>> IMO, the difference is mostly on the mental model for the user.
>> Instead of having a special class of temporary functions that have
>> precedence over builtin functions it suggests to temporarily change
>> built-in functions.
>>
>> Fabian
>>
>> Am Do., 19. Sept. 2019 um 11:52 Uhr schrieb Kurt Young > >:
>>
>> > Hi Fabian,
>> >
>> > I think it's almost the same with #2 with different keyword:
>> >
>> > CREATE TEMPORARY BUILTIN FUNCTION xxx
>> >
>> > Best,
>> > Kurt
>> >
>> >
>> > On Thu, Sep 19, 2019 at 5:50 PM Fabian Hueske 
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I thought about it a bit more and think that there is some good value
>> in
>> > my
>> > > last proposal.
>> > >
>> > > A lot of complexity comes from the fact that we want to allow
>> overriding
>> > > built-in functions which are differently addressed as other functions
>> > (and
>> > > db objects).
>> > > We could just have "CREATE TEMPORARY FUNCTION" do exactly the same
>> thing
>> > as
>> > > "CREATE FUNCTION" and treat both functions exactly the same except
>> that:
>> > > 1) temp functions disappear at the end of the session
>> > > 2) temp function are resolved before other functions
>> > >
>> > > This would be Dawid's proposal from the beginning of this thread (in
>> case
>> > > you still remember... ;-) )
>> > >
>> > > Temporarily overriding built-in functions would be supported with an
>> > > explicit command like
>> > >
>> > > ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ...
>> > >
>> > > This would also address the concerns about accidentally changing the
>> > > semantics of built-in functions.
>> > > IMO, it can't get much more explicit than the above command.
>> > >
>> > > Sorry for bringing up a new option in the middle of the discussion,
>> but
>> > as
>> > > I said, I think it has a bunch of benefits and I don't see major
>> > drawbacks
>> > > (maybe you do?).
>> > >
>> > > What do you think?
>> > >
>> > > Fabian
>> > >
>> > > Am Do., 19. Sept. 2019 um 11:24 Uhr schrieb Fabian Hueske <
>> > > fhue...@gmail.com
>> > > >:
>> > >
>> > > > Hi everyone,
>> > > >
>> > > > I thought again about option #1 and something that I don't like is
>> that
>> > > > the resolved address of xyz is different in "CREATE FUNCTION xyz"
>> and
>> > > > "CREATE TEMPORARY FUNCTION xyz".
>> > > > IMO, adding the keyword "TEMPORARY" should only change the
>> lifecycle of
>> > > > the function, but not where it is located. This implicitly changed
>> > > location
>> > > > might be confusing for users.
>> > > > After all, a temp function should behave pretty much like any other
>> > > > function, except for the fact that it disappears when the session is
>> > > closed.
>> > > >
>> > > > 

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Becket Qin
Thanks for the explanation, Stephan. I have a few questions / thoughts.

So that means we will remove the old connector without a major version
bump, is that correct?

I am not 100% sure if mixing 1.10 connectors with 1.11 connectors will
always work because we saw some dependency class collisions in the past. To
make it safe we may have to maintain the old code for one more release.

To be honest I am still wondering if we have to put the old connector in
Flink repo. if we check in the old connector to Flink. We will end up in
the following situation:
1. Old connector in streamnative/pulsar-flink repo.
2. Old connector in Flink Repo, which may be different from the one in
Pulsar repo. (Added in 1.10, deprecated in 1.11, removed in 1.12)
3. New connector in Flink Repo.

We need to think about how to make the users in each case happy.
- For users of (1), I assume Sijie and Yijie will have to maintain the code
a bit longer for its own compatibility even after we have (2). In that
case, bugs found in old connector may or may not need to be fixed in both
Flink and the streamnative/pulsar-flink repo.
- For users of (2), will we provide bug fixes? If we do, it will be a
little awkward because those bug fixes will be immediately deprecated in
1.11, and removed in 1.12. So we are essentially asking users to migrate
away from the bug fix. After Flink 1.12, users may still have to switch to
use (3) due to the potential dependency class conflicts mentioned above.
- Users of (3) have a much easier life and don't need to worry too much.

The above story seems a little complicated to tell. I think it will be much
easier to not have (2) at all.
1. Old connector in streamnative/pulsar-flink repo.
3. New connector in Flink Repo.

- Old connector will only be maintained in streamnative/pulsar-flink repo
until it is fully deprecated. Users can always use the existing Pulsar
connector in that repo.
- New connector will be in Flink repo and maintained like the other
connectors.

This seems much simpler for users understand and they will not be blocked
from using the old connector. If the concern is about the quality of the
connector in streamnative/pulsar-flink repo, is it enough for us just to
review the code in streamnative/pulsar-flink connector to make sure it
looks good from Flink's perspective?

What do you think?

Thanks,

Jiangjie (Becket) Qin


On Thu, Sep 19, 2019 at 6:58 PM Stephan Ewen  wrote:

> My take would be the following:
>
>   - If we merge the connector now and replace it with a FLIP-27 version
> before the 1.10 release, then we need no deprecation process
>   - If we don't manage to replace it with a FLIP-27 version before the 1.10
> release, than it is good that we have the other version, so no users get
> blocked.
>
> In the latter case we can see how we want to do it. Immediate removal of
> the old version or deprecation label and keeping it for one more release.
> Given that you should be able to use a Flink 1.10 connector with Flink 1.11
> as well (stable public APIs) there is also a workaround if you need an old
> connector in a newer version. So immediate removal might even be feasible.
>
>
> On Thu, Sep 19, 2019 at 11:09 AM Becket Qin  wrote:
>
> > Hi Stephan,
> >
> > Thanks for the clarification. I completely agree with you and Thomas on
> the
> > process of adding connectors to Flink repo. However, I am wondering what
> is
> > the deprecation process? Given the main concern here was that we may have
> > to maintain two Pulsar connector code bases until the old one is removed
> > from the repo, it would be good to know how long we have to do that.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Thu, Sep 19, 2019 at 3:54 PM Stephan Ewen  wrote:
> >
> > > Some quick thoughts on the connector contribution process. I basically
> > > reiterate here what Thomas mentioned in another thread about the
> Kinesis
> > > connector.
> > >
> > > For connectors, we should favor a low-overhead contribution process,
> and
> > > accept user code and changes more readily than in the core system.
> > > That is because connectors have both a big variety of scenarios they
> get
> > > used in (only through use and many small contributions do they become
> > > really useful over time) and at the same time, and committers do not
> use
> > > the connector themselves and usually cannot foresee too well what is
> > > needed.
> > >
> > > Further more, a missing connector (or connector feature) is often a
> > bigger
> > > show stopper for users than a missing API or system feature.
> > >
> > > Along these lines of thougt, the conclusion would be to take the Pulsar
> > > connector now, focus the review on legal/dependencies/rough code style
> > and
> > > conventions, label it as "beta" (in the sense of "new code" that is
> "not
> > > yet tested through longer use") and go ahead. And then evolve it
> quickly
> > > without putting formal blockers in the way, meaning also adding a new
> > FLIP
> > > 27 version when it is 

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Bowen Li
Hi,

Thanks everyone for your votes. I summarized the result as following:

#1:3 (+1), 1 (0), 4(-1) - net: -1
#2:4(0), 2 (+1), 1(+0.5)  - net: +2.5
Dawid -1/0 depending on keyword
#3:2(+1), 3(-1), 3(0)   - net: -1

Given the result, I'd like to change my vote for #2 from 0 to +1, to make
it a stronger case with net +3.5. So the votes so far are:

#1:3 (+1), 1 (0), 4(-1) - net: -1
#2:4(0), 3 (+1), 1(+0.5)  - net: +3.5
Dawid -1/0 depending on keyword
#3:2(+1), 3(-1), 3(0)   - net: -1

What do you think? Do you think we can conclude with this result? Or would
you like to take it as a formal FLIP vote with 3 days voting period?

BTW, I'd prefer "CREATE TEMPORARY BUILTIN FUNCTION" over "ALTER BUILTIN
FUNCTION xxx TEMPORARILY" because
1. the syntax is more consistent with "CREATE FUNCTION" and "CREATE
TEMPORARY FUNCTION"
2. "ALTER BUILTIN FUNCTION xxx TEMPORARILY" implies it alters a built-in
function but it actually doesn't, the logic only creates a temp function
with higher priority than that built-in function in ambiguous resolution
order; and it would behave inconsistently with "ALTER FUNCTION".



On Thu, Sep 19, 2019 at 2:58 AM Fabian Hueske  wrote:

> I agree, it's very similar from the implementation point of view and the
> implications.
>
> IMO, the difference is mostly on the mental model for the user.
> Instead of having a special class of temporary functions that have
> precedence over builtin functions it suggests to temporarily change
> built-in functions.
>
> Fabian
>
> Am Do., 19. Sept. 2019 um 11:52 Uhr schrieb Kurt Young :
>
> > Hi Fabian,
> >
> > I think it's almost the same with #2 with different keyword:
> >
> > CREATE TEMPORARY BUILTIN FUNCTION xxx
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Sep 19, 2019 at 5:50 PM Fabian Hueske  wrote:
> >
> > > Hi,
> > >
> > > I thought about it a bit more and think that there is some good value
> in
> > my
> > > last proposal.
> > >
> > > A lot of complexity comes from the fact that we want to allow
> overriding
> > > built-in functions which are differently addressed as other functions
> > (and
> > > db objects).
> > > We could just have "CREATE TEMPORARY FUNCTION" do exactly the same
> thing
> > as
> > > "CREATE FUNCTION" and treat both functions exactly the same except
> that:
> > > 1) temp functions disappear at the end of the session
> > > 2) temp function are resolved before other functions
> > >
> > > This would be Dawid's proposal from the beginning of this thread (in
> case
> > > you still remember... ;-) )
> > >
> > > Temporarily overriding built-in functions would be supported with an
> > > explicit command like
> > >
> > > ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ...
> > >
> > > This would also address the concerns about accidentally changing the
> > > semantics of built-in functions.
> > > IMO, it can't get much more explicit than the above command.
> > >
> > > Sorry for bringing up a new option in the middle of the discussion, but
> > as
> > > I said, I think it has a bunch of benefits and I don't see major
> > drawbacks
> > > (maybe you do?).
> > >
> > > What do you think?
> > >
> > > Fabian
> > >
> > > Am Do., 19. Sept. 2019 um 11:24 Uhr schrieb Fabian Hueske <
> > > fhue...@gmail.com
> > > >:
> > >
> > > > Hi everyone,
> > > >
> > > > I thought again about option #1 and something that I don't like is
> that
> > > > the resolved address of xyz is different in "CREATE FUNCTION xyz" and
> > > > "CREATE TEMPORARY FUNCTION xyz".
> > > > IMO, adding the keyword "TEMPORARY" should only change the lifecycle
> of
> > > > the function, but not where it is located. This implicitly changed
> > > location
> > > > might be confusing for users.
> > > > After all, a temp function should behave pretty much like any other
> > > > function, except for the fact that it disappears when the session is
> > > closed.
> > > >
> > > > Approach #2 with the additional keyword would make that pretty clear,
> > > IMO.
> > > > However, I neither like GLOBAL (for reasons mentioned by Dawid) or
> > > BUILDIN
> > > > (we are not adding a built-in function).
> > > > So I'd be OK with #2 if we find a good keyword. In fact, approach #2
> > > could
> > > > also be an alias for approach #3 to avoid explicit specification of
> the
> > > > system catalog/db.
> > > >
> > > > Approach #3 would be consistent with other db objects and the "CREATE
> > > > FUNCTION" statement.
> > > > Adding system catalog/db seems rather complex, but then again how
> often
> > > do
> > > > we expect users to override built-in functions? If this becomes a
> major
> > > > issue, we can still add option #2 as an alias.
> > > >
> > > > Not sure what's the best approach from an internal point of view,
> but I
> > > > certainly think that consistent behavior is important.
> > > > Hence my votes are:
> > > >
> > > > -1 for #1
> > > > 0 for #2
> > > > 0 for #3
> > > >
> > > > Btw. Did we consider a completely separate command for overriding
> > > built-in
> > > > functions 

Re: [DISCUSS] FLIP-66: Support time attribute in SQL DDL

2019-09-19 Thread Jark Wu
Hi everyone,

Thanks all for the valuable suggestions and feedbacks so far.
Before starting the vote, I would like to summarize the proposed DDL syntax
in the mailing list.

## Rowtime Attribute (Watermark Syntax)

CREATE TABLE table_name (
  WATERMARK FOR  AS 
) WITH (
  ...
)

It marks an existing field  as the rowtime attribute, and the
watermark is generated by the expression .
 can be arbitrary expression which returns a
nullable BIGINT or TIMESTAMP as the watermark value.

For common cases, users can use the following expressions to define a
strategy.
1. Bounded Out of Orderness, the strategy can be "rowtimeField - INTERVAL
'string' timeUnit".
2. Preserve Watermark From Source, the strategy can be "SYSTEM_WATERMARK()".

## Proctime Attribute

CREATE TABLE table_name (
  ...
  proc AS SYSTEM_PROCTIME()
) WITH (
  ...
)

It uses the computed column syntax to add an additional column with
proctime attribute. Here SYSTEM_PROCTIME() is a built-in function.

For more details and the implementations, please refer to the design doc:
https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d822dba

Feel free to leave your further feedbacks!

Thanks,
Jark

On Thu, 19 Sep 2019 at 11:23, Kurt Young  wrote:

> +1 to start vote process.
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 10:54 AM Jark Wu  wrote:
>
> > Hi everyone,
> >
> > Thanks all for joining the discussion in the doc[1].
> > It seems that the discussion is converged and there is a consensus on the
> > current FLIP document.
> > If there is no objection, I would like to convert it into cwiki FLIP page
> > and start voting process.
> >
> > For more details, please refer to the design doc (it is slightly changed
> > since the initial proposal).
> >
> > Thanks,
> > Jark
> >
> > [1]:
> >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit?ts=5d8258cd
> >
> > On Mon, 16 Sep 2019 at 16:12, Kurt Young  wrote:
> >
> > > After some review and discussion in the google document, I think it's
> > time
> > > to
> > > convert this design to a cwiki flip page and start voting process.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Mon, Sep 9, 2019 at 7:46 PM Jark Wu  wrote:
> > >
> > > > Hi all,
> > > >
> > > > Thanks all for so much feedbacks received in the doc so far.
> > > > I saw a general agreement on using computed column to support
> proctime
> > > > attribute and extract timestamps.
> > > > So we will prepare a computed column FLIP and share in the dev ML
> soon.
> > > >
> > > > Feel free to leave more comments!
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > >
> > > > On Fri, 6 Sep 2019 at 13:50, Dian Fu  wrote:
> > > >
> > > > > Hi Jark,
> > > > >
> > > > > Thanks for bringing up this discussion and the detailed design doc.
> > > This
> > > > > is definitely a critical feature for streaming SQL jobs. I have
> left
> > a
> > > > few
> > > > > comments in the design doc.
> > > > >
> > > > > Thanks,
> > > > > Dian
> > > > >
> > > > > > 在 2019年9月6日,上午11:48,Forward Xu  写道:
> > > > > >
> > > > > > Thanks Jark for this topic, This will be very useful.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > ForwardXu
> > > > > >
> > > > > > Danny Chan  于2019年9月6日周五 上午11:26写道:
> > > > > >
> > > > > >> Thanks Jark for bring up this topic, this is definitely an
> import
> > > > > feature
> > > > > >> for the SQL, especially the DDL users.
> > > > > >>
> > > > > >> I would spend some time to review this design doc, really
> thanks.
> > > > > >>
> > > > > >> Best,
> > > > > >> Danny Chan
> > > > > >> 在 2019年9月6日 +0800 AM11:19,Jark Wu ,写道:
> > > > > >>> Hi everyone,
> > > > > >>>
> > > > > >>> I would like to start discussion about how to support time
> > > attribute
> > > > in
> > > > > >> SQL
> > > > > >>> DDL.
> > > > > >>> In Flink 1.9, we already introduced a basic SQL DDL to create a
> > > > table.
> > > > > >>> However, it doesn't support to define time attributes. This
> makes
> > > > users
> > > > > >>> can't
> > > > > >>> apply window operations on the tables created by DDL which is a
> > bad
> > > > > >>> experience.
> > > > > >>>
> > > > > >>> In FLIP-66, we propose a syntax for watermark to define rowtime
> > > > > attribute
> > > > > >>> and propose to use computed column syntax to define proctime
> > > > attribute.
> > > > > >>> But computed column is another big topic and should deserve a
> > > > separate
> > > > > >>> FLIP.
> > > > > >>> If we have a consensus on the computed column approach, we will
> > > start
> > > > > >>> computed column FLIP soon.
> > > > > >>>
> > > > > >>> FLIP-66:
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1-SecocBqzUh7zY6HBYcfMlG_0z-JAcuZkCvsmN3LrOw/edit#
> > > > > >>>
> > > > > >>> Thanks for any feedback!
> > > > > >>>
> > > > > >>> Best,
> > > > > >>> Jark
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-14131) Support choosing new version failover strategy via configuration

2019-09-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14131:
---

 Summary: Support choosing new version failover strategy via 
configuration
 Key: FLINK-14131
 URL: https://issues.apache.org/jira/browse/FLINK-14131
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0


FLINK-10429 introduces new version failover strategies.
There are two different new version failover strategies and can be more in the 
future.
Users should be able to choose the proper strategy for their jobs via 
configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [PROPOSAL] Merge NewClusterClient into ClusterClient

2019-09-19 Thread Zili Chen
Thanks for your suggestions and information.

I'll therefore reopen the corresponding pull request.

Best,
tison.


Till Rohrmann  于2019年9月18日周三 下午10:55写道:

> No reason to keep the separation. The NewClusterClient interface was only
> introduced to add new methods and not having to implement them for the
> other ClusterClient implementations.
>
> Cheers,
> Till
>
> On Wed, Sep 18, 2019 at 3:17 PM Aljoscha Krettek 
> wrote:
>
> > I agree that NewClusterClient and ClusterClient can be merged now that
> > there is no pre-FLIP-6 code base anymore.
> >
> > Side note, there are a lot of methods in ClusterClient that should not
> > really be there, in my opinion:
> >  - all the getOptimizedPlan*() method
> >  - the run() methods. In the end, only submitJob should be required
> >
> > We should also see what Till (cc’ed) says, maybe he has an opinion on why
> > the separation should be kept.
> >
> > Best,
> > Aljoscha
> >
> > > On 18. Sep 2019, at 11:54, Zili Chen  wrote:
> > >
> > > Hi Xiaogang,
> > >
> > > Thanks for your reply.
> > >
> > > According to the feature discussion thread[1] client API enhancement
> is a
> > > planned
> > > feature towards 1.10 and thus I think this thread is valid if we can
> > reach
> > > a consensus
> > > and introduce new client API in this development cycle.
> > >
> > > Best,
> > > tison.
> > >
> > > [1]
> > >
> >
> https://lists.apache.org/thread.html/22639ca7de62a18f50e90db53e73910bd99b7f00c82f7494f4cb035f@%3Cdev.flink.apache.org%3E
> > >
> > >
> > > SHI Xiaogang  于2019年9月18日周三 下午3:03写道:
> > >
> > >> Hi Tison,
> > >>
> > >> Thanks for bringing this.
> > >>
> > >> I think it's fine to break the back compatibility of client API now
> that
> > >> ClusterClient is not well designed for public usage.
> > >> But from my perspective, we should postpone any modification to
> existing
> > >> interfaces until we come to an agreement on new client API. Otherwise,
> > our
> > >> users may adapt their implementation more than once.
> > >>
> > >> Regards,
> > >> Xiaogang
> > >>
> > >> Jeff Zhang  于2019年9月18日周三 上午10:49写道:
> > >>
> > >>> Thanks for raising this discussion. Overall +1 to merge
> > NewClusterClient
> > >>> into ClusterClient.
> > >>>
> > >>> 1. I think it is OK to break the backward compatibility. This current
> > >>> client api is no so clean which already cause issue for downstream
> > >> project
> > >>> and flink itself.
> > >>> In flink scala shell, I notice this kind of non-readable code
> > >>> Option[Either
> > >>> [MiniCluster , ClusterClient[_]]])
> > >>>
> > >>>
> > >>
> >
> https://github.com/apache/flink/blob/master/flink-scala-shell/src/main/scala/org/apache/flink/api/scala/FlinkShell.scala#L138
> > >>> I also created tickets and PR to try to simply it.
> > >>> https://github.com/apache/flink/pull/8546
> > >>> https://github.com/apache/flink/pull/8533
> > >>>   Personally I don't think we need to keep backward compatibility for
> > >>> non-well-designed api, otherwise it will bring lots of unnecessary
> > >>> overhead.
> > >>>
> > >>> 2. Another concern is that I notice there're many implementation
> > details
> > >> in
> > >>> ClusterClient. I think we should just expose a thin interface, so
> maybe
> > >> we
> > >>> can create interface ClusterClient which includes as less methods as
> > >>> possible, and move all the implementation to AbstractClusterClient.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Zili Chen  于2019年9月18日周三 上午9:46写道:
> > >>>
> >  Hi devs,
> > 
> >  FLINK-14096[1] was created yesterday. It is aimed at merge the
> bridge
> >  class NewClusterClient into ClusterClient because with the effort
> >  under FLINK-10392 this bridge class is no longer necessary.
> > 
> >  Technically in current codebase all implementation of interface
> >  NewClusterClient is subclass of ClusterClient so that the work
> >  required is no more than move method declaration. It helps we use
> >  type signature ClusterClient instead of
> >   >  latter if we aren't in a type variable context. This should not
> affect
> >  anything internal in Flink scope.
> > 
> >  However, as mentioned by Kostas in the JIRA and a previous
> discussion
> >  under a commit[2], it seems that we provide some levels of backward
> >  compatibility for ClusterClient and thus it's better to start a
> public
> >  discussion here.
> > 
> >  There are two concerns from my side.
> > 
> >  1. How much impact this proposal brings to users programming
> directly
> >  to ClusterClient?
> > 
> >  The specific changes here are add two methods `submitJob` and
> >  `requestJobResult` which are already implemented by
> RestClusterClient
> >  and MiniClusterClient. Users would only be affected if they create
> >  a class that inherits ClusterClient and doesn't implement these
> >  methods. Besides, users who create a class that implements
> >  NewClusterClient would be 

[jira] [Created] (FLINK-14130) Remove ClusterClient.run() methods

2019-09-19 Thread Aljoscha Krettek (Jira)
Aljoscha Krettek created FLINK-14130:


 Summary: Remove ClusterClient.run() methods
 Key: FLINK-14130
 URL: https://issues.apache.org/jira/browse/FLINK-14130
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Reporter: Aljoscha Krettek


{{ClusterClient}} is an internal interface of the {{flink-clients}} package. It 
should only be concerned with submitting {{JobGraphs}} to a cluster, which is 
what {{submitJob()}} does. 

The {{run()}} methods are concerned with unpacking programs or job-with-jars 
and at the end use {{submitJob()}} in some way, they should reside in some 
other component. The only valid remaining run method is {{run(PackagedProgram 
prog, int parallelism)}}, this could be in {{PackagedProgramUtils}}. The other 
{{run()}} methods are actually only used in one test: 
{{ClientTest.shouldSubmitToJobClient()}}. I don't think that test is valid 
anymore, it evolved for a very long time and now doesn't test what it was 
supposed to test once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] FLIP-66: Support Time Attribute in SQL DDL

2019-09-19 Thread Jark Wu
Hi,

There are some new valuable suggestions and comments in the design
documentation since the vote started.
I will update the documentation and summarize it in the [DISCUSS] thread. I
hereby cancel the vote.
Will start a new vote once we get a general consensus on the changes.

Thanks,
Jark


On Thu, 19 Sep 2019 at 12:05, Jark Wu  wrote:

> Hi all,
>
> I would like to start the vote for FLIP-66 [1], which is discussed and
> reached a consensus in the discussion thread[2].
>
> The vote will be open for at least 72 hours. I'll try to close it after
> Sep. 24 08:00 UTC, unless there is an objection or not enough votes.
>
> Thanks,
> Jark
>
> [1]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-66%3A+Support+time+attribute+in+SQL+DDL
> 
> [2]:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-66-Support-time-attribute-in-SQL-DDL-tt32766.html
>
>


[jira] [Created] (FLINK-14129) HiveTableSource should implement ProjectableTableSource

2019-09-19 Thread Rui Li (Jira)
Rui Li created FLINK-14129:
--

 Summary: HiveTableSource should implement ProjectableTableSource
 Key: FLINK-14129
 URL: https://issues.apache.org/jira/browse/FLINK-14129
 Project: Flink
  Issue Type: Improvement
Reporter: Rui Li






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14128) Remove the description of restart strategy customization

2019-09-19 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-14128:
---

 Summary: Remove the description of restart strategy customization
 Key: FLINK-14128
 URL: https://issues.apache.org/jira/browse/FLINK-14128
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Runtime / Coordination
Affects Versions: 1.10.0
Reporter: Zhu Zhu
 Fix For: 1.10.0


Restart strategy customization was not a public interface since it was not 
documented.
Since existing RestartStrategy implementation will not be supported with the 
new scheduler introduced in FLINK-10429, we'd better not mark restart strategy 
customization a public interface.

This way we' need to remove the description restart strategy customization 
which is added recently in FLINK-13898.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-19 Thread Jan Lukavský

Hi Stephen,

that sounds like a kind of option. I'm not sure if this would work on 
JDK11, though. JDK11 where I found this issue in first place, because on 
JDK < 9 I can inject the generated file into application class loader 
(which is URLClassLoader). This class loader changed in JDK9 and this is 
not (straightforwardly) possible anymore.


I think that the way to go would be to use the RemoteEnvironment with 
manually spawn MiniCluster.


But thanks for suggestion!

Jan

On 9/19/19 12:36 PM, Stephan Ewen wrote:

Hi Jan!

You can also add an additional class path URL to take classes from. Does
that help?
Is there a directly where the generated class files go, so that a
local file URL could reference them and add them to the user code class
loader?

Best,
Stephan


On Tue, Sep 10, 2019 at 10:28 AM Till Rohrmann  wrote:


Hi Jan,

sorry for my late response. I think the main concern about using the
context class loader is the unpredictability and magic it adds to a
component where you actually don't wanna be surprised. Moreover, if we now
add support for the context class loader, then at a later time another
component might use this feature as well. This component would then have to
make sure that the new class loader it sets has the current context class
loader as its parent because otherwise it might break your use case.

But on the other hand, I admit that I don't have a good solution to your
problem at hand other than using the RemoveExecutionEnvironment.

One comment concerning your proposed class loader resolution. I think it
adds a bit too much magic which is hard to understand for the user. It
would be better if the system would fail with a descriptive error message
instead.

Cheers,
Till

On Thu, Sep 5, 2019 at 12:55 PM Jan Lukavský  wrote:


Hi Till and Aljoscha,

I was investigating the other options, but unfortunately all of them
look a little complicated (although possible, of course). But before
going into a more complicated solutions, I'd like to know what issues do
you actually see with using the context class loader. I can think of one
difficulty - if (for whatever reason), the context class loader doesn't
contain (in itself or as a parent) class loader that loaded flink core
classes, that would probably cause troubles. So, what about a solution
that we take as parent class loader of FlinkUserCodeClassLoaders a class
loader that is:

   a) context class loader of current thread, if it either is actually
class loader of flink core classes, or if it contains this class loader
in its parent hierarchy, or

   b) class loader of flink core classes

That way, class loader of flink core classes would always be in parent
hierarchy of FlinkUserCodeClassLoaders. Would that solve the issues you
see? It works for me.

Jan

On 9/3/19 4:52 PM, Jan Lukavský wrote:

Answers inline.

On 9/3/19 4:01 PM, Till Rohrmann wrote:

How so? Does your REPL add the generated classes to the system class
loader? I assume the system class loader is used to load the Flink
classes.

No, it does not. It cannot on JDK >= 9 (or would have to hack into
jdk.internal.loader.ClassLoaders, which I don't want to :)). It just
creates another class loader, and is able to create a jar from
generated files. The jar is used for remote execution.

Ideally, what you would like to have is the option to provide the

parent

class loader which is used load user code to the LocalEnvironment.
This one
could then be forwarded to the TaskExecutor where it is used to

generate

the user code class loader. But this is a bigger effort.

I'm not sure how this differs from using context classloader? Maybe
there is subtle difference in that this is a little more explicit. On
the other hand, users normally do not modify class loaders, so the
practical impact is IMHO negligible. But maybe this opens another
possibility - we probably could add optional ClassLoader parameter to
LocalEnvironment, with default value of
FlinkRunner.class.getClassLoader()? That might be a good compromise.

The downside to this approach is that it requires you to create a jar
file
and to submit it via a REST call. The upside is that it is closer to

the

production setting.

Yes, a REPL has to do that anyway to support distributed computing, so
this is not an issue.

Jan


Cheers,
Till

On Tue, Sep 3, 2019 at 3:47 PM Jan Lukavský  wrote:


On the other hand, if you say, that the contract of LocalEnvironment

is

to execute as if it had all classes on its class loader, then it
currently breaks this contract. :-)

Jan

On 9/3/19 3:45 PM, Jan Lukavský wrote:

Hi Till,

hmm, that sounds it might work. I would have to incorporate this
(either as default, or on demand) into Apache Beam. Would you see

any

disadvantages of this approach? Would you suggest to make this

default

behavior for local beam FlinkRunner? I can introduce a configuration
option to turn this behavior on, but that would bring additional
maintenance burden, etc., etc.

Jan

On 9/3/19 3:38 PM, Till Rohrmann 

Re: [VOTE] FLIP-56: Dynamic Slot Allocation

2019-09-19 Thread Till Rohrmann
I think besides of point 1. and 3. there are no dependencies between the RM
and TM side changes. Also, I'm not sure whether it makes sense to split the
slot manager changes up into the proposed steps 5, 6 and 7.

I would highly recommend to not add too much duplicate logic/separate code
paths because it just adds blind spots which are probably not as well
tested as the old code paths.

Cheers,
Till

On Thu, Sep 19, 2019 at 11:58 AM Xintong Song  wrote:

> Thanks for the comments, Till.
>
> - Agree on removing SlotID.
>
> - Regarding the implementation plan, it is true that we can possibly reduce
> codes separated by the feature option. But I think to do that we need to
> introduce more dependencies between implementation steps. With the current
> plan, we can easily separate steps on the RM side and the TM side, and
> start concurrently working on them after quickly updating the interfaces in
> between. The feature will come alive when the steps on both RM/TM sides are
> finished. Since we are planning to have two persons (Andrey and I) working
> on this FLIP, I think the current plan is probably more convenient.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Thu, Sep 19, 2019 at 5:09 PM Till Rohrmann 
> wrote:
>
> > Hi Xintong,
> >
> > thanks for starting the vote. The general plan looks good. Hence +1 from
> my
> > side. I still have some minor comments one could think about:
> >
> > * As we no longer have predetermined slots on the TaskExecutor, I think
> we
> > can get rid of the SlotID. Instead, an allocated slot will be identified
> by
> > the AllocationID and the TaskManager's ResourceID in order to
> differentiate
> > duplicate registrations.
> > * For the implementation plan, I believe there is only one tiny part on
> the
> > SlotManager for which we need a separate code path/feature flag which is
> > how we find a matching slot. Everything else should be possible to
> > implement in a way that it works with dynamic and static slot allocation:
> > 1. Let TMs register with default slot profile at RM
> > 2. Change SlotManager to use reported slot profiles instead of
> > pre-calculated profiles
> > 3. Replace SlotID with SlotProfile in TaskExecutorGateway#requestSlot
> > 4. Extend TM to support dynamic slot allocation (aka proper bookkeeping)
> > (can happen concurrently to any of steps 2-3)
> > 5. Add bookkeeping to SlotManager (for pending TMs and registered TMs)
> but
> > still only use default slot profiles for matching with slot requests
> > 6. Allow to match slot requests with reported resources instead of
> default
> > slot profiles (here we could use a feature flag to switch between dynamic
> > and static slot allocation)
> >
> > Wdyt?
> >
> > Cheers,
> > Till
> >
> > On Thu, Sep 19, 2019 at 9:45 AM Andrey Zagrebin 
> > wrote:
> >
> > > Hi Xintong,
> > >
> > > Thanks for starting the vote, +1 from my side.
> > >
> > > Best,
> > > Andrey
> > >
> > > On Tue, Sep 17, 2019 at 4:26 PM Xintong Song 
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start the vote for FLIP-56 [1], on which a consensus
> is
> > > > reached in this discussion thread [2].
> > > >
> > > > The vote will be open for at least 72 hours. I'll try to close it
> after
> > > > Sep. 20 15:00 UTC, unless there is an objection or not enough votes.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> > > >
> > > > [2]
> > > >
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-56-Dynamic-Slot-Allocation-td31960.html
> > > >
> > >
> >
>


Re: [SURVEY] How many people are using customized RestartStrategy(s)

2019-09-19 Thread Zhu Zhu
Thanks everyone for the input.

The RestartStrategy customization is not recognized as a public interface
as it is not explicitly documented.
As it is not used from the feedbacks of this survey, I'll conclude that we
do not need to support customized RestartStrategy for the new scheduler in
Flink 1.10

Other usages are still supported, including all the strategies and
configuring ways described in
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/task_failure_recovery.html#restart-strategies
.

Feel free to share in this thread if you has any concern for it.

Thanks,
Zhu Zhu

Zhu Zhu  于2019年9月12日周四 下午10:33写道:

> Thanks Oytun for the reply!
>
> Sorry for not have stated it clearly. When saying "customized
> RestartStrategy", we mean that users implement an
> *org.apache.flink.runtime.executiongraph.restart.RestartStrategy* by
> themselves and use it by configuring like "restart-strategy:
> org.foobar.MyRestartStrategyFactoryFactory".
>
> The usage of restart strategies you mentioned will keep working with the
> new scheduler.
>
> Thanks,
> Zhu Zhu
>
> Oytun Tez  于2019年9月12日周四 下午10:05写道:
>
>> Hi Zhu,
>>
>> We are using custom restart strategy like this:
>>
>> environment.setRestartStrategy(failureRateRestart(2, Time.minutes(1),
>> Time.minutes(10)));
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Thu, Sep 12, 2019 at 7:11 AM Zhu Zhu  wrote:
>>
>>> Hi everyone,
>>>
>>> I wanted to reach out to you and ask how many of you are using a
>>> customized RestartStrategy[1] in production jobs.
>>>
>>> We are currently developing the new Flink scheduler[2] which interacts
>>> with restart strategies in a different way. We have to re-design the
>>> interfaces for the new restart strategies (so called
>>> RestartBackoffTimeStrategy). Existing customized RestartStrategy will not
>>> work any more with the new scheduler.
>>>
>>> We want to know whether we should keep the way
>>> to customized RestartBackoffTimeStrategy so that existing customized
>>> RestartStrategy can be migrated.
>>>
>>> I'd appreciate if you can share the status if you are using customized
>>> RestartStrategy. That will be valuable for use to make decisions.
>>>
>>> [1]
>>> https://ci.apache.org/projects/flink/flink-docs-master/dev/task_failure_recovery.html#restart-strategies
>>> [2] https://issues.apache.org/jira/browse/FLINK-10429
>>>
>>> Thanks,
>>> Zhu Zhu
>>>
>>


Re: Confluence permission for FLIP creation

2019-09-19 Thread Fabian Hueske
Hi Terry,

I gave you permissions.

Thanks, Fabian

Am Do., 19. Sept. 2019 um 04:09 Uhr schrieb Terry Wang :

> Hi all,
>
> As communicated in an email thread, I'm proposing Flink SQL ddl
> enhancement. I have a draft design doc that I'd like to convert it to a
> FLIP. Thus, it would be great if anyone who can grant me the write access
> to Confluence. My Confluence ID is zjuwangg.
>
> It would be nice if any of you can help on this.
>
> Best,
> Terry Wang
>
>
>
>


Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Stephan Ewen
My take would be the following:

  - If we merge the connector now and replace it with a FLIP-27 version
before the 1.10 release, then we need no deprecation process
  - If we don't manage to replace it with a FLIP-27 version before the 1.10
release, than it is good that we have the other version, so no users get
blocked.

In the latter case we can see how we want to do it. Immediate removal of
the old version or deprecation label and keeping it for one more release.
Given that you should be able to use a Flink 1.10 connector with Flink 1.11
as well (stable public APIs) there is also a workaround if you need an old
connector in a newer version. So immediate removal might even be feasible.


On Thu, Sep 19, 2019 at 11:09 AM Becket Qin  wrote:

> Hi Stephan,
>
> Thanks for the clarification. I completely agree with you and Thomas on the
> process of adding connectors to Flink repo. However, I am wondering what is
> the deprecation process? Given the main concern here was that we may have
> to maintain two Pulsar connector code bases until the old one is removed
> from the repo, it would be good to know how long we have to do that.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Thu, Sep 19, 2019 at 3:54 PM Stephan Ewen  wrote:
>
> > Some quick thoughts on the connector contribution process. I basically
> > reiterate here what Thomas mentioned in another thread about the Kinesis
> > connector.
> >
> > For connectors, we should favor a low-overhead contribution process, and
> > accept user code and changes more readily than in the core system.
> > That is because connectors have both a big variety of scenarios they get
> > used in (only through use and many small contributions do they become
> > really useful over time) and at the same time, and committers do not use
> > the connector themselves and usually cannot foresee too well what is
> > needed.
> >
> > Further more, a missing connector (or connector feature) is often a
> bigger
> > show stopper for users than a missing API or system feature.
> >
> > Along these lines of thougt, the conclusion would be to take the Pulsar
> > connector now, focus the review on legal/dependencies/rough code style
> and
> > conventions, label it as "beta" (in the sense of "new code" that is "not
> > yet tested through longer use") and go ahead. And then evolve it quickly
> > without putting formal blockers in the way, meaning also adding a new
> FLIP
> > 27 version when it is there.
> >
> > Best,
> > Stephan
> >
> >
> >
> > On Thu, Sep 19, 2019 at 3:47 AM Becket Qin  wrote:
> >
> > > Hi Yijie,
> > >
> > > Could you please follow the FLIP process to start a new FLIP
> [DISCUSSION]
> > > thread in the mailing list?
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
> > >
> > > I see two FLIP-69 discussion in the mailing list now. So there is a
> FLIP
> > > number collision. Can you change the FLIP number to 72?
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Thu, Sep 19, 2019 at 12:23 AM Rong Rong 
> wrote:
> > >
> > > > Hi Yijie,
> > > >
> > > > Thanks for sharing the pulsar FLIP.
> > > > Would you mind enabling comments/suggestions on the google doc link?
> > This
> > > > way the contributors from the community can comment on the doc.
> > > >
> > > > Best,
> > > > Rong
> > > >
> > > > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen <
> henry.yijies...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > I've drafted a FLIP that describes the current design of the Pulsar
> > > > > connector:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > > > >
> > > > > Please take a look and let me know what you think.
> > > > >
> > > > > Thanks,
> > > > > Yijie
> > > > >
> > > > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong 
> > > wrote:
> > > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > Sorry for joining the discussion late and thanks Yijie & Sijie
> for
> > > > > driving
> > > > > > the discussion.
> > > > > > I also think the Pulsar connector would be a very valuable
> addition
> > > to
> > > > > > Flink. I can also help out a bit on the review side :-)
> > > > > >
> > > > > > Regarding the timeline, I also share concerns with Becket on the
> > > > > > relationship between the new Pulsar connector and FLIP-27.
> > > > > > There's also another discussion just started by Stephan on
> dropping
> > > > Kafka
> > > > > > 9/10 support on next Flink release [1].  Although the situation
> is
> > > > > somewhat
> > > > > > different, and Kafka 9/10 connector has been in Flink for almost
> > 3-4
> > > > > years,
> > > > > > based on the discussion I am not sure if a major version release
> > is a
> > > > > > requirement for removing old connector supports.
> > > > > >
> > > > > > I think there shouldn't be a blocker if we agree the old
> connector
> > > will
> > > > > be
> > > > 

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-19 Thread Stephan Ewen
Hi Jan!

You can also add an additional class path URL to take classes from. Does
that help?
Is there a directly where the generated class files go, so that a
local file URL could reference them and add them to the user code class
loader?

Best,
Stephan


On Tue, Sep 10, 2019 at 10:28 AM Till Rohrmann  wrote:

> Hi Jan,
>
> sorry for my late response. I think the main concern about using the
> context class loader is the unpredictability and magic it adds to a
> component where you actually don't wanna be surprised. Moreover, if we now
> add support for the context class loader, then at a later time another
> component might use this feature as well. This component would then have to
> make sure that the new class loader it sets has the current context class
> loader as its parent because otherwise it might break your use case.
>
> But on the other hand, I admit that I don't have a good solution to your
> problem at hand other than using the RemoveExecutionEnvironment.
>
> One comment concerning your proposed class loader resolution. I think it
> adds a bit too much magic which is hard to understand for the user. It
> would be better if the system would fail with a descriptive error message
> instead.
>
> Cheers,
> Till
>
> On Thu, Sep 5, 2019 at 12:55 PM Jan Lukavský  wrote:
>
> > Hi Till and Aljoscha,
> >
> > I was investigating the other options, but unfortunately all of them
> > look a little complicated (although possible, of course). But before
> > going into a more complicated solutions, I'd like to know what issues do
> > you actually see with using the context class loader. I can think of one
> > difficulty - if (for whatever reason), the context class loader doesn't
> > contain (in itself or as a parent) class loader that loaded flink core
> > classes, that would probably cause troubles. So, what about a solution
> > that we take as parent class loader of FlinkUserCodeClassLoaders a class
> > loader that is:
> >
> >   a) context class loader of current thread, if it either is actually
> > class loader of flink core classes, or if it contains this class loader
> > in its parent hierarchy, or
> >
> >   b) class loader of flink core classes
> >
> > That way, class loader of flink core classes would always be in parent
> > hierarchy of FlinkUserCodeClassLoaders. Would that solve the issues you
> > see? It works for me.
> >
> > Jan
> >
> > On 9/3/19 4:52 PM, Jan Lukavský wrote:
> > > Answers inline.
> > >
> > > On 9/3/19 4:01 PM, Till Rohrmann wrote:
> > >> How so? Does your REPL add the generated classes to the system class
> > >> loader? I assume the system class loader is used to load the Flink
> > >> classes.
> > > No, it does not. It cannot on JDK >= 9 (or would have to hack into
> > > jdk.internal.loader.ClassLoaders, which I don't want to :)). It just
> > > creates another class loader, and is able to create a jar from
> > > generated files. The jar is used for remote execution.
> > >>
> > >> Ideally, what you would like to have is the option to provide the
> parent
> > >> class loader which is used load user code to the LocalEnvironment.
> > >> This one
> > >> could then be forwarded to the TaskExecutor where it is used to
> generate
> > >> the user code class loader. But this is a bigger effort.
> > > I'm not sure how this differs from using context classloader? Maybe
> > > there is subtle difference in that this is a little more explicit. On
> > > the other hand, users normally do not modify class loaders, so the
> > > practical impact is IMHO negligible. But maybe this opens another
> > > possibility - we probably could add optional ClassLoader parameter to
> > > LocalEnvironment, with default value of
> > > FlinkRunner.class.getClassLoader()? That might be a good compromise.
> > >>
> > >> The downside to this approach is that it requires you to create a jar
> > >> file
> > >> and to submit it via a REST call. The upside is that it is closer to
> the
> > >> production setting.
> > >
> > > Yes, a REPL has to do that anyway to support distributed computing, so
> > > this is not an issue.
> > >
> > > Jan
> > >
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Tue, Sep 3, 2019 at 3:47 PM Jan Lukavský  wrote:
> > >>
> > >>> On the other hand, if you say, that the contract of LocalEnvironment
> is
> > >>> to execute as if it had all classes on its class loader, then it
> > >>> currently breaks this contract. :-)
> > >>>
> > >>> Jan
> > >>>
> > >>> On 9/3/19 3:45 PM, Jan Lukavský wrote:
> >  Hi Till,
> > 
> >  hmm, that sounds it might work. I would have to incorporate this
> >  (either as default, or on demand) into Apache Beam. Would you see
> any
> >  disadvantages of this approach? Would you suggest to make this
> default
> >  behavior for local beam FlinkRunner? I can introduce a configuration
> >  option to turn this behavior on, but that would bring additional
> >  maintenance burden, etc., etc.
> > 
> >  Jan
> > 
> >  On 9/3/19 3:38 PM, 

[jira] [Created] (FLINK-14127) Better BackPressure Detection in WebUI

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14127:
--

 Summary: Better BackPressure Detection in WebUI
 Key: FLINK-14127
 URL: https://issues.apache.org/jira/browse/FLINK-14127
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.10.0
Reporter: Yadong Xie
 Fix For: 1.10.0
 Attachments: 屏幕快照 2019-09-19 下午6.00.05.png, 屏幕快照 2019-09-19 
下午6.00.57.png, 屏幕快照 2019-09-19 下午6.01.43.png

According to the 
[Document|https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/back_pressure.html],
 the backpressure monitor only triggered on request and it is currently not 
available via metrics. This means that in the web UI we have no way to show all 
the backpressure state of all vertexes at the same time. The users need to 
click every vertex to get its backpressure state.

!屏幕快照 2019-09-19 下午6.00.05.png|width=510,height=197!

In Flink 1.9.0 and above, there are four metrics available(outPoolUsage, 
inPoolUsage, floatingBuffersUsage, exclusiveBuffersUsage), we can use these 
metrics to determine if there are possible backpressure, and then use the 
backpressure REST API to confirm it.

Here is a table get from 
[https://flink.apache.org/2019/07/23/flink-network-stack-2.html]

!屏幕快照 2019-09-19 下午6.00.57.png|width=516,height=304!

 

We can display the possible backpressure status on the vertex graph, thus users 
can get all the vertex backpressure states and locate the potential problem 
quickly.

 

!屏幕快照 2019-09-19 下午6.01.43.png|width=572,height=277!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Fabian Hueske
I agree, it's very similar from the implementation point of view and the
implications.

IMO, the difference is mostly on the mental model for the user.
Instead of having a special class of temporary functions that have
precedence over builtin functions it suggests to temporarily change
built-in functions.

Fabian

Am Do., 19. Sept. 2019 um 11:52 Uhr schrieb Kurt Young :

> Hi Fabian,
>
> I think it's almost the same with #2 with different keyword:
>
> CREATE TEMPORARY BUILTIN FUNCTION xxx
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 5:50 PM Fabian Hueske  wrote:
>
> > Hi,
> >
> > I thought about it a bit more and think that there is some good value in
> my
> > last proposal.
> >
> > A lot of complexity comes from the fact that we want to allow overriding
> > built-in functions which are differently addressed as other functions
> (and
> > db objects).
> > We could just have "CREATE TEMPORARY FUNCTION" do exactly the same thing
> as
> > "CREATE FUNCTION" and treat both functions exactly the same except that:
> > 1) temp functions disappear at the end of the session
> > 2) temp function are resolved before other functions
> >
> > This would be Dawid's proposal from the beginning of this thread (in case
> > you still remember... ;-) )
> >
> > Temporarily overriding built-in functions would be supported with an
> > explicit command like
> >
> > ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ...
> >
> > This would also address the concerns about accidentally changing the
> > semantics of built-in functions.
> > IMO, it can't get much more explicit than the above command.
> >
> > Sorry for bringing up a new option in the middle of the discussion, but
> as
> > I said, I think it has a bunch of benefits and I don't see major
> drawbacks
> > (maybe you do?).
> >
> > What do you think?
> >
> > Fabian
> >
> > Am Do., 19. Sept. 2019 um 11:24 Uhr schrieb Fabian Hueske <
> > fhue...@gmail.com
> > >:
> >
> > > Hi everyone,
> > >
> > > I thought again about option #1 and something that I don't like is that
> > > the resolved address of xyz is different in "CREATE FUNCTION xyz" and
> > > "CREATE TEMPORARY FUNCTION xyz".
> > > IMO, adding the keyword "TEMPORARY" should only change the lifecycle of
> > > the function, but not where it is located. This implicitly changed
> > location
> > > might be confusing for users.
> > > After all, a temp function should behave pretty much like any other
> > > function, except for the fact that it disappears when the session is
> > closed.
> > >
> > > Approach #2 with the additional keyword would make that pretty clear,
> > IMO.
> > > However, I neither like GLOBAL (for reasons mentioned by Dawid) or
> > BUILDIN
> > > (we are not adding a built-in function).
> > > So I'd be OK with #2 if we find a good keyword. In fact, approach #2
> > could
> > > also be an alias for approach #3 to avoid explicit specification of the
> > > system catalog/db.
> > >
> > > Approach #3 would be consistent with other db objects and the "CREATE
> > > FUNCTION" statement.
> > > Adding system catalog/db seems rather complex, but then again how often
> > do
> > > we expect users to override built-in functions? If this becomes a major
> > > issue, we can still add option #2 as an alias.
> > >
> > > Not sure what's the best approach from an internal point of view, but I
> > > certainly think that consistent behavior is important.
> > > Hence my votes are:
> > >
> > > -1 for #1
> > > 0 for #2
> > > 0 for #3
> > >
> > > Btw. Did we consider a completely separate command for overriding
> > built-in
> > > functions like "ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ..."?
> > >
> > > Cheers, Fabian
> > >
> > >
> > > Am Do., 19. Sept. 2019 um 11:03 Uhr schrieb JingsongLee
> > > :
> > >
> > >> I know Hive and Spark can shadow built-in functions by temporary
> > function.
> > >> Mysql, Oracle, Sql server can not shadow.
> > >> User can use full names to access functions instead of shadowing.
> > >>
> > >> So I think it is a completely new thing, and the direct way to deal
> with
> > >> new things is to add new grammar. So,
> > >> +1 for #2, +0 for #3, -1 for #1
> > >>
> > >> Best,
> > >> Jingsong Lee
> > >>
> > >>
> > >> --
> > >> From:Kurt Young 
> > >> Send Time:2019年9月19日(星期四) 16:43
> > >> To:dev 
> > >> Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog
> > >>
> > >> And let me make my vote complete:
> > >>
> > >> -1 for #1
> > >> +1 for #2 with different keyword
> > >> -0 for #3
> > >>
> > >> Best,
> > >> Kurt
> > >>
> > >>
> > >> On Thu, Sep 19, 2019 at 4:40 PM Kurt Young  wrote:
> > >>
> > >> > Looks like I'm the only person who is willing to +1 to #2 for now
> :-)
> > >> > But I would suggest to change the keyword from GLOBAL to
> > >> > something like BUILTIN.
> > >> >
> > >> > I think #2 and #3 are almost the same proposal, just with different
> > >> > format to indicate whether it want to override built-in functions.
> > >> >
> > >> > My biggest 

Re: [VOTE] FLIP-56: Dynamic Slot Allocation

2019-09-19 Thread Xintong Song
Thanks for the comments, Till.

- Agree on removing SlotID.

- Regarding the implementation plan, it is true that we can possibly reduce
codes separated by the feature option. But I think to do that we need to
introduce more dependencies between implementation steps. With the current
plan, we can easily separate steps on the RM side and the TM side, and
start concurrently working on them after quickly updating the interfaces in
between. The feature will come alive when the steps on both RM/TM sides are
finished. Since we are planning to have two persons (Andrey and I) working
on this FLIP, I think the current plan is probably more convenient.

Thank you~

Xintong Song



On Thu, Sep 19, 2019 at 5:09 PM Till Rohrmann  wrote:

> Hi Xintong,
>
> thanks for starting the vote. The general plan looks good. Hence +1 from my
> side. I still have some minor comments one could think about:
>
> * As we no longer have predetermined slots on the TaskExecutor, I think we
> can get rid of the SlotID. Instead, an allocated slot will be identified by
> the AllocationID and the TaskManager's ResourceID in order to differentiate
> duplicate registrations.
> * For the implementation plan, I believe there is only one tiny part on the
> SlotManager for which we need a separate code path/feature flag which is
> how we find a matching slot. Everything else should be possible to
> implement in a way that it works with dynamic and static slot allocation:
> 1. Let TMs register with default slot profile at RM
> 2. Change SlotManager to use reported slot profiles instead of
> pre-calculated profiles
> 3. Replace SlotID with SlotProfile in TaskExecutorGateway#requestSlot
> 4. Extend TM to support dynamic slot allocation (aka proper bookkeeping)
> (can happen concurrently to any of steps 2-3)
> 5. Add bookkeeping to SlotManager (for pending TMs and registered TMs) but
> still only use default slot profiles for matching with slot requests
> 6. Allow to match slot requests with reported resources instead of default
> slot profiles (here we could use a feature flag to switch between dynamic
> and static slot allocation)
>
> Wdyt?
>
> Cheers,
> Till
>
> On Thu, Sep 19, 2019 at 9:45 AM Andrey Zagrebin 
> wrote:
>
> > Hi Xintong,
> >
> > Thanks for starting the vote, +1 from my side.
> >
> > Best,
> > Andrey
> >
> > On Tue, Sep 17, 2019 at 4:26 PM Xintong Song 
> > wrote:
> >
> > > Hi all,
> > >
> > > I would like to start the vote for FLIP-56 [1], on which a consensus is
> > > reached in this discussion thread [2].
> > >
> > > The vote will be open for at least 72 hours. I'll try to close it after
> > > Sep. 20 15:00 UTC, unless there is an objection or not enough votes.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> > >
> > > [2]
> > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-56-Dynamic-Slot-Allocation-td31960.html
> > >
> >
>


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Kurt Young
Hi Fabian,

I think it's almost the same with #2 with different keyword:

CREATE TEMPORARY BUILTIN FUNCTION xxx

Best,
Kurt


On Thu, Sep 19, 2019 at 5:50 PM Fabian Hueske  wrote:

> Hi,
>
> I thought about it a bit more and think that there is some good value in my
> last proposal.
>
> A lot of complexity comes from the fact that we want to allow overriding
> built-in functions which are differently addressed as other functions (and
> db objects).
> We could just have "CREATE TEMPORARY FUNCTION" do exactly the same thing as
> "CREATE FUNCTION" and treat both functions exactly the same except that:
> 1) temp functions disappear at the end of the session
> 2) temp function are resolved before other functions
>
> This would be Dawid's proposal from the beginning of this thread (in case
> you still remember... ;-) )
>
> Temporarily overriding built-in functions would be supported with an
> explicit command like
>
> ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ...
>
> This would also address the concerns about accidentally changing the
> semantics of built-in functions.
> IMO, it can't get much more explicit than the above command.
>
> Sorry for bringing up a new option in the middle of the discussion, but as
> I said, I think it has a bunch of benefits and I don't see major drawbacks
> (maybe you do?).
>
> What do you think?
>
> Fabian
>
> Am Do., 19. Sept. 2019 um 11:24 Uhr schrieb Fabian Hueske <
> fhue...@gmail.com
> >:
>
> > Hi everyone,
> >
> > I thought again about option #1 and something that I don't like is that
> > the resolved address of xyz is different in "CREATE FUNCTION xyz" and
> > "CREATE TEMPORARY FUNCTION xyz".
> > IMO, adding the keyword "TEMPORARY" should only change the lifecycle of
> > the function, but not where it is located. This implicitly changed
> location
> > might be confusing for users.
> > After all, a temp function should behave pretty much like any other
> > function, except for the fact that it disappears when the session is
> closed.
> >
> > Approach #2 with the additional keyword would make that pretty clear,
> IMO.
> > However, I neither like GLOBAL (for reasons mentioned by Dawid) or
> BUILDIN
> > (we are not adding a built-in function).
> > So I'd be OK with #2 if we find a good keyword. In fact, approach #2
> could
> > also be an alias for approach #3 to avoid explicit specification of the
> > system catalog/db.
> >
> > Approach #3 would be consistent with other db objects and the "CREATE
> > FUNCTION" statement.
> > Adding system catalog/db seems rather complex, but then again how often
> do
> > we expect users to override built-in functions? If this becomes a major
> > issue, we can still add option #2 as an alias.
> >
> > Not sure what's the best approach from an internal point of view, but I
> > certainly think that consistent behavior is important.
> > Hence my votes are:
> >
> > -1 for #1
> > 0 for #2
> > 0 for #3
> >
> > Btw. Did we consider a completely separate command for overriding
> built-in
> > functions like "ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ..."?
> >
> > Cheers, Fabian
> >
> >
> > Am Do., 19. Sept. 2019 um 11:03 Uhr schrieb JingsongLee
> > :
> >
> >> I know Hive and Spark can shadow built-in functions by temporary
> function.
> >> Mysql, Oracle, Sql server can not shadow.
> >> User can use full names to access functions instead of shadowing.
> >>
> >> So I think it is a completely new thing, and the direct way to deal with
> >> new things is to add new grammar. So,
> >> +1 for #2, +0 for #3, -1 for #1
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >>
> >> --
> >> From:Kurt Young 
> >> Send Time:2019年9月19日(星期四) 16:43
> >> To:dev 
> >> Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog
> >>
> >> And let me make my vote complete:
> >>
> >> -1 for #1
> >> +1 for #2 with different keyword
> >> -0 for #3
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Thu, Sep 19, 2019 at 4:40 PM Kurt Young  wrote:
> >>
> >> > Looks like I'm the only person who is willing to +1 to #2 for now :-)
> >> > But I would suggest to change the keyword from GLOBAL to
> >> > something like BUILTIN.
> >> >
> >> > I think #2 and #3 are almost the same proposal, just with different
> >> > format to indicate whether it want to override built-in functions.
> >> >
> >> > My biggest reason to choose it is I want this behavior be consistent
> >> > with temporal tables. I will give some examples to show the behavior
> >> > and also make sure I'm not misunderstanding anything here.
> >> >
> >> > For most DBs, when user create a temporary table with:
> >> >
> >> > CREATE TEMPORARY TABLE t1
> >> >
> >> > It's actually equivalent with:
> >> >
> >> > CREATE TEMPORARY TABLE `curent_db`.t1
> >> >
> >> > If user change current database, they will not be able to access t1
> >> without
> >> > fully qualified name, .i.e db1.t1 (assuming db1 is current database
> when
> >> > this temporary table is created).
> >> >
> >> > Only #2 and #3 

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Fabian Hueske
Hi,

I thought about it a bit more and think that there is some good value in my
last proposal.

A lot of complexity comes from the fact that we want to allow overriding
built-in functions which are differently addressed as other functions (and
db objects).
We could just have "CREATE TEMPORARY FUNCTION" do exactly the same thing as
"CREATE FUNCTION" and treat both functions exactly the same except that:
1) temp functions disappear at the end of the session
2) temp function are resolved before other functions

This would be Dawid's proposal from the beginning of this thread (in case
you still remember... ;-) )

Temporarily overriding built-in functions would be supported with an
explicit command like

ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ...

This would also address the concerns about accidentally changing the
semantics of built-in functions.
IMO, it can't get much more explicit than the above command.

Sorry for bringing up a new option in the middle of the discussion, but as
I said, I think it has a bunch of benefits and I don't see major drawbacks
(maybe you do?).

What do you think?

Fabian

Am Do., 19. Sept. 2019 um 11:24 Uhr schrieb Fabian Hueske :

> Hi everyone,
>
> I thought again about option #1 and something that I don't like is that
> the resolved address of xyz is different in "CREATE FUNCTION xyz" and
> "CREATE TEMPORARY FUNCTION xyz".
> IMO, adding the keyword "TEMPORARY" should only change the lifecycle of
> the function, but not where it is located. This implicitly changed location
> might be confusing for users.
> After all, a temp function should behave pretty much like any other
> function, except for the fact that it disappears when the session is closed.
>
> Approach #2 with the additional keyword would make that pretty clear, IMO.
> However, I neither like GLOBAL (for reasons mentioned by Dawid) or BUILDIN
> (we are not adding a built-in function).
> So I'd be OK with #2 if we find a good keyword. In fact, approach #2 could
> also be an alias for approach #3 to avoid explicit specification of the
> system catalog/db.
>
> Approach #3 would be consistent with other db objects and the "CREATE
> FUNCTION" statement.
> Adding system catalog/db seems rather complex, but then again how often do
> we expect users to override built-in functions? If this becomes a major
> issue, we can still add option #2 as an alias.
>
> Not sure what's the best approach from an internal point of view, but I
> certainly think that consistent behavior is important.
> Hence my votes are:
>
> -1 for #1
> 0 for #2
> 0 for #3
>
> Btw. Did we consider a completely separate command for overriding built-in
> functions like "ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ..."?
>
> Cheers, Fabian
>
>
> Am Do., 19. Sept. 2019 um 11:03 Uhr schrieb JingsongLee
> :
>
>> I know Hive and Spark can shadow built-in functions by temporary function.
>> Mysql, Oracle, Sql server can not shadow.
>> User can use full names to access functions instead of shadowing.
>>
>> So I think it is a completely new thing, and the direct way to deal with
>> new things is to add new grammar. So,
>> +1 for #2, +0 for #3, -1 for #1
>>
>> Best,
>> Jingsong Lee
>>
>>
>> --
>> From:Kurt Young 
>> Send Time:2019年9月19日(星期四) 16:43
>> To:dev 
>> Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog
>>
>> And let me make my vote complete:
>>
>> -1 for #1
>> +1 for #2 with different keyword
>> -0 for #3
>>
>> Best,
>> Kurt
>>
>>
>> On Thu, Sep 19, 2019 at 4:40 PM Kurt Young  wrote:
>>
>> > Looks like I'm the only person who is willing to +1 to #2 for now :-)
>> > But I would suggest to change the keyword from GLOBAL to
>> > something like BUILTIN.
>> >
>> > I think #2 and #3 are almost the same proposal, just with different
>> > format to indicate whether it want to override built-in functions.
>> >
>> > My biggest reason to choose it is I want this behavior be consistent
>> > with temporal tables. I will give some examples to show the behavior
>> > and also make sure I'm not misunderstanding anything here.
>> >
>> > For most DBs, when user create a temporary table with:
>> >
>> > CREATE TEMPORARY TABLE t1
>> >
>> > It's actually equivalent with:
>> >
>> > CREATE TEMPORARY TABLE `curent_db`.t1
>> >
>> > If user change current database, they will not be able to access t1
>> without
>> > fully qualified name, .i.e db1.t1 (assuming db1 is current database when
>> > this temporary table is created).
>> >
>> > Only #2 and #3 followed this behavior and I would vote for this since
>> this
>> > makes such behavior consistent through temporal tables and functions.
>> >
>> > Why I'm not voting for #3 is a special catalog and database just looks
>> very
>> > hacky to me. It gave a imply that our built-in functions saved at a
>> > special
>> > catalog and database, which is actually not. Introducing a dedicated
>> > keyword
>> > like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
>> > 

Re: [DISCUSS] Add ARM CI build to Flink (information-only)

2019-09-19 Thread Stephan Ewen
My gut feeling is that having a CI that only runs on a specific command
will not help too much.

What about going with nightly builds then? We could set up the ARM CI the
same way as the Travis CI nightly builds (cron builds). They report build
failures to "bui...@flink.apache.org".
Maybe Chesnay or Jark could help with what needs to be done to post to that
mailing list?

A requirement would be that the builds are stable, from the ARM
perspective, meaning that there are no failures at the moment caused by ARM
specific issue.

What do the others think?


On Tue, Sep 3, 2019 at 4:40 AM Xiyuan Wang  wrote:

> The ARM CI trigger has been changed to `github comment` way only. It means
> that every PR won't start ARM test unless a comment `check_arm` is added.
> Like what I did in the PR[1].
>
> A POC for Flink nightly end to end test job is created as well[2]. I'll
> improve it then.
>
> Any feedback or question?
>
>
> [1]: https://github.com/apache/flink/pull/9416
>  https://github.com/apache/flink/pull/9416#issuecomment-527268203
> [2]: https://github.com/theopenlab/openlab-zuul-jobs/pull/631
>
>
> Thanks
>
> Xiyuan Wang  于2019年8月26日周一 下午7:41写道:
>
> > Before ARM CI is ready, I can close the CI test for each PR and let it
> > only be triggered by PR comment.  It's quite easy for OpenLab to do this.
> >
> > OpenLab have many job piplines[1].  Now I use `check` pipline in
> > https://github.com/apache/flink/pull/9416. The job trigger contains
> > github_action and github_comment[2]. I can create a new pipline for
> Flink,
> > the new trigger can only contain github_coment like:
> >
> > trigger:
> >   github:
> >  - event: pull_request
> >action: comment
> >comment: (?i)^\s*recheck_arm_build\s*$
> >
> > So that the ARM job will not be ran for every PR. It'll be just ran for
> > the PR which have `recheck_arm_build` comment.
> >
> > Then once ARM CI is ready, I can add it back.
> >
> >
> > nightly tests can be added as well of couse. There is a kind of job in
> > OpenLab called `periodic job`. We can use it for Flink daily nightly
> tests.
> > If any error occur, the report can be sent to bui...@flink.apache.org
> as
> > well.
> >
> > [1]:
> >
> https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml
> > [2]:
> >
> https://github.com/theopenlab/openlab-zuul-jobs/blob/master/zuul.d/pipelines.yaml#L10-L19
> >
> > Stephan Ewen  于2019年8月26日周一 下午6:13写道:
> >
> >> Adding CI builds for ARM makes only sense when we actually take them
> into
> >> account as "blocking a merge", otherwise there is no point in having
> them.
> >> So we would need to be prepared to do that.
> >>
> >> The cases where something runs in UNIX/x64 but fails on ARM are few
> cases
> >> and so far seem to have been related to libraries or some magic that
> tries
> >> to do system dependent actions outside Java.
> >>
> >> One worthwhile discussion could be whether to run the ARM CI builds as
> >> part
> >> of the nightly tests, not on every commit.
> >> There are a lot of nightly tests, for example for different Java /
> Scala /
> >> Hadoop versions.
> >>
> >> On Mon, Aug 26, 2019 at 10:46 AM Xiyuan Wang 
> >> wrote:
> >>
> >> > Sorry, maybe my words is misleading.
> >> >
> >> > We are just starting adding ARM support. So the CI is non-voting at
> this
> >> > moment to avoid blocking normal Flink development.
> >> >
> >> > But once the ARM CI works well and stable enough. We should mark it as
> >> > voting. It means that in the future, if the ARM test is failed in a
> PR,
> >> the
> >> > PR can not be merged. The test log may tell develpers what error is
> >> > comming. If the develper need debug the detail on an ARM vm, OpenLab
> can
> >> > provider it.
> >> >
> >> > Adding ARM CI can make sure Flink support ARM originally
> >> >
> >> > I left a workflow in the PR, I'd like to print it here:
> >> >
> >> >1. Add the basic build script to ensure the CI system and build job
> >> >works as expect. The job should be marked as non-voting first, it
> >> means the
> >> >CI test failure won't block Flink PR to be merged.
> >> >2. Add the test script to run unit/intergration test. At this step
> >> the
> >> >--fn parameter will be added to mvn test. It will run the full test
> >> cases
> >> >in Flink, so that we can find what test is failed on ARM.
> >> >3. Fix the test failure one by one.
> >> >4. Once all the tests are passed, remove the --fn parameter and
> keep
> >> >watch the CI's status for some days. If some bugs raise then, fix
> >> them as
> >> >what we usually do for travis-ci.
> >> >5. Once the CI is stable enought, remove the non-voting tag, so
> that
> >> >the ARM CI will be the same as travis-ci, to be one of the gate for
> >> Flink
> >> >PR.
> >> >6. Finally, Flink community can announce and release Flink ARM
> >> version.
> >> >
> >> >
> >> > Chesnay Schepler  于2019年8月26日周一 下午2:25写道:
> >> >
> >> >> I'm sorry, but if these issues are only fixed later 

Re: Confluence permission for FLIP creation

2019-09-19 Thread Zili Chen
Thanks!

Best,
tison.


Till Rohrmann  于2019年9月19日周四 下午5:34写道:

> Granted. You should now be able to create pages Tison.
>
> Cheers,
> Till
>
> On Thu, Sep 19, 2019 at 11:01 AM Zili Chen  wrote:
>
> > Hi devs,
> >
> > I'd like to create a page about the ongoing JobClient FLIP. Could you
> > grant me Confluence permission for FLIP creation?
> >
> > My Confluence ID is tison.
> >
> > Best,
> > tison.
> >
>


Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Fabian Hueske
Hi,

You should have the necessary permissions.
Let me know if something doesn't work.

Cheers, Fabian

Am Do., 19. Sept. 2019 um 10:56 Uhr schrieb Yijie Shen <
henry.yijies...@gmail.com>:

> Hi Fabian,
>
> It's   yijieshen
>
> Thanks for your help!
>
> On Thu, Sep 19, 2019 at 4:52 PM Fabian Hueske  wrote:
> >
> > Hi Yijie,
> >
> > I can give you permission if you tell me your Confluence user name.
> >
> > Thanks, Fabian
> >
> > Am Do., 19. Sept. 2019 um 10:49 Uhr schrieb Yijie Shen <
> > henry.yijies...@gmail.com>:
> >
> > > > > > Could you please follow the FLIP process to start a new FLIP
> > > [DISCUSSION]
> > > > > > thread in the mailing list?
> > >
> > > It seems that I don't have permission to create pages in FLIP.
> > >
> > >
> > > > > > > Thanks for sharing the pulsar FLIP.
> > > > > > > Would you mind enabling comments/suggestions on the google doc
> > > link?
> > >
> > > Sorry, my bad, I've enabled comments for the doc.
> > > Does it work for now?
> > >
> > > On Thu, Sep 19, 2019 at 4:07 PM Fabian Hueske 
> wrote:
> > > >
> > > > +1 to what Stephan (and Thomas) said.
> > > >
> > > > Am Do., 19. Sept. 2019 um 09:54 Uhr schrieb Stephan Ewen <
> > > se...@apache.org>:
> > > >
> > > > > Some quick thoughts on the connector contribution process. I
> basically
> > > > > reiterate here what Thomas mentioned in another thread about the
> > > Kinesis
> > > > > connector.
> > > > >
> > > > > For connectors, we should favor a low-overhead contribution
> process,
> > > and
> > > > > accept user code and changes more readily than in the core system.
> > > > > That is because connectors have both a big variety of scenarios
> they
> > > get
> > > > > used in (only through use and many small contributions do they
> become
> > > > > really useful over time) and at the same time, and committers do
> not
> > > use
> > > > > the connector themselves and usually cannot foresee too well what
> is
> > > > > needed.
> > > > >
> > > > > Further more, a missing connector (or connector feature) is often a
> > > bigger
> > > > > show stopper for users than a missing API or system feature.
> > > > >
> > > > > Along these lines of thougt, the conclusion would be to take the
> Pulsar
> > > > > connector now, focus the review on legal/dependencies/rough code
> style
> > > and
> > > > > conventions, label it as "beta" (in the sense of "new code" that is
> > > "not
> > > > > yet tested through longer use") and go ahead. And then evolve it
> > > quickly
> > > > > without putting formal blockers in the way, meaning also adding a
> new
> > > FLIP
> > > > > 27 version when it is there.
> > > > >
> > > > > Best,
> > > > > Stephan
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Sep 19, 2019 at 3:47 AM Becket Qin 
> > > wrote:
> > > > >
> > > > > > Hi Yijie,
> > > > > >
> > > > > > Could you please follow the FLIP process to start a new FLIP
> > > [DISCUSSION]
> > > > > > thread in the mailing list?
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
> > > > > >
> > > > > > I see two FLIP-69 discussion in the mailing list now. So there
> is a
> > > FLIP
> > > > > > number collision. Can you change the FLIP number to 72?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Thu, Sep 19, 2019 at 12:23 AM Rong Rong 
> > > wrote:
> > > > > >
> > > > > > > Hi Yijie,
> > > > > > >
> > > > > > > Thanks for sharing the pulsar FLIP.
> > > > > > > Would you mind enabling comments/suggestions on the google doc
> > > link?
> > > > > This
> > > > > > > way the contributors from the community can comment on the doc.
> > > > > > >
> > > > > > > Best,
> > > > > > > Rong
> > > > > > >
> > > > > > > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen <
> > > henry.yijies...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello everyone,
> > > > > > > >
> > > > > > > > I've drafted a FLIP that describes the current design of the
> > > Pulsar
> > > > > > > > connector:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > > > > > > >
> > > > > > > > Please take a look and let me know what you think.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Yijie
> > > > > > > >
> > > > > > > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong <
> walter...@gmail.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > Sorry for joining the discussion late and thanks Yijie &
> Sijie
> > > for
> > > > > > > > driving
> > > > > > > > > the discussion.
> > > > > > > > > I also think the Pulsar connector would be a very valuable
> > > addition
> > > > > > to
> > > > > > > > > Flink. I can also help out a bit on the review side :-)
> > > > > > > > >
> > > > > > > > > Regarding the timeline, I also share concerns with Becket
> on

[jira] [Created] (FLINK-14126) Elasticsearch Xpack Machine Learning doesn't support ARM

2019-09-19 Thread wangxiyuan (Jira)
wangxiyuan created FLINK-14126:
--

 Summary: Elasticsearch Xpack Machine Learning doesn't support ARM
 Key: FLINK-14126
 URL: https://issues.apache.org/jira/browse/FLINK-14126
 Project: Flink
  Issue Type: Sub-task
  Components: Tests
Affects Versions: 1.9.0
Reporter: wangxiyuan
 Fix For: 2.0.0


Elasticsearch Xpack Machine Learning function is enalbed by default if the 
version is >=6.0. But This feature doesn't support ARM arch. So that in some 
e2e tests, Elasticsearch  is failed to start.

We should disable ML feature in this case on ARM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Confluence permission for FLIP creation

2019-09-19 Thread Till Rohrmann
Granted. You should now be able to create pages Tison.

Cheers,
Till

On Thu, Sep 19, 2019 at 11:01 AM Zili Chen  wrote:

> Hi devs,
>
> I'd like to create a page about the ongoing JobClient FLIP. Could you
> grant me Confluence permission for FLIP creation?
>
> My Confluence ID is tison.
>
> Best,
> tison.
>


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Fabian Hueske
Hi everyone,

I thought again about option #1 and something that I don't like is that the
resolved address of xyz is different in "CREATE FUNCTION xyz" and "CREATE
TEMPORARY FUNCTION xyz".
IMO, adding the keyword "TEMPORARY" should only change the lifecycle of the
function, but not where it is located. This implicitly changed location
might be confusing for users.
After all, a temp function should behave pretty much like any other
function, except for the fact that it disappears when the session is closed.

Approach #2 with the additional keyword would make that pretty clear, IMO.
However, I neither like GLOBAL (for reasons mentioned by Dawid) or BUILDIN
(we are not adding a built-in function).
So I'd be OK with #2 if we find a good keyword. In fact, approach #2 could
also be an alias for approach #3 to avoid explicit specification of the
system catalog/db.

Approach #3 would be consistent with other db objects and the "CREATE
FUNCTION" statement.
Adding system catalog/db seems rather complex, but then again how often do
we expect users to override built-in functions? If this becomes a major
issue, we can still add option #2 as an alias.

Not sure what's the best approach from an internal point of view, but I
certainly think that consistent behavior is important.
Hence my votes are:

-1 for #1
0 for #2
0 for #3

Btw. Did we consider a completely separate command for overriding built-in
functions like "ALTER BUILTIN FUNCTION xxx TEMPORARILY AS ..."?

Cheers, Fabian


Am Do., 19. Sept. 2019 um 11:03 Uhr schrieb JingsongLee
:

> I know Hive and Spark can shadow built-in functions by temporary function.
> Mysql, Oracle, Sql server can not shadow.
> User can use full names to access functions instead of shadowing.
>
> So I think it is a completely new thing, and the direct way to deal with
> new things is to add new grammar. So,
> +1 for #2, +0 for #3, -1 for #1
>
> Best,
> Jingsong Lee
>
>
> --
> From:Kurt Young 
> Send Time:2019年9月19日(星期四) 16:43
> To:dev 
> Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog
>
> And let me make my vote complete:
>
> -1 for #1
> +1 for #2 with different keyword
> -0 for #3
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 4:40 PM Kurt Young  wrote:
>
> > Looks like I'm the only person who is willing to +1 to #2 for now :-)
> > But I would suggest to change the keyword from GLOBAL to
> > something like BUILTIN.
> >
> > I think #2 and #3 are almost the same proposal, just with different
> > format to indicate whether it want to override built-in functions.
> >
> > My biggest reason to choose it is I want this behavior be consistent
> > with temporal tables. I will give some examples to show the behavior
> > and also make sure I'm not misunderstanding anything here.
> >
> > For most DBs, when user create a temporary table with:
> >
> > CREATE TEMPORARY TABLE t1
> >
> > It's actually equivalent with:
> >
> > CREATE TEMPORARY TABLE `curent_db`.t1
> >
> > If user change current database, they will not be able to access t1
> without
> > fully qualified name, .i.e db1.t1 (assuming db1 is current database when
> > this temporary table is created).
> >
> > Only #2 and #3 followed this behavior and I would vote for this since
> this
> > makes such behavior consistent through temporal tables and functions.
> >
> > Why I'm not voting for #3 is a special catalog and database just looks
> very
> > hacky to me. It gave a imply that our built-in functions saved at a
> > special
> > catalog and database, which is actually not. Introducing a dedicated
> > keyword
> > like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
> > straightforward. One can argue that we should avoid introducing new
> > keyword,
> > but it's also very rare that a system can overwrite built-in functions.
> > Since we
> > decided to support this, introduce a new keyword is not a big deal IMO.
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski 
> > wrote:
> >
> >> Hi,
> >>
> >> It is a quite long discussion to follow and I hope I didn’t
> misunderstand
> >> anything. From the proposals presented by Xuefu I would vote:
> >>
> >> -1 for #1 and #2
> >> +1 for #3
> >>
> >> Besides #3 being IMO more general and more consistent, having qualified
> >> names (#3) would help/make easier for someone to use cross
> >> databases/catalogs queries (joining multiple data sets/streams). For
> >> example with some functions to manipulate/clean up/convert the stored
> data
> >> in different catalogs registered in the respective catalogs.
> >>
> >> Piotrek
> >>
> >> > On 19 Sep 2019, at 06:35, Jark Wu  wrote:
> >> >
> >> > I agree with Xuefu that inconsistent handling with all the other
> >> objects is
> >> > not a big problem.
> >> >
> >> > Regarding to option#3, the special "system.system" namespace may
> confuse
> >> > users.
> >> > Users need to know the set of built-in function names to know when to
> >> use
> >> > "system.system" 

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Piotr Nowojski
After reading Kurt’s reasoning I’m bumping my vote for #2 from -1 to +0, or 
even +0.5, so my final vote is:

-1 for #1
+0.5 for #2
+1 for #3

Re confusion about “system_db”. I think quite a lot of DBs are storing some 
meta tables in some system and often hidden db/schema, so I don’t think that if 
we do the same with built in functions will be that big of a deal. In the end, 
both for #2 and #3 user will have to check in the documentation what’s the 
syntax for overriding built-in functions for the first time he will want to do 
it.

Piotrek

> On 19 Sep 2019, at 11:03, JingsongLee  wrote:
> 
> I know Hive and Spark can shadow built-in functions by temporary function.
> Mysql, Oracle, Sql server can not shadow.
> User can use full names to access functions instead of shadowing.
> 
> So I think it is a completely new thing, and the direct way to deal with new 
> things is to add new grammar. So,
> +1 for #2, +0 for #3, -1 for #1
> 
> Best,
> Jingsong Lee
> 
> 
> --
> From:Kurt Young 
> Send Time:2019年9月19日(星期四) 16:43
> To:dev 
> Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog
> 
> And let me make my vote complete:
> 
> -1 for #1
> +1 for #2 with different keyword
> -0 for #3
> 
> Best,
> Kurt
> 
> 
> On Thu, Sep 19, 2019 at 4:40 PM Kurt Young  wrote:
> 
>> Looks like I'm the only person who is willing to +1 to #2 for now :-)
>> But I would suggest to change the keyword from GLOBAL to
>> something like BUILTIN.
>> 
>> I think #2 and #3 are almost the same proposal, just with different
>> format to indicate whether it want to override built-in functions.
>> 
>> My biggest reason to choose it is I want this behavior be consistent
>> with temporal tables. I will give some examples to show the behavior
>> and also make sure I'm not misunderstanding anything here.
>> 
>> For most DBs, when user create a temporary table with:
>> 
>> CREATE TEMPORARY TABLE t1
>> 
>> It's actually equivalent with:
>> 
>> CREATE TEMPORARY TABLE `curent_db`.t1
>> 
>> If user change current database, they will not be able to access t1 without
>> fully qualified name, .i.e db1.t1 (assuming db1 is current database when
>> this temporary table is created).
>> 
>> Only #2 and #3 followed this behavior and I would vote for this since this
>> makes such behavior consistent through temporal tables and functions.
>> 
>> Why I'm not voting for #3 is a special catalog and database just looks very
>> hacky to me. It gave a imply that our built-in functions saved at a
>> special
>> catalog and database, which is actually not. Introducing a dedicated
>> keyword
>> like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
>> straightforward. One can argue that we should avoid introducing new
>> keyword,
>> but it's also very rare that a system can overwrite built-in functions.
>> Since we
>> decided to support this, introduce a new keyword is not a big deal IMO.
>> 
>> Best,
>> Kurt
>> 
>> 
>> On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski 
>> wrote:
>> 
>>> Hi,
>>> 
>>> It is a quite long discussion to follow and I hope I didn’t misunderstand
>>> anything. From the proposals presented by Xuefu I would vote:
>>> 
>>> -1 for #1 and #2
>>> +1 for #3
>>> 
>>> Besides #3 being IMO more general and more consistent, having qualified
>>> names (#3) would help/make easier for someone to use cross
>>> databases/catalogs queries (joining multiple data sets/streams). For
>>> example with some functions to manipulate/clean up/convert the stored data
>>> in different catalogs registered in the respective catalogs.
>>> 
>>> Piotrek
>>> 
 On 19 Sep 2019, at 06:35, Jark Wu  wrote:
 
 I agree with Xuefu that inconsistent handling with all the other
>>> objects is
 not a big problem.
 
 Regarding to option#3, the special "system.system" namespace may confuse
 users.
 Users need to know the set of built-in function names to know when to
>>> use
 "system.system" namespace.
 What will happen if user registers a non-builtin function name under the
 "system.system" namespace?
 Besides, I think it doesn't solve the "explode" problem I mentioned at
>>> the
 beginning of this thread.
 
 So here is my vote:
 
 +1 for #1
 0 for #2
 -1 for #3
 
 Best,
 Jark
 
 
 On Thu, 19 Sep 2019 at 08:38, Xuefu Z  wrote:
 
> @Dawid, Re: we also don't need additional referencing the
>>> specialcatalog
> anywhere.
> 
> True. But once we allow such reference, then user can do so in any
>>> possible
> place where a function name is expected, for which we have to handle.
> That's a big difference, I think.
> 
> Thanks,
> Xuefu
> 
> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
> wysakowicz.da...@gmail.com>
> wrote:
> 
>> @Bowen I am not suggesting introducing additional catalog. I think we
> need
>> to get rid of the current built-in catalog.

Re: [VOTE] FLIP-56: Dynamic Slot Allocation

2019-09-19 Thread Till Rohrmann
Hi Xintong,

thanks for starting the vote. The general plan looks good. Hence +1 from my
side. I still have some minor comments one could think about:

* As we no longer have predetermined slots on the TaskExecutor, I think we
can get rid of the SlotID. Instead, an allocated slot will be identified by
the AllocationID and the TaskManager's ResourceID in order to differentiate
duplicate registrations.
* For the implementation plan, I believe there is only one tiny part on the
SlotManager for which we need a separate code path/feature flag which is
how we find a matching slot. Everything else should be possible to
implement in a way that it works with dynamic and static slot allocation:
1. Let TMs register with default slot profile at RM
2. Change SlotManager to use reported slot profiles instead of
pre-calculated profiles
3. Replace SlotID with SlotProfile in TaskExecutorGateway#requestSlot
4. Extend TM to support dynamic slot allocation (aka proper bookkeeping)
(can happen concurrently to any of steps 2-3)
5. Add bookkeeping to SlotManager (for pending TMs and registered TMs) but
still only use default slot profiles for matching with slot requests
6. Allow to match slot requests with reported resources instead of default
slot profiles (here we could use a feature flag to switch between dynamic
and static slot allocation)

Wdyt?

Cheers,
Till

On Thu, Sep 19, 2019 at 9:45 AM Andrey Zagrebin 
wrote:

> Hi Xintong,
>
> Thanks for starting the vote, +1 from my side.
>
> Best,
> Andrey
>
> On Tue, Sep 17, 2019 at 4:26 PM Xintong Song 
> wrote:
>
> > Hi all,
> >
> > I would like to start the vote for FLIP-56 [1], on which a consensus is
> > reached in this discussion thread [2].
> >
> > The vote will be open for at least 72 hours. I'll try to close it after
> > Sep. 20 15:00 UTC, unless there is an objection or not enough votes.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> >
> > [2]
> >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-56-Dynamic-Slot-Allocation-td31960.html
> >
>


Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Becket Qin
Hi Stephan,

Thanks for the clarification. I completely agree with you and Thomas on the
process of adding connectors to Flink repo. However, I am wondering what is
the deprecation process? Given the main concern here was that we may have
to maintain two Pulsar connector code bases until the old one is removed
from the repo, it would be good to know how long we have to do that.

Thanks,

Jiangjie (Becket) Qin

On Thu, Sep 19, 2019 at 3:54 PM Stephan Ewen  wrote:

> Some quick thoughts on the connector contribution process. I basically
> reiterate here what Thomas mentioned in another thread about the Kinesis
> connector.
>
> For connectors, we should favor a low-overhead contribution process, and
> accept user code and changes more readily than in the core system.
> That is because connectors have both a big variety of scenarios they get
> used in (only through use and many small contributions do they become
> really useful over time) and at the same time, and committers do not use
> the connector themselves and usually cannot foresee too well what is
> needed.
>
> Further more, a missing connector (or connector feature) is often a bigger
> show stopper for users than a missing API or system feature.
>
> Along these lines of thougt, the conclusion would be to take the Pulsar
> connector now, focus the review on legal/dependencies/rough code style and
> conventions, label it as "beta" (in the sense of "new code" that is "not
> yet tested through longer use") and go ahead. And then evolve it quickly
> without putting formal blockers in the way, meaning also adding a new FLIP
> 27 version when it is there.
>
> Best,
> Stephan
>
>
>
> On Thu, Sep 19, 2019 at 3:47 AM Becket Qin  wrote:
>
> > Hi Yijie,
> >
> > Could you please follow the FLIP process to start a new FLIP [DISCUSSION]
> > thread in the mailing list?
> >
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
> >
> > I see two FLIP-69 discussion in the mailing list now. So there is a FLIP
> > number collision. Can you change the FLIP number to 72?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Thu, Sep 19, 2019 at 12:23 AM Rong Rong  wrote:
> >
> > > Hi Yijie,
> > >
> > > Thanks for sharing the pulsar FLIP.
> > > Would you mind enabling comments/suggestions on the google doc link?
> This
> > > way the contributors from the community can comment on the doc.
> > >
> > > Best,
> > > Rong
> > >
> > > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen 
> > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > I've drafted a FLIP that describes the current design of the Pulsar
> > > > connector:
> > > >
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > > >
> > > > Please take a look and let me know what you think.
> > > >
> > > > Thanks,
> > > > Yijie
> > > >
> > > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong 
> > wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > Sorry for joining the discussion late and thanks Yijie & Sijie for
> > > > driving
> > > > > the discussion.
> > > > > I also think the Pulsar connector would be a very valuable addition
> > to
> > > > > Flink. I can also help out a bit on the review side :-)
> > > > >
> > > > > Regarding the timeline, I also share concerns with Becket on the
> > > > > relationship between the new Pulsar connector and FLIP-27.
> > > > > There's also another discussion just started by Stephan on dropping
> > > Kafka
> > > > > 9/10 support on next Flink release [1].  Although the situation is
> > > > somewhat
> > > > > different, and Kafka 9/10 connector has been in Flink for almost
> 3-4
> > > > years,
> > > > > based on the discussion I am not sure if a major version release
> is a
> > > > > requirement for removing old connector supports.
> > > > >
> > > > > I think there shouldn't be a blocker if we agree the old connector
> > will
> > > > be
> > > > > removed once FLIP-27 based Pulsar connector is there. As Stephan
> > > stated,
> > > > it
> > > > > is easier to contribute the source sooner and adjust it later.
> > > > > We should also ensure we clearly communicate the message: for
> > example,
> > > > > putting an experimental flag on the pre-FLIP27 connector page of
> the
> > > > > website, documentations, etc. Any other thoughts?
> > > > >
> > > > > --
> > > > > Rong
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html
> > > > >
> > > > >
> > > > > On Fri, Sep 13, 2019 at 8:15 AM Becket Qin 
> > > wrote:
> > > > >
> > > > > > Technically speaking, removing the old connector code is a
> > backwards
> > > > > > incompatible change which requires a major version bump, i.e.
> Flink
> > > > 2.x.
> > > > > > Given that we don't have a clear plan on when to have the next
> > major
> > > > > > version release, it seems 

[jira] [Created] (FLINK-14125) Display memory and CPU usage in the overview page

2019-09-19 Thread Yadong Xie (Jira)
Yadong Xie created FLINK-14125:
--

 Summary: Display memory and CPU usage in the overview page
 Key: FLINK-14125
 URL: https://issues.apache.org/jira/browse/FLINK-14125
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.10.0
Reporter: Yadong Xie
 Fix For: 1.10.0
 Attachments: 屏幕快照 2019-09-19 下午5.03.14.png

In the overview page of Web UI, besides the task slots and jobs, we could add 
memory and CPU usage metrics to the cluster-level, these metrics are already 
available in the Blink branch.

!屏幕快照 2019-09-19 下午5.03.14.png|width=564,height=283!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread JingsongLee
I know Hive and Spark can shadow built-in functions by temporary function.
Mysql, Oracle, Sql server can not shadow.
User can use full names to access functions instead of shadowing.

So I think it is a completely new thing, and the direct way to deal with new 
things is to add new grammar. So,
+1 for #2, +0 for #3, -1 for #1

Best,
Jingsong Lee


--
From:Kurt Young 
Send Time:2019年9月19日(星期四) 16:43
To:dev 
Subject:Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

And let me make my vote complete:

-1 for #1
+1 for #2 with different keyword
-0 for #3

Best,
Kurt


On Thu, Sep 19, 2019 at 4:40 PM Kurt Young  wrote:

> Looks like I'm the only person who is willing to +1 to #2 for now :-)
> But I would suggest to change the keyword from GLOBAL to
> something like BUILTIN.
>
> I think #2 and #3 are almost the same proposal, just with different
> format to indicate whether it want to override built-in functions.
>
> My biggest reason to choose it is I want this behavior be consistent
> with temporal tables. I will give some examples to show the behavior
> and also make sure I'm not misunderstanding anything here.
>
> For most DBs, when user create a temporary table with:
>
> CREATE TEMPORARY TABLE t1
>
> It's actually equivalent with:
>
> CREATE TEMPORARY TABLE `curent_db`.t1
>
> If user change current database, they will not be able to access t1 without
> fully qualified name, .i.e db1.t1 (assuming db1 is current database when
> this temporary table is created).
>
> Only #2 and #3 followed this behavior and I would vote for this since this
> makes such behavior consistent through temporal tables and functions.
>
> Why I'm not voting for #3 is a special catalog and database just looks very
> hacky to me. It gave a imply that our built-in functions saved at a
> special
> catalog and database, which is actually not. Introducing a dedicated
> keyword
> like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
> straightforward. One can argue that we should avoid introducing new
> keyword,
> but it's also very rare that a system can overwrite built-in functions.
> Since we
> decided to support this, introduce a new keyword is not a big deal IMO.
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski 
> wrote:
>
>> Hi,
>>
>> It is a quite long discussion to follow and I hope I didn’t misunderstand
>> anything. From the proposals presented by Xuefu I would vote:
>>
>> -1 for #1 and #2
>> +1 for #3
>>
>> Besides #3 being IMO more general and more consistent, having qualified
>> names (#3) would help/make easier for someone to use cross
>> databases/catalogs queries (joining multiple data sets/streams). For
>> example with some functions to manipulate/clean up/convert the stored data
>> in different catalogs registered in the respective catalogs.
>>
>> Piotrek
>>
>> > On 19 Sep 2019, at 06:35, Jark Wu  wrote:
>> >
>> > I agree with Xuefu that inconsistent handling with all the other
>> objects is
>> > not a big problem.
>> >
>> > Regarding to option#3, the special "system.system" namespace may confuse
>> > users.
>> > Users need to know the set of built-in function names to know when to
>> use
>> > "system.system" namespace.
>> > What will happen if user registers a non-builtin function name under the
>> > "system.system" namespace?
>> > Besides, I think it doesn't solve the "explode" problem I mentioned at
>> the
>> > beginning of this thread.
>> >
>> > So here is my vote:
>> >
>> > +1 for #1
>> > 0 for #2
>> > -1 for #3
>> >
>> > Best,
>> > Jark
>> >
>> >
>> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z  wrote:
>> >
>> >> @Dawid, Re: we also don't need additional referencing the
>> specialcatalog
>> >> anywhere.
>> >>
>> >> True. But once we allow such reference, then user can do so in any
>> possible
>> >> place where a function name is expected, for which we have to handle.
>> >> That's a big difference, I think.
>> >>
>> >> Thanks,
>> >> Xuefu
>> >>
>> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>> >> wysakowicz.da...@gmail.com>
>> >> wrote:
>> >>
>> >>> @Bowen I am not suggesting introducing additional catalog. I think we
>> >> need
>> >>> to get rid of the current built-in catalog.
>> >>>
>> >>> @Xuefu in option #3 we also don't need additional referencing the
>> special
>> >>> catalog anywhere else besides in the CREATE statement. The resolution
>> >>> behaviour is exactly the same in both options.
>> >>>
>> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z,  wrote:
>> >>>
>>  Hi Dawid,
>> 
>>  "GLOBAL" is a temporary keyword that was given to the approach. It
>> can
>> >> be
>>  changed to something else for better.
>> 
>>  The difference between this and the #3 approach is that we only need
>> >> the
>>  keyword for this create DDL. For other places (such as function
>>  referencing), no keyword or special namespace is needed.
>> 
>>  Thanks,
>>  Xuefu
>> 
>>  On Wed, Sep 18, 2019 at 

Confluence permission for FLIP creation

2019-09-19 Thread Zili Chen
Hi devs,

I'd like to create a page about the ongoing JobClient FLIP. Could you
grant me Confluence permission for FLIP creation?

My Confluence ID is tison.

Best,
tison.


Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Yijie Shen
Hi Fabian,

It's   yijieshen

Thanks for your help!

On Thu, Sep 19, 2019 at 4:52 PM Fabian Hueske  wrote:
>
> Hi Yijie,
>
> I can give you permission if you tell me your Confluence user name.
>
> Thanks, Fabian
>
> Am Do., 19. Sept. 2019 um 10:49 Uhr schrieb Yijie Shen <
> henry.yijies...@gmail.com>:
>
> > > > > Could you please follow the FLIP process to start a new FLIP
> > [DISCUSSION]
> > > > > thread in the mailing list?
> >
> > It seems that I don't have permission to create pages in FLIP.
> >
> >
> > > > > > Thanks for sharing the pulsar FLIP.
> > > > > > Would you mind enabling comments/suggestions on the google doc
> > link?
> >
> > Sorry, my bad, I've enabled comments for the doc.
> > Does it work for now?
> >
> > On Thu, Sep 19, 2019 at 4:07 PM Fabian Hueske  wrote:
> > >
> > > +1 to what Stephan (and Thomas) said.
> > >
> > > Am Do., 19. Sept. 2019 um 09:54 Uhr schrieb Stephan Ewen <
> > se...@apache.org>:
> > >
> > > > Some quick thoughts on the connector contribution process. I basically
> > > > reiterate here what Thomas mentioned in another thread about the
> > Kinesis
> > > > connector.
> > > >
> > > > For connectors, we should favor a low-overhead contribution process,
> > and
> > > > accept user code and changes more readily than in the core system.
> > > > That is because connectors have both a big variety of scenarios they
> > get
> > > > used in (only through use and many small contributions do they become
> > > > really useful over time) and at the same time, and committers do not
> > use
> > > > the connector themselves and usually cannot foresee too well what is
> > > > needed.
> > > >
> > > > Further more, a missing connector (or connector feature) is often a
> > bigger
> > > > show stopper for users than a missing API or system feature.
> > > >
> > > > Along these lines of thougt, the conclusion would be to take the Pulsar
> > > > connector now, focus the review on legal/dependencies/rough code style
> > and
> > > > conventions, label it as "beta" (in the sense of "new code" that is
> > "not
> > > > yet tested through longer use") and go ahead. And then evolve it
> > quickly
> > > > without putting formal blockers in the way, meaning also adding a new
> > FLIP
> > > > 27 version when it is there.
> > > >
> > > > Best,
> > > > Stephan
> > > >
> > > >
> > > >
> > > > On Thu, Sep 19, 2019 at 3:47 AM Becket Qin 
> > wrote:
> > > >
> > > > > Hi Yijie,
> > > > >
> > > > > Could you please follow the FLIP process to start a new FLIP
> > [DISCUSSION]
> > > > > thread in the mailing list?
> > > > >
> > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
> > > > >
> > > > > I see two FLIP-69 discussion in the mailing list now. So there is a
> > FLIP
> > > > > number collision. Can you change the FLIP number to 72?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Thu, Sep 19, 2019 at 12:23 AM Rong Rong 
> > wrote:
> > > > >
> > > > > > Hi Yijie,
> > > > > >
> > > > > > Thanks for sharing the pulsar FLIP.
> > > > > > Would you mind enabling comments/suggestions on the google doc
> > link?
> > > > This
> > > > > > way the contributors from the community can comment on the doc.
> > > > > >
> > > > > > Best,
> > > > > > Rong
> > > > > >
> > > > > > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen <
> > henry.yijies...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hello everyone,
> > > > > > >
> > > > > > > I've drafted a FLIP that describes the current design of the
> > Pulsar
> > > > > > > connector:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > > > > > >
> > > > > > > Please take a look and let me know what you think.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Yijie
> > > > > > >
> > > > > > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong 
> > > > > wrote:
> > > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > Sorry for joining the discussion late and thanks Yijie & Sijie
> > for
> > > > > > > driving
> > > > > > > > the discussion.
> > > > > > > > I also think the Pulsar connector would be a very valuable
> > addition
> > > > > to
> > > > > > > > Flink. I can also help out a bit on the review side :-)
> > > > > > > >
> > > > > > > > Regarding the timeline, I also share concerns with Becket on
> > the
> > > > > > > > relationship between the new Pulsar connector and FLIP-27.
> > > > > > > > There's also another discussion just started by Stephan on
> > dropping
> > > > > > Kafka
> > > > > > > > 9/10 support on next Flink release [1].  Although the
> > situation is
> > > > > > > somewhat
> > > > > > > > different, and Kafka 9/10 connector has been in Flink for
> > almost
> > > > 3-4
> > > > > > > years,
> > > > > > > > based on the discussion I am not sure if a major version
> > release
> > > > is a
> > > > > > > > 

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Fabian Hueske
Hi Yijie,

I can give you permission if you tell me your Confluence user name.

Thanks, Fabian

Am Do., 19. Sept. 2019 um 10:49 Uhr schrieb Yijie Shen <
henry.yijies...@gmail.com>:

> > > > Could you please follow the FLIP process to start a new FLIP
> [DISCUSSION]
> > > > thread in the mailing list?
>
> It seems that I don't have permission to create pages in FLIP.
>
>
> > > > > Thanks for sharing the pulsar FLIP.
> > > > > Would you mind enabling comments/suggestions on the google doc
> link?
>
> Sorry, my bad, I've enabled comments for the doc.
> Does it work for now?
>
> On Thu, Sep 19, 2019 at 4:07 PM Fabian Hueske  wrote:
> >
> > +1 to what Stephan (and Thomas) said.
> >
> > Am Do., 19. Sept. 2019 um 09:54 Uhr schrieb Stephan Ewen <
> se...@apache.org>:
> >
> > > Some quick thoughts on the connector contribution process. I basically
> > > reiterate here what Thomas mentioned in another thread about the
> Kinesis
> > > connector.
> > >
> > > For connectors, we should favor a low-overhead contribution process,
> and
> > > accept user code and changes more readily than in the core system.
> > > That is because connectors have both a big variety of scenarios they
> get
> > > used in (only through use and many small contributions do they become
> > > really useful over time) and at the same time, and committers do not
> use
> > > the connector themselves and usually cannot foresee too well what is
> > > needed.
> > >
> > > Further more, a missing connector (or connector feature) is often a
> bigger
> > > show stopper for users than a missing API or system feature.
> > >
> > > Along these lines of thougt, the conclusion would be to take the Pulsar
> > > connector now, focus the review on legal/dependencies/rough code style
> and
> > > conventions, label it as "beta" (in the sense of "new code" that is
> "not
> > > yet tested through longer use") and go ahead. And then evolve it
> quickly
> > > without putting formal blockers in the way, meaning also adding a new
> FLIP
> > > 27 version when it is there.
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > >
> > > On Thu, Sep 19, 2019 at 3:47 AM Becket Qin 
> wrote:
> > >
> > > > Hi Yijie,
> > > >
> > > > Could you please follow the FLIP process to start a new FLIP
> [DISCUSSION]
> > > > thread in the mailing list?
> > > >
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
> > > >
> > > > I see two FLIP-69 discussion in the mailing list now. So there is a
> FLIP
> > > > number collision. Can you change the FLIP number to 72?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Thu, Sep 19, 2019 at 12:23 AM Rong Rong 
> wrote:
> > > >
> > > > > Hi Yijie,
> > > > >
> > > > > Thanks for sharing the pulsar FLIP.
> > > > > Would you mind enabling comments/suggestions on the google doc
> link?
> > > This
> > > > > way the contributors from the community can comment on the doc.
> > > > >
> > > > > Best,
> > > > > Rong
> > > > >
> > > > > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen <
> henry.yijies...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > I've drafted a FLIP that describes the current design of the
> Pulsar
> > > > > > connector:
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > > > > >
> > > > > > Please take a look and let me know what you think.
> > > > > >
> > > > > > Thanks,
> > > > > > Yijie
> > > > > >
> > > > > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong 
> > > > wrote:
> > > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Sorry for joining the discussion late and thanks Yijie & Sijie
> for
> > > > > > driving
> > > > > > > the discussion.
> > > > > > > I also think the Pulsar connector would be a very valuable
> addition
> > > > to
> > > > > > > Flink. I can also help out a bit on the review side :-)
> > > > > > >
> > > > > > > Regarding the timeline, I also share concerns with Becket on
> the
> > > > > > > relationship between the new Pulsar connector and FLIP-27.
> > > > > > > There's also another discussion just started by Stephan on
> dropping
> > > > > Kafka
> > > > > > > 9/10 support on next Flink release [1].  Although the
> situation is
> > > > > > somewhat
> > > > > > > different, and Kafka 9/10 connector has been in Flink for
> almost
> > > 3-4
> > > > > > years,
> > > > > > > based on the discussion I am not sure if a major version
> release
> > > is a
> > > > > > > requirement for removing old connector supports.
> > > > > > >
> > > > > > > I think there shouldn't be a blocker if we agree the old
> connector
> > > > will
> > > > > > be
> > > > > > > removed once FLIP-27 based Pulsar connector is there. As
> Stephan
> > > > > stated,
> > > > > > it
> > > > > > > is easier to contribute the source sooner and adjust it later.
> > > > > > > We should also ensure we clearly 

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Yijie Shen
> > > Could you please follow the FLIP process to start a new FLIP [DISCUSSION]
> > > thread in the mailing list?

It seems that I don't have permission to create pages in FLIP.


> > > > Thanks for sharing the pulsar FLIP.
> > > > Would you mind enabling comments/suggestions on the google doc link?

Sorry, my bad, I've enabled comments for the doc.
Does it work for now?

On Thu, Sep 19, 2019 at 4:07 PM Fabian Hueske  wrote:
>
> +1 to what Stephan (and Thomas) said.
>
> Am Do., 19. Sept. 2019 um 09:54 Uhr schrieb Stephan Ewen :
>
> > Some quick thoughts on the connector contribution process. I basically
> > reiterate here what Thomas mentioned in another thread about the Kinesis
> > connector.
> >
> > For connectors, we should favor a low-overhead contribution process, and
> > accept user code and changes more readily than in the core system.
> > That is because connectors have both a big variety of scenarios they get
> > used in (only through use and many small contributions do they become
> > really useful over time) and at the same time, and committers do not use
> > the connector themselves and usually cannot foresee too well what is
> > needed.
> >
> > Further more, a missing connector (or connector feature) is often a bigger
> > show stopper for users than a missing API or system feature.
> >
> > Along these lines of thougt, the conclusion would be to take the Pulsar
> > connector now, focus the review on legal/dependencies/rough code style and
> > conventions, label it as "beta" (in the sense of "new code" that is "not
> > yet tested through longer use") and go ahead. And then evolve it quickly
> > without putting formal blockers in the way, meaning also adding a new FLIP
> > 27 version when it is there.
> >
> > Best,
> > Stephan
> >
> >
> >
> > On Thu, Sep 19, 2019 at 3:47 AM Becket Qin  wrote:
> >
> > > Hi Yijie,
> > >
> > > Could you please follow the FLIP process to start a new FLIP [DISCUSSION]
> > > thread in the mailing list?
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
> > >
> > > I see two FLIP-69 discussion in the mailing list now. So there is a FLIP
> > > number collision. Can you change the FLIP number to 72?
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Thu, Sep 19, 2019 at 12:23 AM Rong Rong  wrote:
> > >
> > > > Hi Yijie,
> > > >
> > > > Thanks for sharing the pulsar FLIP.
> > > > Would you mind enabling comments/suggestions on the google doc link?
> > This
> > > > way the contributors from the community can comment on the doc.
> > > >
> > > > Best,
> > > > Rong
> > > >
> > > > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen 
> > > > wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > I've drafted a FLIP that describes the current design of the Pulsar
> > > > > connector:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > > > >
> > > > > Please take a look and let me know what you think.
> > > > >
> > > > > Thanks,
> > > > > Yijie
> > > > >
> > > > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong 
> > > wrote:
> > > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > Sorry for joining the discussion late and thanks Yijie & Sijie for
> > > > > driving
> > > > > > the discussion.
> > > > > > I also think the Pulsar connector would be a very valuable addition
> > > to
> > > > > > Flink. I can also help out a bit on the review side :-)
> > > > > >
> > > > > > Regarding the timeline, I also share concerns with Becket on the
> > > > > > relationship between the new Pulsar connector and FLIP-27.
> > > > > > There's also another discussion just started by Stephan on dropping
> > > > Kafka
> > > > > > 9/10 support on next Flink release [1].  Although the situation is
> > > > > somewhat
> > > > > > different, and Kafka 9/10 connector has been in Flink for almost
> > 3-4
> > > > > years,
> > > > > > based on the discussion I am not sure if a major version release
> > is a
> > > > > > requirement for removing old connector supports.
> > > > > >
> > > > > > I think there shouldn't be a blocker if we agree the old connector
> > > will
> > > > > be
> > > > > > removed once FLIP-27 based Pulsar connector is there. As Stephan
> > > > stated,
> > > > > it
> > > > > > is easier to contribute the source sooner and adjust it later.
> > > > > > We should also ensure we clearly communicate the message: for
> > > example,
> > > > > > putting an experimental flag on the pre-FLIP27 connector page of
> > the
> > > > > > website, documentations, etc. Any other thoughts?
> > > > > >
> > > > > > --
> > > > > > Rong
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html
> > > > > >
> > > > > >
> > > > > > On Fri, Sep 13, 2019 at 8:15 AM Becket Qin 
> > > > 

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Kurt Young
And let me make my vote complete:

-1 for #1
+1 for #2 with different keyword
-0 for #3

Best,
Kurt


On Thu, Sep 19, 2019 at 4:40 PM Kurt Young  wrote:

> Looks like I'm the only person who is willing to +1 to #2 for now :-)
> But I would suggest to change the keyword from GLOBAL to
> something like BUILTIN.
>
> I think #2 and #3 are almost the same proposal, just with different
> format to indicate whether it want to override built-in functions.
>
> My biggest reason to choose it is I want this behavior be consistent
> with temporal tables. I will give some examples to show the behavior
> and also make sure I'm not misunderstanding anything here.
>
> For most DBs, when user create a temporary table with:
>
> CREATE TEMPORARY TABLE t1
>
> It's actually equivalent with:
>
> CREATE TEMPORARY TABLE `curent_db`.t1
>
> If user change current database, they will not be able to access t1 without
> fully qualified name, .i.e db1.t1 (assuming db1 is current database when
> this temporary table is created).
>
> Only #2 and #3 followed this behavior and I would vote for this since this
> makes such behavior consistent through temporal tables and functions.
>
> Why I'm not voting for #3 is a special catalog and database just looks very
> hacky to me. It gave a imply that our built-in functions saved at a
> special
> catalog and database, which is actually not. Introducing a dedicated
> keyword
> like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
> straightforward. One can argue that we should avoid introducing new
> keyword,
> but it's also very rare that a system can overwrite built-in functions.
> Since we
> decided to support this, introduce a new keyword is not a big deal IMO.
>
> Best,
> Kurt
>
>
> On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski 
> wrote:
>
>> Hi,
>>
>> It is a quite long discussion to follow and I hope I didn’t misunderstand
>> anything. From the proposals presented by Xuefu I would vote:
>>
>> -1 for #1 and #2
>> +1 for #3
>>
>> Besides #3 being IMO more general and more consistent, having qualified
>> names (#3) would help/make easier for someone to use cross
>> databases/catalogs queries (joining multiple data sets/streams). For
>> example with some functions to manipulate/clean up/convert the stored data
>> in different catalogs registered in the respective catalogs.
>>
>> Piotrek
>>
>> > On 19 Sep 2019, at 06:35, Jark Wu  wrote:
>> >
>> > I agree with Xuefu that inconsistent handling with all the other
>> objects is
>> > not a big problem.
>> >
>> > Regarding to option#3, the special "system.system" namespace may confuse
>> > users.
>> > Users need to know the set of built-in function names to know when to
>> use
>> > "system.system" namespace.
>> > What will happen if user registers a non-builtin function name under the
>> > "system.system" namespace?
>> > Besides, I think it doesn't solve the "explode" problem I mentioned at
>> the
>> > beginning of this thread.
>> >
>> > So here is my vote:
>> >
>> > +1 for #1
>> > 0 for #2
>> > -1 for #3
>> >
>> > Best,
>> > Jark
>> >
>> >
>> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z  wrote:
>> >
>> >> @Dawid, Re: we also don't need additional referencing the
>> specialcatalog
>> >> anywhere.
>> >>
>> >> True. But once we allow such reference, then user can do so in any
>> possible
>> >> place where a function name is expected, for which we have to handle.
>> >> That's a big difference, I think.
>> >>
>> >> Thanks,
>> >> Xuefu
>> >>
>> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>> >> wysakowicz.da...@gmail.com>
>> >> wrote:
>> >>
>> >>> @Bowen I am not suggesting introducing additional catalog. I think we
>> >> need
>> >>> to get rid of the current built-in catalog.
>> >>>
>> >>> @Xuefu in option #3 we also don't need additional referencing the
>> special
>> >>> catalog anywhere else besides in the CREATE statement. The resolution
>> >>> behaviour is exactly the same in both options.
>> >>>
>> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z,  wrote:
>> >>>
>>  Hi Dawid,
>> 
>>  "GLOBAL" is a temporary keyword that was given to the approach. It
>> can
>> >> be
>>  changed to something else for better.
>> 
>>  The difference between this and the #3 approach is that we only need
>> >> the
>>  keyword for this create DDL. For other places (such as function
>>  referencing), no keyword or special namespace is needed.
>> 
>>  Thanks,
>>  Xuefu
>> 
>>  On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
>>  wysakowicz.da...@gmail.com>
>>  wrote:
>> 
>> > Hi,
>> > I think it makes sense to start voting at this point.
>> >
>> > Option 1: Only 1-part identifiers
>> > PROS:
>> > - allows shadowing built-in functions
>> > CONS:
>> > - incosistent with all the other objects, both permanent & temporary
>> > - does not allow shadowing catalog functions
>> >
>> > Option 2: Special keyword for built-in function
>> > I think this is quite similar to 

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Kurt Young
Looks like I'm the only person who is willing to +1 to #2 for now :-)
But I would suggest to change the keyword from GLOBAL to
something like BUILTIN.

I think #2 and #3 are almost the same proposal, just with different
format to indicate whether it want to override built-in functions.

My biggest reason to choose it is I want this behavior be consistent
with temporal tables. I will give some examples to show the behavior
and also make sure I'm not misunderstanding anything here.

For most DBs, when user create a temporary table with:

CREATE TEMPORARY TABLE t1

It's actually equivalent with:

CREATE TEMPORARY TABLE `curent_db`.t1

If user change current database, they will not be able to access t1 without
fully qualified name, .i.e db1.t1 (assuming db1 is current database when
this temporary table is created).

Only #2 and #3 followed this behavior and I would vote for this since this
makes such behavior consistent through temporal tables and functions.

Why I'm not voting for #3 is a special catalog and database just looks very
hacky to me. It gave a imply that our built-in functions saved at a special
catalog and database, which is actually not. Introducing a dedicated keyword
like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
straightforward. One can argue that we should avoid introducing new keyword,
but it's also very rare that a system can overwrite built-in functions.
Since we
decided to support this, introduce a new keyword is not a big deal IMO.

Best,
Kurt


On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski  wrote:

> Hi,
>
> It is a quite long discussion to follow and I hope I didn’t misunderstand
> anything. From the proposals presented by Xuefu I would vote:
>
> -1 for #1 and #2
> +1 for #3
>
> Besides #3 being IMO more general and more consistent, having qualified
> names (#3) would help/make easier for someone to use cross
> databases/catalogs queries (joining multiple data sets/streams). For
> example with some functions to manipulate/clean up/convert the stored data
> in different catalogs registered in the respective catalogs.
>
> Piotrek
>
> > On 19 Sep 2019, at 06:35, Jark Wu  wrote:
> >
> > I agree with Xuefu that inconsistent handling with all the other objects
> is
> > not a big problem.
> >
> > Regarding to option#3, the special "system.system" namespace may confuse
> > users.
> > Users need to know the set of built-in function names to know when to use
> > "system.system" namespace.
> > What will happen if user registers a non-builtin function name under the
> > "system.system" namespace?
> > Besides, I think it doesn't solve the "explode" problem I mentioned at
> the
> > beginning of this thread.
> >
> > So here is my vote:
> >
> > +1 for #1
> > 0 for #2
> > -1 for #3
> >
> > Best,
> > Jark
> >
> >
> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z  wrote:
> >
> >> @Dawid, Re: we also don't need additional referencing the specialcatalog
> >> anywhere.
> >>
> >> True. But once we allow such reference, then user can do so in any
> possible
> >> place where a function name is expected, for which we have to handle.
> >> That's a big difference, I think.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
> >> wysakowicz.da...@gmail.com>
> >> wrote:
> >>
> >>> @Bowen I am not suggesting introducing additional catalog. I think we
> >> need
> >>> to get rid of the current built-in catalog.
> >>>
> >>> @Xuefu in option #3 we also don't need additional referencing the
> special
> >>> catalog anywhere else besides in the CREATE statement. The resolution
> >>> behaviour is exactly the same in both options.
> >>>
> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z,  wrote:
> >>>
>  Hi Dawid,
> 
>  "GLOBAL" is a temporary keyword that was given to the approach. It can
> >> be
>  changed to something else for better.
> 
>  The difference between this and the #3 approach is that we only need
> >> the
>  keyword for this create DDL. For other places (such as function
>  referencing), no keyword or special namespace is needed.
> 
>  Thanks,
>  Xuefu
> 
>  On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
>  wysakowicz.da...@gmail.com>
>  wrote:
> 
> > Hi,
> > I think it makes sense to start voting at this point.
> >
> > Option 1: Only 1-part identifiers
> > PROS:
> > - allows shadowing built-in functions
> > CONS:
> > - incosistent with all the other objects, both permanent & temporary
> > - does not allow shadowing catalog functions
> >
> > Option 2: Special keyword for built-in function
> > I think this is quite similar to the special catalog/db. The thing I
> >> am
> > strongly against in this proposal is the GLOBAL keyword. This keyword
>  has a
> > meaning in rdbms systems and means a function that is present for a
> > lifetime of a session in which it was created, but available in all
> >>> other
> > sessions. Therefore I really don't 

[jira] [Created] (FLINK-14124) potential memory leak in netty server

2019-09-19 Thread YufeiLiu (Jira)
YufeiLiu created FLINK-14124:


 Summary: potential memory leak in netty server
 Key: FLINK-14124
 URL: https://issues.apache.org/jira/browse/FLINK-14124
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Affects Versions: 1.6.3
Reporter: YufeiLiu
 Attachments: image-2019-09-19-15-53-32-294.png

I have a job running in flink 1.4.2, end of the task is use Phoenix jdbc driver 
write record into Apache Phoenix.
_mqStream
.keyBy(0)
.window(TumblingProcessingTimeWindows.of(Time.of(300, 
TimeUnit.SECONDS)))
.process(new MyProcessWindowFunction())
.addSink(new PhoenixSinkFunction());_

But the TaskManager of sink subtask off-heap memory keep increasing, precisely 
is might case by DirectByteBuffer.
I analyze heap dump, find there are hundreds of DirectByteBuffer object 
reference to over 3MB memory address, they are all leak to Flink Netty Server 
Thread.
 !image-2019-09-19-15-53-32-294.png! 

It only happened in sink task, other nodes just work fine. I think is problem 
of Phoenix at first, but heap dump show memory is consume by netty. I didn't 
know much about flink network, I will be appreciated if someone can tell me the 
might causation or how to dig durther.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Fabian Hueske
+1 to what Stephan (and Thomas) said.

Am Do., 19. Sept. 2019 um 09:54 Uhr schrieb Stephan Ewen :

> Some quick thoughts on the connector contribution process. I basically
> reiterate here what Thomas mentioned in another thread about the Kinesis
> connector.
>
> For connectors, we should favor a low-overhead contribution process, and
> accept user code and changes more readily than in the core system.
> That is because connectors have both a big variety of scenarios they get
> used in (only through use and many small contributions do they become
> really useful over time) and at the same time, and committers do not use
> the connector themselves and usually cannot foresee too well what is
> needed.
>
> Further more, a missing connector (or connector feature) is often a bigger
> show stopper for users than a missing API or system feature.
>
> Along these lines of thougt, the conclusion would be to take the Pulsar
> connector now, focus the review on legal/dependencies/rough code style and
> conventions, label it as "beta" (in the sense of "new code" that is "not
> yet tested through longer use") and go ahead. And then evolve it quickly
> without putting formal blockers in the way, meaning also adding a new FLIP
> 27 version when it is there.
>
> Best,
> Stephan
>
>
>
> On Thu, Sep 19, 2019 at 3:47 AM Becket Qin  wrote:
>
> > Hi Yijie,
> >
> > Could you please follow the FLIP process to start a new FLIP [DISCUSSION]
> > thread in the mailing list?
> >
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
> >
> > I see two FLIP-69 discussion in the mailing list now. So there is a FLIP
> > number collision. Can you change the FLIP number to 72?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Thu, Sep 19, 2019 at 12:23 AM Rong Rong  wrote:
> >
> > > Hi Yijie,
> > >
> > > Thanks for sharing the pulsar FLIP.
> > > Would you mind enabling comments/suggestions on the google doc link?
> This
> > > way the contributors from the community can comment on the doc.
> > >
> > > Best,
> > > Rong
> > >
> > > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen 
> > > wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > I've drafted a FLIP that describes the current design of the Pulsar
> > > > connector:
> > > >
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > > >
> > > > Please take a look and let me know what you think.
> > > >
> > > > Thanks,
> > > > Yijie
> > > >
> > > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong 
> > wrote:
> > > > >
> > > > > Hi All,
> > > > >
> > > > > Sorry for joining the discussion late and thanks Yijie & Sijie for
> > > > driving
> > > > > the discussion.
> > > > > I also think the Pulsar connector would be a very valuable addition
> > to
> > > > > Flink. I can also help out a bit on the review side :-)
> > > > >
> > > > > Regarding the timeline, I also share concerns with Becket on the
> > > > > relationship between the new Pulsar connector and FLIP-27.
> > > > > There's also another discussion just started by Stephan on dropping
> > > Kafka
> > > > > 9/10 support on next Flink release [1].  Although the situation is
> > > > somewhat
> > > > > different, and Kafka 9/10 connector has been in Flink for almost
> 3-4
> > > > years,
> > > > > based on the discussion I am not sure if a major version release
> is a
> > > > > requirement for removing old connector supports.
> > > > >
> > > > > I think there shouldn't be a blocker if we agree the old connector
> > will
> > > > be
> > > > > removed once FLIP-27 based Pulsar connector is there. As Stephan
> > > stated,
> > > > it
> > > > > is easier to contribute the source sooner and adjust it later.
> > > > > We should also ensure we clearly communicate the message: for
> > example,
> > > > > putting an experimental flag on the pre-FLIP27 connector page of
> the
> > > > > website, documentations, etc. Any other thoughts?
> > > > >
> > > > > --
> > > > > Rong
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html
> > > > >
> > > > >
> > > > > On Fri, Sep 13, 2019 at 8:15 AM Becket Qin 
> > > wrote:
> > > > >
> > > > > > Technically speaking, removing the old connector code is a
> > backwards
> > > > > > incompatible change which requires a major version bump, i.e.
> Flink
> > > > 2.x.
> > > > > > Given that we don't have a clear plan on when to have the next
> > major
> > > > > > version release, it seems unclear how long the old connector code
> > > will
> > > > be
> > > > > > there if we check it in right now. Or will we remove the old
> > > connector
> > > > > > without a major version bump? In any case, it sounds not quite
> user
> > > > > > friendly to the those who might use the old Pulsar connector. I
> am
> > > not
> > > > sure
> > > > > > if it is worth these 

Re: [DISCUSS] Contribute Pulsar Flink connector back to Flink

2019-09-19 Thread Stephan Ewen
Some quick thoughts on the connector contribution process. I basically
reiterate here what Thomas mentioned in another thread about the Kinesis
connector.

For connectors, we should favor a low-overhead contribution process, and
accept user code and changes more readily than in the core system.
That is because connectors have both a big variety of scenarios they get
used in (only through use and many small contributions do they become
really useful over time) and at the same time, and committers do not use
the connector themselves and usually cannot foresee too well what is needed.

Further more, a missing connector (or connector feature) is often a bigger
show stopper for users than a missing API or system feature.

Along these lines of thougt, the conclusion would be to take the Pulsar
connector now, focus the review on legal/dependencies/rough code style and
conventions, label it as "beta" (in the sense of "new code" that is "not
yet tested through longer use") and go ahead. And then evolve it quickly
without putting formal blockers in the way, meaning also adding a new FLIP
27 version when it is there.

Best,
Stephan



On Thu, Sep 19, 2019 at 3:47 AM Becket Qin  wrote:

> Hi Yijie,
>
> Could you please follow the FLIP process to start a new FLIP [DISCUSSION]
> thread in the mailing list?
>
>
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals#FlinkImprovementProposals-Process
>
> I see two FLIP-69 discussion in the mailing list now. So there is a FLIP
> number collision. Can you change the FLIP number to 72?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Thu, Sep 19, 2019 at 12:23 AM Rong Rong  wrote:
>
> > Hi Yijie,
> >
> > Thanks for sharing the pulsar FLIP.
> > Would you mind enabling comments/suggestions on the google doc link? This
> > way the contributors from the community can comment on the doc.
> >
> > Best,
> > Rong
> >
> > On Mon, Sep 16, 2019 at 5:43 AM Yijie Shen 
> > wrote:
> >
> > > Hello everyone,
> > >
> > > I've drafted a FLIP that describes the current design of the Pulsar
> > > connector:
> > >
> > >
> > >
> >
> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#
> > >
> > > Please take a look and let me know what you think.
> > >
> > > Thanks,
> > > Yijie
> > >
> > > On Sat, Sep 14, 2019 at 12:08 AM Rong Rong 
> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > Sorry for joining the discussion late and thanks Yijie & Sijie for
> > > driving
> > > > the discussion.
> > > > I also think the Pulsar connector would be a very valuable addition
> to
> > > > Flink. I can also help out a bit on the review side :-)
> > > >
> > > > Regarding the timeline, I also share concerns with Becket on the
> > > > relationship between the new Pulsar connector and FLIP-27.
> > > > There's also another discussion just started by Stephan on dropping
> > Kafka
> > > > 9/10 support on next Flink release [1].  Although the situation is
> > > somewhat
> > > > different, and Kafka 9/10 connector has been in Flink for almost 3-4
> > > years,
> > > > based on the discussion I am not sure if a major version release is a
> > > > requirement for removing old connector supports.
> > > >
> > > > I think there shouldn't be a blocker if we agree the old connector
> will
> > > be
> > > > removed once FLIP-27 based Pulsar connector is there. As Stephan
> > stated,
> > > it
> > > > is easier to contribute the source sooner and adjust it later.
> > > > We should also ensure we clearly communicate the message: for
> example,
> > > > putting an experimental flag on the pre-FLIP27 connector page of the
> > > > website, documentations, etc. Any other thoughts?
> > > >
> > > > --
> > > > Rong
> > > >
> > > > [1]
> > > >
> > >
> >
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html
> > > >
> > > >
> > > > On Fri, Sep 13, 2019 at 8:15 AM Becket Qin 
> > wrote:
> > > >
> > > > > Technically speaking, removing the old connector code is a
> backwards
> > > > > incompatible change which requires a major version bump, i.e. Flink
> > > 2.x.
> > > > > Given that we don't have a clear plan on when to have the next
> major
> > > > > version release, it seems unclear how long the old connector code
> > will
> > > be
> > > > > there if we check it in right now. Or will we remove the old
> > connector
> > > > > without a major version bump? In any case, it sounds not quite user
> > > > > friendly to the those who might use the old Pulsar connector. I am
> > not
> > > sure
> > > > > if it is worth these potential problems in order to have the Pulsar
> > > source
> > > > > connector checked in one or two months earlier.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Thu, Sep 12, 2019 at 3:52 PM Stephan Ewen 
> > wrote:
> > > > >
> > > > > > Agreed, if we check in the old code, we should make it clear that
> > it
> > > will
> > > > > > be 

Re: [VOTE] FLIP-56: Dynamic Slot Allocation

2019-09-19 Thread Andrey Zagrebin
Hi Xintong,

Thanks for starting the vote, +1 from my side.

Best,
Andrey

On Tue, Sep 17, 2019 at 4:26 PM Xintong Song  wrote:

> Hi all,
>
> I would like to start the vote for FLIP-56 [1], on which a consensus is
> reached in this discussion thread [2].
>
> The vote will be open for at least 72 hours. I'll try to close it after
> Sep. 20 15:00 UTC, unless there is an objection or not enough votes.
>
> Thank you~
>
> Xintong Song
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
>
> [2]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-56-Dynamic-Slot-Allocation-td31960.html
>


Re: java8 lambdas and exceptions lead to compile error

2019-09-19 Thread Till Rohrmann
Hi,

if there is an easy way to make it also work with Java 1.8.0_77 I guess we
could change it. That way we would make the life of our users easier.

The solution proposed by JDK-8054569 seems quite simple. The only downside
I see is that it could easily fell victim of a future refactoring/clean up
if we don't add some context/comment why the explicit type has been
introduced. Alternatively, we could state on the website which Java version
you need to build Flink.

Cheers,
Till

On Thu, Sep 19, 2019 at 8:53 AM zz  wrote:

> Hey all,
> Recently, I used flink to do secondary development, when compile flink
> master(up-to-date) by using Java 1.8.0_77, got errors as follow:
>
> compile (default-compile) on project flink-table-api-java: Compilation
> failure
>
> /home/*/zzsmdfj/sflink/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/factories/Cal
> culatedTableFactory.java:[90,53] unreported exception X; must be caught or
> declared to be thrownat
> org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:213)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:154)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute
> (MojoExecutor.java:146)
> at
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> (LifecycleModuleBuilder.java:117)
> at
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
> (LifecycleModuleBuilder.java:81)
> at
>
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
> (SingleThreadedBuilder.java:51)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:309)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:194)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:107)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:955)
> at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
> at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
> at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke
> (NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke
> (DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke (Method.java:498)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced
> (Launcher.java:289)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launch
> (Launcher.java:229)
> at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode
> (Launcher.java:415)
> at org.codehaus.plexus.classworlds.launcher.Launcher.main
> (Launcher.java:356)
> Caused by: org.apache.maven.plugin.compiler.CompilationFailureException:
> Compilation failure
>
> if using Java 1.8.0_102 to compile, it build success. it maybe a case of
> bug JDK-8054569 .
>
> Is that a problem?and What should I do with this?any comments would be
> appreciated.
>
> issue:https://issues.apache.org/jira/browse/FLINK-14093
>


Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table module

2019-09-19 Thread Kurt Young
IIUC it's good to see that both serializable (tables description from DDL)
and unserializable (tables with DataStream underneath) tables are treated
unify with CatalogTable.

Can I also assume functions that either come from a function class (from
DDL)
or function objects (newed by user) will also treated unify with
CatalogFunction?

This will greatly simplify and unify current API level concepts and design.

And it seems only one thing left, how do we deal with
ConnectTableDescriptor?
It's actually very similar with serializable CatalogTable, both carry some
text
properties which even are the same. Is there any chance we can further unify
this to CatalogTable?

object
Best,
Kurt


On Thu, Sep 19, 2019 at 3:13 PM Jark Wu  wrote:

> Thanks Dawid for the design doc.
>
> In general, I’m +1 to the FLIP.
>
>
> +1 to the single-string and parse way to express object path.
>
> +1 to deprecate registerTableSink & registerTableSource.
> But I would suggest to provide an easy way to register a custom
> source/sink before we drop them (this is another story).
> Currently, it’s not easy to implement a custom connector descriptor.
>
> Best,
> Jark
>
>
> > 在 2019年9月19日,11:37,Dawid Wysakowicz  写道:
> >
> > Hi JingsongLee,
> > From my understanding they can. Underneath they will be CatalogTables.
> The
> > difference is the lifetime of the tables. Plus some of the user facing
> > interfaces cannot be persisted e.g. datastream. Therefore we must have a
> > separate methods for that. In the end the temporary tables are held in
> > memory as CatalogTables.
> > Best,
> > Dawid
> >
> > On Thu, 19 Sep 2019, 10:08 JingsongLee,  .invalid>
> > wrote:
> >
> >> Hi dawid:
> >> Can temporary tables achieve the same capabilities as catalog table?
> >> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
> >> PartitionStatistics
> >> like partition support: we have added some catalog equivalent interfaces
> >> on TableSource/TableSink: getPartitions, getPartitionFieldNames
> >> Maybe it's not a good idea to add these interfaces to
> >> TableSource/TableSink. What do you think?
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >>
> >> --
> >> From:Kurt Young 
> >> Send Time:2019年9月18日(星期三) 17:54
> >> To:dev 
> >> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >> module
> >>
> >> Hi all,
> >>
> >> Sorry to join this party late. Big +1 to this flip, especially for the
> >> dropping
> >> "registerTableSink & registerTableSource" part. These are indeed legacy
> >> and we should try to unify them through CatalogTable after we introduce
> >> the concept of Catalog.
> >>
> >> From my understanding, what we can registered should all be metadata,
> >> TableSource/TableSink should only be the one who is responsible to do
> >> the real work, i.e. reading and writing data according to the schema and
> >> other information like computed column, partition, .e.g.
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee  >> .invalid>
> >> wrote:
> >>
> >>> After some development and thinking, I have a general understanding.
> >>> +1 to registering a source/sink does not fit into the SQL world.
> >>> I am OK to have a deprecated registerTemporarySource/Sink to compatible
> >>> with old ways.
> >>>
> >>> Best,
> >>> Jingsong Lee
> >>>
> >>>
> >>> --
> >>> From:Timo Walther 
> >>> Send Time:2019年9月17日(星期二) 08:00
> >>> To:dev 
> >>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >>> module
> >>>
> >>> Hi Dawid,
> >>>
> >>> thanks for the design document. It fixes big concept gaps due to
> >>> historical reasons with proper support for serializability and catalog
> >>> support in mind.
> >>>
> >>> I would not mind a registerTemporarySource/Sink, but the problem that I
> >>> see is that many people think that this is the recommended way of
> >>> registering a table source/sink which is not true. We should guide
> users
> >>> to either use connect() or DDL API which can be validated and stored in
> >>> catalog.
> >>>
> >>> Also from a concept perspective, registering a source/sink does not fit
> >>> into the SQL world. SQL does not know about source/sinks but only about
> >>> tables. If the responsibility of a TableSource/TableSink is just a pure
> >>> physical data consumer/producer that is not connected to the actual
> >>> logical table schema, we would need a possibility of defining time
> >>> attributes and interpreting/converting a changelog. This should be done
> >>> by the framework with information from the DDL/connect() and not be
> >>> defined in every table source.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>>
> >>> On 09.09.19 14:16, JingsongLee wrote:
>  Hi dawid:
> 
>  It is difficult to describe specific examples.
>  Sometimes users will generate some java converters through some
>   Java code, or generate some Java 

Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table module

2019-09-19 Thread Jark Wu
Thanks Dawid for the design doc. 

In general, I’m +1 to the FLIP.


+1 to the single-string and parse way to express object path. 

+1 to deprecate registerTableSink & registerTableSource. 
But I would suggest to provide an easy way to register a custom source/sink 
before we drop them (this is another story). 
Currently, it’s not easy to implement a custom connector descriptor.

Best,
Jark


> 在 2019年9月19日,11:37,Dawid Wysakowicz  写道:
> 
> Hi JingsongLee,
> From my understanding they can. Underneath they will be CatalogTables. The
> difference is the lifetime of the tables. Plus some of the user facing
> interfaces cannot be persisted e.g. datastream. Therefore we must have a
> separate methods for that. In the end the temporary tables are held in
> memory as CatalogTables.
> Best,
> Dawid
> 
> On Thu, 19 Sep 2019, 10:08 JingsongLee, 
> wrote:
> 
>> Hi dawid:
>> Can temporary tables achieve the same capabilities as catalog table?
>> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
>> PartitionStatistics
>> like partition support: we have added some catalog equivalent interfaces
>> on TableSource/TableSink: getPartitions, getPartitionFieldNames
>> Maybe it's not a good idea to add these interfaces to
>> TableSource/TableSink. What do you think?
>> 
>> Best,
>> Jingsong Lee
>> 
>> 
>> --
>> From:Kurt Young 
>> Send Time:2019年9月18日(星期三) 17:54
>> To:dev 
>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>> module
>> 
>> Hi all,
>> 
>> Sorry to join this party late. Big +1 to this flip, especially for the
>> dropping
>> "registerTableSink & registerTableSource" part. These are indeed legacy
>> and we should try to unify them through CatalogTable after we introduce
>> the concept of Catalog.
>> 
>> From my understanding, what we can registered should all be metadata,
>> TableSource/TableSink should only be the one who is responsible to do
>> the real work, i.e. reading and writing data according to the schema and
>> other information like computed column, partition, .e.g.
>> 
>> Best,
>> Kurt
>> 
>> 
>> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee > .invalid>
>> wrote:
>> 
>>> After some development and thinking, I have a general understanding.
>>> +1 to registering a source/sink does not fit into the SQL world.
>>> I am OK to have a deprecated registerTemporarySource/Sink to compatible
>>> with old ways.
>>> 
>>> Best,
>>> Jingsong Lee
>>> 
>>> 
>>> --
>>> From:Timo Walther 
>>> Send Time:2019年9月17日(星期二) 08:00
>>> To:dev 
>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>> module
>>> 
>>> Hi Dawid,
>>> 
>>> thanks for the design document. It fixes big concept gaps due to
>>> historical reasons with proper support for serializability and catalog
>>> support in mind.
>>> 
>>> I would not mind a registerTemporarySource/Sink, but the problem that I
>>> see is that many people think that this is the recommended way of
>>> registering a table source/sink which is not true. We should guide users
>>> to either use connect() or DDL API which can be validated and stored in
>>> catalog.
>>> 
>>> Also from a concept perspective, registering a source/sink does not fit
>>> into the SQL world. SQL does not know about source/sinks but only about
>>> tables. If the responsibility of a TableSource/TableSink is just a pure
>>> physical data consumer/producer that is not connected to the actual
>>> logical table schema, we would need a possibility of defining time
>>> attributes and interpreting/converting a changelog. This should be done
>>> by the framework with information from the DDL/connect() and not be
>>> defined in every table source.
>>> 
>>> Regards,
>>> Timo
>>> 
>>> 
>>> On 09.09.19 14:16, JingsongLee wrote:
 Hi dawid:
 
 It is difficult to describe specific examples.
 Sometimes users will generate some java converters through some
  Java code, or generate some Java classes through third-party
  libraries. Of course, these can be best done through properties.
 But this requires additional work from users.My suggestion is to
  keep this Java instance class way that is user-friendly.
 
 Best,
 Jingsong Lee
 
 
 --
 From:Dawid Wysakowicz 
 Send Time:2019年9月6日(星期五) 16:21
 To:dev 
 Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
>>> module
 
 Hi all,
 @Jingsong Could you elaborate a bit more what do you mean by
 "some Connectors are difficult to convert all states to properties"
 All the Flink provided connectors will definitely be expressible with
>>> properties (In the end you should be able to use them from DDL). I think
>> if
>>> a TableSource is complex enough that it handles filter push down,
>> partition
>>> support etc. should rather be made 

Re: [DISCUSS] FLIP-57 - Rework FunctionCatalog

2019-09-19 Thread Piotr Nowojski
Hi,

It is a quite long discussion to follow and I hope I didn’t misunderstand 
anything. From the proposals presented by Xuefu I would vote:

-1 for #1 and #2 
+1 for #3

Besides #3 being IMO more general and more consistent, having qualified names 
(#3) would help/make easier for someone to use cross databases/catalogs queries 
(joining multiple data sets/streams). For example with some functions to 
manipulate/clean up/convert the stored data in different catalogs registered in 
the respective catalogs.

Piotrek 

> On 19 Sep 2019, at 06:35, Jark Wu  wrote:
> 
> I agree with Xuefu that inconsistent handling with all the other objects is
> not a big problem.
> 
> Regarding to option#3, the special "system.system" namespace may confuse
> users.
> Users need to know the set of built-in function names to know when to use
> "system.system" namespace.
> What will happen if user registers a non-builtin function name under the
> "system.system" namespace?
> Besides, I think it doesn't solve the "explode" problem I mentioned at the
> beginning of this thread.
> 
> So here is my vote:
> 
> +1 for #1
> 0 for #2
> -1 for #3
> 
> Best,
> Jark
> 
> 
> On Thu, 19 Sep 2019 at 08:38, Xuefu Z  wrote:
> 
>> @Dawid, Re: we also don't need additional referencing the specialcatalog
>> anywhere.
>> 
>> True. But once we allow such reference, then user can do so in any possible
>> place where a function name is expected, for which we have to handle.
>> That's a big difference, I think.
>> 
>> Thanks,
>> Xuefu
>> 
>> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>> wysakowicz.da...@gmail.com>
>> wrote:
>> 
>>> @Bowen I am not suggesting introducing additional catalog. I think we
>> need
>>> to get rid of the current built-in catalog.
>>> 
>>> @Xuefu in option #3 we also don't need additional referencing the special
>>> catalog anywhere else besides in the CREATE statement. The resolution
>>> behaviour is exactly the same in both options.
>>> 
>>> On Thu, 19 Sep 2019, 08:17 Xuefu Z,  wrote:
>>> 
 Hi Dawid,
 
 "GLOBAL" is a temporary keyword that was given to the approach. It can
>> be
 changed to something else for better.
 
 The difference between this and the #3 approach is that we only need
>> the
 keyword for this create DDL. For other places (such as function
 referencing), no keyword or special namespace is needed.
 
 Thanks,
 Xuefu
 
 On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
 wysakowicz.da...@gmail.com>
 wrote:
 
> Hi,
> I think it makes sense to start voting at this point.
> 
> Option 1: Only 1-part identifiers
> PROS:
> - allows shadowing built-in functions
> CONS:
> - incosistent with all the other objects, both permanent & temporary
> - does not allow shadowing catalog functions
> 
> Option 2: Special keyword for built-in function
> I think this is quite similar to the special catalog/db. The thing I
>> am
> strongly against in this proposal is the GLOBAL keyword. This keyword
 has a
> meaning in rdbms systems and means a function that is present for a
> lifetime of a session in which it was created, but available in all
>>> other
> sessions. Therefore I really don't want to use this keyword in a
 different
> context.
> 
> Option 3: Special catalog/db
> 
> PROS:
> - allows shadowing built-in functions
> - allows shadowing catalog functions
> - consistent with other objects
> CONS:
> - we introduce a special namespace for built-in functions
> 
> I don't see a problem with introducing the special namespace. In the
>>> end
 it
> is very similar to the keyword approach. In this case the catalog/db
> combination would be the "keyword"
> 
> Therefore my votes:
> Option 1: -0
> Option 2: -1 (I might change to +0 if we can come up with a better
 keyword)
> Option 3: +1
> 
> Best,
> Dawid
> 
> 
> On Thu, 19 Sep 2019, 05:12 Xuefu Z,  wrote:
> 
>> Hi Aljoscha,
>> 
>> Thanks for the summary and these are great questions to be
>> answered.
 The
>> answer to your first question is clear: there is a general
>> agreement
>>> to
>> override built-in functions with temp functions.
>> 
>> However, your second and third questions are sort of related, as a
> function
>> reference can be either just function name (like "func") or in the
>>> form
> or
>> "cat.db.func". When a reference is just function name, it can mean
> either a
>> built-in function or a function defined in the current cat/db. If
>> we
>> support overriding a built-in function with a temp function, such
>> overriding can also cover a function in the current cat/db.
>> 
>> I think what Timo referred as "overriding a catalog function"
>> means a
> temp
>> function defined as "cat.db.func" overrides a catalog function
>> "func"
 in

java8 lambdas and exceptions lead to compile error

2019-09-19 Thread zz
Hey all,
Recently, I used flink to do secondary development, when compile flink
master(up-to-date) by using Java 1.8.0_77, got errors as follow:

compile (default-compile) on project flink-table-api-java: Compilation
failure
/home/*/zzsmdfj/sflink/flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/operations/utils/factories/Cal
culatedTableFactory.java:[90,53] unreported exception X; must be caught or
declared to be thrownat
org.apache.maven.lifecycle.internal.MojoExecutor.execute
(MojoExecutor.java:213)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute
(MojoExecutor.java:154)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute
(MojoExecutor.java:146)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
(LifecycleModuleBuilder.java:117)
at
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject
(LifecycleModuleBuilder.java:81)
at
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
(SingleThreadedBuilder.java:51)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute
(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:309)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:194)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:107)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:955)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:290)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:194)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced
(Launcher.java:289)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch
(Launcher.java:229)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode
(Launcher.java:415)
at org.codehaus.plexus.classworlds.launcher.Launcher.main
(Launcher.java:356)
Caused by: org.apache.maven.plugin.compiler.CompilationFailureException:
Compilation failure

if using Java 1.8.0_102 to compile, it build success. it maybe a case of
bug JDK-8054569 .

Is that a problem?and What should I do with this?any comments would be
appreciated.

issue:https://issues.apache.org/jira/browse/FLINK-14093


[jira] [Created] (FLINK-14123) Change taskmanager.memory.fraction default value to 0.6

2019-09-19 Thread liupengcheng (Jira)
liupengcheng created FLINK-14123:


 Summary: Change taskmanager.memory.fraction default value to 0.6
 Key: FLINK-14123
 URL: https://issues.apache.org/jira/browse/FLINK-14123
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Affects Versions: 1.9.0
Reporter: liupengcheng


Currently, we are testing flink batch task, such as terasort, however, it 
started only awhile then it failed due to OOM. 

 
{code:java}
org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 
a807e1d635bd4471ceea4282477f8850)
at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:262)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:338)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:326)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:62)
at 
org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:539)
at 
com.github.ehiggs.spark.terasort.FlinkTeraSort$.main(FlinkTeraSort.scala:89)
at 
com.github.ehiggs.spark.terasort.FlinkTeraSort.main(FlinkTeraSort.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:604)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:466)
at 
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:274)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:746)
at 
org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:273)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1007)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1080)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1080)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:259)
... 23 more
Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 
'SortMerger Reading Thread' terminated due to an exception: GC overhead limit 
exceeded
at 
org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:650)
at 
org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1109)
at org.apache.flink.runtime.operators.NoOpDriver.run(NoOpDriver.java:82)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:504)
at 
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated 
due to an exception: GC overhead limit exceeded
at 
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:831)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at 
org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:84)
at 
org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:33)
at 
org.apache.flink.api.scala.typeutils.CaseClassSerializer.deserialize(CaseClassSerializer.scala:121)
at 
org.apache.flink.api.scala.typeutils.CaseClassSerializer.deserialize(CaseClassSerializer.scala:114)
at 
org.apache.flink.api.scala.typeutils.CaseClassSerializer.deserialize(CaseClassSerializer.scala:32)
at 
org.apache.flink.runtime.plugable.ReusingDeserializationDelegate.read(ReusingDeserializationDelegate.java:57)
at