Re: [DISCUSS] Support temporary tables in SQL API

2019-07-25 Thread Aljoscha Krettek
Thanks for pushing the discussion, Dawid! I’m also fine with option #3.

Aljoscha

> On 24. Jul 2019, at 12:04, Dawid Wysakowicz  wrote:
> 
> Hi all,
> 
> Thank you Xuefu for clarifying your opinion. Now we have 3 votes for both of 
> the options. To conclude this discussion I am willing to change my vote to 
> option 3 as I had only a slight preference towards option 2.
> 
> Therefore the final results of the poll are as follows:
> 
> option #2: 2 votes(Kurt, Aljoscha)
> 
> option #3: 4 votes(Timo, Jingsong, Xuefu, me)
> 
> I will prepare appropriate PRs according to the decision (unless somebody 
> objects). We will revisit the long-term solution in a separate thread as part 
> of the 1.10 release after 1.9 is released.
> 
> Thank you all for your opinions!
> 
> Best,
> 
> Dawid
> 
> On 24/07/2019 09:35, Aljoscha Krettek wrote:
>> Isn’t https://issues.apache.org/jira/browse/FLINK-13279 
>> <https://issues.apache.org/jira/browse/FLINK-13279> 
>> <https://issues.apache.org/jira/browse/FLINK-13279> 
>> <https://issues.apache.org/jira/browse/FLINK-13279> already a sign that 
>> there are surprises for users if we go with option #3?
>> 
>> Aljoscha
>> 
>>> On 24. Jul 2019, at 00:33, Xuefu Z  
>>> <mailto:usxu...@gmail.com> wrote:
>>> 
>>> I favored #3 if that wasn't obvious.
>>> 
>>> Usability issue with #2 makes Hive  too hard to use. #3 keeps the old
>>> behavior for existing users who don't have Hive and thus there is only one,
>>> in-memory catalog. If a user does register Hive, he/she understands that
>>> there are multiple catalogs and that fully qualified table name is
>>> necessary. Thus, #3 has no impact (and no surprises) for existing users,
>>> and new requirement of fully qualified names is for only for users of the
>>> new feature (multiple catalogs), which seems very natural.
>>> 
>>> Thanks,
>>> Xuefu
>>> 
>>> On Tue, Jul 23, 2019 at 5:47 AM Dawid Wysakowicz >> <mailto:dwysakow...@apache.org> <mailto:dwysakow...@apache.org> 
>>> <mailto:dwysakow...@apache.org>>
>>> wrote:
>>> 
>>>> I think we all agree so far that we should implement one of the short term
>>>> solutions for 1.9 release (#2 or #3) and continue the discussion on option
>>>> #1 for the next release. Personally I prefer option #2, because it is
>>>> closest to the current behavior and as Kurt said it is the most intuitive
>>>> one, but I am also fine with option #3
>>>> 
>>>> To sum up the opinions so far:
>>>> 
>>>> *option #2: 3 votes(Kurt, Aljoscha, me)*
>>>> 
>>>> *option #3: 2 votes(Timo, Jingsong)*
>>>> 
>>>> I wasn't sure which option out of the two Xuefu prefers.
>>>> 
>>>> I would like to conclude the discussion by the end of tomorrow, so that we
>>>> can prepare a proper fix as soon as possible. Therefore I would suggest to
>>>> proceed with the option that gets the most votes until tomorrow (*July
>>>> 24th 12:00 CET*), unless there are some hard objections.
>>>> 
>>>> 
>>>> Comment on option #1 concerns:
>>>> 
>>>> I agree with Jingsong on that. I think there are some benefits of the
>>>> approach, as it makes Flink in control of the temporary tables.
>>>> 
>>>> 1. We have a unified behavior across all catalogs. Also for the catalogs
>>>> that do not support temporary tables natively.
>>>> 
>>>> 2. As Flink is in control of the temporary tables it makes it easier to
>>>> control their lifecycle.
>>>> 
>>>> Best,
>>>> 
>>>> Dawid
>>>> On 23/07/2019 11:40, JingsongLee wrote:
>>>> 
>>>> And I think we should recommend user to use catalog api to
>>>> createTable and createFunction,(I guess most scenarios do
>>>> not use temporary objects) in this way, it is good to option #3
>>>> 
>>>> Best, JingsongLee
>>>> 
>>>> 
>>>> --
>>>> From:JingsongLee >>> <mailto:lzljs3620...@aliyun.com.INVALID> 
>>>> <mailto:lzljs3620...@aliyun.com.INVALID> 
>>>> <mailto:lzljs3620...@aliyun.com.INVALID>> >>> <mailto:lzljs3620...@aliyun.com.INVALID> 
>>>> <mailto:lzljs3620...@aliyun.com.INVALID> 
>>

Re: [DISCUSS] Support temporary tables in SQL API

2019-07-24 Thread Dawid Wysakowicz
Hi all,

Thank you Xuefu for clarifying your opinion. Now we have 3 votes for
both of the options. To conclude this discussion I am willing to change
my vote to option 3 as I had only a slight preference towards option 2.

Therefore the final results of the poll are as follows:

/option #2: 2 votes(Kurt, Aljoscha)/

/option #3: 4 votes(Timo, Jingsong, Xuefu, me)/

I will prepare appropriate PRs according to the decision (unless
somebody objects). We will revisit the long-term solution in a separate
thread as part of the 1.10 release after 1.9 is released.

Thank you all for your opinions!

Best,

Dawid

On 24/07/2019 09:35, Aljoscha Krettek wrote:
> Isn’t https://issues.apache.org/jira/browse/FLINK-13279 
> <https://issues.apache.org/jira/browse/FLINK-13279> already a sign that there 
> are surprises for users if we go with option #3?
>
> Aljoscha
>
>> On 24. Jul 2019, at 00:33, Xuefu Z  wrote:
>>
>> I favored #3 if that wasn't obvious.
>>
>> Usability issue with #2 makes Hive  too hard to use. #3 keeps the old
>> behavior for existing users who don't have Hive and thus there is only one,
>> in-memory catalog. If a user does register Hive, he/she understands that
>> there are multiple catalogs and that fully qualified table name is
>> necessary. Thus, #3 has no impact (and no surprises) for existing users,
>> and new requirement of fully qualified names is for only for users of the
>> new feature (multiple catalogs), which seems very natural.
>>
>> Thanks,
>> Xuefu
>>
>> On Tue, Jul 23, 2019 at 5:47 AM Dawid Wysakowicz > <mailto:dwysakow...@apache.org>>
>> wrote:
>>
>>> I think we all agree so far that we should implement one of the short term
>>> solutions for 1.9 release (#2 or #3) and continue the discussion on option
>>> #1 for the next release. Personally I prefer option #2, because it is
>>> closest to the current behavior and as Kurt said it is the most intuitive
>>> one, but I am also fine with option #3
>>>
>>> To sum up the opinions so far:
>>>
>>> *option #2: 3 votes(Kurt, Aljoscha, me)*
>>>
>>> *option #3: 2 votes(Timo, Jingsong)*
>>>
>>> I wasn't sure which option out of the two Xuefu prefers.
>>>
>>> I would like to conclude the discussion by the end of tomorrow, so that we
>>> can prepare a proper fix as soon as possible. Therefore I would suggest to
>>> proceed with the option that gets the most votes until tomorrow (*July
>>> 24th 12:00 CET*), unless there are some hard objections.
>>>
>>>
>>> Comment on option #1 concerns:
>>>
>>> I agree with Jingsong on that. I think there are some benefits of the
>>> approach, as it makes Flink in control of the temporary tables.
>>>
>>> 1. We have a unified behavior across all catalogs. Also for the catalogs
>>> that do not support temporary tables natively.
>>>
>>> 2. As Flink is in control of the temporary tables it makes it easier to
>>> control their lifecycle.
>>>
>>> Best,
>>>
>>> Dawid
>>> On 23/07/2019 11:40, JingsongLee wrote:
>>>
>>> And I think we should recommend user to use catalog api to
>>> createTable and createFunction,(I guess most scenarios do
>>> not use temporary objects) in this way, it is good to option #3
>>>
>>> Best, JingsongLee
>>>
>>>
>>> --
>>> From:JingsongLee >> <mailto:lzljs3620...@aliyun.com.INVALID>> >> <mailto:lzljs3620...@aliyun.com.INVALID>>
>>> Send Time:2019年7月23日(星期二) 17:35
>>> To:dev mailto:dev@flink.apache.org>> 
>>> mailto:dev@flink.apache.org>>
>>> Subject:Re: [DISCUSS] Support temporary tables in SQL API
>>>
>>> Thanks Dawid and other people.
>>> +1 for using option #3 for 1.9.0 and go with option #1
>>> in 1.10.0.
>>>
>>> Regarding Xuefu's concern, I don't know how necessary it is for each 
>>> catalog to
>>> deal with tmpView. I think Catalog is different from DB, we can have single 
>>> concept for tmpView, that make user easier to understand.
>>>
>>> Regarding option #2, It is hard to use if we let user to use fully 
>>> qualified name for hive catalog. Would this experience be too bad to use?
>>>
>>> Best, Jingsong Lee
>>>
>>>
>>> --
>>> From:Kurt Young mailto:ykt...

Re: [DISCUSS] Support temporary tables in SQL API

2019-07-24 Thread Aljoscha Krettek
Isn’t https://issues.apache.org/jira/browse/FLINK-13279 
<https://issues.apache.org/jira/browse/FLINK-13279> already a sign that there 
are surprises for users if we go with option #3?

Aljoscha

> On 24. Jul 2019, at 00:33, Xuefu Z  wrote:
> 
> I favored #3 if that wasn't obvious.
> 
> Usability issue with #2 makes Hive  too hard to use. #3 keeps the old
> behavior for existing users who don't have Hive and thus there is only one,
> in-memory catalog. If a user does register Hive, he/she understands that
> there are multiple catalogs and that fully qualified table name is
> necessary. Thus, #3 has no impact (and no surprises) for existing users,
> and new requirement of fully qualified names is for only for users of the
> new feature (multiple catalogs), which seems very natural.
> 
> Thanks,
> Xuefu
> 
> On Tue, Jul 23, 2019 at 5:47 AM Dawid Wysakowicz  <mailto:dwysakow...@apache.org>>
> wrote:
> 
>> I think we all agree so far that we should implement one of the short term
>> solutions for 1.9 release (#2 or #3) and continue the discussion on option
>> #1 for the next release. Personally I prefer option #2, because it is
>> closest to the current behavior and as Kurt said it is the most intuitive
>> one, but I am also fine with option #3
>> 
>> To sum up the opinions so far:
>> 
>> *option #2: 3 votes(Kurt, Aljoscha, me)*
>> 
>> *option #3: 2 votes(Timo, Jingsong)*
>> 
>> I wasn't sure which option out of the two Xuefu prefers.
>> 
>> I would like to conclude the discussion by the end of tomorrow, so that we
>> can prepare a proper fix as soon as possible. Therefore I would suggest to
>> proceed with the option that gets the most votes until tomorrow (*July
>> 24th 12:00 CET*), unless there are some hard objections.
>> 
>> 
>> Comment on option #1 concerns:
>> 
>> I agree with Jingsong on that. I think there are some benefits of the
>> approach, as it makes Flink in control of the temporary tables.
>> 
>> 1. We have a unified behavior across all catalogs. Also for the catalogs
>> that do not support temporary tables natively.
>> 
>> 2. As Flink is in control of the temporary tables it makes it easier to
>> control their lifecycle.
>> 
>> Best,
>> 
>> Dawid
>> On 23/07/2019 11:40, JingsongLee wrote:
>> 
>> And I think we should recommend user to use catalog api to
>> createTable and createFunction,(I guess most scenarios do
>> not use temporary objects) in this way, it is good to option #3
>> 
>> Best, JingsongLee
>> 
>> 
>> --
>> From:JingsongLee > <mailto:lzljs3620...@aliyun.com.INVALID>> > <mailto:lzljs3620...@aliyun.com.INVALID>>
>> Send Time:2019年7月23日(星期二) 17:35
>> To:dev mailto:dev@flink.apache.org>> 
>> mailto:dev@flink.apache.org>>
>> Subject:Re: [DISCUSS] Support temporary tables in SQL API
>> 
>> Thanks Dawid and other people.
>> +1 for using option #3 for 1.9.0 and go with option #1
>> in 1.10.0.
>> 
>> Regarding Xuefu's concern, I don't know how necessary it is for each catalog 
>> to
>> deal with tmpView. I think Catalog is different from DB, we can have single 
>> concept for tmpView, that make user easier to understand.
>> 
>> Regarding option #2, It is hard to use if we let user to use fully qualified 
>> name for hive catalog. Would this experience be too bad to use?
>> 
>> Best, Jingsong Lee
>> 
>> 
>> --
>> From:Kurt Young mailto:ykt...@gmail.com>> 
>> mailto:ykt...@gmail.com>>
>> Send Time:2019年7月23日(星期二) 17:03
>> To:dev mailto:dev@flink.apache.org>> 
>> mailto:dev@flink.apache.org>>
>> Subject:Re: [DISCUSS] Support temporary tables in SQL API
>> 
>> Thanks Dawid for driving this discussion.
>> Personally, I would +1 for using option #2 for 1.9.0 and go with option #1
>> in 1.10.0.
>> 
>> Regarding Xuefu's concern about option #1, I think we could also try to
>> reuse the in-memory catalog
>> for the builtin temporary table storage.
>> 
>> Regarding to option #2 and option #3, from user's perspective, IIUC option
>> #2 allows user to have
>> simple name to reference temporary table and should use fully qualified
>> name for external catalogs.
>> But option #3 provide the opposite behavior, user can use simple name for
>> external tables after he
>> changed current catalog an

Re: [DISCUSS] Support temporary tables in SQL API

2019-07-23 Thread Xuefu Z
I favored #3 if that wasn't obvious.

Usability issue with #2 makes Hive  too hard to use. #3 keeps the old
behavior for existing users who don't have Hive and thus there is only one,
in-memory catalog. If a user does register Hive, he/she understands that
there are multiple catalogs and that fully qualified table name is
necessary. Thus, #3 has no impact (and no surprises) for existing users,
and new requirement of fully qualified names is for only for users of the
new feature (multiple catalogs), which seems very natural.

Thanks,
Xuefu

On Tue, Jul 23, 2019 at 5:47 AM Dawid Wysakowicz 
wrote:

> I think we all agree so far that we should implement one of the short term
> solutions for 1.9 release (#2 or #3) and continue the discussion on option
> #1 for the next release. Personally I prefer option #2, because it is
> closest to the current behavior and as Kurt said it is the most intuitive
> one, but I am also fine with option #3
>
> To sum up the opinions so far:
>
> *option #2: 3 votes(Kurt, Aljoscha, me)*
>
> *option #3: 2 votes(Timo, Jingsong)*
>
> I wasn't sure which option out of the two Xuefu prefers.
>
> I would like to conclude the discussion by the end of tomorrow, so that we
> can prepare a proper fix as soon as possible. Therefore I would suggest to
> proceed with the option that gets the most votes until tomorrow (*July
> 24th 12:00 CET*), unless there are some hard objections.
>
>
> Comment on option #1 concerns:
>
> I agree with Jingsong on that. I think there are some benefits of the
> approach, as it makes Flink in control of the temporary tables.
>
> 1. We have a unified behavior across all catalogs. Also for the catalogs
> that do not support temporary tables natively.
>
> 2. As Flink is in control of the temporary tables it makes it easier to
> control their lifecycle.
>
> Best,
>
> Dawid
> On 23/07/2019 11:40, JingsongLee wrote:
>
> And I think we should recommend user to use catalog api to
>  createTable and createFunction,(I guess most scenarios do
>  not use temporary objects) in this way, it is good to option #3
>
> Best, JingsongLee
>
>
> --------------------------
> From:JingsongLee  
> 
> Send Time:2019年7月23日(星期二) 17:35
> To:dev  
> Subject:Re: [DISCUSS] Support temporary tables in SQL API
>
> Thanks Dawid and other people.
> +1 for using option #3 for 1.9.0 and go with option #1
>  in 1.10.0.
>
> Regarding Xuefu's concern, I don't know how necessary it is for each catalog 
> to
>  deal with tmpView. I think Catalog is different from DB, we can have single 
> concept for tmpView, that make user easier to understand.
>
> Regarding option #2, It is hard to use if we let user to use fully qualified 
> name for hive catalog. Would this experience be too bad to use?
>
> Best, Jingsong Lee
>
>
> --
> From:Kurt Young  
> Send Time:2019年7月23日(星期二) 17:03
> To:dev  
> Subject:Re: [DISCUSS] Support temporary tables in SQL API
>
> Thanks Dawid for driving this discussion.
> Personally, I would +1 for using option #2 for 1.9.0 and go with option #1
> in 1.10.0.
>
> Regarding Xuefu's concern about option #1, I think we could also try to
> reuse the in-memory catalog
> for the builtin temporary table storage.
>
> Regarding to option #2 and option #3, from user's perspective, IIUC option
> #2 allows user to have
> simple name to reference temporary table and should use fully qualified
> name for external catalogs.
> But option #3 provide the opposite behavior, user can use simple name for
> external tables after he
> changed current catalog and current database, but have to use fully
> qualified name for temporary
> tables. IMO, option #2 will be more straightforward.
>
> Best,
> Kurt
>
>
> On Tue, Jul 23, 2019 at 4:01 PM Aljoscha Krettek  
> 
> wrote:
>
>
> I would be fine with option 3) but I think option 2) is the more implicit
> solution that has less surprising behaviour.
>
> Aljoscha
>
>
> On 22. Jul 2019, at 23:59, Xuefu Zhang   
> wrote:
>
> Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
> that for 1.9 we should have some quick and simple solution, leaving time
> for more thorough discussions for 1.10.
>
> In particular, I'm not fully with solution #1. For one thing, it seems
> proposing storing all temporary objects in a memory map in
>
> CatalogManager,
>
> and the memory map duplicates the functionality of the in-memory catalog,
> which also store temporary objects. For another, as pointed out by the
> google doc, different db may handle the temporary tables 

Re: [DISCUSS] Support temporary tables in SQL API

2019-07-23 Thread Dawid Wysakowicz
I think we all agree so far that we should implement one of the short
term solutions for 1.9 release (#2 or #3) and continue the discussion on
option #1 for the next release. Personally I prefer option #2, because
it is closest to the current behavior and as Kurt said it is the most
intuitive one, but I am also fine with option #3

To sum up the opinions so far:

/option #2: 3 votes(Kurt, Aljoscha, me)/

/option #3: 2 votes(Timo, Jingsong)/

I wasn't sure which option out of the two Xuefu prefers.

I would like to conclude the discussion by the end of tomorrow, so that
we can prepare a proper fix as soon as possible. Therefore I would
suggest to proceed with the option that gets the most votes until
tomorrow (*July 24th 12:00 CET*), unless there are some hard objections.


Comment on option #1 concerns:

I agree with Jingsong on that. I think there are some benefits of the
approach, as it makes Flink in control of the temporary tables.

1. We have a unified behavior across all catalogs. Also for the catalogs
that do not support temporary tables natively.

2. As Flink is in control of the temporary tables it makes it easier to
control their lifecycle.

Best,

Dawid

On 23/07/2019 11:40, JingsongLee wrote:
> And I think we should recommend user to use catalog api to
>  createTable and createFunction,(I guess most scenarios do
>  not use temporary objects) in this way, it is good to option #3
>
> Best, JingsongLee
>
>
> --
> From:JingsongLee 
> Send Time:2019年7月23日(星期二) 17:35
> To:dev 
> Subject:Re: [DISCUSS] Support temporary tables in SQL API
>
> Thanks Dawid and other people.
> +1 for using option #3 for 1.9.0 and go with option #1
>  in 1.10.0.
>
> Regarding Xuefu's concern, I don't know how necessary it is for each catalog 
> to
>  deal with tmpView. I think Catalog is different from DB, we can have single 
> concept for tmpView, that make user easier to understand.
>
> Regarding option #2, It is hard to use if we let user to use fully qualified 
> name for hive catalog. Would this experience be too bad to use?
>
> Best, Jingsong Lee
>
>
> --------------
> From:Kurt Young 
> Send Time:2019年7月23日(星期二) 17:03
> To:dev 
> Subject:Re: [DISCUSS] Support temporary tables in SQL API
>
> Thanks Dawid for driving this discussion.
> Personally, I would +1 for using option #2 for 1.9.0 and go with option #1
> in 1.10.0.
>
> Regarding Xuefu's concern about option #1, I think we could also try to
> reuse the in-memory catalog
> for the builtin temporary table storage.
>
> Regarding to option #2 and option #3, from user's perspective, IIUC option
> #2 allows user to have
> simple name to reference temporary table and should use fully qualified
> name for external catalogs.
> But option #3 provide the opposite behavior, user can use simple name for
> external tables after he
> changed current catalog and current database, but have to use fully
> qualified name for temporary
> tables. IMO, option #2 will be more straightforward.
>
> Best,
> Kurt
>
>
> On Tue, Jul 23, 2019 at 4:01 PM Aljoscha Krettek 
> wrote:
>
>> I would be fine with option 3) but I think option 2) is the more implicit
>> solution that has less surprising behaviour.
>>
>> Aljoscha
>>
>>> On 22. Jul 2019, at 23:59, Xuefu Zhang  wrote:
>>>
>>> Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
>>> that for 1.9 we should have some quick and simple solution, leaving time
>>> for more thorough discussions for 1.10.
>>>
>>> In particular, I'm not fully with solution #1. For one thing, it seems
>>> proposing storing all temporary objects in a memory map in
>> CatalogManager,
>>> and the memory map duplicates the functionality of the in-memory catalog,
>>> which also store temporary objects. For another, as pointed out by the
>>> google doc, different db may handle the temporary tables differently, and
>>> accordingly it may make more sense to let each catalog to handle its
>>> temporary objects.
>>>
>>> Therefore, postponing the fix buys us time to flush out all the details.
>>>
>>> Thanks,
>>> Xuefu
>>>
>>> On Mon, Jul 22, 2019 at 7:19 AM Timo Walther  wrote:
>>>
>>>> Thanks for summarizing our offline discussion Dawid! Even though I would
>>>> prefer solution 1 instead of releasing half-baked features, I also
>>>> understand that the Table API should not further block the next release.
>>>> Therefore, I would be fine with solution 3 but introduce 

Re: [DISCUSS] Support temporary tables in SQL API

2019-07-23 Thread JingsongLee
And I think we should recommend user to use catalog api to
 createTable and createFunction,(I guess most scenarios do
 not use temporary objects) in this way, it is good to option #3

Best, JingsongLee


--
From:JingsongLee 
Send Time:2019年7月23日(星期二) 17:35
To:dev 
Subject:Re: [DISCUSS] Support temporary tables in SQL API

Thanks Dawid and other people.
+1 for using option #3 for 1.9.0 and go with option #1
 in 1.10.0.

Regarding Xuefu's concern, I don't know how necessary it is for each catalog to
 deal with tmpView. I think Catalog is different from DB, we can have single 
concept for tmpView, that make user easier to understand.

Regarding option #2, It is hard to use if we let user to use fully qualified 
name for hive catalog. Would this experience be too bad to use?

Best, Jingsong Lee


--
From:Kurt Young 
Send Time:2019年7月23日(星期二) 17:03
To:dev 
Subject:Re: [DISCUSS] Support temporary tables in SQL API

Thanks Dawid for driving this discussion.
Personally, I would +1 for using option #2 for 1.9.0 and go with option #1
in 1.10.0.

Regarding Xuefu's concern about option #1, I think we could also try to
reuse the in-memory catalog
for the builtin temporary table storage.

Regarding to option #2 and option #3, from user's perspective, IIUC option
#2 allows user to have
simple name to reference temporary table and should use fully qualified
name for external catalogs.
But option #3 provide the opposite behavior, user can use simple name for
external tables after he
changed current catalog and current database, but have to use fully
qualified name for temporary
tables. IMO, option #2 will be more straightforward.

Best,
Kurt


On Tue, Jul 23, 2019 at 4:01 PM Aljoscha Krettek 
wrote:

> I would be fine with option 3) but I think option 2) is the more implicit
> solution that has less surprising behaviour.
>
> Aljoscha
>
> > On 22. Jul 2019, at 23:59, Xuefu Zhang  wrote:
> >
> > Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
> > that for 1.9 we should have some quick and simple solution, leaving time
> > for more thorough discussions for 1.10.
> >
> > In particular, I'm not fully with solution #1. For one thing, it seems
> > proposing storing all temporary objects in a memory map in
> CatalogManager,
> > and the memory map duplicates the functionality of the in-memory catalog,
> > which also store temporary objects. For another, as pointed out by the
> > google doc, different db may handle the temporary tables differently, and
> > accordingly it may make more sense to let each catalog to handle its
> > temporary objects.
> >
> > Therefore, postponing the fix buys us time to flush out all the details.
> >
> > Thanks,
> > Xuefu
> >
> > On Mon, Jul 22, 2019 at 7:19 AM Timo Walther  wrote:
> >
> >> Thanks for summarizing our offline discussion Dawid! Even though I would
> >> prefer solution 1 instead of releasing half-baked features, I also
> >> understand that the Table API should not further block the next release.
> >> Therefore, I would be fine with solution 3 but introduce the new
> >> user-facing `createTemporaryTable` methods as synonyms of the existing
> >> ones already. This allows us to deprecate the methods with undefined
> >> behavior as early as possible.
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> Am 22.07.19 um 16:13 schrieb Dawid Wysakowicz:
> >>> Hi all,
> >>>
> >>> When working on FLINK-13279[1] we realized we could benefit from a
> >>> better temporary objects support in the Catalog API/Table API.
> >>> Unfortunately we are already long past the feature freeze that's why I
> >>> wanted to get some opinions from the community how should we proceed
> >>> with this topic. I tried to prepare a summary of the current state and
> 3
> >>> different suggested approaches that we could take. Please see the
> >>> attached document[2]
> >>>
> >>> I will appreciate your thoughts!
> >>>
> >>>
> >>> [1] https://issues.apache.org/jira/browse/FLINK-13279
> >>>
> >>> [2]
> >>>
> >>
> https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing
> >>>
> >>>
> >>
> >>
>
>


Re: [DISCUSS] Support temporary tables in SQL API

2019-07-23 Thread JingsongLee
Thanks Dawid and other people.
+1 for using option #3 for 1.9.0 and go with option #1
 in 1.10.0.

Regarding Xuefu's concern, I don't know how necessary it is for each catalog to
 deal with tmpView. I think Catalog is different from DB, we can have single 
concept for tmpView, that make user easier to understand.

Regarding option #2, It is hard to use if we let user to use fully qualified 
name for hive catalog. Would this experience be too bad to use?

Best, Jingsong Lee


--
From:Kurt Young 
Send Time:2019年7月23日(星期二) 17:03
To:dev 
Subject:Re: [DISCUSS] Support temporary tables in SQL API

Thanks Dawid for driving this discussion.
Personally, I would +1 for using option #2 for 1.9.0 and go with option #1
in 1.10.0.

Regarding Xuefu's concern about option #1, I think we could also try to
reuse the in-memory catalog
for the builtin temporary table storage.

Regarding to option #2 and option #3, from user's perspective, IIUC option
#2 allows user to have
simple name to reference temporary table and should use fully qualified
name for external catalogs.
But option #3 provide the opposite behavior, user can use simple name for
external tables after he
changed current catalog and current database, but have to use fully
qualified name for temporary
tables. IMO, option #2 will be more straightforward.

Best,
Kurt


On Tue, Jul 23, 2019 at 4:01 PM Aljoscha Krettek 
wrote:

> I would be fine with option 3) but I think option 2) is the more implicit
> solution that has less surprising behaviour.
>
> Aljoscha
>
> > On 22. Jul 2019, at 23:59, Xuefu Zhang  wrote:
> >
> > Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
> > that for 1.9 we should have some quick and simple solution, leaving time
> > for more thorough discussions for 1.10.
> >
> > In particular, I'm not fully with solution #1. For one thing, it seems
> > proposing storing all temporary objects in a memory map in
> CatalogManager,
> > and the memory map duplicates the functionality of the in-memory catalog,
> > which also store temporary objects. For another, as pointed out by the
> > google doc, different db may handle the temporary tables differently, and
> > accordingly it may make more sense to let each catalog to handle its
> > temporary objects.
> >
> > Therefore, postponing the fix buys us time to flush out all the details.
> >
> > Thanks,
> > Xuefu
> >
> > On Mon, Jul 22, 2019 at 7:19 AM Timo Walther  wrote:
> >
> >> Thanks for summarizing our offline discussion Dawid! Even though I would
> >> prefer solution 1 instead of releasing half-baked features, I also
> >> understand that the Table API should not further block the next release.
> >> Therefore, I would be fine with solution 3 but introduce the new
> >> user-facing `createTemporaryTable` methods as synonyms of the existing
> >> ones already. This allows us to deprecate the methods with undefined
> >> behavior as early as possible.
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> Am 22.07.19 um 16:13 schrieb Dawid Wysakowicz:
> >>> Hi all,
> >>>
> >>> When working on FLINK-13279[1] we realized we could benefit from a
> >>> better temporary objects support in the Catalog API/Table API.
> >>> Unfortunately we are already long past the feature freeze that's why I
> >>> wanted to get some opinions from the community how should we proceed
> >>> with this topic. I tried to prepare a summary of the current state and
> 3
> >>> different suggested approaches that we could take. Please see the
> >>> attached document[2]
> >>>
> >>> I will appreciate your thoughts!
> >>>
> >>>
> >>> [1] https://issues.apache.org/jira/browse/FLINK-13279
> >>>
> >>> [2]
> >>>
> >>
> https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing
> >>>
> >>>
> >>
> >>
>
>


Re: [DISCUSS] Support temporary tables in SQL API

2019-07-23 Thread Kurt Young
Thanks Dawid for driving this discussion.
Personally, I would +1 for using option #2 for 1.9.0 and go with option #1
in 1.10.0.

Regarding Xuefu's concern about option #1, I think we could also try to
reuse the in-memory catalog
for the builtin temporary table storage.

Regarding to option #2 and option #3, from user's perspective, IIUC option
#2 allows user to have
simple name to reference temporary table and should use fully qualified
name for external catalogs.
But option #3 provide the opposite behavior, user can use simple name for
external tables after he
changed current catalog and current database, but have to use fully
qualified name for temporary
tables. IMO, option #2 will be more straightforward.

Best,
Kurt


On Tue, Jul 23, 2019 at 4:01 PM Aljoscha Krettek 
wrote:

> I would be fine with option 3) but I think option 2) is the more implicit
> solution that has less surprising behaviour.
>
> Aljoscha
>
> > On 22. Jul 2019, at 23:59, Xuefu Zhang  wrote:
> >
> > Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
> > that for 1.9 we should have some quick and simple solution, leaving time
> > for more thorough discussions for 1.10.
> >
> > In particular, I'm not fully with solution #1. For one thing, it seems
> > proposing storing all temporary objects in a memory map in
> CatalogManager,
> > and the memory map duplicates the functionality of the in-memory catalog,
> > which also store temporary objects. For another, as pointed out by the
> > google doc, different db may handle the temporary tables differently, and
> > accordingly it may make more sense to let each catalog to handle its
> > temporary objects.
> >
> > Therefore, postponing the fix buys us time to flush out all the details.
> >
> > Thanks,
> > Xuefu
> >
> > On Mon, Jul 22, 2019 at 7:19 AM Timo Walther  wrote:
> >
> >> Thanks for summarizing our offline discussion Dawid! Even though I would
> >> prefer solution 1 instead of releasing half-baked features, I also
> >> understand that the Table API should not further block the next release.
> >> Therefore, I would be fine with solution 3 but introduce the new
> >> user-facing `createTemporaryTable` methods as synonyms of the existing
> >> ones already. This allows us to deprecate the methods with undefined
> >> behavior as early as possible.
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> Am 22.07.19 um 16:13 schrieb Dawid Wysakowicz:
> >>> Hi all,
> >>>
> >>> When working on FLINK-13279[1] we realized we could benefit from a
> >>> better temporary objects support in the Catalog API/Table API.
> >>> Unfortunately we are already long past the feature freeze that's why I
> >>> wanted to get some opinions from the community how should we proceed
> >>> with this topic. I tried to prepare a summary of the current state and
> 3
> >>> different suggested approaches that we could take. Please see the
> >>> attached document[2]
> >>>
> >>> I will appreciate your thoughts!
> >>>
> >>>
> >>> [1] https://issues.apache.org/jira/browse/FLINK-13279
> >>>
> >>> [2]
> >>>
> >>
> https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing
> >>>
> >>>
> >>
> >>
>
>


Re: [DISCUSS] Support temporary tables in SQL API

2019-07-23 Thread Aljoscha Krettek
I would be fine with option 3) but I think option 2) is the more implicit 
solution that has less surprising behaviour.

Aljoscha

> On 22. Jul 2019, at 23:59, Xuefu Zhang  wrote:
> 
> Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
> that for 1.9 we should have some quick and simple solution, leaving time
> for more thorough discussions for 1.10.
> 
> In particular, I'm not fully with solution #1. For one thing, it seems
> proposing storing all temporary objects in a memory map in CatalogManager,
> and the memory map duplicates the functionality of the in-memory catalog,
> which also store temporary objects. For another, as pointed out by the
> google doc, different db may handle the temporary tables differently, and
> accordingly it may make more sense to let each catalog to handle its
> temporary objects.
> 
> Therefore, postponing the fix buys us time to flush out all the details.
> 
> Thanks,
> Xuefu
> 
> On Mon, Jul 22, 2019 at 7:19 AM Timo Walther  wrote:
> 
>> Thanks for summarizing our offline discussion Dawid! Even though I would
>> prefer solution 1 instead of releasing half-baked features, I also
>> understand that the Table API should not further block the next release.
>> Therefore, I would be fine with solution 3 but introduce the new
>> user-facing `createTemporaryTable` methods as synonyms of the existing
>> ones already. This allows us to deprecate the methods with undefined
>> behavior as early as possible.
>> 
>> Thanks,
>> Timo
>> 
>> 
>> Am 22.07.19 um 16:13 schrieb Dawid Wysakowicz:
>>> Hi all,
>>> 
>>> When working on FLINK-13279[1] we realized we could benefit from a
>>> better temporary objects support in the Catalog API/Table API.
>>> Unfortunately we are already long past the feature freeze that's why I
>>> wanted to get some opinions from the community how should we proceed
>>> with this topic. I tried to prepare a summary of the current state and 3
>>> different suggested approaches that we could take. Please see the
>>> attached document[2]
>>> 
>>> I will appreciate your thoughts!
>>> 
>>> 
>>> [1] https://issues.apache.org/jira/browse/FLINK-13279
>>> 
>>> [2]
>>> 
>> https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing
>>> 
>>> 
>> 
>> 



Re: [DISCUSS] Support temporary tables in SQL API

2019-07-22 Thread Xuefu Zhang
Thanks to Dawid for initiating the discussion. Overall, I agree with Timo
that for 1.9 we should have some quick and simple solution, leaving time
for more thorough discussions for 1.10.

In particular, I'm not fully with solution #1. For one thing, it seems
proposing storing all temporary objects in a memory map in CatalogManager,
and the memory map duplicates the functionality of the in-memory catalog,
which also store temporary objects. For another, as pointed out by the
google doc, different db may handle the temporary tables differently, and
accordingly it may make more sense to let each catalog to handle its
temporary objects.

Therefore, postponing the fix buys us time to flush out all the details.

Thanks,
Xuefu

On Mon, Jul 22, 2019 at 7:19 AM Timo Walther  wrote:

> Thanks for summarizing our offline discussion Dawid! Even though I would
> prefer solution 1 instead of releasing half-baked features, I also
> understand that the Table API should not further block the next release.
> Therefore, I would be fine with solution 3 but introduce the new
> user-facing `createTemporaryTable` methods as synonyms of the existing
> ones already. This allows us to deprecate the methods with undefined
> behavior as early as possible.
>
> Thanks,
> Timo
>
>
> Am 22.07.19 um 16:13 schrieb Dawid Wysakowicz:
> > Hi all,
> >
> > When working on FLINK-13279[1] we realized we could benefit from a
> > better temporary objects support in the Catalog API/Table API.
> > Unfortunately we are already long past the feature freeze that's why I
> > wanted to get some opinions from the community how should we proceed
> > with this topic. I tried to prepare a summary of the current state and 3
> > different suggested approaches that we could take. Please see the
> > attached document[2]
> >
> > I will appreciate your thoughts!
> >
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-13279
> >
> > [2]
> >
> https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing
> >
> >
>
>


Re: [DISCUSS] Support temporary tables in SQL API

2019-07-22 Thread Timo Walther
Thanks for summarizing our offline discussion Dawid! Even though I would 
prefer solution 1 instead of releasing half-baked features, I also 
understand that the Table API should not further block the next release. 
Therefore, I would be fine with solution 3 but introduce the new 
user-facing `createTemporaryTable` methods as synonyms of the existing 
ones already. This allows us to deprecate the methods with undefined 
behavior as early as possible.


Thanks,
Timo


Am 22.07.19 um 16:13 schrieb Dawid Wysakowicz:

Hi all,

When working on FLINK-13279[1] we realized we could benefit from a
better temporary objects support in the Catalog API/Table API.
Unfortunately we are already long past the feature freeze that's why I
wanted to get some opinions from the community how should we proceed
with this topic. I tried to prepare a summary of the current state and 3
different suggested approaches that we could take. Please see the
attached document[2]

I will appreciate your thoughts!


[1] https://issues.apache.org/jira/browse/FLINK-13279

[2]
https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing






[DISCUSS] Support temporary tables in SQL API

2019-07-22 Thread Dawid Wysakowicz
Hi all,

When working on FLINK-13279[1] we realized we could benefit from a
better temporary objects support in the Catalog API/Table API.
Unfortunately we are already long past the feature freeze that's why I
wanted to get some opinions from the community how should we proceed
with this topic. I tried to prepare a summary of the current state and 3
different suggested approaches that we could take. Please see the
attached document[2]

I will appreciate your thoughts!


[1] https://issues.apache.org/jira/browse/FLINK-13279

[2]
https://docs.google.com/document/d/1RxLj4tDB9GXVjF5qrkM38SKUPkvJt_BSefGYTQ-cVX4/edit?usp=sharing




signature.asc
Description: OpenPGP digital signature