Re: [DISCUSS] Support Interactive Programming in Flink Table API

Xingcan Cui Wed, 21 Nov 2018 08:46:39 -0800

Hi all,

Thanks for the replies.

@Becket I think whether putting the persist/cache methods in a separated util 
class or inside the DataSet/Table depends on what we want to introduce. The 
former one sounds more like a data storage component where users may even 
somehow get a stored DataSet/Table via an ID or something, whereas the latter 
one sounds only like a cache mechanism. I’m not quite sure what we really need, 
but either approach is acceptable to me.

@Shaoxuan Yes, maybe “generally” is a more accurate word here. As the TableAPI 
only works with row type records, I just wondered whether a cache for that can 
be generalized on arbitrary data types. Anyway, if contributions can be made to 
enhance the TableAPI and rebuild other libs on it, that’s not a problem. 
Another point is, as I replied to @Becket, whether we introduce only a cache 
mechanism or a data storage component. IMO, compared to data storage, the cache 
could be volatile, which means it only works for (possibly?) accelerating and 
doesn’t need to absolutely guarantee the existence of DataSets/Tables.

What do you think?

Best,
Xingcan

> On Nov 21, 2018, at 5:44 AM, Ruidong Li <[email protected]> wrote:
> 
> Hi Becket,
> 
> I think the Flink Service is a good abstraction, with which we can easily
> build Interactive Programing or some other features.
> We might bring the concept of 'Session', then we can think of Flink
> Services as system processes and user jobs as user processes, so the
> management of life cycle need to be discussed.
> 
> Kind Regards
> Xpray
> 
> 
> 
> Xingcan Cui <[email protected]> 于2018年11月21日周三 上午1:10写道：
> 
>> Hi Becket,
>> 
>> Thanks for bringing this up! For a long time, the intermediate cache
>> problem has always been a pain point of the Flink streaming model. As far
>> as I know, it’s quite a block for iterate operations in batch-related libs
>> such as Gelly and FlinkML.
>> 
>> Actually, there’s an old JIRA[1], aiming to solve the cache problem more
>> “thoroughly”. Compared with your proposal, it makes the persistence in
>> DataSet level, which also allows the internal operations based on the
>> DataSet API to benefit.
>> 
>> I totally understand the importance of Table API, but just wonder whether
>> we should consider this problem in a larger view, i.e., adding a
>> `PersistentService` rather than a `TablePersistentService` (as described in
>> the "Flink Services" section).
>> 
>> Thanks,
>> Xingcan
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-1730
>> 
>>> On Nov 20, 2018, at 8:56 AM, Becket Qin <[email protected]> wrote:
>>> 
>>> Hi all,
>>> 
>>> As a few recent email threads have pointed out, it is a promising
>>> opportunity to enhance Flink Table API in various aspects, including
>>> functionality and ease of use among others. One of the scenarios where we
>>> feel Flink could improve is interactive programming. To explain the
>> issues
>>> and facilitate the discussion on the solution, we put together the
>>> following document with our proposal.
>>> 
>>> 
>> https://docs.google.com/document/d/1d4T2zTyfe7hdncEUAxrlNOYr4e5IMNEZLyqSuuswkA0/edit?usp=sharing
>>> 
>>> Feedback and comments are very welcome!
>>> 
>>> Thanks,
>>> 
>>> Jiangjie (Becket) Qin
>> 
>>

Re: [DISCUSS] Support Interactive Programming in Flink Table API

Reply via email to